HTML parser for Mac and iPhone/iPod touch

While Apple Inc. provides XML parsers based on open source XML parsers, but they don’t for HTML.

One of a good parser I found is Element Parser and its source codes is host at GitHub. However there is no good explanation about how to use it.
So, I took a look at its sample source codes, and its FAQ page. My impression was that it would use CSS selectors.

So, if there are HTML tags like this :

<meta name="generator" content="WordPress 2.8.1" /> <!-- leave this for stats -->
<link rel="stylesheet" href="http://icodeblog.com/wp-content/themes/bluez/style.css" type="text/css" media="screen" />
<link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="http://icodeblog.com/feed/" />
<link rel="alternate" type="text/xml" title="RSS .92" href="http://icodeblog.com/feed/rss/" />
<link rel="alternate" type="application/atom+xml" title="Atom 0.3" href="http://icodeblog.com/feed/atom/" />
<link rel="pingback" href="http://icodeblog.com/xmlrpc.php" />

To retrieve information for CSS links, search pattern should be :

link[rel="alternate"]

There is a good explanation about CSS selectors at Selectutorial:CSS selectors

5 responses to this post.

  1. Posted by touchtank on October 10, 2009 at 6:25 PM

    Thanks for the mention of ElementParser. Did you see the intro at http://touchtank.wordpress.com/element-parser/ What else could we post to make it easier to learn and use? We’d love your feedback.

    Reply

    • Posted by jongampark on October 10, 2009 at 8:07 PM

      Hello, TouchTank
      Yeah, I’ve read the intro. However, I found out that it missed something like “attribute” method of Element class.

      I have no strong opinion how an reference manual should be, but it would be great if it has reference manual like Apple’s. (i.e. list of methods for classes. What they do, and so on. ) Yeah, there are comments if a file, for example, Element.h, is once opened, but it can be hard for users to figure out what classes and methods are for internal use or not.

      Also, I found that returned array by selectElements call were not key-value compliant. So, to filter-out things, I had to make an array with dictionary as element. If the one returned from that method is key-value compliant, it would be possible to apply predicate directly.

      My last word is…… Thank you for writing a great HTML parser!!! :)

      Reply

    • Posted by jongampark on October 11, 2009 at 6:24 PM

      I also notice one problem.
      When I tried to extract RSS link from Cocoa With Love’s HTML, it couldn’t find link tags with “alternate” attribute. It is due to JavaScript which is located in front of meta tag. When I removed the JavaScript, it could detect the RSS links correctly.

      Reply

  2. […] HTML parser for Mac and iPhone/iPod touch « JongAm’s blog (tags: iphone programming software reference html) […]

    Reply

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: