nekohtml

A thin wrapper around NekoHTML as provided by Celerity.

At the moment this gem depends on Celerity to provide the nekohtml jar. Once I can figure out how to make this optional, I’ll provide it here if the celerity gem isn’t here at install time.

Usage

jruby-1.4.0 > require 'nekohtml'
 => true 
jruby-1.4.0 > html= "<html><head><title>Title of Majesty</title></head></html>" 
 => "<html><head><title>Title of Majesty</title></head></html>" 
jruby-1.4.0 > doc= Nekohtml.parse(html)
 => #<Nekohtml::HtmlDocument:0x3f70119f ... >
jruby-1.4.0 > doc.search("//TITLE")
 => #<Nekohtml::HtmlNodeList:0x1a7b5617 ... >
jruby-1.4.0 > _.first.text
 => "Title of Majesty"

Note that the xpath must use all-caps for tag names. This is a limitation of NekoHTML; I may plunder Celerity’s source to see how they/HtmlUnit handle it but for now, that’s what you’ve got.

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 Alex Young. See LICENSE for details.