Rack HTMLTidy

The rack-htmltidy gem is a middleware that adds HTML validation for Rack applications. It uses Dave's Raggett HTML Tidy to check HTML pages. The results are written to log. The middleware uses the TidyLib C library and the tidy gem.

The idea of using middleware to validate HTML belongs to Marcin Kulik, [Rack middleware using HTMLTidy] [mkulik]

Limitations of TidyLib: Currently, all character encoding support is hard wired into the library. This means we do a poor job of supporting many popular encodings such as GB2312, euc-kr, eastern European languages, cyrillic, etc. Any of these languages must first be transcoded into ISO-10646/Unicode before Tidy can work with it.

Using with Rack application

Rack::HTMLTidy can be used with any Rack application, for example with a Sinatra application. If your application includes a rackup file or uses Rack::Builder to construct the application pipeline, simply require and use as follows:

require 'rack/htmltidy'
use Rack::HTMLTidy, 
  :errors => true, 
  :diagnostics => true, 
  :path => "/usr/lib/libtidy-0.99.so.0"
run app

Remember to update the :path option to the location of TidyLib on your system.

Using with Rails 2.3.2

In order to use include the following in a Rails application config/environment.rb file:

require 'rack/htmltidy'

Rails::Initializer.run do |config|  
  config.gem "rack-htmltidy"
  config.middleware.use(Rack::HTMLTidy,
    :errors => true, 
    :diagnostics => true,
    :path => "/usr/lib/libtidy-0.99.so.0")
end  

Check the Rack configuration:

rake middleware

Remember to update the :path option to the location of TidyLib on your system.

Miscellaneous stuff

1. To install TidyLib on Fedora 9 and above:

yum install libtidy libtidy-devel

2. To fix the bug: tidybuf.rb:40: [BUG] Segmentation fault, clone, build and install the tidy gem from here:

git://github.com/ak47/tidy.git

[mkulik]: http://sickill.net/blog/2009/05/10/rack-middleware-using-html-tidy.html "Marcin Kulik Blog