Version
1.0.4 released. Check out https://rubygems.org/gems/dq-readability
- competing structure for fighting invalid characters
- Wikipedia image case resolved
Install
Command line:
(sudo) gem install dq-readability
Bundler:
gem "dq-readability"
Example
require 'rubygems'
require 'dq-readability'
source = "http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/radixSort.htm"
puts DQReadability::Document.new(source,:tags=>%w[div pre p h1 h2 h3 h4 td table tr b a img br li ul ol center br hr blockquote em strong sub sup font tbody tt span dl dd t code figure fieldset legend dir noscript],:attributes=>%w[href src align width color height]).content