Web scraper with an elegant DSL that parses structured data from web pages.


gem install wombat

Obs: Requires ruby 1.9.3 (activesupport requires Ruby version >= 1.9.3)

Scraping a page:

The simplest way to use Wombat is by calling Wombat.crawl and passing it a block:

require 'wombat'

Wombat.crawl do
  base_url "http://www.github.com"
  path "/"

  headline xpath: "//h1"
  subheading css: "p.subheading"

  what_is({ css: ".teaser h3" }, :list)

  links do
    explore xpath: '//*[@id="wrapper"]/div[1]/div/ul/li[1]/a' do |e|
      e.gsub(/Explore/, "Love")

    search css: '.search'
    features css: '.features'
    blog css: '.blog'
The code above is gonna return the following hash:
  "headline"=>"Build software better, together.",
  "subheading"=> "Powerful collaboration, review, and code management for open source and private development projects.",
  "what_is"=> [
    "Great collaboration starts with communication.",
    "Manage and contribute from all your devices.",
    "The world’s largest open source community."
  "links"=> {
    "explore"=>"Love GitHub",


