Class: Mida::Document

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/mida/document.rb

Overview

Class that holds the extracted Microdata

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(target, page_url = nil) ⇒ Document

Create a new Microdata object

[target] The string containing the html that you want to parse. [page_url] The url of target used for form absolute urls. This must include the filename, e.g. index.html.



18
19
20
21
22
# File 'lib/mida/document.rb', line 18

def initialize(target, page_url=nil)
  @doc = target.kind_of?(Nokogiri::XML::Document) ? target : Nokogiri(target)
  @page_url = page_url
  @items = extract_items
end

Instance Attribute Details

#itemsObject (readonly)

An Array of Mida::Item objects. These are all top-level and hence not properties of other Items



11
12
13
# File 'lib/mida/document.rb', line 11

def items
  @items
end

Instance Method Details

#eachObject

Implements method for Enumerable



25
26
27
# File 'lib/mida/document.rb', line 25

def each
  @items.each {|item| yield(item)}
end

#search(itemtype, items = @items) ⇒ Object

Returns an array of matching Mida::Item objects

This drills down through each Item to find match items

[itemtype] A regexp to match the item types against [items] An array of items to search. If no argument supplied, will search through all items in the document.



36
37
38
39
40
41
42
43
44
45
# File 'lib/mida/document.rb', line 36

def search(itemtype, items=@items)
  items.each_with_object([]) do |item, found_items|
    # Allows matching against empty string, otherwise couldn't match
    # as item.type can be nil
    if (item.type.nil? && "" =~ itemtype) || (item.type =~ itemtype)
      found_items << item
    end
    found_items.concat(search_values(item.properties.values, itemtype))
  end
end