Class: MicroformatParser::Extractor
- Inherits:
-
Object
- Object
- MicroformatParser::Extractor
- Defined in:
- lib/uformatparser.rb
Overview
Implements an extractor using a simple expression format.
For more information see MicroformatParser.extractor.
Constant Summary collapse
- REGEX =
Parse each extractor into three parts: $1 function name (excluding parentheses) $2 element name $3 attribute name (including leading @) If a match is found the result is either $1, or $2 and/or $3
/^(\w+)\(\)|([A-Za-z][A-Za-z0-9_\-:]*)?(@[A-Za-z][A-Za-z0-9_\-:]*)?$/
Instance Method Summary collapse
-
#extract(node) ⇒ Object
Extracts a value from the node based on the extractor expression.
-
#initialize(context, statement) ⇒ Extractor
constructor
:startdoc:.
- #inspect ⇒ Object
Constructor Details
#initialize(context, statement) ⇒ Extractor
:startdoc:
612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 |
# File 'lib/uformatparser.rb', line 612 def initialize(context, statement) statement.strip! @extracts = [] # Break the statement into multiple extraction rules, separated by |. statement.split('|').each do |extract| parts = REGEX.match(extract) if parts[1] then # Function. Find a method in the context object (the rule class), # report an error is not found. begin @extracts << context.method(parts[1]) # context. rescue NameError=>error raise InvalidExtractorException, error., error.backtrace end elsif parts[2] and parts[3] # Apply only if element of this type, and extract the named attribute. attr_name = parts[3][1..-1] @extracts << proc { |node| node.attributes[attr_name] if node.name == parts[2] } elsif parts[2] # Apply only if element of this type, and extract the text value. @extracts << proc { |node| text(node) if node.name == parts[2] } elsif parts[3] # Extract the named attribute. attr_name = parts[3][1..-1] @extracts << proc { |node| node.attributes[attr_name] } else raise InvalidExtractorException, "Invalid extraction statement" end end raise InvalidExtractorException, "Invalid (empty) extraction statement" if @extracts.size == 0 end |
Instance Method Details
#extract(node) ⇒ Object
Extracts a value from the node based on the extractor expression.
646 647 648 649 650 651 652 653 654 |
# File 'lib/uformatparser.rb', line 646 def extract(node) # Iterate over all extraction rules, returning the first value. value = nil @extracts.each do |extract| value = extract.call(node) break if value end value end |
#inspect ⇒ Object
656 657 658 |
# File 'lib/uformatparser.rb', line 656 def inspect @extracts.join('|') end |