Class: Ariel::RuleSet

Inherits:
Object
  • Object
show all
Defined in:
lib/ariel/rule_set.rb

Overview

A RuleSet acts as a container for a Node::Structure’s start and end rules. These are stored as an ordered array and are applied in turn until there is a successful match. A RuleSet takes responsibility for applying start and end rules to extract an Node::Extracted.

Instance Method Summary collapse

Constructor Details

#initialize(start_rules, end_rules) ⇒ RuleSet

Returns a new instance of RuleSet.



8
9
10
11
# File 'lib/ariel/rule_set.rb', line 8

def initialize(start_rules, end_rules)
  @start_rules=start_rules
  @end_rules=end_rules
end

Instance Method Details

#apply_to(tokenstream) ⇒ Object

Returns an array of the extracted tokenstreams. An empty array is returned if the rules cannot be applied. TODO: Think more about the way list iteration rules are applied



16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/ariel/rule_set.rb', line 16

def apply_to(tokenstream)
  start_idxs=nil
  end_idxs=nil
  @start_rules.each do |rule|
  start_idxs=rule.apply_to tokenstream
    break if !start_idxs.empty?
  end
  @end_rules.each do |rule|
    end_idxs=rule.apply_to tokenstream
    end_idxs.reverse! #So the start_idxs and end_idxs match up
    break if !end_idxs.empty?
  end
  result=[]
  unless start_idxs.empty? && end_idxs.empty?
    # Following expression deals with the case where the first start rule
    # matches after the first end rule, indicating that all tokens up to the
    # end rule match should be a list item
    if start_idxs.first > end_idxs.first
      start_idxs.insert(0, 0)
    end
    if end_idxs.last < start_idxs.last
      end_idxs << (tokenstream.size - 1)
    end
    Log.debug "RuleSet matched with start_idxs=#{start_idxs.inspect} and end_idxs=#{end_idxs.inspect}"
    start_idxs.zip(end_idxs) do |start_idx, end_idx|
      if start_idx && end_idx
        next if start_idx > end_idx
        result << tokenstream.slice_by_token_index(start_idx, end_idx)
        yield result.last if block_given?
      else
        break
      end
    end
  end
  return result
end