Class: Ariel::Rule
- Inherits:
-
Object
- Object
- Ariel::Rule
- Defined in:
- lib/ariel/rule.rb
Overview
A rule contains an array of landmarks (each of which is an array of individual landmark features. This landmark array is accessible through Rule#landmarks. A Rule also has a direction :forward or :back, which determines whether it is applied from the end or beginning of a tokenstream.
Constant Summary collapse
- @@RuleMatchData =
Struct.new(:token_loc, :type)
- @@cache =
{}
Instance Attribute Summary collapse
-
#direction ⇒ Object
Returns the value of attribute direction.
-
#exhaustive ⇒ Object
Returns the value of attribute exhaustive.
-
#landmarks ⇒ Object
Returns the value of attribute landmarks.
Class Method Summary collapse
-
.prepare_tokenstream(tokenstream, direction) ⇒ Object
Reverses the given tokenstream if necessary based on its current direction, and the direction given (corresponding to the sort of rule you hope to apply to it).
Instance Method Summary collapse
-
#==(rule) ⇒ Object
(also: #eql?)
Two rules are equal if they have the same list of landmarks and the same direction.
-
#apply_to(tokenstream) ⇒ Object
Given a TokenStream and a rule, applies the rule on the stream and returns an empty array if the match fails and an array of token_locs if the match succeeds.
-
#closest_match(tokenstream, preference = :none) ⇒ Object
Only used in rule learning on labeled tokenstreams.
- #deep_clone ⇒ Object
- #exhaustive? ⇒ Boolean
- #generalise_feature(landmark_index, feature_index = 0) ⇒ Object
- #hash ⇒ Object
-
#initialize(landmarks, direction, exhaustive = false) ⇒ Rule
constructor
A rule’s direction can be :back or :forward, which determines whether it is applied from the start of end of the TokenStream.
-
#matches(tokenstream, *types) ⇒ Object
Returns true or false depending on if the match of this rule on the given tokenstream is of any of the given types (could be a combination of :perfect, :early, :fail and :late).
-
#partial(range) ⇒ Object
Returns a rule that contains a given range of.
-
#wildcard_count ⇒ Object
Returns the number of wildcards included as features in the list of rule landmarks.
Constructor Details
#initialize(landmarks, direction, exhaustive = false) ⇒ Rule
A rule’s direction can be :back or :forward, which determines whether it is applied from the start of end of the TokenStream. The landmark array contains an array for each landmark, which consists of one or more features. e.g. Rule.new([[:anything, “Example”], [“Test”]], :forward).
16 17 18 19 20 21 |
# File 'lib/ariel/rule.rb', line 16 def initialize(landmarks, direction, exhaustive=false) @landmarks=landmarks raise(ArgumentError, "Not a valid direction") unless [:forward, :back].include?(direction) @direction=direction @exhaustive=exhaustive end |
Instance Attribute Details
#direction ⇒ Object
Returns the value of attribute direction.
8 9 10 |
# File 'lib/ariel/rule.rb', line 8 def direction @direction end |
#exhaustive ⇒ Object
Returns the value of attribute exhaustive.
8 9 10 |
# File 'lib/ariel/rule.rb', line 8 def exhaustive @exhaustive end |
#landmarks ⇒ Object
Returns the value of attribute landmarks.
8 9 10 |
# File 'lib/ariel/rule.rb', line 8 def landmarks @landmarks end |
Class Method Details
.prepare_tokenstream(tokenstream, direction) ⇒ Object
Reverses the given tokenstream if necessary based on its current direction, and the direction given (corresponding to the sort of rule you hope to apply to it).
121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/ariel/rule.rb', line 121 def self.prepare_tokenstream(tokenstream, direction) if tokenstream.reversed? target=tokenstream if direction==:back target=tokenstream.reverse if direction==:forward elsif not tokenstream.reversed? target=tokenstream if direction==:forward target=tokenstream.reverse if direction==:back end target.rewind #rules are applied from the beginning of the stream return target end |
Instance Method Details
#==(rule) ⇒ Object Also known as: eql?
Two rules are equal if they have the same list of landmarks and the same direction
29 30 31 |
# File 'lib/ariel/rule.rb', line 29 def ==(rule) return ((self.landmarks == rule.landmarks) && self.direction==rule.direction) end |
#apply_to(tokenstream) ⇒ Object
Given a TokenStream and a rule, applies the rule on the stream and returns an empty array if the match fails and an array of token_locs if the match succeeds. Yields a RuleMatchData Struct with accessors token_loc (the position of the match in the stream) and type if a block is given. type is nil if the TokenStream has no label, :perfect if all tokens up to the labeled token are consumed, :early if the rule’s final position is before the labeled token, and :late if it is after. The returned token_loc is the position in the stream as it was passed in. That is, the token_loc is always from the left of the given stream whether it is in a reversed state or not.
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
# File 'lib/ariel/rule.rb', line 74 def apply_to(tokenstream) target=self.class.prepare_tokenstream(tokenstream, @direction) cache_check=@@cache[[tokenstream.cache_hash, self.hash]] if cache_check token_locs=cache_check else token_locs=[] while result=seek_landmarks(target) token_locs << correct_match_location(tokenstream, result) break unless exhaustive? end @@cache[[tokenstream.cache_hash, self.hash]]=token_locs end if block_given? generate_match_data(target, token_locs).each {|md| yield md} end return token_locs end |
#closest_match(tokenstream, preference = :none) ⇒ Object
Only used in rule learning on labeled tokenstreams. Needed to provide the match index most relevant to the currently labeled list item. A preference of :early or :late can be passed, which will only return a token_loc before the stream’s label_index or after the label_index.
113 114 115 116 |
# File 'lib/ariel/rule.rb', line 113 def closest_match(tokenstream, preference=:none) token_locs=self.apply_to(tokenstream) return find_closest_match(token_locs, tokenstream.label_index) end |
#deep_clone ⇒ Object
43 44 45 |
# File 'lib/ariel/rule.rb', line 43 def deep_clone Marshal::load(Marshal.dump(self)) end |
#exhaustive? ⇒ Boolean
23 24 25 |
# File 'lib/ariel/rule.rb', line 23 def exhaustive? @exhaustive end |
#generalise_feature(landmark_index, feature_index = 0) ⇒ Object
47 48 49 50 51 52 53 54 55 56 57 |
# File 'lib/ariel/rule.rb', line 47 def generalise_feature(landmark_index, feature_index=0) feature=self.landmarks[landmark_index][feature_index] alternates=[] Wildcards.matching(feature) do |wildcard| r=self.deep_clone r.landmarks[landmark_index][feature_index]=wildcard alternates << r yield r if block_given? end return alternates end |
#hash ⇒ Object
34 35 36 |
# File 'lib/ariel/rule.rb', line 34 def hash [@landmarks, @direction, @exhaustive].hash end |
#matches(tokenstream, *types) ⇒ Object
Returns true or false depending on if the match of this rule on the given tokenstream is of any of the given types (could be a combination of :perfect, :early, :fail and :late). Only valid on streams with labels
96 97 98 99 100 101 102 103 104 105 106 107 |
# File 'lib/ariel/rule.rb', line 96 def matches(tokenstream, *types) raise ArgumentError, "No match types given" if types.empty? raise ArgumentError, "Only applicable to tokenstreams containing a label" if tokenstream.label_index.nil? match = nil apply_to(tokenstream) {|md| match=md.type if md.type;} match = :fail if match.nil? if types.include? match return true else return false end end |
#partial(range) ⇒ Object
Returns a rule that contains a given range of
39 40 41 |
# File 'lib/ariel/rule.rb', line 39 def partial(range) return Rule.new(@landmarks[range], @direction) end |
#wildcard_count ⇒ Object
Returns the number of wildcards included as features in the list of rule landmarks.
61 62 63 |
# File 'lib/ariel/rule.rb', line 61 def wildcard_count @landmarks.flatten.select {|feature| feature.kind_of? Symbol}.size end |