Module: Stamina::Automaton::Walking
- Included in:
- Stamina::Automaton
- Defined in:
- lib/stamina-core/stamina/automaton/walking.rb
Overview
Provides useful automaton walking methods. This module is automatically included in Automaton and is not intended to be used directly.
Examples
# Building an automaton for the regular language a(ba)*
s0, s1 = nil
fa = Automaton.new do
s0 = add_state(:initial => true)
s1 = add_state(:accepting => true)
connect(0,1,'a')
connect(1,0,'b')
end
# some examples with reached
fa.dfa_reached('? a b') # -> s0 dfa variant method
fa.dfa_reached('? a a') # -> nil
fa.dfa_reached('? b a', s1) # -> s1 from an explicit init state
fa.reached('? a b') # -> [s0] generic method on automaton
fa.reached('? a a') # -> []
fa.reached('? b a', s1) # -> [s1]
# some examples with split (the most powerful one!)
fa.dfa_split('? a b a b') # [['a','b','a','b'], s0, []]
fa.dfa_split('? a b b a') # [['a','b'], s0, ['b','a']]
fa.split('? a b a b') # [['a','b','a','b'], [s0], []]
fa.split('? a b b a') # [['a','b'], [s0], ['b','a']]
# All of this works on non-deterministic automata as well (and epsilon
# symbols are taken into account), but you'll probably need to read
# the following section to master the power of this module in this case!
Using this module
This section fully details the design choices that has been made for the implementation of the Walking module used by Stamina on Automaton. It is provided because Walking is one of the core classes of Stamina, that probably all users (and contributors) will use. Walking usage is really user-friendly, so you are normally not required to read this section in the first place ! Read it only if of interest for you, or if you experiment unexpected results.
Methods defined by this module respect common conventions that you must be aware of:
Generic methods vs. dfa variants
The convention is simple: methods whose name starts with ‘dfa_’ are expected to be used on deterministic automata only (that is, automata answering true to the deterministic? method invocation). We refer to those methods as ‘dfa variants’. Strange results may occur if invoked on non-deterministic automata. Other methods are called ‘generic methods’ and can be used on any automaton. Generic methods and dfa variants sometimes use different conventions according to arguments and returned values, as explained below.
Argument conventions
-
all methods taking a symbol argument expect it to be a valid instance of the class used for representing input symbols on edges of your automaton (that is, the mark you’ve installed under :symbol key on the edge, see Automaton documentation for details).
-
all methods taking an input argument support the following objects for it:
-
InputString instance: an real input string typically coming from a Sample.
-
Array of symbols: where by symbol, we mean an input symbol as explained above (and not a Ruby Symbol instance). The array is never modified by the methods, so that you don’t have to worry about where this array comes from.
-
String (a real Ruby one): in this case, the input is expected to be an ADL input string, which is parsed using ADL::parse_string. Note that ‘a b a b’ is NOT a valid ADL input string, so that you typically have to use the ‘?’ sign to indicate that the tested string is basically unlabeled.
-
-
all methods taking a from argument support the following objects for it:
-
ommited: from is interpreted as the set of initial states by generic methods and the last rule applies. from is interpreted as the unique initial state of the deterministic automaton by dfa method variants (
dfa_xxx
), and the third rule applies. -
Integer: from is interpreted as a state index, and the next rule applies on the index-th state of the automaton.
-
State: from is interpreted by the generic methods as a singleton set containing the state and the last rule applies. Deterministic method variants interpret it as the start state from which the walk must start. In this case, they always return a State instance (or nil) instead of an array of states.
-
Array: from is interpreted as a set of states (duplicates are supported so it’s actually a bag) from which the walk must start. Indexes of states are also supported, see Automaton documentation about indexes.
-
Returned value conventions
Moreover, (unless stated explicitely) all methods returning states as (part of) their returned value respect the following return conventions (which somewhat summarizes the from conventions above):
-
generic methods always return an array of states (without duplicates) which can be modified. This array is never sorted by state index. To insist: even when invoked on a deterministic automaton with a State argument as from, they will return an array of states as show by the code excerpt below. Lastly, the returned value is never nil, but an empty array may be returned when it makes sense (no reached states for example).
fa = Automaton.new do ... end # build a(ba)* automaton s0 = fa.initial_state fa.reached('? a b a b', s0) # returns [s0], not s0 !
-
dfa variant methods respond to your query using the same language as you: if from is ommitted, is a State or an Integer, the the result will be a single State instance, or nil if it makes sense (no reached state for example). Otherwise, they behaves exactly as generic methods (always return an array of states, …)
Epsilon symbol aware methods
Stamina does not allow epsilon symbols on deterministic automata; thus, this subsection only applies to generic methods.
Methods documented as ‘epsilon aware’ (almost all generic methods) always take epsilon symbols into account in their computations (Stamina uses nil as epsilon symbol, by convention), in a natural way. For example:
fa = Automaton.new do ... end # build a non-deterministic automaton
# with epsilon symbols
# the line below computes the set of reached states
# (from the set of initial states) by walking the dfa
# with a string.
#
# The actual computation is in fact the set of reached
# states with the string (regex) 'eps* a eps* b eps*',
# where eps is the epsilon symbol.
reached = fa.reached('? a b')
Detailed API
Instance Method Summary collapse
-
#accepts?(input, from = nil) ⇒ Boolean
(also: #accept?)
Checks if the automaton accepts an input string.
-
#delta(from, symbol) ⇒ Object
Computes an array representing the set of states that can be reached from from states with the given input symbol.
-
#dfa_delta(from, symbol) ⇒ Object
Returns the target state (or the target states, according to from conventions) that can be reached from from states with a given input symbol.
-
#dfa_reached(input, from = nil) ⇒ Object
Same as reached, respecting dfa conventions.
-
#dfa_split(input, from = nil) ⇒ Object
Same as split, respecting dfa conventions.
-
#dfa_step(from, symbol) ⇒ Object
Returns the state reached from from states with an input symbol.
-
#label_of(str) ⇒ Object
Returns ‘1’ if the string is accepted by the automaton, ‘0’ otherwise.
-
#parses?(input, from = nil) ⇒ Boolean
(also: #parse?)
Checks if the automaton is able to parse an input string.
-
#reached(input, from = nil) ⇒ Object
Walks the automaton with an input string, starting at states from, collects the set of all reached states and returns it.
-
#rejects?(input, from = nil) ⇒ Boolean
(also: #reject?)
Checks if the automaton rejects an input string.
-
#split(input, from = nil, sort = false) ⇒ Object
Splits a given input and returns a triplet
[parsed,reached,remaining]
where parsed is an array of parsed symbols, reached is the set of reached states with the parsed input string and remaining is an array of symbols with the unparsable part of the string. -
#step(from, symbol) ⇒ Object
Returns reachable states from from states with an input symbol.
Instance Method Details
#accepts?(input, from = nil) ⇒ Boolean Also known as: accept?
Checks if the automaton accepts an input string. Returns true if at least one accepting state can be reached, false otherwise.
299 300 301 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 299 def accepts?(input, from=nil) not reached(input,from).select{|s| s.accepting? and not s.error?}.empty? end |
#delta(from, symbol) ⇒ Object
Computes an array representing the set of states that can be reached from from states with the given input symbol.
This method is epsilon symbol aware (represented with nil) on non deterministic automata, meaning that it actually computes the set of reachable states through strings respecting the eps* symbol eps*
regular expression, where eps is the epsilon symbol.
160 161 162 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 160 def delta(from, symbol) walking_to_from(from).map{|s| s.delta(symbol)}.flatten.uniq end |
#dfa_delta(from, symbol) ⇒ Object
Returns the target state (or the target states, according to from conventions) that can be reached from from states with a given input symbol. Returns nil (or an empty array, according to the same conventions) if no such state exists.
170 171 172 173 174 175 176 177 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 170 def dfa_delta(from, symbol) if from.is_a?(Automaton::State) from.dfa_delta(symbol) else delta = walking_to_from(from).map{|s| s.dfa_delta(symbol)}.flatten.uniq walking_to_dfa_result(delta, from) end end |
#dfa_reached(input, from = nil) ⇒ Object
Same as reached, respecting dfa conventions.
281 282 283 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 281 def dfa_reached(input, from=nil) walking_to_dfa_result(reached(input,from),from) end |
#dfa_split(input, from = nil) ⇒ Object
Same as split, respecting dfa conventions.
225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 225 def dfa_split(input, from=nil) from = initial_state if from.nil? from = ith_state(from) if from.is_a?(Integer) if from.is_a?(Automaton::State) # the three elements of the triplet parsed = [] reached = from remaining = walking_to_modifiable_symbols(input) # walk now until remaining.empty? symb = remaining[0] next_reached = reached.dfa_delta(symb) # stop it if no reached state break if next_reached.nil? # otherwise, update triplet parsed << remaining.shift reached = next_reached end [parsed, reached, remaining] else # the three elements of the triplet parsed = [] reached = walking_to_from(from) remaining = walking_to_modifiable_symbols(input) # walk now until remaining.empty? symb = remaining[0] next_reached = dfa_delta(reached, symb) # stop it if no reached state break if next_reached.nil? or next_reached.empty? # otherwise, update triplet parsed << remaining.shift reached = next_reached end [parsed, walking_to_dfa_result(reached, from), remaining] end end |
#dfa_step(from, symbol) ⇒ Object
Returns the state reached from from states with an input symbol. Returns nil or the empty array (according to from conventions) if no state can be reached with the given symbol.
146 147 148 149 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 146 def dfa_step(from, symbol) step = walking_to_from(from).map{|s| s.dfa_step(symbol)}.flatten.uniq walking_to_dfa_result(step, from) end |
#label_of(str) ⇒ Object
Returns ‘1’ if the string is accepted by the automaton, ‘0’ otherwise.
315 316 317 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 315 def label_of(str) accepts?(str) ? '1' : '0' end |
#parses?(input, from = nil) ⇒ Boolean Also known as: parse?
Checks if the automaton is able to parse an input string. Returns true if at least one state can be reached, false otherwise. Unlike accepts?, the labeling of the reached state does not count.
290 291 292 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 290 def parses?(input, from=nil) not(reached(input,from).empty?) end |
#reached(input, from = nil) ⇒ Object
Walks the automaton with an input string, starting at states from, collects the set of all reached states and returns it. Unlike split, returned array is empty if the string is not parsable by the automaton. This method is epsilon symbol aware.
275 276 277 278 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 275 def reached(input, from=nil) parsed, reached, remaining = split(input, from) remaining.empty? ? reached : [] end |
#rejects?(input, from = nil) ⇒ Boolean Also known as: reject?
Checks if the automaton rejects an input string. Returns true if no accepting state can be reached, false otherwise.
308 309 310 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 308 def rejects?(input, from=nil) not(accepts?(input, from)) end |
#split(input, from = nil, sort = false) ⇒ Object
Splits a given input and returns a triplet [parsed,reached,remaining]
where parsed is an array of parsed symbols, reached is the set of reached states with the parsed input string and remaining is an array of symbols with the unparsable part of the string. This method is epsilon symbol aware.
By construction, the following properties are verified:
-
parsed + remaining == input
(assuming input is an array of symbols), which means that atring concatenation of parsed and remaining symbols is is the input string. -
reached.empty? == false
, because at least initial states (or from if provided) are reached. -
remaining.empty? == parses?(input,from)
, meaning that the automaton parses the whole input if there is no remaining symol. -
delta(reached, remaining[0]).empty? unless remaining.empty?
, which express the splitting stop condition: splitting continues while at least one state can be reached with the next symbol.
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 197 def split(input, from=nil, sort=false) if deterministic? parsed, reached, remaining = dfa_split(input, from) [parsed, walking_from_dfa_to_nfa_result(reached), remaining] else # the three elements of the triplet parsed = [] reached = walking_to_from(from) remaining = walking_to_modifiable_symbols(input) # walk now until remaining.empty? symb = remaining[0] next_reached = delta(reached, symb) # stop it if no reached state break if next_reached.empty? # otherwise, update triplet parsed << remaining.shift reached = next_reached end reached.sort! if sort [parsed, reached, remaining] end end |
#step(from, symbol) ⇒ Object
Returns reachable states from from states with an input symbol. This method is not epsilon symbol aware.
136 137 138 139 |
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 136 def step(from, symbol) from = walking_to_from(from) from.map{|s| s.step(symbol)}.flatten.uniq end |