Module: Stamina::Automaton::Walking

Included in:
Stamina::Automaton
Defined in:
lib/stamina-core/stamina/automaton/walking.rb

Overview

Provides useful automaton walking methods. This module is automatically included in Automaton and is not intended to be used directly.

Examples

# Building an automaton for the regular language a(ba)*
s0, s1 = nil
fa = Automaton.new do
  s0 = add_state(:initial => true)
  s1 = add_state(:accepting => true)
  connect(0,1,'a')
  connect(1,0,'b')
end

# some examples with reached
fa.dfa_reached('? a b')      # -> s0     dfa variant method
fa.dfa_reached('? a a')      # -> nil
fa.dfa_reached('? b a', s1)  # -> s1     from an explicit init state

fa.reached('? a b')          # -> [s0]   generic method on automaton
fa.reached('? a a')          # -> []
fa.reached('? b a', s1)      # -> [s1]

# some examples with split (the most powerful one!)
fa.dfa_split('? a b a b')    # [['a','b','a','b'], s0, []]
fa.dfa_split('? a b b a')    # [['a','b'], s0, ['b','a']]

fa.split('? a b a b')        # [['a','b','a','b'], [s0], []]
fa.split('? a b b a')        # [['a','b'], [s0], ['b','a']]

# All of this works on non-deterministic automata as well (and epsilon
# symbols are taken into account), but you'll probably need to read
# the following section to master the power of this module in this case!

Using this module

This section fully details the design choices that has been made for the implementation of the Walking module used by Stamina on Automaton. It is provided because Walking is one of the core classes of Stamina, that probably all users (and contributors) will use. Walking usage is really user-friendly, so you are normally not required to read this section in the first place ! Read it only if of interest for you, or if you experiment unexpected results.

Methods defined by this module respect common conventions that you must be aware of:

Generic methods vs. dfa variants

The convention is simple: methods whose name starts with ‘dfa_’ are expected to be used on deterministic automata only (that is, automata answering true to the deterministic? method invocation). We refer to those methods as ‘dfa variants’. Strange results may occur if invoked on non-deterministic automata. Other methods are called ‘generic methods’ and can be used on any automaton. Generic methods and dfa variants sometimes use different conventions according to arguments and returned values, as explained below.

Argument conventions

  • all methods taking a symbol argument expect it to be a valid instance of the class used for representing input symbols on edges of your automaton (that is, the mark you’ve installed under :symbol key on the edge, see Automaton documentation for details).

  • all methods taking an input argument support the following objects for it:

    • InputString instance: an real input string typically coming from a Sample.

    • Array of symbols: where by symbol, we mean an input symbol as explained above (and not a Ruby Symbol instance). The array is never modified by the methods, so that you don’t have to worry about where this array comes from.

    • String (a real Ruby one): in this case, the input is expected to be an ADL input string, which is parsed using ADL::parse_string. Note that ‘a b a b’ is NOT a valid ADL input string, so that you typically have to use the ‘?’ sign to indicate that the tested string is basically unlabeled.

  • all methods taking a from argument support the following objects for it:

    • ommited: from is interpreted as the set of initial states by generic methods and the last rule applies. from is interpreted as the unique initial state of the deterministic automaton by dfa method variants (dfa_xxx), and the third rule applies.

    • Integer: from is interpreted as a state index, and the next rule applies on the index-th state of the automaton.

    • State: from is interpreted by the generic methods as a singleton set containing the state and the last rule applies. Deterministic method variants interpret it as the start state from which the walk must start. In this case, they always return a State instance (or nil) instead of an array of states.

    • Array: from is interpreted as a set of states (duplicates are supported so it’s actually a bag) from which the walk must start. Indexes of states are also supported, see Automaton documentation about indexes.

Returned value conventions

Moreover, (unless stated explicitely) all methods returning states as (part of) their returned value respect the following return conventions (which somewhat summarizes the from conventions above):

  • generic methods always return an array of states (without duplicates) which can be modified. This array is never sorted by state index. To insist: even when invoked on a deterministic automaton with a State argument as from, they will return an array of states as show by the code excerpt below. Lastly, the returned value is never nil, but an empty array may be returned when it makes sense (no reached states for example).

    fa = Automaton.new do ... end     # build a(ba)* automaton
    s0 = fa.initial_state
    fa.reached('? a b a b', s0)         # returns [s0], not s0 !
    
  • dfa variant methods respond to your query using the same language as you: if from is ommitted, is a State or an Integer, the the result will be a single State instance, or nil if it makes sense (no reached state for example). Otherwise, they behaves exactly as generic methods (always return an array of states, …)

Epsilon symbol aware methods

Stamina does not allow epsilon symbols on deterministic automata; thus, this subsection only applies to generic methods.

Methods documented as ‘epsilon aware’ (almost all generic methods) always take epsilon symbols into account in their computations (Stamina uses nil as epsilon symbol, by convention), in a natural way. For example:

fa = Automaton.new do ... end    # build a non-deterministic automaton
                                 # with epsilon symbols

# the line below computes the set of reached states
# (from the set of initial states) by walking the dfa
# with a string.
#
# The actual computation is in fact the set of reached
# states with the string (regex) 'eps* a eps* b eps*',
# where eps is the epsilon symbol.
reached = fa.reached('? a b')

Detailed API

Instance Method Summary collapse

Instance Method Details

#accepts?(input, from = nil) ⇒ Boolean Also known as: accept?

Checks if the automaton accepts an input string. Returns true if at least one accepting state can be reached, false otherwise.

Returns:

  • (Boolean)


299
300
301
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 299

def accepts?(input, from=nil)
  not reached(input,from).select{|s| s.accepting? and not s.error?}.empty?
end

#delta(from, symbol) ⇒ Object

Computes an array representing the set of states that can be reached from from states with the given input symbol.

This method is epsilon symbol aware (represented with nil) on non deterministic automata, meaning that it actually computes the set of reachable states through strings respecting the eps* symbol eps* regular expression, where eps is the epsilon symbol.



160
161
162
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 160

def delta(from, symbol)
  walking_to_from(from).map{|s| s.delta(symbol)}.flatten.uniq
end

#dfa_delta(from, symbol) ⇒ Object

Returns the target state (or the target states, according to from conventions) that can be reached from from states with a given input symbol. Returns nil (or an empty array, according to the same conventions) if no such state exists.



170
171
172
173
174
175
176
177
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 170

def dfa_delta(from, symbol)
  if from.is_a?(Automaton::State)
    from.dfa_delta(symbol)
  else
    delta = walking_to_from(from).map{|s| s.dfa_delta(symbol)}.flatten.uniq
    walking_to_dfa_result(delta, from)
  end
end

#dfa_reached(input, from = nil) ⇒ Object

Same as reached, respecting dfa conventions.



281
282
283
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 281

def dfa_reached(input, from=nil)
  walking_to_dfa_result(reached(input,from),from)
end

#dfa_split(input, from = nil) ⇒ Object

Same as split, respecting dfa conventions.



225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 225

def dfa_split(input, from=nil)
  from = initial_state if from.nil?
  from = ith_state(from) if from.is_a?(Integer)
  if from.is_a?(Automaton::State)
    # the three elements of the triplet
    parsed = []
    reached = from
    remaining = walking_to_modifiable_symbols(input)

    # walk now
    until remaining.empty?
      symb = remaining[0]
      next_reached = reached.dfa_delta(symb)

      # stop it if no reached state
      break if next_reached.nil?

      # otherwise, update triplet
      parsed << remaining.shift
      reached = next_reached
    end
    [parsed, reached, remaining]
  else
    # the three elements of the triplet
    parsed = []
    reached = walking_to_from(from)
    remaining = walking_to_modifiable_symbols(input)

    # walk now
    until remaining.empty?
      symb = remaining[0]
      next_reached = dfa_delta(reached, symb)

      # stop it if no reached state
      break if next_reached.nil? or next_reached.empty?

      # otherwise, update triplet
      parsed << remaining.shift
      reached = next_reached
    end
    [parsed, walking_to_dfa_result(reached, from), remaining]
  end
end

#dfa_step(from, symbol) ⇒ Object

Returns the state reached from from states with an input symbol. Returns nil or the empty array (according to from conventions) if no state can be reached with the given symbol.



146
147
148
149
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 146

def dfa_step(from, symbol)
  step = walking_to_from(from).map{|s| s.dfa_step(symbol)}.flatten.uniq
  walking_to_dfa_result(step, from)
end

#label_of(str) ⇒ Object

Returns ‘1’ if the string is accepted by the automaton, ‘0’ otherwise.



315
316
317
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 315

def label_of(str)
  accepts?(str) ? '1' : '0'
end

#parses?(input, from = nil) ⇒ Boolean Also known as: parse?

Checks if the automaton is able to parse an input string. Returns true if at least one state can be reached, false otherwise. Unlike accepts?, the labeling of the reached state does not count.

Returns:

  • (Boolean)


290
291
292
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 290

def parses?(input, from=nil)
  not(reached(input,from).empty?)
end

#reached(input, from = nil) ⇒ Object

Walks the automaton with an input string, starting at states from, collects the set of all reached states and returns it. Unlike split, returned array is empty if the string is not parsable by the automaton. This method is epsilon symbol aware.



275
276
277
278
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 275

def reached(input, from=nil)
  parsed, reached, remaining = split(input, from)
  remaining.empty? ? reached : []
end

#rejects?(input, from = nil) ⇒ Boolean Also known as: reject?

Checks if the automaton rejects an input string. Returns true if no accepting state can be reached, false otherwise.

Returns:

  • (Boolean)


308
309
310
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 308

def rejects?(input, from=nil)
  not(accepts?(input, from))
end

#split(input, from = nil, sort = false) ⇒ Object

Splits a given input and returns a triplet [parsed,reached,remaining] where parsed is an array of parsed symbols, reached is the set of reached states with the parsed input string and remaining is an array of symbols with the unparsable part of the string. This method is epsilon symbol aware.

By construction, the following properties are verified:

  • parsed + remaining == input (assuming input is an array of symbols), which means that atring concatenation of parsed and remaining symbols is is the input string.

  • reached.empty? == false, because at least initial states (or from if provided) are reached.

  • remaining.empty? == parses?(input,from), meaning that the automaton parses the whole input if there is no remaining symol.

  • delta(reached, remaining[0]).empty? unless remaining.empty?, which express the splitting stop condition: splitting continues while at least one state can be reached with the next symbol.



197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 197

def split(input, from=nil, sort=false)
  if deterministic?
    parsed, reached, remaining = dfa_split(input, from)
    [parsed, walking_from_dfa_to_nfa_result(reached), remaining]
  else
    # the three elements of the triplet
    parsed = []
    reached = walking_to_from(from)
    remaining = walking_to_modifiable_symbols(input)

    # walk now
    until remaining.empty?
      symb = remaining[0]
      next_reached = delta(reached, symb)

      # stop it if no reached state
      break if next_reached.empty?

      # otherwise, update triplet
      parsed << remaining.shift
      reached = next_reached
    end
    reached.sort! if sort
    [parsed, reached, remaining]
  end
end

#step(from, symbol) ⇒ Object

Returns reachable states from from states with an input symbol. This method is not epsilon symbol aware.



136
137
138
139
# File 'lib/stamina-core/stamina/automaton/walking.rb', line 136

def step(from, symbol)
  from = walking_to_from(from)
  from.map{|s| s.step(symbol)}.flatten.uniq
end