Class: Kleene::BatchMatchTracker

Inherits:
Object
  • Object
show all
Defined in:
lib/kleene/multi_match_dfa.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeBatchMatchTracker

Returns a new instance of BatchMatchTracker.



253
254
255
# File 'lib/kleene/multi_match_dfa.rb', line 253

def initialize
  reset
end

Instance Attribute Details

#candidate_match_start_positionsObject

The NFA keys in the following two structures are not the original NFAs supplied to the MultiMatchDFA. They are the original NFAs that have been augmented with a dead end error state, so the keys are objects that are the internal state of a MultiMatchDFA



234
235
236
# File 'lib/kleene/multi_match_dfa.rb', line 234

def candidate_match_start_positions
  @candidate_match_start_positions
end

#empty_matchesObject

: Hash(NFA, Array(Int32)) # NFA -> Array(IndexPositionOfEmptyMatch)



246
247
248
# File 'lib/kleene/multi_match_dfa.rb', line 246

def empty_matches
  @empty_matches
end

#match_end_positionsObject

The end positions are indices at which, after handling the character, the DFA was observed to be in a match/accept state;

however, the interpretation is ambiguous, because the accepting state may be as a result of (1) transitioning to an error state that is also marked final/accepting,
OR it may be as a result of transitioning to (2) a non-error final state.
In the case of (1), the match may be an empty match, where after transitioning to an error state, the DFA is in a state that
is equivalent to the error state and start state and final state (e.g. as in an optional or kleene star DFA),
while in the case of (2), the match may be a "normal" match.
The ambiguity is problematic because it isn't clear whether the index position of the match is end inclusive end of a match
or the beginning of an empty match.
This ambiguity is all due to the construction of the composite DFA in the MultiMatchDFA - the dead end error states are epsilon-transitioned
to the composite DFA's start state.


245
246
247
# File 'lib/kleene/multi_match_dfa.rb', line 245

def match_end_positions
  @match_end_positions
end

#matchesObject

The NFA keys in the following structure are the original NFAs supplied to the MultiMatchDFA. This is in contrast to the augmented NFAs that are used as keys in the candidate_match_start_positions and match_end_positions structures, documented above ^^^.



251
252
253
# File 'lib/kleene/multi_match_dfa.rb', line 251

def matches
  @matches
end

Instance Method Details

#add_empty_match(nfa_with_dead_end, token_index) ⇒ Object



293
294
295
296
# File 'lib/kleene/multi_match_dfa.rb', line 293

def add_empty_match(nfa_with_dead_end, token_index)
  positions = empty_match_positions(nfa_with_dead_end)
  positions << token_index
end

#add_end_of_match(nfa_with_dead_end, token_index) ⇒ Object

the end positions are inclusive of the index of the last character matched, so empty matches are not accounted for in the match_end_positions array



287
288
289
290
291
# File 'lib/kleene/multi_match_dfa.rb', line 287

def add_end_of_match(nfa_with_dead_end, token_index)
  # puts "add_end_of_match(#{nfa.object_id}, #{token_index})"
  positions = end_positions(nfa_with_dead_end)
  positions << token_index
end

#add_match(nfa, match) ⇒ Object



309
310
311
312
# File 'lib/kleene/multi_match_dfa.rb', line 309

def add_match(nfa, match)
  matches = matches_for(nfa)
  matches << match
end

#add_start_of_candidate_match(nfa_with_dead_end, token_index) ⇒ Object



280
281
282
283
284
# File 'lib/kleene/multi_match_dfa.rb', line 280

def add_start_of_candidate_match(nfa_with_dead_end, token_index)
  # puts "add_start_of_candidate_match(#{nfa.object_id}, #{token_index})"
  positions = start_positions(nfa_with_dead_end)
  positions << token_index
end

#empty_match_positions(nfa) ⇒ Object



272
273
274
# File 'lib/kleene/multi_match_dfa.rb', line 272

def empty_match_positions(nfa)
  empty_matches[nfa] ||= Array.new
end

#end_positions(nfa) ⇒ Object



268
269
270
# File 'lib/kleene/multi_match_dfa.rb', line 268

def end_positions(nfa)
  match_end_positions[nfa] ||= Array.new
end

#invert_candidate_match_start_positionsObject

: Hash(Int32, Array(NFA))



298
299
300
301
302
303
304
305
306
307
# File 'lib/kleene/multi_match_dfa.rb', line 298

def invert_candidate_match_start_positions # : Hash(Int32, Array(NFA))
  index_to_nfas = Hash.new
  candidate_match_start_positions.each do |nfa_with_dead_end, indices|
    indices.each do |index|
      nfas = index_to_nfas[index] ||= Array.new
      nfas << nfa_with_dead_end
    end
  end
  index_to_nfas
end

#matches_for(nfa) ⇒ Object



276
277
278
# File 'lib/kleene/multi_match_dfa.rb', line 276

def matches_for(nfa)
  matches[nfa] ||= Array.new
end

#resetObject



257
258
259
260
261
262
# File 'lib/kleene/multi_match_dfa.rb', line 257

def reset
  @candidate_match_start_positions = Hash.new
  @match_end_positions = Hash.new
  @empty_matches = Hash.new
  @matches = Hash.new
end

#start_positions(nfa) ⇒ Object



264
265
266
# File 'lib/kleene/multi_match_dfa.rb', line 264

def start_positions(nfa)
  candidate_match_start_positions[nfa] ||= Array.new
end