Class: TwitterCldr::Segmentation::PossibleWord
- Inherits:
-
Object
- Object
- TwitterCldr::Segmentation::PossibleWord
- Defined in:
- lib/twitter_cldr/segmentation/possible_word.rb
Constant Summary collapse
- POSSIBLE_WORD_LIST_MAX =
list size, limited by the maximum number of words in the dictionary that form a nested sequence.
20
Instance Method Summary collapse
-
#accept_marked(cursor) ⇒ Object
select the currently marked candidate, point after it in the text, and invalidate self.
-
#back_up(cursor) ⇒ Object
back up from the current candidate to the next shorter one; return true if that exists and point the text after it.
-
#candidates(cursor, dictionary, end_pos) ⇒ Object
fill the list of candidates if needed, select the longest, and return the number found.
-
#initialize ⇒ PossibleWord
constructor
A new instance of PossibleWord.
-
#longest_prefix ⇒ Object
return the longest prefix this candidate location shares with a dictionary word.
-
#mark_current ⇒ Object
mark the current candidate as the one we like.
Constructor Details
#initialize ⇒ PossibleWord
Returns a new instance of PossibleWord.
13 14 15 16 17 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 13 def initialize @lengths = [] @count = nil @offset = -1 end |
Instance Method Details
#accept_marked(cursor) ⇒ Object
select the currently marked candidate, point after it in the text, and invalidate self
46 47 48 49 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 46 def accept_marked(cursor) cursor.position = @offset + @lengths[@mark] @lengths[@mark] end |
#back_up(cursor) ⇒ Object
back up from the current candidate to the next shorter one; return true if that exists and point the text after it
53 54 55 56 57 58 59 60 61 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 53 def back_up(cursor) if @current > 0 @current -= 1 cursor.position = @offset + @lengths[@current] return true end false end |
#candidates(cursor, dictionary, end_pos) ⇒ Object
fill the list of candidates if needed, select the longest, and return the number found
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 20 def candidates(cursor, dictionary, end_pos) start = cursor.position if start != @offset @offset = start @count, _, @lengths, @prefix = dictionary.matches( cursor, end_pos - start, POSSIBLE_WORD_LIST_MAX ) # dictionary leaves text after longest prefix, not longest word, so back up. if @count <= 0 cursor.position = start end end if @count > 0 cursor.position = start + @lengths[@count - 1] end @current = @count - 1 @mark = @current return @count end |
#longest_prefix ⇒ Object
return the longest prefix this candidate location shares with a dictionary word
64 65 66 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 64 def longest_prefix @prefix end |
#mark_current ⇒ Object
mark the current candidate as the one we like
69 70 71 |
# File 'lib/twitter_cldr/segmentation/possible_word.rb', line 69 def mark_current @mark = @current end |