Class: RMMSeg::Ferret::Tokenizer

Inherits:
Ferret::Analysis::TokenStream
  • Object
show all
Defined in:
lib/rmmseg/ferret.rb

Overview

The Tokenizer tokenize text with RMMSeg::Algorithm.

Instance Method Summary collapse

Constructor Details

#initialize(str) ⇒ Tokenizer

Create a new Tokenizer to tokenize text



35
36
37
# File 'lib/rmmseg/ferret.rb', line 35

def initialize(str)
  self.text = str
end

Instance Method Details

#nextObject

Get next token



40
41
42
43
44
45
46
47
# File 'lib/rmmseg/ferret.rb', line 40

def next
  tk = @algor.next_token
  if tk.nil?
    nil
  else
    ::Ferret::Analysis::Token.new(tk.text, tk.start_pos, tk.end_pos)
  end        
end

#textObject

Get the text being tokenized



50
51
52
# File 'lib/rmmseg/ferret.rb', line 50

def text
  @text
end

#text=(str) ⇒ Object

Set the text to be tokenized



55
56
57
58
# File 'lib/rmmseg/ferret.rb', line 55

def text=(str)
  @text = str
  @algor = RMMSeg::Config.algorithm_instance(@text)
end