Class: ClassifierReborn::Tokenizer::Token

Inherits:
String
  • Object
show all
Defined in:
lib/classifier-reborn/extensions/tokenizer/token.rb

Instance Method Summary collapse

Constructor Details

#initialize(string, stemmable: true, maybe_stopword: true) ⇒ Token

The class can be created with one token string and extra attributes. E.g.,

t = ClassifierReborn::Tokenizer::Token.new 'Tokenize', stemmable: true, maybe_stopword: false

Attributes available are:

stemmable:        true  Possibility that the token can be stemmed. This must be false for un-stemmable terms, otherwise this should be true.
maybe_stopword:   true  Possibility that the token is a stopword. This must be false for terms which never been stopword, otherwise this should be true.


16
17
18
19
20
# File 'lib/classifier-reborn/extensions/tokenizer/token.rb', line 16

def initialize(string, stemmable: true, maybe_stopword: true)
  super(string)
  @stemmable = stemmable
  @maybe_stopword = maybe_stopword
end

Instance Method Details

#maybe_stopword?Boolean

Returns:

  • (Boolean)


26
27
28
# File 'lib/classifier-reborn/extensions/tokenizer/token.rb', line 26

def maybe_stopword?
  @maybe_stopword
end

#stemObject



30
31
32
33
# File 'lib/classifier-reborn/extensions/tokenizer/token.rb', line 30

def stem
  stemmed = super
  self.class.new(stemmed, stemmable: @stemmable, maybe_stopword: @maybe_stopword)
end

#stemmable?Boolean

Returns:

  • (Boolean)


22
23
24
# File 'lib/classifier-reborn/extensions/tokenizer/token.rb', line 22

def stemmable?
  @stemmable
end