Class: RMMSeg::Word

Inherits:
Object
  • Object
show all
Defined in:
lib/rmmseg/word.rb

Overview

An object representing a CJK word.

Constant Summary collapse

TYPES =
{
  :unrecognized => :unrecognized,
  :basic_latin_word => :basic_latin_word,
  :cjk_word => :cjk_word
}.freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(text, type = TYPES[:unrecognized], frequency = nil) ⇒ Word

Initialize a Word object.



21
22
23
24
25
26
# File 'lib/rmmseg/word.rb', line 21

def initialize(text, type=TYPES[:unrecognized], frequency=nil)
  @text = text
  @type = type
  @frequency = frequency
  @length = @text.jlength
end

Instance Attribute Details

#frequencyObject (readonly)

The frequency of the word. This value is meaningful only when this is a one-character word.



18
19
20
# File 'lib/rmmseg/word.rb', line 18

def frequency
  @frequency
end

#textObject (readonly)

The content text of the word.



11
12
13
# File 'lib/rmmseg/word.rb', line 11

def text
  @text
end

#typeObject (readonly)

The type of the word, may be one of the key of TYPES .



14
15
16
# File 'lib/rmmseg/word.rb', line 14

def type
  @type
end

Instance Method Details

#byte_sizeObject

The number of bytes in the word.



34
35
36
# File 'lib/rmmseg/word.rb', line 34

def byte_size
  @text.length
end

#lengthObject

The number of characters in the word. Not number of bytes.



29
30
31
# File 'lib/rmmseg/word.rb', line 29

def length
  @length
end