Class: Natto::MeCabNode

Inherits:
MeCabStruct show all
Defined in:
lib/natto/struct.rb

Overview

MeCabNode is a wrapper for the struct mecab_node_t structure holding the parsed node.

Values for the MeCab node attributes may be obtained by using the following Symbols as keys to the layout associative array of FFI::Struct members.

  • :prev - pointer to previous node
  • :next - pointer to next node
  • :enext - pointer to the node which ends at the same position
  • :bnext - pointer to the node which starts at the same position
  • :rpath - pointer to the right path; nil if MECAB_ONE_BEST mode
  • :lpath - pointer to the right path; nil if MECAB_ONE_BEST mode
  • :surface - surface string; length may be obtained with length/rlength members
  • :feature - feature string
  • :id - unique node id
  • :length - length of surface form
  • :rlength - length of the surface form including white space before the morph
  • :rcAttr - right attribute id
  • :lcAttr - left attribute id
  • :posid - part-of-speech id
  • :char_type - character type
  • :stat - node status; 0 (NOR), 1 (UNK), 2 (BOS), 3 (EOS), 4 (EON)
  • :isbest - 1 if this node is best node
  • :alpha - forward accumulative log summation, only with marginal probability flag
  • :beta - backward accumulative log summation, only with marginal probability flag
  • :prob - marginal probability, only with marginal probability flag
  • :wcost - word cost
  • :cost - best accumulative cost from bos node to this node

Usage

An instance of MeCabNode is yielded to the block used with MeCab#parse, where the above-mentioned node attributes may be accessed by name.

nm = Natto::MeCab.new

nm.parse('卓球なんて死ぬまでの暇つぶしだよ。') do |n| 
  puts "#{n.surface}\t#{n.cost}" if n.is_nor? 
end

While it is also possible to use the Symbol for the MeCab node member to index into the FFI::Struct layout associative array, please use the attribute accessors. In the case of :surface and :feature, MeCab returns the raw bytes, so natto will convert that into a string using the default encoding.

Constant Summary collapse

NOR_NODE =

Normal MeCab node defined in the dictionary, c.f. stat.

0
UNK_NODE =

Unknown MeCab node not defined in the dictionary, c.f. stat.

1
BOS_NODE =

Virtual node representing the beginning of the sentence, c.f. stat.

2
EOS_NODE =

Virutual node representing the end of the sentence, c.f. stat.

3
EON_NODE =

Virtual node representing the end of an N-Best MeCab node list, c.f. stat.

4

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from MeCabStruct

#method_missing

Constructor Details

#initialize(nptr) ⇒ MeCabNode

Initializes this node instance. Sets the MeCab feature value for this node.

Parameters:

  • nptr (FFI::Pointer)

    pointer to MeCab node



245
246
247
248
249
250
251
252
# File 'lib/natto/struct.rb', line 245

def initialize(nptr)
  super(nptr)
  @pointer = nptr

  if self[:feature]
    @feature = self[:feature].force_encoding(Encoding.default_external)
  end
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Natto::MeCabStruct

Instance Attribute Details

#featureString

Returns corresponding feature value.

Returns:

  • (String)

    corresponding feature value.



204
205
206
# File 'lib/natto/struct.rb', line 204

def feature
  @feature
end

#pointerFFI::Pointer (readonly)

Returns pointer to MeCab node struct.

Returns:

  • (FFI::Pointer)

    pointer to MeCab node struct.



206
207
208
# File 'lib/natto/struct.rb', line 206

def pointer
  @pointer
end

#surfaceString

Returns surface morpheme surface value.

Returns:

  • (String)

    surface morpheme surface value.



202
203
204
# File 'lib/natto/struct.rb', line 202

def surface
  @surface
end

Instance Method Details

#inspectString

Overrides Object#inspect.

Returns:

  • (String)

    encoded object id, stat, surface, and feature

See Also:



274
275
276
# File 'lib/natto/struct.rb', line 274

def inspect
  self.to_s
end

#is_bos?Boolean

Returns true if this is a virtual MeCab node representing the beginning of the sentence.

Returns:

  • (Boolean)


292
293
294
# File 'lib/natto/struct.rb', line 292

def is_bos?
  self.stat == BOS_NODE
end

#is_eon?Boolean

Returns true if this is a virtual MeCab node representing the end of the node list.

Returns:

  • (Boolean)


304
305
306
# File 'lib/natto/struct.rb', line 304

def is_eon?
  self.stat == EON_NODE
end

#is_eos?Boolean

Returns true if this is a virtual MeCab node representing the end of the sentence.

Returns:

  • (Boolean)


298
299
300
# File 'lib/natto/struct.rb', line 298

def is_eos?
  self.stat == EOS_NODE 
end

#is_nor?Boolean

Returns true if this is a normal MeCab node found in the dictionary.

Returns:

  • (Boolean)


280
281
282
# File 'lib/natto/struct.rb', line 280

def is_nor?
  self.stat == NOR_NODE
end

#is_unk?Boolean

Returns true if this is an unknown MeCab node not found in the dictionary.

Returns:

  • (Boolean)


286
287
288
# File 'lib/natto/struct.rb', line 286

def is_unk?
  self.stat == UNK_NODE
end

#to_sString

Returns human-readable details for the MeCab node. Overrides Object#to_s.

  • encoded object id
  • underlying FFI pointer to MeCab Node
  • stat (node type: NOR, UNK, BOS/EOS, EON)
  • surface
  • feature

Returns:

  • (String)

    encoded object id, underlying FFI pointer, stat, surface, and feature



263
264
265
266
267
268
269
# File 'lib/natto/struct.rb', line 263

def to_s
   [ super.chop,
     "@pointer=#{@pointer},",
     "stat=#{self[:stat]},", 
     "@surface=\"#{self.surface}\",",
     "@feature=\"#{self.feature}\">" ].join(' ')
end