Class: Nimono::Cabocha
- Inherits:
-
FFI::AutoPointer
- Object
- FFI::AutoPointer
- Nimono::Cabocha
- Includes:
- CabochaLib, OptionParse
- Defined in:
- lib/nimono/nimono.rb
Overview
‘Cabocha` is a class providing an interface to the CaboCha library. In this class the arguments supported by CaboCha can be used in almost the same way.
Constant Summary
Constants included from CabochaLib
Nimono::CabochaLib::CABOCHA_PATH
Constants included from OptionParse
Instance Attribute Summary collapse
-
#chunks ⇒ Array
readonly
Array of chunk.
-
#libpath ⇒ String
readonly
Absolute file path to CaboCha library.
-
#options ⇒ Hash
readonly
CaboCha options as Key-Value pairs.
-
#tokens ⇒ Array
readonly
Array of Token.
Class Method Summary collapse
Instance Method Summary collapse
-
#initialize(options = {}) ⇒ Cabocha
constructor
Initializes the CaboCha with the given ‘options’.
-
#parse(text) ⇒ String
Parses the given ‘text`, returning the CaboCha output as a string.
-
#to_s ⇒ String
The result of parsing Japanese text.
Methods included from CabochaLib
Methods included from OptionParse
Constructor Details
#initialize(options = {}) ⇒ Cabocha
Initializes the CaboCha with the given ‘options’. options is given as a string (CaboCha command line arguments) or as a Ruby-style hash.
Options supported are:
-
:output_format
-
:input_layer
-
:output_layer
-
:ne
-
:parser_model
-
:chunker_model
-
:ne_model
-
:posset
-
:charset
-
:charset_file
-
:rcfile
-
:mecabrc
-
:mecab_dicdir
-
:mecab_userdic
-
:output
<p>CaboCha command line arguments (-f1) or long (–output-format=1) may be used in addition ot Ruby-style hashs</p>
e.g.<br />
require 'nimono'
nc = Nimono::Cabocha.new(output_format: 1)
or nc = Nimono::Cabocha.new('-f1')
=> #<Nimono::Cabocha:0x6364e48d
@sparse_tostr=#<Proc:0x74d917f5@/home/foo/nimono/lib/nimono/nimono.rb:54 (lambda)>,
@libpath="/usr/local/lib/libcabocha.so",
@options={:output_format=>1},
@tree=#<FFI::Pointer address=0x7f6ecc2e3790>,
@parser=#<FFI::Pointer address=0x7f6ecc2e3830>>
puts nc.parse('太郎は花子が読んでいる本を次郎に渡した')
太郎 名詞,固有名詞,人名,名,*,*,太郎,タロウ,タロー
は 助詞,係助詞,*,*,*,*,は,ハ,ワ
* 1 2D 0/1 1.700175
花子 名詞,固有名詞,人名,名,*,*,花子,ハナコ,ハナコ
が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
* 2 3D 0/2 1.825021
読ん 動詞,自立,*,*,五段・マ行,連用タ接続,読む,ヨン,ヨン
で 助詞,接続助詞,*,*,*,*,で,デ,デ
いる 動詞,非自立,*,*,一段,基本形,いる,イル,イル
* 3 5D 0/1 -0.742128
本 名詞,一般,*,*,*,*,本,ホン,ホン
を 助詞,格助詞,一般,*,*,*,を,ヲ,ヲ
* 4 5D 1/2 -0.742128
次 名詞,一般,*,*,*,*,次,ツギ,ツギ
郎 名詞,一般,*,*,*,*,郎,ロウ,ロー
に 助詞,格助詞,一般,*,*,*,に,ニ,ニ
* 5 -1D 0/1 0.000000
渡し 動詞,自立,*,*,五段・サ行,連用形,渡す,ワタシ,ワタシ
た 助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
EOS
=> nil
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/nimono/nimono.rb', line 93 def initialize(={}) @options = self.class.() opt_str = self.class.(@options) @libpath = self.class.cabocha_library @parser = self.class.cabocha_new2(opt_str) super @parser if @parser.address == 0x0 raise CabochaError.new("Could not initialize CaboCha with options: '#{opt_str}'") end @tree = self.class.cabocha_sparse_totree(@parser, "") if @options[:output_layer] self.class.cabocha_tree_set_output_layer(@tree, @options[:output_layer]) end @sparse_tostr = ->(text) { begin self.class.cabocha_sparse_tostr(@parser, text).force_encoding(Encoding.default_external) rescue raise CabochaError.new 'Parse Error' end } end |
Instance Attribute Details
#chunks ⇒ Array (readonly)
Returns Array of chunk.
20 21 22 |
# File 'lib/nimono/nimono.rb', line 20 def chunks @chunks end |
#libpath ⇒ String (readonly)
Returns absolute file path to CaboCha library.
18 19 20 |
# File 'lib/nimono/nimono.rb', line 18 def libpath @libpath end |
#options ⇒ Hash (readonly)
Returns CaboCha options as Key-Value pairs.
16 17 18 |
# File 'lib/nimono/nimono.rb', line 16 def @options end |
#tokens ⇒ Array (readonly)
Returns Array of Token.
23 24 25 |
# File 'lib/nimono/nimono.rb', line 23 def tokens @tokens end |
Class Method Details
.release(ptr) ⇒ Object
25 26 27 |
# File 'lib/nimono/nimono.rb', line 25 def self.release(ptr) self.class.cabocha_destroy(ptr) end |
Instance Method Details
#parse(text) ⇒ String
Parses the given ‘text`, returning the CaboCha output as a string. At the same time creating #chunks and #tokens.
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
# File 'lib/nimono/nimono.rb', line 125 def parse(text) if text.nil? raise CabochaError.new 'Text to parse cannot be nil' else @result = @sparse_tostr.call(text) @tree = self.class.cabocha_sparse_totree(@parser, text) @tokens = [] self.class.cabocha_tree_token_size(@tree).times do |i| @tokens << Nimono::Token.new(self.class.cabocha_tree_token(@tree, i)) end @tokens.freeze @chunks = [] @tokens.each {|token| @chunks << token.chunk unless token.chunk.nil?} @chunks.each_with_index do |chunk, index| tokens = [] chunk.token_size.times do |i| tokens << @tokens[chunk.token_pos + i] end chunk.instance_variable_set(:@tokens, tokens) chunk.instance_variable_set(:@id, index) end @chunks.freeze self.to_s end end |
#to_s ⇒ String
The result of parsing Japanese text
156 157 158 |
# File 'lib/nimono/nimono.rb', line 156 def to_s @result end |