Class: RMMSeg::Config
- Inherits:
-
Object
- Object
- RMMSeg::Config
- Defined in:
- lib/rmmseg/config.rb
Overview
Configurations of RMMSeg.
Class Attribute Summary collapse
-
.dictionaries ⇒ Object
An array of dictionary files.
-
.max_word_length ⇒ Object
The maximum length of a CJK word.
Class Method Summary collapse
-
.algorithm ⇒ Object
Get the algorithm name currently using.
-
.algorithm=(algor) ⇒ Object
Set the algorithm name used to segment.
-
.algorithm_instance(text, tok = Token) ⇒ Object
Get an instance of the algorithm object corresponding to the algorithm name configured.
-
.on_ambiguity ⇒ Object
Get the behavior description when an unresolved ambiguity occured.
-
.on_ambiguity=(behavior) ⇒ Object
Set the behavior on an unresolved ambiguity.
Class Attribute Details
.dictionaries ⇒ Object
An array of dictionary files. Each element should be of the form: [file, whether_dic_include_frequency_info]. This should be set before the dictionaries are loaded (They are loaded only when they are used). Or else you should call Dictionary.instance.reload manually to reload the dictionaries.
58 59 60 |
# File 'lib/rmmseg/config.rb', line 58 def dictionaries @dictionaries end |
.max_word_length ⇒ Object
The maximum length of a CJK word. The default value is 4. Making this value too large might slow down the segment operations.
62 63 64 |
# File 'lib/rmmseg/config.rb', line 62 def max_word_length @max_word_length end |
Class Method Details
.algorithm ⇒ Object
Get the algorithm name currently using
19 20 21 |
# File 'lib/rmmseg/config.rb', line 19 def algorithm @algorithm end |
.algorithm=(algor) ⇒ Object
Set the algorithm name used to segment. Valid values are :complex
and :simple
. The former is the default one.
24 25 26 27 28 29 |
# File 'lib/rmmseg/config.rb', line 24 def algorithm=(algor) unless [:complex, :simple].include? algor raise ArgumentError, "Unknown algorithm #{algor}" end @algorithm = algor end |
.algorithm_instance(text, tok = Token) ⇒ Object
Get an instance of the algorithm object corresponding to the algorithm name configured. tok
is the class of the token oject to be returned. For example, if you want to use with Ferret, you should provide ::Ferret::Analysis::Token
.
34 35 36 |
# File 'lib/rmmseg/config.rb', line 34 def algorithm_instance(text, tok=Token) RMMSeg.const_get("#{@algorithm}".capitalize+"Algorithm").new(text, tok) end |
.on_ambiguity ⇒ Object
Get the behavior description when an unresolved ambiguity occured.
39 40 41 |
# File 'lib/rmmseg/config.rb', line 39 def on_ambiguity @on_ambiguity end |
.on_ambiguity=(behavior) ⇒ Object
Set the behavior on an unresolved ambiguity. Valid values are :raise_exception
and :select_first
. The latter is the default one.
45 46 47 48 49 50 |
# File 'lib/rmmseg/config.rb', line 45 def on_ambiguity=(behavior) unless [:raise_exception, :select_first].include? behavior raise ArgumentError, "Unknown behavior on ambiguity: #{behavior}" end @on_ambiguity = behavior end |