Class: TwitterCldr::Segmentation::Parser
- Inherits:
-
Object
- Object
- TwitterCldr::Segmentation::Parser
- Defined in:
- lib/twitter_cldr/segmentation/parser.rb
Instance Method Summary collapse
Instance Method Details
#parse(text, options = {}) ⇒ Object
10 11 12 13 14 15 16 17 |
# File 'lib/twitter_cldr/segmentation/parser.rb', line 10 def parse(text, = {}) left_str, boundary_symbol_str, right_str = text.split(/([÷×])/) boundary_symbol = boundary_symbol_for(boundary_symbol_str) left = compile_token_list(tokenize_regex(left_str || ''), ) right = compile_token_list(tokenize_regex(right_str || ''), ) klass = class_for(boundary_symbol) klass.new(left, right) end |
#tokenize_regex(text) ⇒ Object
19 20 21 22 23 |
# File 'lib/twitter_cldr/segmentation/parser.rb', line 19 def tokenize_regex(text) regex_tokenizer.tokenize(text).reject do |token| token.value.strip.empty? end end |