Class: TwitterCldr::Segmentation::RuleSetBuilder

Inherits:

Object

Object
TwitterCldr::Segmentation::RuleSetBuilder

show all

Defined in:: lib/twitter_cldr/segmentation/rule_set_builder.rb

Class Method Summary collapse

.exception_rule_for(locale, boundary_type) ⇒ Object

See the comment above exceptions_for.
.implicit_end_of_text_rule ⇒ Object

The implicit initial rules are always “start-of-text ÷” and “÷ end-of-text”.
.implicit_final_rule ⇒ Object

The implicit final rule is always “Any ÷ Any”.
.load(locale, boundary_type, options = {}) ⇒ Object

Class Method Details

.exception_rule_for(locale, boundary_type) ⇒ `Object`

See the comment above exceptions_for. Basically, we only support exceptions for the “sentence” boundary type since the ULI JSON data doesn’t distinguish between boundary types.

# File 'lib/twitter_cldr/segmentation/rule_set_builder.rb', line 19

def exception_rule_for(locale, boundary_type)
  cache_key = TwitterCldr::Utils.compute_cache_key(locale, boundary_type)
  exceptions_cache[cache_key] ||= begin
    exceptions = exceptions_for(locale, boundary_type)
    regex_contents = exceptions.map { |exc| Regexp.escape(exc) }.join("|")
    parse("(?:#{regex_contents}) ×", nil).tap do |rule|
      rule.id = 0
    end
  end
end

.implicit_end_of_text_rule ⇒ `Object`

The implicit initial rules are always “start-of-text ÷” and “÷ end-of-text”. We don’t need the start-of-text one.

# File 'lib/twitter_cldr/segmentation/rule_set_builder.rb', line 40

def implicit_end_of_text_rule
  @implicit_end_of_text_rule ||=
    parse('.\z ÷', nil).tap do |rule|
      rule.id = 9998
    end
end

.implicit_final_rule ⇒ `Object`

The implicit final rule is always “Any ÷ Any”

# File 'lib/twitter_cldr/segmentation/rule_set_builder.rb', line 31

def implicit_final_rule
  @implicit_final_rule ||=
    parse('. ÷ .', nil).tap do |rule|
      rule.id = 9999
    end
end

.load(locale, boundary_type, options = {}) ⇒ `Object`

# File 'lib/twitter_cldr/segmentation/rule_set_builder.rb', line 11

def load(locale, boundary_type, options = {})
  rules = compile_rules_for(boundary_type)
  RuleSet.new(locale, rules, boundary_type, options)
end

Class: TwitterCldr::Segmentation::RuleSetBuilder

Class Method Summary collapse

Class Method Details

.exception_rule_for(locale, boundary_type) ⇒ Object

.implicit_end_of_text_rule ⇒ Object

.implicit_final_rule ⇒ Object

.load(locale, boundary_type, options = {}) ⇒ Object

.exception_rule_for(locale, boundary_type) ⇒ `Object`

.implicit_end_of_text_rule ⇒ `Object`

.implicit_final_rule ⇒ `Object`

.load(locale, boundary_type, options = {}) ⇒ `Object`