Class: TwitterCldr::Shared::UnicodeRegex
- Inherits:
-
Object
- Object
- TwitterCldr::Shared::UnicodeRegex
- Extended by:
- Forwardable
- Defined in:
- lib/twitter_cldr/shared/unicode_regex.rb
Instance Attribute Summary collapse
-
#elements ⇒ Object
readonly
Returns the value of attribute elements.
-
#modifiers ⇒ Object
readonly
Returns the value of attribute modifiers.
Class Method Summary collapse
-
.all_unicode ⇒ Object
All unicode characters.
- .compile(str, modifiers = "", symbol_table = nil) ⇒ Object
-
.invalid_regexp_chars ⇒ Object
A few <control> characters (i.e. 2..7) and public/private surrogates (i.e. 55296..57343).
- .valid_regexp_chars ⇒ Object
Instance Method Summary collapse
-
#initialize(elements, modifiers = nil) ⇒ UnicodeRegex
constructor
A new instance of UnicodeRegex.
- #to_regexp ⇒ Object
- #to_regexp_str ⇒ Object
- #to_s ⇒ Object
Constructor Details
#initialize(elements, modifiers = nil) ⇒ UnicodeRegex
Returns a new instance of UnicodeRegex.
58 59 60 61 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 58 def initialize(elements, modifiers = nil) @elements = elements @modifiers = modifiers end |
Instance Attribute Details
#elements ⇒ Object (readonly)
Returns the value of attribute elements.
56 57 58 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 56 def elements @elements end |
#modifiers ⇒ Object (readonly)
Returns the value of attribute modifiers.
56 57 58 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 56 def modifiers @modifiers end |
Class Method Details
.all_unicode ⇒ Object
All unicode characters
21 22 23 24 25 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 21 def all_unicode @all_unicode ||= TwitterCldr::Utils::RangeSet.new( [0..0x10FFFF] ) end |
.compile(str, modifiers = "", symbol_table = nil) ⇒ Object
12 13 14 15 16 17 18 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 12 def compile(str, modifiers = "", symbol_table = nil) new( parser.parse(tokenizer.tokenize(str), { symbol_table: symbol_table }), modifiers ) end |
.invalid_regexp_chars ⇒ Object
A few <control> characters (i.e. 2..7) and public/private surrogates (i.e. 55296..57343). These don’t play nicely with Ruby’s regular expression engine, and I think we can safely disregard them.
30 31 32 33 34 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 30 def invalid_regexp_chars @invalid_regexp_chars ||= TwitterCldr::Utils::RangeSet.new( [2..7, 55296..57343] ) end |
.valid_regexp_chars ⇒ Object
36 37 38 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 36 def valid_regexp_chars @valid_regexp_chars ||= all_unicode.subtract(invalid_regexp_chars) end |
Instance Method Details
#to_regexp ⇒ Object
63 64 65 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 63 def to_regexp @regexp ||= Regexp.new(to_regexp_str, modifier_union) end |
#to_regexp_str ⇒ Object
67 68 69 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 67 def to_regexp_str @regexp_str ||= elements.map(&:to_regexp_str).join end |
#to_s ⇒ Object
71 72 73 74 75 |
# File 'lib/twitter_cldr/shared/unicode_regex.rb', line 71 def to_s @elements.inject('') do |ret, element| ret + element.to_s end end |