Method: UnicodeUtils.downcase
- Defined in:
- lib/unicode_utils/downcase.rb
.downcase(str, language_id = nil) ⇒ Object
Perform a full case-conversion of str
to lowercase according to the Unicode standard.
Some conversion rules are language dependent, these are in effect when a non-nil language_id
is given. If non-nil, the language_id
must be a two letter language code as defined in BCP 47 (tools.ietf.org/rfc/bcp/bcp47.txt) as a symbol. If a language doesn’t have a two letter code, the three letter code is to be used. If locale independent behaviour is required, nil
should be passed explicitely, because a later version of UnicodeUtils may default to something else.
Examples:
require "unicode_utils/downcase"
UnicodeUtils.downcase("ᾈ") => "ᾀ"
UnicodeUtils.downcase("aBI\u{307}", :tr) => "abi"
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
# File 'lib/unicode_utils/downcase.rb', line 28 def downcase(str, language_id = nil) String.new.force_encoding(str.encoding).tap { |res| if Impl::LANGS_WITH_RULES.include?(language_id) # ensure O(1) lookup by index str = str.encode(Encoding::UTF_32LE) end pos = 0 str.each_codepoint { |cp| special_mapping = Impl.conditional_downcase_mapping(cp, str, pos, language_id) || SPECIAL_DOWNCASE_MAP[cp] if special_mapping special_mapping.each { |m| res << m } else res << (SIMPLE_DOWNCASE_MAP[cp] || cp) end pos += 1 } } end |