Module: ActiveSupport::Multibyte
- Defined in:
- lib/active_support/multibyte/chars.rb,
lib/active_support/multibyte.rb,
lib/active_support/multibyte/utils.rb,
lib/active_support/multibyte/exceptions.rb,
lib/active_support/multibyte/unicode_database.rb
Overview
:nodoc:
Defined Under Namespace
Classes: Chars, Codepoint, EncodingError, UnicodeDatabase
Constant Summary collapse
- NORMALIZATION_FORMS =
A list of all available normalization forms. See www.unicode.org/reports/tr15/tr15-29.html for more information about normalization.
[:c, :kc, :d, :kd]
- UNICODE_VERSION =
The Unicode version that is supported by the implementation
'5.1.0'
- VALID_CHARACTER =
Regular expressions that describe valid byte sequences for a character
{ # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site) 'UTF-8' => /\A(?: [\x00-\x7f] | [\xc2-\xdf] [\x80-\xbf] | \xe0 [\xa0-\xbf] [\x80-\xbf] | [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] | \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] | [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] | \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf])\z /xn, # Quick check for valid Shift-JIS characters, disregards the odd-even pairing 'Shift_JIS' => /\A(?: [\x00-\x7e \xa1-\xdf] | [\x81-\x9f \xe0-\xef] [\x40-\x7e \x80-\x9e \x9f-\xfc])\z /xn }
- UCD =
UniCode Database
UnicodeDatabase.new
Class Method Summary collapse
-
.clean(string) ⇒ Object
Removes all invalid characters from the string.
-
.proxy_class ⇒ Object
Returns the currect proxy class.
-
.proxy_class=(klass) ⇒ Object
The proxy class returned when calling mb_chars.
-
.valid_character ⇒ Object
Returns a regular expression that matches valid characters in the current encoding.
-
.verify(string) ⇒ Object
Verifies the encoding of a string.
-
.verify!(string) ⇒ Object
Verifies the encoding of the string and raises an exception when it’s not valid.
Class Method Details
.clean(string) ⇒ Object
Removes all invalid characters from the string.
Note: this method is a no-op in Ruby 1.9
46 47 48 |
# File 'lib/active_support/multibyte/utils.rb', line 46 def self.clean(string) string end |
.proxy_class ⇒ Object
Returns the currect proxy class
31 32 33 |
# File 'lib/active_support/multibyte.rb', line 31 def self.proxy_class @proxy_class ||= ActiveSupport::Multibyte::Chars end |
.proxy_class=(klass) ⇒ Object
The proxy class returned when calling mb_chars. You can use this accessor to configure your own proxy class so you can support other encodings. See the ActiveSupport::Multibyte::Chars implementation for an example how to do this.
Example:
ActiveSupport::Multibyte.proxy_class = CharsForUTF32
26 27 28 |
# File 'lib/active_support/multibyte.rb', line 26 def self.proxy_class=(klass) @proxy_class = klass end |
.valid_character ⇒ Object
Returns a regular expression that matches valid characters in the current encoding
7 8 9 |
# File 'lib/active_support/multibyte/utils.rb', line 7 def self.valid_character VALID_CHARACTER[Encoding.default_external.to_s] end |
.verify(string) ⇒ Object
Verifies the encoding of a string
23 24 25 |
# File 'lib/active_support/multibyte/utils.rb', line 23 def self.verify(string) string.valid_encoding? end |
.verify!(string) ⇒ Object
Verifies the encoding of the string and raises an exception when it’s not valid
38 39 40 |
# File 'lib/active_support/multibyte/utils.rb', line 38 def self.verify!(string) raise EncodingError.new("Found characters with invalid encoding") unless verify(string) end |