Module: Normatron::Filters::KeepFilter
- Extended by:
- Helpers
- Defined in:
- lib/normatron/filters/keep_filter.rb
Overview
Remove the characters that doesn’t match the given properties.
The character properties follow the rule of \p{}
construct described in Regexp class.
The \p{}
construct matches characters with the named property, much like POSIX bracket classes.
To pass named properties to this filter, use them as Symbols:
Property | Description |
---|---|
:Alnum |
Alphabetic and numeric character |
:Alpha |
Alphabetic character |
:Blank |
Space or tab |
:Cntrl |
Control character |
:Digit |
Digit |
:Graph |
Non-blank character (excludes spaces, control characters, and similar) |
:Lower |
Lowercase alphabetical character |
:Print |
Like :Graph, but includes the space character |
:Punct |
Punctuation character |
:Space |
Whitespace character ([:blank:] , newline, carriage return, etc.) |
:Upper |
Uppercase alphabetical |
:XDigit |
Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F) |
:Word |
A member of one of the following Unicode general category Letter, Mark, Number, Connector_Punctuation |
:ASCII |
A character in the ASCII character set |
:Any |
Any Unicode character (including unassigned characters) |
:Assigned |
An assigned character |
A Unicode character’s General Category value can also be matched with :Ab
where Ab
is the category’s
abbreviation as described below:
Property | Description |
---|---|
:L |
Letter |
:Ll |
Letter: Lowercase |
:Lm |
Letter: Mark |
:Lo |
Letter: Other |
:Lt |
Letter: Titlecase |
:Lu |
Letter: Uppercas |
:Lo |
Letter: Other |
:M |
Mark |
:Mn |
Mark: Nonspacing |
:Mc |
Mark: Spacing Combining |
:Me |
Mark: Enclosing |
:N |
Number |
:Nd |
Number: Decimal Digit |
:Nl |
Number: Letter |
:No |
Number: Other |
:P |
Punctuation |
:Pc |
Punctuation: Connector |
:Pd |
Punctuation: Dash |
:Ps |
Punctuation: Open |
:Pe |
Punctuation: Close |
:Pi |
Punctuation: Initial Quote |
:Pf |
Punctuation: Final Quote |
:Po |
Punctuation: Other |
:S |
Symbol |
:Sm |
Symbol: Math |
:Sc |
Symbol: Currency |
:Sc |
Symbol: Currency |
:Sk |
Symbol: Modifier |
:So |
Symbol: Other |
:Z |
Separator |
:Zs |
Separator: Space |
:Zl |
Separator: Line |
:Zp |
Separator: Paragraph |
:C |
Other |
:Cc |
Other: Control |
:Cf |
Other: Format |
:Cn |
Other: Not Assigned |
:Co |
Other: Private Use |
:Cs |
Other: Surrogate |
Lastly, this method matches a character’s Unicode script. The following scripts are supported:
Arabic, Armenian, Balinese, Bengali, Bopomofo, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Inherited, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lycian, Lydian, Malayalam, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Ol_Chiki, Old_Italic, Old_Persian, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Saurashtra, Shavian, Sinhala, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, and Yi.
Class Method Summary collapse
-
.call(input, *properties) ⇒ String
Performs input conversion according to filter requirements.
Methods included from Helpers
acronym_regex, acronyms, evaluate_regexp, inflections, mb_send
Class Method Details
.call(input, *properties) ⇒ String
Performs input conversion according to filter requirements.
This method returns the object itself when the first argument is not a String.
112 113 114 |
# File 'lib/normatron/filters/keep_filter.rb', line 112 def self.call(input, *properties) input.kind_of?(String) ? evaluate_regexp(input, :keep, properties) : input end |