Module: Porter2
- Defined in:
- lib/porter2stemmer/constants.rb
Overview
Constants for the Porter 2 stemmer
Constant Summary collapse
- C =
A non-vowel
"[^aeiouy]"
- V =
A vowel: a e i o u y
"[aeiouy]"
- CW =
A non-vowel other than w, x, or Y
"[^aeiouywxY]"
- Double =
Doubles created when adding a suffix: these are undoubled when stemmed
"(bb|dd|ff|gg|mm|nn|pp|rr|tt)"
- Valid_LI =
A valid letter that can come before ‘li’ (or ‘ly’)
"[cdeghkmnrt]"
- SHORT_SYLLABLE =
A specification for a short syllable.
A short syllable in a word is either:
-
a vowel followed by a non-vowel other than w, x or Y and preceded by a non-vowel, or
-
a vowel at the beginning of the word followed by a non-vowel.
(The original document is silent on whether sequences of two or more non-vowels make a syllable long. But as this specification is only used to find sequences of non-vowel - vowel - non-vowel - end-of-word, this ambiguity does not have an effect.)
-
"((#{C}#{V}#{CW})|(^#{V}#{C}))"
- STEP_2_MAPS =
Suffix transformations used in porter2_step2. (ogi, li endings dealt with in procedure)
{"tional" => "tion", "enci" => "ence", "anci" => "ance", "abli" => "able", "entli" => "ent", "ization" => "ize", "izer" => "ize", "ational" => "ate", "ation" => "ate", "ator" => "ate", "alism" => "al", "aliti" => "al", "alli" => "al", "fulness" => "ful", "ousli" => "ous", "ousness" => "ous", "iveness" => "ive", "iviti" => "ive", "biliti" => "ble", "bli" => "ble", "fulli" => "ful", "lessli" => "less" }
- STEP_3_MAPS =
Suffix transformations used in porter2_step3. (ative ending dealt with in procedure)
{"tional" => "tion", "ational" => "ate", "alize" => "al", "icate" => "ic", "iciti" => "ic", "ical" => "ic", "ful" => "", "ness" => "" }
- STEP_4_MAPS =
Suffix transformations used in porter2_step4. (ion ending dealt with in procedure)
{"al" => "", "ance" => "", "ence" => "", "er" => "", "ic" => "", "able" => "", "ible" => "", "ant" => "", "ement" => "", "ment" => "", "ent" => "", "ism" => "", "ate" => "", "iti" => "", "ous" => "", "ive" => "", "ize" => "" }
- SPECIAL_CASES =
Special-case stemmings
{"skis" => "ski", "skies" => "sky", "dying" => "die", "lying" => "lie", "tying" => "tie", "idly" => "idl", "gently" => "gentl", "ugly" => "ugli", "early" => "earli", "only" => "onli", "singly" =>"singl", "sky" => "sky", "news" => "news", "howe" => "howe", "atlas" => "atlas", "cosmos" => "cosmos", "bias" => "bias", "andes" => "andes" }
- STEP_1A_SPECIAL_CASES =
Special case words to stop processing after step 1a.
%w[ inning outing canning herring earring proceed exceed succeed ]