Module: PragmaticSegmenter::Languages::Kazakh

Includes:
Common
Defined in:
lib/pragmatic_segmenter/languages/kazakh.rb

Defined Under Namespace

Modules: Abbreviation Classes: AbbreviationReplacer, Processor

Constant Summary collapse

MULTI_PERIOD_ABBREVIATION_REGEX =
/\b\p{Cyrillic}(?:\.\s?\p{Cyrillic})+[.]|b[a-z](?:\.[a-z])+[.]/i

Constants included from Common

Common::BETWEEN_DOUBLE_QUOTES_REGEX, Common::CONTINUOUS_PUNCTUATION_REGEX, Common::ExtraWhiteSpaceRule, Common::FileFormatRule, Common::GeoLocationRule, Common::KommanditgesellschaftRule, Common::NUMBERED_REFERENCE_REGEX, Common::PARENS_BETWEEN_DOUBLE_QUOTES_REGEX, Common::PossessiveAbbreviationRule, Common::Punctuations, Common::QUOTATION_AT_END_OF_SENTENCE_REGEX, Common::QuestionMarkInQuotationRule, Common::SENTENCE_BOUNDARY_REGEX, Common::SPLIT_SPACE_QUOTATION_AT_END_OF_SENTENCE_REGEX, Common::SingleNewLineRule, Common::SubSingleQuoteRule