Module: PragmaticSegmenter::Languages::Deutsch
- Includes:
- Common
- Defined in:
- lib/pragmatic_segmenter/languages/deutsch.rb
Defined Under Namespace
Modules: Abbreviation, Numbers Classes: AbbreviationReplacer, BetweenPunctuation, Processor
Constant Summary collapse
- BETWEEN_UNCONVENTIONAL_DOUBLE_QUOTE_DE_REGEX =
Rubular: rubular.com/r/OdcXBsub0w
/,,(?>[^“\\]+|\\{2}|\\.)*“/
- SPLIT_DOUBLE_QUOTES_DE_REGEX =
Rubular: rubular.com/r/2UskIupGgP
/\A„(?>[^“\\]+|\\{2}|\\.)*“/
- BETWEEN_DOUBLE_QUOTES_DE_REGEX =
Rubular: rubular.com/r/TkZomF9tTM
/„(?>[^“\\]+|\\{2}|\\.)*“/
- MONTHS =
['Januar', 'Februar', 'März', 'April', 'Mai', 'Juni', 'Juli', 'August', 'September', 'Oktober', 'November', 'Dezember'].freeze
- SingleLowerCaseLetterRule =
Rubular: rubular.com/r/B4X33QKIL8
Rule.new(/(?<=\s[a-z])\.(?=\s)/, '∯')
- SingleLowerCaseLetterAtStartOfLineRule =
Rubular: rubular.com/r/iUNSkCuso0
Rule.new(/(?<=^[a-z])\.(?=\s)/, '∯')
Constants included from Common
Common::BETWEEN_DOUBLE_QUOTES_REGEX, Common::CONTINUOUS_PUNCTUATION_REGEX, Common::ExtraWhiteSpaceRule, Common::FileFormatRule, Common::GeoLocationRule, Common::KommanditgesellschaftRule, Common::MULTI_PERIOD_ABBREVIATION_REGEX, Common::NUMBERED_REFERENCE_REGEX, Common::PARENS_BETWEEN_DOUBLE_QUOTES_REGEX, Common::PossessiveAbbreviationRule, Common::Punctuations, Common::QUOTATION_AT_END_OF_SENTENCE_REGEX, Common::QuestionMarkInQuotationRule, Common::SENTENCE_BOUNDARY_REGEX, Common::SPLIT_SPACE_QUOTATION_AT_END_OF_SENTENCE_REGEX, Common::SingleNewLineRule, Common::SubSingleQuoteRule