Class: PragmaticSegmenter::Languages::Japanese::Cleaner
- Defined in:
- lib/pragmatic_segmenter/languages/japanese.rb
Constant Summary collapse
- NewLineInMiddleOfWordRule =
Rubular: rubular.com/r/N4kPuJgle7
Rule.new(/(?<=の)\n(?=\S)/, '')
Constants included from Cleaner::Rules
Cleaner::Rules::ConsecutiveForwardSlashRule, Cleaner::Rules::ConsecutivePeriodsRule, Cleaner::Rules::DoubleNewLineRule, Cleaner::Rules::DoubleNewLineWithSpaceRule, Cleaner::Rules::EscapedCarriageReturnRule, Cleaner::Rules::EscapedNewLineRule, Cleaner::Rules::InlineFormattingRule, Cleaner::Rules::NEWLINE_IN_MIDDLE_OF_SENTENCE_REGEX, Cleaner::Rules::NO_SPACE_BETWEEN_SENTENCES_DIGIT_REGEX, Cleaner::Rules::NO_SPACE_BETWEEN_SENTENCES_REGEX, Cleaner::Rules::NewLineFollowedByBulletRule, Cleaner::Rules::NewLineFollowedByPeriodRule, Cleaner::Rules::NoSpaceBetweenSentencesDigitRule, Cleaner::Rules::NoSpaceBetweenSentencesRule, Cleaner::Rules::QuotationsFirstRule, Cleaner::Rules::QuotationsSecondRule, Cleaner::Rules::ReplaceNewlineWithCarriageReturnRule, Cleaner::Rules::TableOfContentsRule, Cleaner::Rules::TypoEscapedCarriageReturnRule, Cleaner::Rules::TypoEscapedNewLineRule, Cleaner::Rules::URL_EMAIL_KEYWORDS
Instance Attribute Summary
Attributes inherited from Cleaner
Instance Method Summary collapse
Methods inherited from Cleaner
Constructor Details
This class inherits a constructor from PragmaticSegmenter::Cleaner
Instance Method Details
#clean ⇒ Object
12 13 14 15 |
# File 'lib/pragmatic_segmenter/languages/japanese.rb', line 12 def clean super remove_newline_in_middle_of_word end |