Class: PragmaticSegmenter::List
- Inherits:
-
Object
- Object
- PragmaticSegmenter::List
- Defined in:
- lib/pragmatic_segmenter/list.rb
Overview
This class searches for a list within a string and adds newlines before each list item.
Constant Summary collapse
- ROMAN_NUMERALS =
%w(i ii iii iv v vi vii viii ix x xi xii xiii xiv x xi xii xiii xv xvi xvii xviii xix xx)
- LATIN_NUMERALS =
('a'..'z').to_a
- ALPHABETICAL_LIST_WITH_PERIODS =
Rubular: rubular.com/r/XcpaJKH0sz
/(?<=^)[a-z](?=\.)|(?<=\A)[a-z](?=\.)|(?<=\s)[a-z](?=\.)/
- ALPHABETICAL_LIST_WITH_PARENS =
Rubular: rubular.com/r/Gu5rQapywf
/(?<=\()[a-z]+(?=\))|(?<=^)[a-z]+(?=\))|(?<=\A)[a-z]+(?=\))|(?<=\s)[a-z]+(?=\))/i
- SubstituteListPeriodRule =
Rule.new(/♨/, '∯')
- ListMarkerRule =
Rule.new(/☝/, '')
- SpaceBetweenListItemsFirstRule =
Rubular: rubular.com/r/Wv4qLdoPx7
Rule.new(/(?<=\S\S|^)\s(?=\S\s*\d{1,2}♨)/, "\r")
- SpaceBetweenListItemsSecondRule =
Rubular: rubular.com/r/AizHXC6HxK
Rule.new(/(?<=\S\S|^)\s(?=\d{1,2}♨)/, "\r")
- SpaceBetweenListItemsThirdRule =
Rubular: rubular.com/r/GE5q6yID2j
Rule.new(/(?<=\S\S|^)\s(?=\d{1,2}☝)/, "\r")
- NUMBERED_LIST_REGEX_1 =
/\s\d{1,2}(?=\.\s)|^\d{1,2}(?=\.\s)|\s\d{1,2}(?=\.\))|^\d{1,2}(?=\.\))|(?<=\s\-)\d{1,2}(?=\.\s)|(?<=^\-)\d{1,2}(?=\.\s)|(?<=\s\⁃)\d{1,2}(?=\.\s)|(?<=^\⁃)\d{1,2}(?=\.\s)|(?<=s\-)\d{1,2}(?=\.\))|(?<=^\-)\d{1,2}(?=\.\))|(?<=\s\⁃)\d{1,2}(?=\.\))|(?<=^\⁃)\d{1,2}(?=\.\))/
- NUMBERED_LIST_REGEX_2 =
/(?<=\s)\d{1,2}\.(?=\s)|^\d{1,2}\.(?=\s)|(?<=\s)\d{1,2}\.(?=\))|^\d{1,2}\.(?=\))|(?<=\s\-)\d{1,2}\.(?=\s)|(?<=^\-)\d{1,2}\.(?=\s)|(?<=\s\⁃)\d{1,2}\.(?=\s)|(?<=^\⁃)\d{1,2}\.(?=\s)|(?<=\s\-)\d{1,2}\.(?=\))|(?<=^\-)\d{1,2}\.(?=\))|(?<=\s\⁃)\d{1,2}\.(?=\))|(?<=^\⁃)\d{1,2}\.(?=\))/
- NUMBERED_LIST_PARENS_REGEX =
/\d{1,2}(?=\)\s)/
- EXTRACT_ALPHABETICAL_LIST_LETTERS_REGEX =
Rubular: rubular.com/r/NsNFSqrNvJ
/\([a-z]+(?=\))|(?<=^)[a-z]+(?=\))|(?<=\A)[a-z]+(?=\))|(?<=\s)[a-z]+(?=\))/i
- ALPHABETICAL_LIST_LETTERS_AND_PERIODS_REGEX =
Rubular: rubular.com/r/wMpnVedEIb
/(?<=^)[a-z]\.|(?<=\A)[a-z]\.|(?<=\s)[a-z]\./i
- ROMAN_NUMERALS_IN_PARENTHESES =
Rubular: rubular.com/r/GcnmQt4a3I
/\(((?=[mdclxvi])m*(c[md]|d?c*)(x[cl]|l?x*)(i[xv]|v?i*))\)(?=\s[A-Z])/
Instance Attribute Summary collapse
-
#text ⇒ Object
readonly
Returns the value of attribute text.
Instance Method Summary collapse
- #add_line_break ⇒ Object
-
#initialize(text:) ⇒ List
constructor
A new instance of List.
- #replace_parens ⇒ Object
Constructor Details
#initialize(text:) ⇒ List
Returns a new instance of List.
50 51 52 |
# File 'lib/pragmatic_segmenter/list.rb', line 50 def initialize(text:) @text = text.dup end |
Instance Attribute Details
#text ⇒ Object (readonly)
Returns the value of attribute text.
49 50 51 |
# File 'lib/pragmatic_segmenter/list.rb', line 49 def text @text end |
Instance Method Details
#add_line_break ⇒ Object
54 55 56 57 58 59 |
# File 'lib/pragmatic_segmenter/list.rb', line 54 def add_line_break format_alphabetical_lists format_roman_numeral_lists format_numbered_list_with_periods format_numbered_list_with_parens end |
#replace_parens ⇒ Object
61 62 63 64 |
# File 'lib/pragmatic_segmenter/list.rb', line 61 def replace_parens text.gsub!(ROMAN_NUMERALS_IN_PARENTHESES, '&✂&\1&⌬&'.freeze) text end |