Top Level Namespace
Defined Under Namespace
Modules: Ramparts Classes: EmailParser, PhoneParser, UrlParser
Constant Summary collapse
- ARGUMENT_ERROR_TEXT =
'Parameter 1, the block of text to parse, is not a string'.freeze
- MR_ALGO =
The map reduce (MR) algorithm. Faster by ~2x than the other algorithm. Maps parts of the text such as ‘at’ or ‘FOUR’ down to ‘@’ and ‘4’ removes spaces etc, and then runs a simple regex over the remainder Information loss occurs and hence it can’t return indices
'MR'.freeze
- GR_ALGO =
The glorified regex (GR) algorithm. An obtuse and yet heartily strong regex that does a single pass over the text. Since the regex is so complicated and robust - it is slower than the map reduce algorithm. No information loss occurs so we can return indices of where the phone numbers and etc. exist
'GR'.freeze
- EMAIL_DOMAINS =
%w[ gmail yahoo hotmail aol icloud live outlook ymail comcast shaw rogers msn mail me att careguide sbcglobal rocketmail telus sympatico cox gmai email aim yandex gamil gmx student students earthlink gnail juno gmsil netzero ail gmil gmal hmail yaho alumni gmial googlemail tampabay mtroyal usa cfl yshoo protonmail rediffmail liberty maine inbox optimum example yhaoo yorku mchsi yahoi zoho hushmail libero hotmal ukr wowway post lycos yaboo contractor yahool ].freeze
Instance Method Summary collapse
-
#ranges_overlap?(r1, r2) ⇒ Boolean
Check if two ranges overlap.
-
#replace(text, instances, &block) ⇒ Object
Given some text it replaces each matched instance with the given insertable.
-
#scan(text, regex, type) ⇒ Object
Given some text it scans the text with the given regex for matches.
Instance Method Details
#ranges_overlap?(r1, r2) ⇒ Boolean
Check if two ranges overlap
44 45 46 |
# File 'lib/ramparts/helpers.rb', line 44 def ranges_overlap?(r1, r2) r1.cover?(r2.first) || r2.cover?(r1.first) end |
#replace(text, instances, &block) ⇒ Object
Given some text it replaces each matched instance with the given insertable
19 20 21 22 23 24 25 26 27 |
# File 'lib/ramparts/helpers.rb', line 19 def replace(text, instances, &block) # rubocop:disable Lint/UnusedMethodArgument altered_text = String.new(text) instances.map do |instance| insertable = yield instance altered_text[instance[:start_offset]...instance[:end_offset]] = insertable end altered_text end |
#scan(text, regex, type) ⇒ Object
Given some text it scans the text with the given regex for matches
30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/ramparts/helpers.rb', line 30 def scan(text, regex, type) text .enum_for(:scan, regex) .map do { start_offset: Regexp.last_match.begin(0), end_offset: Regexp.last_match.begin(0) + Regexp.last_match.to_s.length, value: Regexp.last_match.to_s, type: type } end end |