Module: Tickle
- Defined in:
- lib/tickle.rb,
lib/tickle/tickle.rb,
lib/tickle/handler.rb
Overview
:nodoc:
Defined Under Namespace
Classes: InvalidArgumentException, InvalidDateExpression, Repeater, Token
Constant Summary collapse
- VERSION =
"0.1.7"
Class Method Summary collapse
-
.base_tokenize(text) ⇒ Object
Split the text on spaces and convert each word into a Token.
-
.combine_multiple_numbers ⇒ Object
Turns compound numbers, like ‘twenty first’ => 21.
- .debug=(val) ⇒ Object
- .dwrite(msg, line_feed = nil) ⇒ Object
-
.guess ⇒ Object
The heavy lifting.
- .guess_month_names ⇒ Object
- .guess_number_and_unit ⇒ Object
- .guess_ordinal ⇒ Object
- .guess_ordinal_and_unit ⇒ Object
- .guess_special ⇒ Object
- .guess_unit_types ⇒ Object
- .guess_weekday ⇒ Object
- .is_date(str) ⇒ Object
-
.normalize(text) ⇒ Object
Clean up the specified input text by stripping unwanted characters, converting idioms to their canonical form, converting number words to numbers (three => 3), and converting ordinal words to numeric ordinals (third => 3rd).
-
.normalize_us_holidays(text) ⇒ Object
Converts natural language US Holidays into a date expression to be parsed.
-
.parse(text, specified_options = {}) ⇒ Object
Configuration options.
-
.post_tokenize ⇒ Object
normalizes each token.
-
.pre_filter(text) ⇒ Object
Normalize natural string removing prefix language.
-
.process_for_ending(text) ⇒ Object
process the remaining expression to see if an until, end, ending is specified.
-
.scan_expression(text, options) ⇒ Object
scans the expression for a variety of natural formats, such as ‘every thursday starting tomorrow until May 15th.
-
.token_types ⇒ Object
Returns an array of types for all tokens.
Class Method Details
.base_tokenize(text) ⇒ Object
Split the text on spaces and convert each word into a Token
181 182 183 |
# File 'lib/tickle/tickle.rb', line 181 def base_tokenize(text) #:nodoc: text.split(' ').map { |word| Token.new(word) } end |
.combine_multiple_numbers ⇒ Object
Turns compound numbers, like ‘twenty first’ => 21
239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 |
# File 'lib/tickle/tickle.rb', line 239 def combine_multiple_numbers if [:number, :ordinal].all? {|type| token_types.include? type} number = token_of_type(:number) ordinal = token_of_type(:ordinal) combined_original = "#{number.original} #{ordinal.original}" Tickle.dwrite "number.start = #{number.start}" Tickle.dwrite "number.start.to_s = #{number.start.to_s}" Tickle.dwrite "number.start.to_s[0] = #{number.start.to_s[0]}" combined_word = ([number.start.to_s.chars.first, ordinal.word].join("")) combined_value = ([number.start.to_s.chars.first, ordinal.start.to_s].join("")) new_number_token = Token.new(combined_original, combined_word, :ordinal, combined_value, 365) @tokens.reject! {|token| (token.type == :number || token.type == :ordinal)} @tokens << new_number_token end end |
.debug=(val) ⇒ Object
87 |
# File 'lib/tickle.rb', line 87 def self.debug=(val); @debug = val; end |
.dwrite(msg, line_feed = nil) ⇒ Object
89 90 91 |
# File 'lib/tickle.rb', line 89 def self.dwrite(msg, line_feed=nil) (line_feed ? p(">> #{msg}") : puts(">> #{msg}")) if @debug end |
.guess ⇒ Object
The heavy lifting. Goes through each token groupings to determine what natural language should either by parsed by Chronic or returned. This methodology makes extension fairly simple, as new token types can be easily added in repeater and then processed by the guess method
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# File 'lib/tickle/handler.rb', line 8 def guess() return nil if @tokens.empty? guess_unit_types guess_weekday unless @next guess_month_names unless @next guess_number_and_unit unless @next guess_ordinal unless @next guess_ordinal_and_unit unless @next guess_special unless @next # check to see if next is less than now and, if so, set it to next year @next = Time.local(@next.year + 1, @next.month, @next.day, @next.hour, @next.min, @next.sec) if @next && @next.to_date < @start.to_date # return the next occurrence return @next.to_time if @next end |
.guess_month_names ⇒ Object
37 38 39 |
# File 'lib/tickle/handler.rb', line 37 def guess_month_names @next = chronic_parse_with_start("#{Date::MONTHNAMES[token_of_type(:month_name).start]} 1") if token_types.same?([:month_name]) end |
.guess_number_and_unit ⇒ Object
41 42 43 44 45 46 47 48 |
# File 'lib/tickle/handler.rb', line 41 def guess_number_and_unit @next = @start.bump(:day, token_of_type(:number).interval) if token_types.same?([:number, :day]) @next = @start.bump(:week, token_of_type(:number).interval) if token_types.same?([:number, :week]) @next = @start.bump(:month, token_of_type(:number).interval) if token_types.same?([:number, :month]) @next = @start.bump(:year, token_of_type(:number).interval) if token_types.same?([:number, :year]) @next = chronic_parse_with_start("#{token_of_type(:month_name).word} #{token_of_type(:number).start}") if token_types.same?([:number, :month_name]) @next = chronic_parse_with_start("#{token_of_type(:specific_year).word}-#{token_of_type(:month_name).start}-#{token_of_type(:number).start}") if token_types.same?([:number, :month_name, :specific_year]) end |
.guess_ordinal ⇒ Object
50 51 52 |
# File 'lib/tickle/handler.rb', line 50 def guess_ordinal @next = handle_same_day_chronic_issue(@start.year, @start.month, token_of_type(:ordinal).start) if token_types.same?([:ordinal]) end |
.guess_ordinal_and_unit ⇒ Object
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
# File 'lib/tickle/handler.rb', line 54 def guess_ordinal_and_unit @next = handle_same_day_chronic_issue(@start.year, token_of_type(:month_name).start, token_of_type(:ordinal).start) if token_types.same?([:ordinal, :month_name]) @next = handle_same_day_chronic_issue(@start.year, @start.month, token_of_type(:ordinal).start) if token_types.same?([:ordinal, :month]) @next = handle_same_day_chronic_issue(token_of_type(:specific_year).word, token_of_type(:month_name).start, token_of_type(:ordinal).start) if token_types.same?([:ordinal, :month_name, :specific_year]) if token_types.same?([:ordinal, :weekday, :month_name]) @next = chronic_parse_with_start("#{token_of_type(:ordinal).word} #{token_of_type(:weekday).start.to_s} in #{Date::MONTHNAMES[token_of_type(:month_name).start]}") @next = handle_same_day_chronic_issue(@start.year, token_of_type(:month_name).start, token_of_type(:ordinal).start) if @next.to_date == @start.to_date end if token_types.same?([:ordinal, :weekday, :month]) @next = chronic_parse_with_start("#{token_of_type(:ordinal).word} #{token_of_type(:weekday).start.to_s} in #{Date::MONTHNAMES[get_next_month(token_of_type(:ordinal).start)]}") @next = handle_same_day_chronic_issue(@start.year, @start.month, token_of_type(:ordinal).start) if @next.to_date == @start.to_date end end |
.guess_special ⇒ Object
70 71 72 73 74 75 |
# File 'lib/tickle/handler.rb', line 70 def guess_special guess_special_other guess_special_beginning unless @next guess_special_middle unless @next guess_special_end unless @next end |
.guess_unit_types ⇒ Object
26 27 28 29 30 31 |
# File 'lib/tickle/handler.rb', line 26 def guess_unit_types @next = @start.bump(:day) if token_types.same?([:day]) @next = @start.bump(:week) if token_types.same?([:week]) @next = @start.bump(:month) if token_types.same?([:month]) @next = @start.bump(:year) if token_types.same?([:year]) end |
.guess_weekday ⇒ Object
33 34 35 |
# File 'lib/tickle/handler.rb', line 33 def guess_weekday @next = chronic_parse_with_start("#{token_of_type(:weekday).start.to_s}") if token_types.same?([:weekday]) end |
.is_date(str) ⇒ Object
93 94 95 96 97 98 99 100 |
# File 'lib/tickle.rb', line 93 def self.is_date(str) begin Date.parse(str.to_s) return true rescue Exception => e return false end end |
.normalize(text) ⇒ Object
Clean up the specified input text by stripping unwanted characters, converting idioms to their canonical form, converting number words to numbers (three => 3), and converting ordinal words to numeric ordinals (third => 3rd)
196 197 198 199 200 201 202 |
# File 'lib/tickle/tickle.rb', line 196 def normalize(text) #:nodoc: normalized_text = text.to_s.downcase normalized_text = Numerizer.numerize(normalized_text) normalized_text.gsub!(/['"\.]/, '') normalized_text.gsub!(/([\/\-\,\@])/) { ' ' + $1 + ' ' } normalized_text end |
.normalize_us_holidays(text) ⇒ Object
Converts natural language US Holidays into a date expression to be parsed.
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
# File 'lib/tickle/tickle.rb', line 206 def normalize_us_holidays(text) #:nodoc: normalized_text = text.to_s.downcase normalized_text.gsub!(/\bnew\syear'?s?(\s)?(day)?\b/, "january 1, #{next_appropriate_year(1, 1)}") normalized_text.gsub!(/\bnew\syear'?s?(\s)?(eve)?\b/, "december 31, #{next_appropriate_year(12, 31)}") normalized_text.gsub!(/\bm(artin\s)?l(uther\s)?k(ing)?(\sday)?\b/, 'third monday in january') normalized_text.gsub!(/\binauguration(\sday)?\b/, 'january 20') normalized_text.gsub!(/\bpresident'?s?(\sday)?\b/, 'third monday in february') normalized_text.gsub!(/\bmemorial\sday\b/, '4th monday of may') normalized_text.gsub!(/\bindepend(e|a)nce\sday\b/, "july 4, #{next_appropriate_year(7, 4)}") normalized_text.gsub!(/\blabor\sday\b/, 'first monday in september') normalized_text.gsub!(/\bcolumbus\sday\b/, 'second monday in october') normalized_text.gsub!(/\bveterans?\sday\b/, "november 11, #{next_appropriate_year(11, 1)}") normalized_text.gsub!(/\bthanksgiving(\sday)?\b/, 'fourth thursday in november') normalized_text.gsub!(/\bchristmas\seve\b/, "december 24, #{next_appropriate_year(12, 24)}") normalized_text.gsub!(/\bchristmas(\sday)?\b/, "december 25, #{next_appropriate_year(12, 25)}") normalized_text.gsub!(/\bsuper\sbowl(\ssunday)?\b/, 'first sunday in february') normalized_text.gsub!(/\bgroundhog(\sday)?\b/, "february 2, #{next_appropriate_year(2, 2)}") normalized_text.gsub!(/\bvalentine'?s?(\sday)?\b/, "february 14, #{next_appropriate_year(2, 14)}") normalized_text.gsub!(/\bs(ain)?t\spatrick'?s?(\sday)?\b/, "march 17, #{next_appropriate_year(3, 17)}") normalized_text.gsub!(/\bapril\sfool'?s?(\sday)?\b/, "april 1, #{next_appropriate_year(4, 1)}") normalized_text.gsub!(/\bearth\sday\b/, "april 22, #{next_appropriate_year(4, 22)}") normalized_text.gsub!(/\barbor\sday\b/, 'fourth friday in april') normalized_text.gsub!(/\bcinco\sde\smayo\b/, "may 5, #{next_appropriate_year(5, 5)}") normalized_text.gsub!(/\bmother'?s?\sday\b/, 'second sunday in may') normalized_text.gsub!(/\bflag\sday\b/, "june 14, #{next_appropriate_year(6, 14)}") normalized_text.gsub!(/\bfather'?s?\sday\b/, 'third sunday in june') normalized_text.gsub!(/\bhalloween\b/, "october 31, #{next_appropriate_year(10, 31)}") normalized_text.gsub!(/\belection\sday\b/, 'second tuesday in november') normalized_text.gsub!(/\bkwanzaa\b/, "january 1, #{next_appropriate_year(1, 1)}") normalized_text end |
.parse(text, specified_options = {}) ⇒ Object
Configuration options
-
start
- start date for future occurrences. Must be in valid date format. -
until
- last date to run occurrences until. Must be in valid date format.
Use by calling Tickle.parse and passing natural language with or without options.
def get_next_occurrence
results = Tickle.parse('every Wednesday starting June 1st until Dec 15th')
return results[:next] if results
end
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# File 'lib/tickle/tickle.rb', line 36 def parse(text, = {}) # get options and set defaults if necessary. Ability to set now is mostly for debugging = {:start => Time.now, :next_only => false, :until => nil, :now => Time.now} = .merge # ensure an expression was provided raise(InvalidArgumentException, 'date expression is required') unless text # ensure the specified options are valid .keys.each do |key| raise(InvalidArgumentException, "#{key} is not a valid option key.") unless .keys.include?(key) end raise(InvalidArgumentException, ':start specified is not a valid datetime.') unless (is_date([:start]) || Chronic.parse([:start])) if [:start] # check to see if a valid datetime was passed return text if text.is_a?(Date) || text.is_a?(Time) # check to see if this event starts some other time and reset now event = scan_expression(text, ) Tickle.dwrite("start: #{@start}, until: #{@until}, now: #{[:now].to_date}") # => ** this is mostly for testing. Bump by 1 day if today (or in the past for testing) raise(InvalidDateExpression, "the start date (#{@start.to_date}) cannot occur in the past for a future event") if @start && @start.to_date < [:now].to_date raise(InvalidDateExpression, "the start date (#{@start.to_date}) cannot occur after the end date") if @until && @start.to_date > @until.to_date # no need to guess at expression if the start_date is in the future best_guess = nil if @start.to_date > [:now].to_date best_guess = @start else # put the text into a normal format to ease scanning using Chronic event = pre_filter(event) # split into tokens @tokens = base_tokenize(event) # process each original word for implied word post_tokenize @tokens.each {|x| Tickle.dwrite("raw: #{x.inspect}")} # scan the tokens with each token scanner @tokens = Repeater.scan(@tokens) # remove all tokens without a type @tokens.reject! {|token| token.type.nil? } # combine number and ordinals into single number combine_multiple_numbers @tokens.each {|x| Tickle.dwrite("processed: #{x.inspect}")} # if we can't guess it maybe chronic can best_guess = (guess || chronic_parse(event)) end raise(InvalidDateExpression, "the next occurrence takes place after the end date specified") if @until && best_guess.to_date > @until.to_date if !best_guess return nil elsif [:next_only] != true return {:next => best_guess.to_time, :expression => event.strip, :starting => @start, :until => @until} else return best_guess end end |
.post_tokenize ⇒ Object
normalizes each token
186 187 188 189 190 |
# File 'lib/tickle/tickle.rb', line 186 def post_tokenize @tokens.each do |token| token.word = normalize(token.original) end end |
.pre_filter(text) ⇒ Object
Normalize natural string removing prefix language
167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/tickle/tickle.rb', line 167 def pre_filter(text) return nil unless text text.gsub!(/every(\s)?/, '') text.gsub!(/each(\s)?/, '') text.gsub!(/repeat(s|ing)?(\s)?/, '') text.gsub!(/on the(\s)?/, '') text.gsub!(/([^\w\d\s])+/, '') text.downcase.strip text = normalize_us_holidays(text) end |
.process_for_ending(text) ⇒ Object
process the remaining expression to see if an until, end, ending is specified
157 158 159 160 161 162 163 164 |
# File 'lib/tickle/tickle.rb', line 157 def process_for_ending(text) regex = /^(.*)(\s(?:\bend|until)(?:s|ing)?)(.*)/i if text =~ regex return text.match(regex)[1], text.match(regex)[3] else return text, nil end end |
.scan_expression(text, options) ⇒ Object
scans the expression for a variety of natural formats, such as ‘every thursday starting tomorrow until May 15th
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/tickle/tickle.rb', line 105 def scan_expression(text, ) starting = ending = nil start_every_regex = /^(start(?:s|ing)?)\s(.*)(\s(?:every|each|\bon\b|repeat)(?:s|ing)?)(.*)/i every_start_regex = /^(every|each|\bon\b|repeat(?:the)?)\s(.*)(\s(?:start)(?:s|ing)?)(.*)/i start_ending_regex = /^(start(?:s|ing)?)\s(.*)(\s(?:\bend|until)(?:s|ing)?)(.*)/i if text =~ start_every_regex starting = text.match(start_every_regex)[2].strip text = text.match(start_every_regex)[4].strip event, ending = process_for_ending(text) elsif text =~ every_start_regex event = text.match(every_start_regex)[2].strip text = text.match(every_start_regex)[4].strip starting, ending = process_for_ending(text) elsif text =~ start_ending_regex starting = text.match(start_ending_regex)[2].strip ending = text.match(start_ending_regex)[4].strip event = 'day' else event, ending = process_for_ending(text) end # they gave a phrase so if we can't interpret then we need to raise an error if starting Tickle.dwrite("starting: #{starting}") @start = chronic_parse(pre_filter(starting)) if @start @start.to_time else raise(InvalidDateExpression,"the starting date expression \"#{starting}\" could not be interpretted") end else @start = [:start].to_time #rescue nil end if ending @until = chronic_parse(pre_filter(ending)) if @until @until.to_time else raise(InvalidDateExpression,"the ending date expression \"#{ending}\" could not be interpretted") end else @until = [:until].to_time rescue nil end @next = nil return event end |
.token_types ⇒ Object
Returns an array of types for all tokens
256 257 258 |
# File 'lib/tickle/tickle.rb', line 256 def token_types @tokens.map(&:type) end |