Class: Milemarker
- Inherits:
-
Object
- Object
- Milemarker
- Defined in:
- lib/milemarker.rb,
lib/milemarker/version.rb,
lib/milemarker/structured.rb
Overview
milemarker class, to keep track of progress over time for long-running iterating processes
Direct Known Subclasses
Defined Under Namespace
Classes: Structured
Constant Summary collapse
- VERSION =
"1.0.0"
Instance Attribute Summary collapse
-
#batch_end_time ⇒ Time
readonly
Time the last batch ended processing.
-
#batch_number ⇒ Integer
readonly
Which batch number (total increment / batch_size).
-
#batch_size ⇒ Integer
Batch size for computing ‘on_batch` calls.
-
#batch_start_time ⇒ Time
readonly
Time the last batch started processing.
-
#count ⇒ Integer
readonly
Total records (really, increments) for the full run.
-
#last_batch_seconds ⇒ Integer
readonly
Number of second to process the last batch.
-
#last_batch_size ⇒ Integer
readonly
Number of records (really, number of increments) in the last batch.
-
#logger ⇒ Logger, #info
Logging object for automatic logging methods.
-
#name ⇒ String
Optional “name” of this milemarker, for logging purposes.
-
#prev_count ⇒ Integer
readonly
Total count at the time of the last on_batch call.
-
#start_time ⇒ Time
readonly
Time the full process started.
Instance Method Summary collapse
-
#_increment_and_on_batch(&blk) ⇒ Object
(also: #increment_and_on_batch)
Single call to increment and run (if needed) the on_batch block.
-
#batch_line ⇒ String
A line describing the batch suitable for logging, of the form load records.ndj 8_000_000.
-
#batch_rate ⇒ Float
Rate of the last batch (in recs/second).
-
#batch_rate_str(decimals = 0) ⇒ String
Rate-per-second in form XXX.YY.
-
#batch_seconds_so_far ⇒ Float
Total seconds since this batch started.
-
#create_logger!(*args, **kwargs) ⇒ Milemarker
Create a logger for use in logging milemaker information.
-
#final_batch_size ⇒ Integer
(also: #batch_count_so_far)
Record how many increments there have been since the last on_batch call.
-
#final_line ⇒ String
A line describing the entire run, suitable for logging, of the form load records.ndj FINISHED.
-
#incr(increase = 1) ⇒ Milemarker
(also: #increment)
Increment the counter – how many records processed, e.g.
-
#increment_and_log_batch_line(level: :info) ⇒ Object
Convenience method, exactly the same as the common idiom ‘mm.incr; mm.on_batch {|mm| log.info mm.batch_line}`.
-
#initialize(batch_size: 1000, name: nil, logger: nil) ⇒ Milemarker
constructor
Create a new milemarker tracker, with an optional name and logger.
-
#log(msg, level: :info) ⇒ Object
Log a line using the internal logger.
-
#log_batch_line(level: :info) ⇒ Object
Log the batch line, as described in #batch_line.
-
#log_final_line(level: :info) ⇒ Object
Log the final line, as described in #final_line.
-
#on_batch {|Milemarker| ... } ⇒ Object
Run the given block if we’ve exceeded the batch size for the current batch.
-
#reset_for_next_batch! ⇒ Object
Reset the internal counters/timers at the end of a batch.
-
#set_milemarker! ⇒ Object
Set/reset all the internal state.
-
#threadsafe_increment_and_on_batch(&blk) ⇒ Object
Threadsafe version of #increment_and_on_batch, doing the whole thing as a single atomic action.
-
#threadsafify! ⇒ Milemarker
Turn ‘increment_and_batch` (and thus `increment_and_log_batch_line`) into a threadsafe version.
-
#total_rate ⇒ Float
Total rate so far (in rec/second).
-
#total_rate_str(decimals = 0) ⇒ String
Rate-per-second in form XXX.YY.
-
#total_seconds_so_far ⇒ Float
Total seconds since the beginning of this milemarker.
Constructor Details
#initialize(batch_size: 1000, name: nil, logger: nil) ⇒ Milemarker
Create a new milemarker tracker, with an optional name and logger
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'lib/milemarker.rb', line 52 def initialize(batch_size: 1000, name: nil, logger: nil) @batch_size = batch_size @name = name @logger = logger @batch_number = 0 @last_batch_size = 0 @last_batch_seconds = 0 @start_time = Time.now @batch_start_time = @start_time @batch_end_time = @start_time @count = 0 @prev_count = 0 end |
Instance Attribute Details
#batch_end_time ⇒ Time (readonly)
Returns Time the last batch ended processing.
39 40 41 |
# File 'lib/milemarker.rb', line 39 def batch_end_time @batch_end_time end |
#batch_number ⇒ Integer (readonly)
Returns which batch number (total increment / batch_size).
24 25 26 |
# File 'lib/milemarker.rb', line 24 def batch_number @batch_number end |
#batch_size ⇒ Integer
Returns batch size for computing ‘on_batch` calls.
18 19 20 |
# File 'lib/milemarker.rb', line 18 def batch_size @batch_size end |
#batch_start_time ⇒ Time (readonly)
Returns Time the last batch started processing.
36 37 38 |
# File 'lib/milemarker.rb', line 36 def batch_start_time @batch_start_time end |
#count ⇒ Integer (readonly)
Returns Total records (really, increments) for the full run.
42 43 44 |
# File 'lib/milemarker.rb', line 42 def count @count end |
#last_batch_seconds ⇒ Integer (readonly)
Returns number of second to process the last batch.
27 28 29 |
# File 'lib/milemarker.rb', line 27 def last_batch_seconds @last_batch_seconds end |
#last_batch_size ⇒ Integer (readonly)
Returns number of records (really, number of increments) in the last batch.
30 31 32 |
# File 'lib/milemarker.rb', line 30 def last_batch_size @last_batch_size end |
#logger ⇒ Logger, #info
Returns logging object for automatic logging methods.
21 22 23 |
# File 'lib/milemarker.rb', line 21 def logger @logger end |
#name ⇒ String
Returns optional “name” of this milemarker, for logging purposes.
15 16 17 |
# File 'lib/milemarker.rb', line 15 def name @name end |
#prev_count ⇒ Integer (readonly)
Returns Total count at the time of the last on_batch call. Used to figure out how many records were in the final batch.
46 47 48 |
# File 'lib/milemarker.rb', line 46 def prev_count @prev_count end |
#start_time ⇒ Time (readonly)
Returns Time the full process started.
33 34 35 |
# File 'lib/milemarker.rb', line 33 def start_time @start_time end |
Instance Method Details
#_increment_and_on_batch(&blk) ⇒ Object Also known as: increment_and_on_batch
Single call to increment and run (if needed) the on_batch block
107 108 109 |
# File 'lib/milemarker.rb', line 107 def _increment_and_on_batch(&blk) incr.on_batch(&blk) end |
#batch_line ⇒ String
A line describing the batch suitable for logging, of the form
load records.ndj 8_000_000. This batch 2_000_000 in 26.2s (76_469 r/s). Overall 72_705 r/s.
142 143 144 145 146 |
# File 'lib/milemarker.rb', line 142 def batch_line # rubocop:disable Layout/LineLength "#{name} #{ppnum(count, 10)}. This batch #{ppnum(last_batch_size, 5)} in #{ppnum(last_batch_seconds, 4, 1)}s (#{batch_rate_str} r/s). Overall #{total_rate_str} r/s." # rubocop:enable Layout/LineLength end |
#batch_rate ⇒ Float
Returns rate of the last batch (in recs/second).
169 170 171 172 173 |
# File 'lib/milemarker.rb', line 169 def batch_rate return 0.0 if count.zero? last_batch_size.to_f / last_batch_seconds end |
#batch_rate_str(decimals = 0) ⇒ String
Returns Rate-per-second in form XXX.YY.
178 179 180 |
# File 'lib/milemarker.rb', line 178 def batch_rate_str(decimals = 0) ppnum(batch_rate, 0, decimals) end |
#batch_seconds_so_far ⇒ Float
Total seconds since this batch started
204 205 206 |
# File 'lib/milemarker.rb', line 204 def batch_seconds_so_far Time.now - batch_start_time end |
#create_logger!(*args, **kwargs) ⇒ Milemarker
Create a logger for use in logging milemaker information
92 93 94 95 |
# File 'lib/milemarker.rb', line 92 def create_logger!(*args, **kwargs) @logger = Logger.new(*args, **kwargs) self end |
#final_batch_size ⇒ Integer Also known as: batch_count_so_far
Record how many increments there have been since the last on_batch call. Most useful to count how many items are in the final (usually incomplete) batch Note that since Milemarker can’t tell when you’re done processing, you can call this anytime and get the number of items processed since the last on_batch call.
153 154 155 |
# File 'lib/milemarker.rb', line 153 def final_batch_size count - prev_count end |
#final_line ⇒ String
A line describing the entire run, suitable for logging, of the form
load records.ndj FINISHED. 27_138_118 total records in 00h 12m 39s. Overall 35_718 r/s.
162 163 164 165 166 |
# File 'lib/milemarker.rb', line 162 def final_line # rubocop:disable Layout/LineLength "#{name} FINISHED. #{ppnum(count, 10)} total records in #{seconds_to_time_string(total_seconds_so_far)}. Overall #{total_rate_str} r/s." # rubocop:enable Layout/LineLength end |
#incr(increase = 1) ⇒ Milemarker Also known as: increment
Increment the counter – how many records processed, e.g.
82 83 84 85 |
# File 'lib/milemarker.rb', line 82 def incr(increase = 1) @count += increase self end |
#increment_and_log_batch_line(level: :info) ⇒ Object
Convenience method, exactly the same as the common idiom
`mm.incr; mm.on_batch {|mm| log.info mm.batch_line}`
123 124 125 |
# File 'lib/milemarker.rb', line 123 def increment_and_log_batch_line(level: :info) increment_and_on_batch { log_batch_line(level: level) } end |
#log(msg, level: :info) ⇒ Object
Log a line using the internal logger. Do nothing if no logger is configured.
229 230 231 |
# File 'lib/milemarker.rb', line 229 def log(msg, level: :info) logger&.send(level, msg) end |
#log_batch_line(level: :info) ⇒ Object
Log the batch line, as described in #batch_line
129 130 131 |
# File 'lib/milemarker.rb', line 129 def log_batch_line(level: :info) log(batch_line, level: level) end |
#log_final_line(level: :info) ⇒ Object
Log the final line, as described in #final_line
135 136 137 |
# File 'lib/milemarker.rb', line 135 def log_final_line(level: :info) log(final_line, level: level) end |
#on_batch {|Milemarker| ... } ⇒ Object
Run the given block if we’ve exceeded the batch size for the current batch
99 100 101 102 103 104 |
# File 'lib/milemarker.rb', line 99 def on_batch if batch_size_exceeded? set_milemarker! yield self end end |
#reset_for_next_batch! ⇒ Object
Reset the internal counters/timers at the end of a batch. Taken care of by #on_batch; should probably not be called manually.
220 221 222 223 224 |
# File 'lib/milemarker.rb', line 220 def reset_for_next_batch! @batch_start_time = batch_end_time @prev_count = count @batch_number = batch_divisor end |
#set_milemarker! ⇒ Object
Set/reset all the internal state. Called by #on_batch when necessary; should probably not be called manually
210 211 212 213 214 215 216 |
# File 'lib/milemarker.rb', line 210 def set_milemarker! @batch_end_time = Time.now @last_batch_size = @count - @prev_count @last_batch_seconds = @batch_end_time - @batch_start_time reset_for_next_batch! end |
#threadsafe_increment_and_on_batch(&blk) ⇒ Object
Threadsafe version of #increment_and_on_batch, doing the whole thing as a single atomic action
114 115 116 117 118 |
# File 'lib/milemarker.rb', line 114 def threadsafe_increment_and_on_batch(&blk) @mutex.synchronize do _increment_and_on_batch(&blk) end end |
#threadsafify! ⇒ Milemarker
Turn ‘increment_and_batch` (and thus `increment_and_log_batch_line`) into a threadsafe version
72 73 74 75 76 77 78 |
# File 'lib/milemarker.rb', line 72 def threadsafify! @mutex = Mutex.new define_singleton_method(:increment_and_on_batch) do |&blk| threadsafe_increment_and_on_batch(&blk) end self end |
#total_rate ⇒ Float
Returns total rate so far (in rec/second).
183 184 185 186 187 |
# File 'lib/milemarker.rb', line 183 def total_rate return 0.0 if @count.zero? count / total_seconds_so_far end |
#total_rate_str(decimals = 0) ⇒ String
Returns Rate-per-second in form XXX.YY.
192 193 194 |
# File 'lib/milemarker.rb', line 192 def total_rate_str(decimals = 0) ppnum(total_rate, 0, decimals) end |
#total_seconds_so_far ⇒ Float
Total seconds since the beginning of this milemarker
198 199 200 |
# File 'lib/milemarker.rb', line 198 def total_seconds_so_far Time.now - start_time end |