Class: PdMetrics
- Inherits:
-
Object
- Object
- PdMetrics
- Defined in:
- lib/pd_metrics.rb,
lib/pd_metrics/version.rb
Defined Under Namespace
Modules: NumericExtensions Classes: Counter, Gauge, Histogram, NumericMetric
Constant Summary collapse
- VERSION =
"1.0.0"
Class Method Summary collapse
-
.gauge(namespace, key, value, tags = {}, additional_data = {}) ⇒ Object
Captures the current value for a metric.
-
.histogram(namespace, key, value, tags = {}, additional_data = {}) ⇒ Object
Captures statistical metrics for a set of values within a given timeframe.
-
.incr(namespace, key, increment_by = 1, tags = {}, additional_data = {}) ⇒ Object
Captures an increase/decrease in a counter.
-
.send_event(namespace, metrics_and_tags = {}, additional_data = {}) ⇒ Object
Logs an event to metric backend.
-
.time(namespace, key, tags = {}, additional_data = {}) ⇒ Object
Captures timing metrics for a block of Ruby code.
Class Method Details
.gauge(namespace, key, value, tags = {}, additional_data = {}) ⇒ Object
Captures the current value for a metric.
Unlike a counter, this value cannot be combined with itself in a meaningful way, so only the last reported value with a certain sampling frequency (normally every 10 seconds) is recorded in DataDog.
You can use this method to capture metrics that change over time, like amount of memory used. Usually, this sampling occurs at a regular frequency via a timer.
PdMetrics.gauge('ruby', 'live_objects', ObjectSpace.live_objects)
The following line will be printed in SumoLogic for each call to gauge.
ruby #live_objects=30873|
Additionally, the following metric will be available in DataDog.
ruby.live_objects
127 128 129 130 131 |
# File 'lib/pd_metrics.rb', line 127 def self.gauge(namespace, key, value, = {}, additional_data = {}) gauge_data = || {} gauge_data[key] = value.gauge send_event(namespace, gauge_data, additional_data) end |
.histogram(namespace, key, value, tags = {}, additional_data = {}) ⇒ Object
Captures statistical metrics for a set of values within a given timeframe. This is very similar to the time method, but it’s genericized for use in arbitrary values.
An example usage would be calculating the size of JSON payloads received by an API. You could use a counter, but that wouldn’t tease out what the average and median payload sizes are.
PdMetrics.histogram('api', 'payload_size', payload.size, account: 'Netflix')
The following line will be printed in SumoLogic for every payload.
api #account=Netflix|#payload_size=1234
api #account=Netflix|#payload_size=0
Additionally, DataDog will have the following metrics available. Note, these metrics are captured every 10 seconds, so they likely represent multiple requests within that time window.
api.payload_size.count
api.payload_size.avg
api.payload_size.median
api.payload_size.max
api.payload_size.95percentile
158 159 160 161 162 |
# File 'lib/pd_metrics.rb', line 158 def self.histogram(namespace, key, value, = {}, additional_data = {}) histogram_data = || {} histogram_data[key] = value.histogram send_event(namespace, histogram_data, additional_data) end |
.incr(namespace, key, increment_by = 1, tags = {}, additional_data = {}) ⇒ Object
Captures an increase/decrease in a counter.
You can use this to capture metrics that should be added together when viewed on a graph.
PdMetrics.incr('logins', 'success')
PdMetrics.incr('emails', 'bytes_received', email_bytes.size, account: 'Netflix')
That will produce the following line in SumoLogic.
logins #success=1
emails #account=Netflix|#bytes_received=1234|
Additionally, the following metrics will be defined in DataDog
logins.success
emails.bytes_received
101 102 103 104 105 |
# File 'lib/pd_metrics.rb', line 101 def self.incr(namespace, key, increment_by = 1, = {}, additional_data = {}) incr_data = || {} incr_data[key] = increment_by.counter send_event(namespace, incr_data, additional_data) end |
.send_event(namespace, metrics_and_tags = {}, additional_data = {}) ⇒ Object
Logs an event to metric backend. In general, you can log any key value pairs.
PdMetrics.send_event('api', account: 'Netflix', wait_delta: 0.01, run_delta: 0.1)
This will result in the following line being logged in SumoLogic. No data will be sent to DataDog.
api #account=Netflix|#run_delta=0.1|#wait_delta=0.01|
In order to support aggregated graphs in DataDog, you’ll need to mark the type of any numerical metrics you want aggregated.
PdMetrics.send_event('api', wait_delta: (0.01).histogram, run_delta: (0.1).histogram)
This extra bit of detail is needed to let DataDog know how to aggregate multiple events in a single timeslice.
counter - adds together multiple data points. Use this for things like visits, errors, etc.
gauge - takes the last value. Use this for things like free memory, connections to database, etc.
histogram - derives count, avg, median, max, min, 95th percentile from a single value. Use this for this like latency, bytes written, etc.
Note that when Datadog metrics are supplied, any non-metric data is passed to DataDog as tags. Depending on how many tags you have, this can be counterproductive in DataDog. To have additional data logged only to Sumologic, pass it in the additional_data paramter.
PdMetrics.send_event('api', {wait_delta: (0.01).histogram, run_delta: (0.1).histogram}, account: 'Netflix')
39 40 41 42 43 44 45 46 |
# File 'lib/pd_metrics.rb', line 39 def self.send_event(namespace, = {}, additional_data = {}) logger.debug { "send_event #{namespace} #{.inspect} #{additional_data.inspect}" } ||= {} additional_data ||= {} send_datadog_format(namespace, ) send_sumologic_format(namespace, , additional_data) end |
.time(namespace, key, tags = {}, additional_data = {}) ⇒ Object
Captures timing metrics for a block of Ruby code.
PdMetrics.time('api', 'receive_email', account: 'Netflix') do
# process the email
end
Assuming the request took 2 seconds to process, the following log message will be written in SumoLogic.
api #account=Netflix|#receive_email=2.0|#failed=false|
Additionally, the following histogram metrics will be captured in DataDog
api.receive_email.count
api.receive_email.avg
api.receive_email.median
api.receive_email.max
api.receive_email.95percentile
In addition to capturing latency of the request, the success or failure of the block of code is captured as well. It is considered failed if an exception is thrown.
70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/pd_metrics.rb', line 70 def self.time(namespace, key, = {}, additional_data = {}) failed = false start = Time.now yield rescue failed = true raise ensure timing_data = || {} timing_data[key] = (Time.now - start).histogram timing_data['failed'] = failed send_event(namespace, timing_data) end |