Visibility into your application is one of the most critical parts of software development. At best, visibility is typically an afterthought and this is a problem. So what do you do?

Instrument all the things

InstrumentAllTheThings

At Terminus we use DataDog for our application visibility. InstrumentAllTheThings provides simple ways to quickly and unobtrusively add detailed instrumentation to Datadog metrics and Datadog APM.

Installation

Add this line to your application's Gemfile:

gem 'instrument_all_the_things'

And then execute:

$ bundle

Or install it yourself as:

$ gem install instrument_all_the_things

Usage

Note: For convenience the InstrumentAllTheThings constant is aliased to IATT unless the constant is already defined.

Method Instrumentation

Method instrumentation both in APM as well as in simple counts & timings is the bread and butter of visibility, and it is trivial to implement with IATT.

Each measured metric may be individually disabled, and some may be provided additional configuration. All measurments default to on, unless otherwise specified. You may disable the specified measurement by providing a falsy value to the configuration key when calling instrument

Example

class Foo
  include InstrumentAllTheThings

  instrument config_key: {configuration_option: 123}
  def foo
  end
end

Error Logging

Configuration Key log_errors

When set to a non falsy value all errors raised in a span will be logged to the configured IATT logger and re-emitted. The first traces span where they are seen logs them only, they will not be re-logged by IATT at any future level

By default call stacks are logged without all gem paths (as defined by the Bundler.bundle_path)

Option Description Default
rescue_class The parent error which should be caught and logged StandardError
exclude_bundle_path If truthy, remove all entries from the error backtrace which are in the bundle path true

Tracing

Configuration Key trace

When set to a non falsy value, a span for this method will be created. The defaults are listed below. This hash will also be passed to the DataDog tracer, and their options should also be understood.

Option Description Default
service This is the value which shows up on the first page of the APM screen this should be set at the entry point of your application or process nil
resource How this method will show up when viewing the service in APM. For instance methods ClassName.method_name
For class methods ClassName#method_name
span_name You probably don't want to change this method.execution
span_type See DD Docs. nil
tags Set of tags to be added to the span, expected to be a hash {}

Dynamic Tags

If a trace's tags need to reference either an instance variable or a parameter to the method being traced, you must pass in a proc for the tag value.

For instance variables, this will look like this:

instrument trace: { tags: [-> { "some_stat:#{some_instance_var}" }]}
def my_instance_method
...
end

For parameter references, your proc will need to to have a parameter of args or kwargs (or both), depending on if the parameter you need to reference is a normal parameter or a keyword parameter. Examples:

instrument trace: { tags: [->(args) { "log_args:#{args[0]}" }]}
def my_instance_method(my_var)
...
end

instrument trace: { tags: [->(kwargs) { "log_args:#{kwargs[:my_arg]}" }]}
def my_instance_method(my_arg:)
...
end

Testing Support

You can setup your test environment by running the following setup:

require 'instrument_all_the_things/testing/stat_tracker'
require 'instrument_all_the_things/testing/trace_tracker'
require 'instrument_all_the_things/testing/rspec_matchers'

Datadog.configure do |c|
  c.tracing.test_mode.enabled = true 
  c.tracing.test_mode.writer_options = {
    transport: InstrumentAllTheThings::Testing::TraceTracker.tracker
  }
end

IATT.stat_reporter = IATT::Testing::StatTracker.new

RSpec.configure do |config|
  config.include InstrumentAllTheThings::Testing::RSpecMatchers
  config.before(:each) do
    IATT::Testing::TraceTracker.tracker.reset!
    IATT.stat_reporter.reset!
  end
end

This injects middleware and in the StatsD interface as well as in the Tracer output. By doing this you can start using some awesome rspec helpers like so:

  let(:klass) do
    Class.new do
      include InstrumentAllTheThings::Helpers

      instrument
      def foo
      end

      def self.inspect
        'KlassName'
      end
    end
  end

  it 'traces' do
    expect {
      klass.new.foo
    }.to change{
      emitted_spans(
        filtered_by: {resource: 'KlassName.foo'}
      )
    }.by(1)
  end
end

Stat Transmission

InstrumentAllTheThings provides no real functionality on top of the build in Datadog statsd utility. The primary goal of the IATT library is to allow for easy testing. The following method are provided on the InstrumentAllTheThings module.

Increment

Examples:

expect{
  InstrumentAllTheThings.increment('my.counter')
}.to change{ counter_value('my.counter') }.from(0).to(1)

expect{
  InstrumentAllTheThings.increment('my.counter', by: 5)
}.to change { counter_value('my.counter') }.from(0).to(5)

# You can also filter by tags
expect {
  InstrumentAllTheThings.increment('my.counter', by: 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.increment('my.counter', by: 2, tags: ['a:b', 'foo:baz'])
}.to change { counter_value('my.counter') }.from(0).to(5)
  .and change { counter_value('my.counter', with_tags: ['a:b']) }.from(0).to(5)
  .and change { counter_value('my.counter', with_tags: ['foo:bar']) }.from(0).to(3)
  .and change { counter_value('my.counter', with_tags: ['foo:baz']) }.from(0).to(2)

Decrement

See Increment for all examples

expect{
  InstrumentAllTheThings.increment('my.counter')
}.to change{ counter_value('my.counter') }.from(0).to(-1)

Count

Count underlies both increment and decrement, and works in a similar way.

expect {
  InstrumentAllTheThings.count('my.counter', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.count('my.counter', 2, tags: ['a:b', 'foo:baz'])
}.to change { counter_value('my.counter') }.from(0).to(5)
  .and change { counter_value('my.counter', with_tags: ['a:b']) }.from(0).to(5)
  .and change { counter_value('my.counter', with_tags: ['foo:bar']) }.from(0).to(3)
  .and change { counter_value('my.counter', with_tags: ['foo:baz']) }.from(0).to(2)

Gauge

expect {
  InstrumentAllTheThings.gauge('my.gauge', 1)
  InstrumentAllTheThings.gauge('my.gauge', 2)
}.to change { gauge_value('my.gauge') }.from(nil).to(2)

expect {
  InstrumentAllTheThings.gauge('my.gauge', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.gauge('my.gauge', 1, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.gauge('my.gauge', 2, tags: ['a:b', 'foo:baz'])
  InstrumentAllTheThings.gauge('my.gauge', 7, tags: ['a:b'])
}.to change { gauge_value('my.gauge') }.to(7)
  .and change { gauge_value('my.gauge', with_tags: ['a:b']) }.to(7)
  .and change { gauge_value('my.gauge', with_tags: ['foo:bar']) }.to(1)
  .and change { gauge_value('my.gauge', with_tags: ['foo:baz']) }.to(2)

Set

expect {
  InstrumentAllTheThings.set('my.set', 1)
  InstrumentAllTheThings.set('my.set', 2)
}.to change { set_value('my.set') }.from(0).to(2)

expect {
  InstrumentAllTheThings.set('my.set', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.set('my.set', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.set('my.set', 5, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.set('my.set', 6, tags: ['a:b', 'foo:baz'])
  InstrumentAllTheThings.set('my.set', 9, tags: ['a:b'])
}.to change { set_value('my.set') }.to(4)
  .and change { set_value('my.set', with_tags: ['a:b']) }.to(4)
  .and change { set_value('my.set', with_tags: ['foo:bar']) }.to(2)
  .and change { set_value('my.set', with_tags: ['foo:baz']) }.to(1)

Histogram

Histogram is a pesudo metric in Datadog based on the StatsD timing metric. InstrumentAllTheThings provides a way to test the values emitted, not the statistical calculations derived.

expect {
  InstrumentAllTheThings.histogram('my.histogram', 1)
  InstrumentAllTheThings.histogram('my.histogram', 2)
}.to change { histogram_values('my.histogram') }.from([]).to([1, 2])

expect {
  InstrumentAllTheThings.histogram('my.histogram', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.histogram('my.histogram', 5, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.histogram('my.histogram', 6, tags: ['a:b', 'foo:baz'])
  InstrumentAllTheThings.histogram('my.histogram', 9, tags: ['a:b'])
}.to change { histogram_values('my.histogram') }.to([3, 5, 6, 9])
  .and change { histogram_values('my.histogram', with_tags: ['a:b']) }.to([3, 5, 6, 9])
  .and change { histogram_values('my.histogram', with_tags: ['foo:bar']) }.to([3, 5])
  .and change { histogram_values('my.histogram', with_tags: ['foo:baz']) }.to([6])

Distribution

expect {
  InstrumentAllTheThings.distribution('my.distribution', 1)
  InstrumentAllTheThings.distribution('my.distribution', 2)
}.to change { distribution_values('my.distribution') }.from([]).to([1, 2])

expect {
  InstrumentAllTheThings.distribution('my.distribution', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.distribution('my.distribution', 5, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.distribution('my.distribution', 6, tags: ['a:b', 'foo:baz'])
  InstrumentAllTheThings.distribution('my.distribution', 9, tags: ['a:b'])
}.to change { distribution_values('my.distribution') }.to([3, 5, 6, 9])
  .and change { distribution_values('my.distribution', with_tags: ['a:b']) }.to([3, 5, 6, 9])
  .and change { distribution_values('my.distribution', with_tags: ['foo:bar']) }.to([3, 5])
  .and change { distribution_values('my.distribution', with_tags: ['foo:baz']) }.to([6])

Timing

expect {
  InstrumentAllTheThings.timing('my.timing', 1)
  InstrumentAllTheThings.timing('my.timing', 2)
}.to change { timing_values('my.timing') }.from([]).to([1, 2])

expect {
  InstrumentAllTheThings.timing('my.timing', 3, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.timing('my.timing', 5, tags: ['a:b', 'foo:bar'])
  InstrumentAllTheThings.timing('my.timing', 6, tags: ['a:b', 'foo:baz'])
  InstrumentAllTheThings.timing('my.timing', 9, tags: ['a:b'])
}.to change { timing_values('my.timing') }.to([3, 5, 6, 9])
  .and change { timing_values('my.timing', with_tags: ['a:b']) }.to([3, 5, 6, 9])
  .and change { timing_values('my.timing', with_tags: ['foo:bar']) }.to([3, 5])
  .and change { timing_values('my.timing', with_tags: ['foo:baz']) }.to([6])

Time

The time helper wraps a block with reporting to the timing metric time. As such specs should leverage that spec helper.

expect {
  InstrumentAllTheThings.time('my.time') {}
  InstrumentAllTheThings.time('my.time') {}
}.to change { timing_values('my.time').length }.from(0).to(2)

expect {
  InstrumentAllTheThings.time('my.time', tags: ['a:b', 'foo:bar']) {}
  InstrumentAllTheThings.time('my.time', tags: ['a:b', 'foo:bar']) {}
  InstrumentAllTheThings.time('my.time', tags: ['a:b', 'foo:baz']) {}
  InstrumentAllTheThings.time('my.time', tags: ['a:b']) {}
}.to change { timing_values('my.time').length }.to(4)
  .and change { timing_values('my.time', with_tags: ['a:b']).length }.to(4)
  .and change { timing_values('my.time', with_tags: ['foo:bar']).length }.to(2)
  .and change { timing_values('my.time', with_tags: ['foo:baz']).length }.to(1)

Performance

Is it slow? That depends quite a bit on your perspective. We keep a benchmark as part of our specs and monitor the overall cost of instrumentation. The current numbers look roughly like:

Calculating -------------------------------------
      uninstrumetned     17.038M (± 3.3%) i/s -     85.164M in   5.004312s
           the_works      7.404k (±12.9%) i/s -     36.936k in   5.100630s
          only_trace     27.968k (±12.7%) i/s -    139.209k in   5.061907s
  only_error_logging    638.098k (± 4.6%) i/s -      3.231M in   5.075275s
only_execution_counts     9.847k (±11.1%) i/s -     49.088k in   5.073475s

All of these calculations are performed against an empty method, which ruby executes in 0.000059ms. If we consider this the baseline, running all of the included instrumentation in the non-error case has a cost of 0.135ms. If this is an expensive calculation is directly proportional to the execution duration of the method. If you are tracing a 1ms method, it represents a 13% overhead, but if you are tracing a 100ms method it is only 0.13%.

The moral of the story, don't blindly add the works to every method in your stack, instead choose methods who's output will be diagnostically meaningful.

We are also always looking for patches with will reduce execution time and allocations of the instrumentation, so please open some PRs!

Configuration

The configuration for IATT is available on the top level InstrumentAllTheThings module.

Config Name Description Default
stat_namespace The string to add to all outbound stats (may not be changed after stat transmiter initialization) nil
logger The logger used to report errors and info If the constant Rails is set, use Rails.logger.
If App and it responds to logger use App.logger. Otherwise create a new Logger sent to STDOUT
stat_reporter The class which receives simple stats If Datadog::Statsd is found, use that, otherwise the Blackhole client is used
tracer The instance of a tracer which will handle all traces If Datadog is defined and responds to tracer, use the value returned by that. Otherwise use the Blackhole. Gem

Stats Reporters

Datadog

The default client if the constant Datadog::Statsd is found.

Initialized with environment variables

  • DATADOG_HOST if set, otherwise localhost
  • DATADOG_POST if set, otherwise 8125

Tracers

Datadog

The default client if the constant Datadog is found and has a non-null value for tracer.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Breaking Changes

V3 => V4

Removed the default StatsD based wrappers.

V2 => V3

IATT no longer takes ENV var configuration for directly. All DataDog configuration is handled by their env vars.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/GetTerminus/instrument_all_the_things.