Beaver, chewing through logs to make something useful

Beaver is a light DSL and command line utility for parsing HTTP and Rails production logs back into usable data. It answers questions like:

  • How many failed logins have there been today? Were they for the same user? From the same IP?

  • How many 500, 404, etc. errors yesterday? On what pages?

  • How many Widgets were created yesterday, and with what data?

  • Did anyone submit a form with the words “kill them all”? Yikes.

  • Rails 3.2 tagged logging is cool, but what’s a good way to review them?

For a full list of matchers available to “hit”, see the Beaver::Dam class. For a full list of methods available inside a “hit” block, or to members of the “hits” array in a “dam” block, see the Beaver::Request, Beaver::Parsers::Rails, and Beaver::Parsers::HTTP classes.

Installation

[sudo] gem install beaver

Use beaver with a DSL file

hit :failed_logins, :method => :post, :path => '/login', :status => 401

hit :new_widgets, :path => '/widgets', :method => :post, :status => 302 do
  puts "A Widget named #{params[:widget][:name]} was created!"
end

hit :help, :path => %r|^/help| do
  skip! if path == '/help/page_i_want_to_ignore'
  puts "#{ip} looked for help at #{path}"
end

hit :tagged => 'user 1' do
  puts "user 1 was tagged at #{path} - other tags were: #{tags.join(', ')}"
end

hit :errors, :status => (500..503)

dam :failed_logins do
  puts "Failed logins:"
  hits.group_by { |h| h.params[:username] } do |user, fails|
    puts " #{user}"
    fails.each do |hit|
      puts "  from #{hit.ip} at #{hit.time.to_s}"
    end
  end
end

dam :errors do
  puts tablize(' | ') { |hit| [hit.status, hit.path, hit.ip, hit.time] }
end

Run ”beaver” from the command line, passing in your beaver file and some logs:

beaver my_beaver_file.rb /var/www/rails-app/log/production.log*

Use beaver as a library

require 'rubygems'
require 'beaver'

Beaver.stream '/path/to/httpd/access_logs*' do
  hit :failed_logins, :method => :post, :path => '/login', :status => 401
  dam :failed_logins do
    puts "#{hits.size} failed logins!"
  end
end

Or with even less DSL…

beaver = Beaver.new('/path/to/httpd/access_logs*')
beaver.hit :failed_logins, :method => :post, :path => '/login', :status => 401
beaver.dam :failed_logins do
  puts "#{hits.size} failed logins!"
end
beaver.stream

Or just to parse and return the matching requests…

requests = Beaver.dam('/path/to/rails/production.log*', :method => :put)

Or as just a parser (no filtering)…

requests = Beaver.parse('/path/to/rails/production.log*')

Use beaver as part of the *nix toolchain

It’s difficult to grep through a multi-line log format like Rails’ and output each matching multi-line event (though I hear Google is working on a ‘Context-Free Grep’, which may help solve that). Until then, for Rails anyway, beaver is happy to step in.

beaver --path="/widgets" --method=post,put --tagged=tag1,tag2 /var/www/rails-app/log/production.log

Or format the output to a single line:

beaver --controller=widgets /var/www/rails-app/log/production.log --print "%{ip} hit %{action} using %{method}"

Also accepts log content from pipes and stdin. Use it to filter log files:

cat /var/www/rails-app/log/production.log* | beaver --action=edit

beaver --action=edit < /var/www/rails-app/log/production.log.1 

beaver --action=edit --stdin

Or for dead-simple real-time event monitoring:

tail -f /var/log/nginx/access_log | beaver www.rb

See all options with ‘beaver –help’.

Example use with Logwatch

This assumes 1) you’re rotating your Rails logs daily and 2) you’re running logwatch daily.

Check your beaver DSL file into your Rails app, maybe under /script.

Add a logwatch config file for your new service at /etc/logwatch/conf/services/your_app.conf:

Title = "Your Rails App"
LogFile = NONE

In /etc/logwatch/scripts/services/your_app:

beaver /var/www/your_app/script/beaver.rb --yesterday /var/www/your_app/log/production.log{,.1}

Your Rails app should return appropriate HTTP statuses

Rails does a lot of great things for us, but one thing largely up to us are HTTP status codes. For example, your failed logins are probably returning 200 when they should arguably be returning 400 or 401. It’s easy to do, and very useful to Beaver.

render :action => :login, :status => 401

A detailed description of each status code and when to use it can be found at www.w3.org.

Complex Rails tag querying

Beaver supports complex Rails Tagged Logger tag quering, in both DSL and command-line modes.

DSL

# Matches any request tagged with "foo"
hit :tagged, :tagged => 'foo'
# Could also be written as ['foo']

# Matches any request tagged with "foo" AND "bar"
hit :tagged_with_all, :tagged => 'foo, bar'
# Could also be written ['foo', 'bar']

# Matches any request tagged with "foo" OR "bar"
hit :tagged_with_any, :tagged => [['foo'], ['bar']]

# Matches any request tagged with ("foo" AND "bar") OR ("bar" AND "baz") OR "yay"
hit :tagged_and_or_and_or, :tagged => [['foo', 'bar'], ['bar', 'baz'], ['yay']]
# Could also be written ['foo, bar', 'bar, baz', 'yay']

Command-line

# Matches any request tagged with "foo"
beaver --tagged foo production.log

# Matches any request tagged with "foo" AND "bar"
beaver --tagged foo,bar production.log

# Matches any request tagged with "foo" OR "bar"
beaver --tagged foo --tagged bar production.log

# Matches any request tagged with ("foo" AND "bar") OR ("bar" AND "baz") OR "yay"
beaver --tagged foo,bar --tagged bar,baz --tagged yay production.log

License

Copyright 2011 Jordan Hollinger

Licensed under the Apache License