Project Description

dm-cutie - The 'cutie' stands for Query Tracking. She is super thorough and easy as hell, just the way you like it.

Tracks queries that are executed by DataMapper. Models can be optionally excluded from tracking. Cutie currently only supports DataObjects Adapters.

Quickstart- The whole caboodle.

There are two major pieces to the Cutie Project.

  • dm-cutie - the query tracking portion - this gets installed 'in your app'
  • dm-cutie-ui - A sinatra app that can connect to your dm-cutie repo and provide statisics.

Installing the query tracker (dm-cutie) $ gem sources -a http://gemcutter.org # Add gem cutter to your gem sources $ gem install dm-cutie $ gem install dm-cutie-extras # This is a plugin pack for dm-cutie $ gem install dm-cutie-ui # this is the front-end to dm-cutie

Adding it to your projects # Somewhere after your DataMapper repositories are setup require 'dm-cutie' DataMapper::Cutie.enable_adapter :data_objects

# If you want some extras, sorry its Mysql centric, its what Im primarily familiar with
# require 'dm-cutie-extras'
#
# # Query Plan for Sqlite3
# DataMapper::Cutie::Extras.load :sqlite3_execution_step
#
# # Query Plan for Mysql
# DataMapper::Cutie::Extras.load :mysql_execution_step
#
# # Mysql Index information
# DataMapper::Cutie::Extras.load :mysql_index
#
# # Mysql warnings & errors
# DataMapper::Cutie::Extras.load :mysql_warning
#

DataMapper::Cutie.setup do |c|
  # The name of the datamapper repository to store dm cutie tracked data in.
  #   If you are using something like mysql make sure that you created the schema
  #   
  c[:repo_name]       = :dm_cutie 

  # List of any models in YOUR application that you want dm-cutie to ignore
  c[:exclude_models]  = false #[:person, :car]

  # List of models you want dm-cutie to specifically track
  c[:only_models]     = false #[:article, :address]

  # What you consider a slow query
  c[:slow_query_length] = 2
end

# passing true will tell dm-cutie to force a migration of all of HER models.
#   she will always do a migration if she can't find her tables. In a development
#   environment it is recommended to pass 'true' as your DataMapper models are probably
#   changing often and there for should be re-profiled.
DataMapper::Cutie.start(true)

# If you are doing long term tracking in a production environment you will want to pass
#   'false'  DM cutie will still migrate her tables the first time to make sure everything
#   is set up correctly. PS, dm-cutie causes TONS of queries to be executed to adequatly track
#   what is going on in your application.  It would be recommended to only use her in development
#   and staging environments.
# DataMapper::Cutie.start(false)

Starting the dm-cutie-ui sinatra app # A bin file should have been included with the gem # # Want to see options? # dm-cutie-ui --help # # The default port is 9292 $ dm-cutie-ui

Log into dm-cutie-ui

  • Go to http://localhost:9292 #or whatever port you used
  • Enter the path to your dm-cutie repo # mysql://root@localhost/dm-cutie
  • Click 'connect' or enter a password if the account you are connecting with needs a password
  • Learn?

How It Works

DM Cutie overrides DataObjectsAdapters and puts in some hooks to track queries. Cutie uses four models to store information about queries

  • GeneralizedQuery - This is a query without variables bound to it
    • Example: Person.first(:email => "[email protected]") would be stored as "SELECT * FROM people WHERE email = ?"
    • This query will be unique to this table.
    • Useful for determine how many generic queries are being generated by your application
  • ExecutedQuery - This is a query WITH variables bound to it
    • Example: Person.first(:email => "[email protected]") would be stored as "SELECT * FROM people WHERE email = '[email protected]'"
    • EVERY executed statement is stored in this table, executed statements do not have a unique constraint.
    • This table has a foreign Key to GeneralizedQuery, so you can determine which ExecutedQuery objects are related
      • Example: Person.first(:email => "[email protected]") and Person.first(:email => "[email protected]")
      • Both ExecutedQuery objects would relate back to the GeneralizedQuery, Person.first(:email => ?) ("SELECT * FROM people WHERE email = ?")
    • This also stores the line number and file name of the caller.
  • RepositoryStorage - This keeps track of all storages (tables) across repositories.
    • UnboundQueries 'belong' to RepositoryStorages through QueryStorageLink
    • Example:
      • if you have a Person model that is stored in a YAML repository that would be one RepositoryStorage
      • if you have a Person model that is stored in a MySQL repostiroy that would be an additional RepositoryStorage
      • UnboundQueries directed at MySQL for the Person model would be related to the MySQL Person RepositoryStorage object
  • QueryStorageLink - ties a RepositoryStorage
    • Denotes whether a RepositoryStorage was the 'primary storage' for a query
    • The primary storage is determine by the Repository#name and Repository#adapter of the query object
    • Other QueryStorageLinks on an GeneralizedQuery are a result of relationships that are stored in the Query#links

Important

DM Cutie generates a ton of queries itself to track what is going on in DataMapper. It would be wise to use it to only to track your queries in a Staging or Development environment or temporarily in production.

DM Cutie essentially records a 'snapshot' of everything that happened in your applications repository.

Usage

require 'dm-cutie' 

# Load the hook for the adapter you want to track
DataMapper::Cutie.enable_adapter :data_objects

#note, these haven't been written yet
# DataMapper::Cutie.enable_adapter :yaml 
# DataMapper::Cutie.enable_adapter :in_memory

# Optional
DataMapper::Cutie.setup do |c|
  # The repo to use to store Cutie's data in
  c[:repo_name]                  = :dm_cutie

  # Models to always exclude from tracking
  c[:exclude_models]            = false #[:person, :car]

  # Models to only tracker
  c[:only_models]               = false #[:article, :address]

  # Time in seconds you consider a query to be slow
  c[:slow_query_length]         = 2
end

DataMapper::Cutie.start

Adapter Trackers vs Tracker Hooks

  • Adapter Trackers are the basic trackers for a specific adapter (DataObjectsAdapter, MySQLAdapter, etc)
    • Provide the basic functionality to 'capture' a query in progress and the basic storage of details in ExecutedQuery, GeneralizedQuery, RepositoryStorage, and QueryStorageLink
    • Currently this is NOT DRY and is a bit hostile, as it literally reimplements the CRUD layer methods in DO
  • Tracker Hooks allow you to create additional storages for details that are outside the basic 4 models included in Cutie
    • Example: All SQL queries have a RepositoryStorage (the tables in the FROM statement), but only a MySQL query would have a MySQL execution plan, so you can have a Tracker Hook that will only store MySQL Execution plans if its the mysql adapter
    • Allow you to perform logic on an 'executed query'

Enabling Tracker Hooks

A project of tracker hooks exists called dm-cutie-extras.

require 'dm-cutie'
DataMapper::Cutie.enable_adapter :data_objects

# Simply require the additional trackers
require 'dm-cutie-extras'
DataMapper::Cutie::Extras.load :mysql_execution_step
DataMapper::Cutie::Extras.load :sqlite3_execution_step

DataMapper::Cutie.start(true)

Currently included are:

  • MySQL Execution Step - Runs a query plan for MySQL queries
  • Sqlite3 Execution Step - Runs a query plan for Sqlite3 queries

Writing Tracker Hooks

To write a tracker simply include DataMapper::Cutie::Tracker::Hook::Abstract into a class and define a few methods class MyTrackerHook include DataMapper::Cutie::Tracker::Hook::Abstract

  # The name of your hook
  def self.hook_name
    "Really cool hook"
  end

  # the statements this tracker applies to
  def self.supported_statements
    [:select]
  end

  # The adapters this tracker applies to
  def self.suppored_adapters
    [:mysql]
  end

  # This is a factory method that will run your tracking logic
  # this method will receive the current ExecutedQuery
  #
  def self.track( executed_query )
    # Your tracking logic goes here
  end
end

Then let Cutie know she has a new hook to work with DataMapper::Cutie.add_tracker_hook :my_tracker_hook #Symbol representation of your class name

Example "Hi Mom! Tracker Hook" - Sends mom an email telling her what SELECT statements where ran

class HiMomTrackerHook
  include DataMapper::Cutie::Tracker::Hook::Abstract

  # The name of your hook
  def self.hook_name
    "Hi Mom! Tracker hook"
  end

  # the statements this tracker applies to
  def self.supported_statements
    [:select]
  end

  # The adapters this tracker applies to
  def self.suppored_adapters
    [:mysql]
  end

  def self.track( executed_query )
    # This is an absolutely awesome tracker, because my mom can optimize the heck out of a query.

    my_cool_mail_method(
      :to => "[email protected]",
      :from => "[email protected]",
      :subject => "my friends are reading my data",
      :body => "My friend just executed this query: #{executed_query.generalized_query.statement}"
    )
  end
end

# Of course tell cutie she has a new hook
DataMapper::Cutie.add_tracker_hook :hi_mom_tracker_hook

Example "Which Server Executed What" - Keeps track of what Application server sent the query # Create a model to store server names in
class QueryingServer
require 'socket' include DataMapper::Cutie::Tracker::Hook::Abstract

  include DataMapper::Resource
  has n, :executed_queries

  property :id, Serial
  property :name, String, :unique_index => true

  def self.hook_name
    "Which Server Executed A Query Hook"
  end

  def self.supported_statements
    [:select, :insert, :update, :delete]
  end

  def self.suppored_adapters
    [:mysql]
  end

  def self.track( executed_query )
    DataMapper::Cutie.repo do #get access to the Cutie's repo (you can store this wherever, probably makes sense here though)

      trans = DataMapper::Transaction.new #start a transaction (optional)

      trans.link do 
        querying_server = QueryingServer.first_or_create(:name => Socket.gethostname)
        querying_server.executed_queries << executed_query
        querying_server.save
      end

    end
  end

end

# Let ExecutedQuery know its getting a new column :P
ExecutedQuery.send :belongs_to, :querying_server

# Of course let cutie know she has a new hook
DataMapper::Cutie.add_tracker_hook :querying_server

# Here we have one other concern, we've created a new model: QueryingServer.  We want Cutie to treat this model as an internal model, so she doesn't track it (which would cause an infinite loop)
DataMapper::Cutie.add_internal_model :querying_server    

Tracker hooks don't have to be a model, but they can be.

The #track method receives a ExecutedQuery object. Here are some methods available on that object

executed_query.generalized_query  #The GeneralizedQuery object
executed_query.statement      #The bound SQL Statement
executed_query.generalized_query.statement #The generalized SQL Statement
executed_query.bind_values    #The bind values for the SQL Statement
# etc... Everything that is available on a DataMapper::Query object

Writing an Adapter Tracker

Trackers should 'hook' around the method that is to be tracked. In the DataObjectsTracker example the methods being tracked are

  • select_statement
  • insert_statement
  • update_statement
  • delete_statement

A model that is performing a crud action will only be tracked if the adapter of the repository that the action is happening in is based on the DO Adapter.

TODOS

  • Model.all => Doesnt track -> Probably need to add an override to Collection class
  • Rspecs, convert test_script.rb to specs

  • Format documentation for rdoc/yard whatever.

  • is there an easier way to 'hi jack' something at the DO Adapter level other than making the code unDRY in /tracker/data_objects/overrides

    • Tried using extlib hooks, but needed access to local variables in the DO CRUD methods
  • Namespace the Models so they dont interfere with anyones code if they have the same model names.

  • Tracker hooks trackers should automatically be stored in DataMapper::Cutie.repo instead of having to explicitly call it

  • Spot Tracking => Cutie.track...statements... #So you can track a specific area, rather than the whole system

CONSIDERATIONS

  • Should this be pushed back to the DO level?
    • I like using DM to access data... also the potential to track DM Sphinx, etc
    • Its highly geared toward SQL, can it be abstracted
    • use CRUD operations instead of :select, :insert, :update, :delete
  • Get self.default_respository_name to work so all CUTIE queries dont have to be done in Cutie.repo block
    • class_eval def self.default_repository_name; #THE_NAME; end;
  • Add ability to exclude/include repositories. You might have 3 MySQL repos, and are tracking the DO adapter, but are only interested in a particular database