Dynashard - Dynamic sharding for ActiveRecord
This package provides database sharding functionality for ActiveRecord models.
Sharding is disabled by default and is enabled with Dynashard.enable
. This allows
sharding behavior to be enabled globally or only for specific environments; for example,
production environments could be sharded while development environments could
use a single database.
Models may be configured to determine the appropriate shard (database connection) to use based on context defined prior to performing queries. Different models may shard using different contexts.
class Widget < ActiveRecord::Base
shard :by => :user
end
class Doohickie < ActiveRecord::Base
shard :by => :vhost
end
class WidgetController < ApplicationController
around_filter :set_shard_context
def index
# Widgets will be loaded using the connection for the current user's shard
@widgets = Widget.find(:all)
# Doohickies will be loaded using the connection for the vhost's shard
@doohickies = Doohickie.find(:all)
end
private
def set_shard_context
Dynashard.with_context(:user => current_user.shard, :vhost => request.env['HTTP_HOST']) do
yield
end
end
end
Sharded models are returned as objects of a shard-specific subclass.
> new_widget = Dynashard.with_context(:user => 'shard1') {Widget.new(:name => 'New widget')}
=> <#Dynashard::Shard0::Widget id: nil, name: "New widget">
> created_widget = Dynashard.with_context(:user => 'shard2') {Widget.create(:name => 'Created widget')}
=> <#Dynashard::Shard1::Widget id: 1, name: "Created widget">
> found_widget = Dynashard.with_context(:user => 'shard3') {Widget.find(:first)}
=> <#Dynashard::Shard2::Widget id: 4, name: "Found widget">
> found_widgets = Dynashard.with_context(:user => 'shard3') {Widget.find(:all)}
=> [<#Dynashard::Shard2::Widget id: 4, name: "Found widget">, <#Dynashard::Shard2::Widget id: 5, name: "Other found widget">]
New objects are saved on the shard with the context that was active when the object was initialized.
> new_widget.save
=> <#Dynashard::Shard0::Widget id: 1, name: "New widget"> # saved on 'shard1'
Created and found objects are updated on the shard with the context that was active when they were created or found.
> created_widget.update_attribute(:name, 'New name')
=> true # updated on 'shard2'
> found_widget.update_attributes(:name => 'Updated name')
=> true # updated on 'shard3'
Shard context values may be a valid argument to establish_connection() such as a string reference to a configuration from config/database.yml or a hash with database connection parameters. Values may also be an object that responds to :call and returns a valid argument to establish_connection().
Load widgets from a shard defined in database.yml
$ cat config/database.yml
development:
database: db/development.sqlite3
<<: *defaults
shard1:
database: db/shard1.sqlite3
<<: *defaults
shard2:
database: db/shard2.sqlite3
<<: *defaults
> @widgets = Dynashard.with_context(:user => 'shard1') { Widget.find(:all) }
=> [#<Dynashard::Shard0::Widget id:1>, #<Dynashard::Shard0::Widget id:2>]
Load widgets from a shard using a hash of connection params
> conn = {:adapter => 'sqlite3', :database => 'db/shard3.sqlite3'}
> @widgets = Dynashard.with_context(:user => conn) { Widget.find(:all) }
=> [#<Dynashard::Shard2::Widget id:1>, #<Dynashard::Shard2::Widget id:2>]
Create a widget using a method to determine the shard
widget_shard = lambda do
# Store widgets by month/day
{:adapter => 'sqlite3', :database => "db/dayslice#{Time.now.strftime("%m%d")}"}
end
> Time.now
=> Mon Jan 31 17:37:23 -0800 2011
> widget_shard.call
=> {:database=>"db/dayslice0131", :adapter=>"sqlite3"}
> new_widget = Dynashard.with_context(:user => widget_shard) do
Widget.create(:name => 'The newest of the widgets')
end
=> <#Dynashard::Shard4::Widget id:3>
Use a Rails initializer for one-time configuration of shard context
$ cat config/initializers/dynashard.rb
# Put user-sharded data on the smallest shard
Dynashard.shard_context[:user] = lambda do
Shard.order(:size).find(:first).dsn
end
> new_widget = Widget.create(:name => 'Put this on the smallest shard')
=> <#Dynashard::Shard5::Widget id:4>
Use with_context to override an earlier context setting
> Dynashard.shard_context[:user] = 'shard1'
> new_widget = Widget.create(:name => 'Put this on shard1')
=> <#Dynashard::Shard0::Widget id:5>
> new_widget = Dynashard.with_context(:user => 'shard2') do
Widget.create(:name => 'Put this on shard2')
do
> <#Dynashard::Shard1::Widget id:6>
Associated models may be configured to use different shards determined by the association's owner.
class Company < ActiveRecord::Base
shard :associated, :using => :shard
has_many :customers
def shard
# logic to find the company's shard
end
end
class Customer < ActiveRecord::Base
belongs_to :company
shard :by => :company
end
Load a Company using the default ActiveRecord connection.
> c = Company.find(:first)
=> #<Company id:1>
Load Customers using the connection for the Company's shard. Associated models are returns as shard-specific subclasses of the association class.
> c.customers
=> [#<Dynashard::Shard0::Customer id: 1>, #<Dynashard::Shard0::Customer id: 2>]
Save new associations on the Company's shard.
> c.customers.create(:name => 'Always right')
=> #<Dynashard::Shard0::Customer id: 3>
TODO: add gotcha section, eg:
- many-to-many associations can only be used across shards in one direction, where the association target and the join table exist on the same database connection (else joins don't work.)
- uniqueness validations should be scoped by whatever is sharding
- ways to shoot yourself in the foot with non-sharding association owners of sharded models
- investigate proxy extend for association proxy