ZD
ZD is a zero-downtime data migration framework that sits on top of the data store (or stores) of your choice. It implements zero-downtime by putting your data through a series of states:
- Unrun: The migration is implemented, but has not been run yet.
- Prepared: Any necessary creation of a "space" for the migrated data has been done. Typically not needed for schema-less stores. Your application code now writes to both the old and new data structures.
- Migrated: All pre-existing data has been copied and/or mutated in to the new locations while continuing to exist as-is in the old locations. The code base is still operating off of the old locations (and new data continues to be written to both locations).
- Switched:
The code base now uses the new locations and ignores the old locations. Data is still written to both old and new locations. This is the point at which you would want to test the system to make sure the migration worked as expected. If something isn't right, the migration can still be rolled all the way back to the
unrun
state. - Completed:
The migration has been verified to be working, and new data is no longer written to the old locations. Migration-specific code can be stripped out of the code base now. Once a migration is
completed
, it cannot be rolled back. Any references to the migration in the codebase will generate a warning. - Destroyed: The old locations for data have been removed from the data store. Any references to the migration in the codebase will raise an error.
Installation
Just add zd to your Gemfile:
gem 'zd'
And bundle install
.
Usage
To show how ZD works, lets walk through a simple example. Lets say you have a Person class, which used to store separate first names and last names. You've since expanded internationally and realized what a bad idea this is in general, and so now you need to fix your mistake down to the data level without taking your application down (though rolling restarts are OK). Your Person class looks like this initially:
class Person
include AwesomeDB
def first_name
read(:first_name)
end
def last_name
read(:last_name)
end
def first_name=(value)
write(:first_name, value)
end
def last_name=(value)
write(:last_name, value)
end
def name
[first_name, last_name].compact.join(" ")
end
def name=(value)
first_name, *rest = value.split(/\s+/)
write(:first_name, first_name)
write(:last_name, rest.join(" "))
end
end
(The read
and write
methods are made-up access methods for the made-up AwesomeDB data store.)
Currently you still have code using both the name
and first_name
/last_name
, but you're slowly cleaning it up. The key thing is that all the methods on the class continue to obey their contract throughout the data migration.
To get started you'll want to generate a new migration with zd new <name>
. Migrations go in the db/migrate folder in your project, and use a timestamped filename (similar to ActiveRecord migrations). A fresh migration looks something like this:
class Migrations::MergeFirstAndLastName < ZD::Migration
register! depends_on: :nothing
def prepare
end
def migrate
end
def destroy
end
end
And here is what it might look like after the migration is filled out:
class Migrations::MergeFirstAndLastName < ZD::Migration
register! depends_on: :nothing
def prepare
Person.add_field :name
end
def migrate
Person.each do |person|
person.name = [person.first_name, person.last_name].compact.join(" ")
end
end
def destroy
Person.remove_field :first_name
Person.remove_field :last_name
end
end
The Person.add_field
and Person.remove_field
methods are made up; you would just use whatever your data store provides (if necessary; many schemaless datastores won't even need the prepare
step).
This is all well and good, but how does the model handle the fact that the data format is shifting around underneath it? ZD provides state-based methods that can be used to mark when which code should be run:
class Person
include AwesomeDB
def first_name
ZD[:merge_first_and_last_name].HANDLE do |m|
m.UNTIL_SWITCHED{read(:first_name)}
m.ONCE_SWITCHED{@first_name ||= name.split(/\s+/).first}
end
end
def last_name
ZD[:merge_first_and_last_name].HANDLE do |m|
m.UNTIL_SWITCHED{read(:last_name)}
m.ONCE_SWITCHED{@last_name ||= name.split(/\s+/)[1..-1].join(" ")}
end
end
def first_name=(value)
ZD[:merge_first_and_last_name].HANDLE do |m|
m.ONCE_PREPARED{write(:name, [value, last_name].compact.join(" "))}
m.UNTIL_COMPLETED{write(:first_name, value)}
end
end
def last_name=(value)
ZD[:merge_first_and_last_name].HANDLE do |m|
m.ONCE_PREPARED{write(:name, [first_name, value].compact.join(" "))}
m.UNTIL_COMPLETED{write(:last_name, value)}
end
end
def name
ZD[:merge_first_and_last_name].HANDLE do |m|
m.UNTIL_SWITCHED{return [first_name, last_name].compact.join(" ")}
m.ONCE_SWITCHED{read(:name)}
end
end
def name=(value)
ZD[:merge_first_and_last_name].HANDLE do |m|
m.ONCE_PREPARED{write(:name, value)}
m.UNTIL_COMPLETED do
first_name, *rest = value.split(/\s+/)
write(:first_name, first_name)
write(:last_name, rest.join(" "))
end
end
end
end
The first thing you're probably thinking after seeing that is, "Who hit my code with the ugly stick!?!" But that's actually a feature of ZD: migration-specific code sticks out like a sore thumb so that there will be lots of motivation to strip it out once the migration is complete. Migration code should be robust but temporary.
Once your migration and migration-specific code is in place, you can start walking your data through the migration states using zd
:
$ zd prepare
All migrations in the unrun
state will be transitioned to the prepared
state via the prepare
action.
$ zd migrate
All migrations in the prepared
state will be transitioned to the migrated
state via the migrate
action.
$ zd switch
All migrations in the migrated
state will flip over to switched
. This triggers all code to start using the new code paths.
This is the point at which you should verify that your migrations have been successful and all the new code is working as expected in production. Getting back to the old state is as easy as zd switchoff [name]
.
$ zd complete
All migrations in the switched
state will flip over to completed
. This triggers all code to stop writing to old locations, and puts you past the point of no return for an easy rollback. Once you get here, it's time to go through your codebase and rip out the migration-specific code blocks, just leaving the code that deals with the new data structure.
$ zd destroy
All migrations in the completed
state will be transitioned to the destroyed
state via the destroy
action. Typically this is the point at which old data gets cleaned up. Note that once your migration gets to this state, continued references to it in your code will raise an error.
And that's all there is to it! You can either leave old migration files from db/migrate, or delete them once you're done with them - the overhead for each one is very small. Oh, and here's what the Person class looks like once you're done:
class Person
include AwesomeDB
def first_name
@first_name ||= name.split(/\s+/).first
end
def last_name
@last_name ||= name.split(/\s+/)[1..-1].join(" ")
end
def first_name=(value)
write(:name, [value, last_name].compact.join(" "))
end
def last_name=(value)
write(:name, [first_name, value].compact.join(" "))
end
def name
read(:name)
end
def name=(value)
write(:name, value)
end
end
No more ugly!