Mongo Delta
Coordinated transfer between MongoDB clusters
Mongo Delta is a command line tool that tails a MongoDB replica set's oplog (using mongoriver) and based on a configured set of outlets transfers documents to other MongoDB instances.
Installation
Install from Rubygems as:
$ gem install mongo_delta
Or build from source by:
$ gem build mongo_delta.gemspec
And then install the built gem.
Configuration
Mongo Delta requires a configuration where you set up your source, various targets and outlets. This can be stored in a YAML file or in the source database.
Here's an example:
db: mongo_delta
service: mongo_delta
source: mongodb://mongorsa1:27017,mongorsa2:27017
targets:
archive: mongodb://mongoarch:27017
outlets:
event_archiver:
outlet: Replicator
target: archive
db: db_name
collection: events
The db
and service
options are optional and do the same as their
command line counterparts. The default for both is 'mongo_delta'
. This
tells Mongo Delta where to persist the optime which tracks the point of
time upto which the oplog has been processed. The service
option
makes it possible to run multiple Mongo Delta processes using the same
source.
The source
is where Mongo Delta is going to tail the oplog.
Under targets
several target connections can be listed.
Use MongoDB URIs for both options.
Finally, list outlets which will handle the incoming data and send them out another way. Configure each outlet with the following options:
outlet
: name of one of the outlet implementations (see below)target
: name of one of the targetsdb
andcollection
: specify the namespace for which the outlet appliestarget_db
andtarget_collection
: optional, send data at target to a different db and collection- some outlets can have further options
Storing configuration in the source database
You can store this configuration in the source database. Use the
--source
command line option and Mongo Delta will assume that the
configuration is located in the config
collection of the mongo_delta
database with _id: 'mongo_delta'
. The database and the service ID can
be overridden with the --db
and --service
options respectively.
Example:
$ mongo mongo_delta
rs0:PRIMARY> db.config.save({
... _id: 'mongo_delta',
... outlets: {
... event_archiver: {
... outlet: 'Replicator',
... target: 'live',
... db: 'sourcedb',
... collection: 'events',
... target_db: 'archive'
... }
... },
... targets: {live: 'mongodb://localhost:27017'}
... })
$ mongo_delta --source mongodb://localhost:27017
2013-06-10 21:24:29 - INFO: Registering event_archiver Replicator outlet for cartman.events
2013-06-10 21:24:29 - INFO: Starting stream
Usage
mongo_delta --config path/to/config.yml [options]
or if the configuration is stored in the source database:
mongo_delta --source mongodb://mongorsa1:27017,mongorsa2:27017 [options]
Run mongo_delta --help
for more options.
Outlets
Replicator
This outlet simply repeats insert
, remove
and update
operations on
the configured target. You can use this to keep a remote collection
in sync with your main MongoDB cluster. Keep in mind that the
replication is one-way.
Sharded clusters
Mongo Delta does not have special support for sharded Mongo clusters at
this time. It should be possible to run a separate mongo_delta
instance against each of the individual backend shard replica sets,
but otherwise with the same configuration.
Development
Patches and contributions are welcome! Please fork the project and open a pull request on github, or just report issues.
Mongo Delta assumes the source MongoDB to be a replica set member. You
can create a standalone replica set member on your development machine
by running mongod
with the --replSet rs0
option, and then running
the following command in the mongo shell:
rs.initiate({_id: 'rs0', members: [{ _id: 0, host: '127.0.0.1:27017'}]})