Mongify
A data translator from sql database to mongoDB.
Supports MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2 (Basically anything ActiveRecord has built-in). However, I’ve only tested it with MySql and SQLite
Supports any version of MongoDB
Learn more about MongoDB at: www.mongodb.org
Install
gem install mongify
NOTICE
You might have to install active record database gems (you'll know if you get an error message while running mongify command)
Usage
Creating a configuration file
In order for Mongify to do its job, it needs to know where your databases are located.
Building a config file is as simple as this:
sql_connection do
adapter "mysql"
host "localhost"
username "root"
password "passw0rd"
database "my_database"
batch_size 10000 # This is defaulted to 10000 but in case you want to make that smaller (on lower RAM machines)
# Uncomment the following line if you get a "String not valid UTF-8" error.
# encoding "utf8"
end
mongodb_connection do
host "localhost"
database "my_database"
# Uncomment the following line if you get a "String not valid UTF-8" error.
# encoding "utf8"
end
You can check your configuration by running
mongify check database.config
Options
Currently the only supported option is for the mongodb_connection.
mongodb_connection :force => true do # Forcing a mongodb_connection will drop the database before processing
# ...
end
You can omit the mongodb connection until you’re ready to process your translation
Generating or creating a translation
Generating a translation
If your database is large and complex, it might be a bit too much work to write the translation file by hand. Mongify’s translate command can help with this:
mongify translation database.config
Or pipe it right into a file by running
mongify translation database.config > translation_file.rb
Creating a translation
Creating a translation is pretty straightforward. It looks something like this:
table "users" do
column "id", :key
column "first_name", :string
column "last_name", :string
column "created_at", :datetime
column "updated_at", :datetime
end
table "posts" do
column "id", :key
column "title", :string
column "owner_id", :integer, :references => :users
column "body", :text
column "published_at", :datetime
column "created_at", :datetime
column "updated_at", :datetime
end
table "comments", :embed_in => :posts, :on => :post_id do
column "id", :key
column "body", :text
column "post_id", :integer, :references => :posts
column "user_id", :integer, :references => :users
column "created_at", :datetime
column "updated_at", :datetime
end
table "preferences", :embed_in => :users, :as => :object do
column "id", :key, :as => :string
column "user_id", :integer, :references => "users"
column "notify_by_email", :boolean
end
table "notes", :embed_in => true, :polymorphic => 'notable' do
column "id", :key
column "user_id", :integer, :references => "users"
column "notable_id", :integer
column "notable_type", :string
column "body", :text
column "created_at", :datetime
column "updated_at", :datetime
end
Save the file as "translation_file.rb"
and run the command:
mongify process database.config translation_file.rb
Commands
Usage: mongify command database_config [database_translation.rb]
Commands:
"check" or "ck" >> Checks connection for sql and no_sql databases [configuration_file]
"process" or "pr" >> Takes a translation and process it to mongodb [configuration_file, translation_file]
"sync" or "sy" >> Takes a translation and process it to mongodb, only syncs (insert/update) new or updated records based on the updated_at column [configuration_file, translation_file]
"translation" or "tr" >> Outputs a translation file from a sql connection [configuration_file]
Examples:
mongify translation datbase.config
mongify tr database.config
mongify check database.config
mongify process database.config database_translation.rb
mongify sync database.config database_translation.rb
Common options:
-h, --help Show this message
-v, --version Show version
Translation Layout and Options
When dealing with a translation, there are a few options you can set
Table
Structure
Structure for defining a table is as follows:
table "table_name", {options} do
# columns go here...
end
Options
Table Options are as follow:
table "table_name" # Does a straight copy of the table
table "table_name", :embed_in => 'users' # Embeds table_name into users, assuming a user_id is present in table_name.
# This will also assume you want the table embedded as an array.
table "table_name", # Embeds table_name into users, linking it via a owner_id
:embed_in => 'users', # This will also assume you want the table embedded as an array.
:on => 'owner_id'
table "table_name", # Embeds table_name into users as a one to one relationship
:embed_in => 'users', # This also assumes you have a user_id present in table_name
:on => 'owner_id', # You can also specify both :on and :as options when embedding
:as => 'object' # NOTE: If you rename the owner_id column, make sure you
# update the :on to the new column name
table "table_name", :rename_to => 'my_table' # This will allow you to rename the table as it's getting process
# Just remember that columns that use :reference need to
# reference the new name.
table "table_name", :ignore => true # This will ignore the whole table (like it doesn't exist)
# This option is good for tables like: schema_migrations
table "table_name", # This allows you to specify the table as being polymorphic
:polymorphic => 'notable', # and provide the name of the polymorphic relationship.
:embed_in => true # Setting embed_in => true allows the relationship to be
# embedded directly into the parent class.
# If you do not embed it, the polymorphic table will be copied in to
# MongoDB and the notable_id will be updated to the new BSON::ObjectID
table "table_name" do # A table can take a before_save block that will be called just
before_save do |row| # before the row is saved to the no sql database.
row.admin = row.delete('permission').to_i > 50 # This gives you the ability to do very powerful things like:
end # Moving records around, renaming records, changing values in row based on
end # some values! Checkout Mongify::Database::DataRow to learn more
table "users" do # Here is how to set new ID using the old id
before_save do |row|
row._id = row.delete('pre_mongified_id')
end
end
table "preferences", :embed_in => "users" do # As of version 0.2, embedded tables with a before_save will take an
before_save do |pref_row, user_row, unset_user_row| # extra argument which is the parent row of the embedded table.
user_row.email_me = pref_row.delete('email_me') # This gives you the ability to move things from an embedded table row
# to the parent row.
if pref_row['one_name']
unset_user_row['last_name'] = true # This will delete/unset the last name for the parent row
end
end
end
More documentation can be found at Mongify::Database::Table
Columns
Structure
Structure for defining a column is as follows:
column "name", :type, {options}
Columns with no type given will be set to :string
Notes
as of version 0.2: Leaving a column out when defining a table will result in the column being ignored
Types
Before we cover the options, you need to know what types of columns are supported:
:key # Columns that are primary keys need to be marked as :key type. You can provide an :as if your :key is not an integer column
:integer # Will be converted to a integer
:float # Will be converted to a float
:decimal # Will be converted to a string (due to MongoDB Ruby Drivers not supporting BigDecimal, read details in Mongify::Database::Column under Decimal Storage)
:string # Will be converted to a string
:text # Will be converted to a string
:datetime # Will be converted to a Time format (DateTime is not currently supported in the Mongo ruby driver)
:date # Will be converted to a Time format (Date is not currently supported in the Mongo ruby driver)
:timestamps # Will be converted to a Time format
:time # Will be converted to a Time format (the date portion of the Time object will be 2000-01-01)
:binary # Will be converted to a string
:boolean # Will be converted to a true or false values
Options
column "post_id", :integer, :references => :posts # Referenced columns need to be marked as such, this will mean that they will be updated
# with the new BSON::ObjectID.
# NOTE: if you rename the table 'posts', you should set the :references to the new name
column "name", :string, :ignore => true # Ignoring a column will make the column NOT copy over to the new database
column "surname",
:string,
:rename_to => 'last_name' # Rename_to allows you to rename the column
<em>For decimal columns you can specify a few options:</em>
column "total", # This is a default conversion setting.
:decimal,
:as => 'string'
column "total", # You can specify to convert your decimal to integer
:decimal, # specifying scale will define how many decimal places to keep
:as => 'integer', # Example: :scale => 2 will convert 123.4567 to 12346 before saving
:scale => 2
More documentation can be found at Mongify::Database::Column
Notes
Mongify has been tested on Ruby 1.8.7-p330, 1.9.2-p136, 1.9.3-p327, 2.0.0-p353, 2.1.1
If you have any issues, please feel free to report them here: issue tracker
Sync Note
If you want to continually sync your sql database with MongoDB using Mongify, you MUST run the sync feature from the start. Using the process command will not permit future syncing. The sync feature leaves the ‘pre_mongify_id` in your documents as well as it creates it’s own collection called ‘mongify_sync_helper` to keep dates and times of your last sync.
Additional Tools
As of May 2013, we’ve released an add-on tool to Mongify called Mongify-Mongoid which lets you use the translation.rb file to generate Mongoid models. It generates the model fields, relations (most of the way) and timestamps. This should save you some time from having to re-type all the models by hand. Check out the gem at: rubygems.org/gems/mongify-mongoid
TODO
-
Allow deeper embedding
-
Test in different databases
-
Give an ability to mark source DB rows as imported (allowing Mongify to run as a on going converter)
Known Issues
-
Can’t do anything to an embedded table
Development
Requirements
You just need bundler >= 1.0.8
gem install bundler
bundle install
copy over spec/support/database.example to spec/support/database.yml and fill in required info
rake test:setup:mysql
rake test:setup:postgresql
rake test
Special Thanks
Just want to thank my wife (Alessia) who not only puts up with me working on my free time but sometimes helps out listening to my craziness or helping me name classes or functions.
I would also like to thank Mon_Ouie on the Ruby IRC channel for helping me figure out how to setup the internal configuration file reading.
Another thanks goes out to eimermusic for his feedback on the gem, and a few bugs that he helps flush out. And to Pranas Kiziela who committed some bug and typo fixes that make Mongify even better!
About
This gem was made by Andrew Kalek from Anlek Consulting
Reach me at:
-
Twitter: @anlek
-
Email: andrew.kalek@anlekcom
The sync functionality was made by Hossam Hammady from Qatar Computing Research Institute
Reach me at:
-
Twitter: @hammady
-
Email: hhammady@qf[dot]orgqa
License
Copyright © 2011 - 2013 Andrew Kalek, Anlek Consulting
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.