ROD – Ruby Object Database

WARNING

The 0.7.x branch is a development branch – incompatibilities between library releases might be introduced (both in API and data schema). You are advised to use the latest release of 0.6.x branch.

DESCRIPTION

ROD (Ruby Object Database) is library which aims at providing fast access for data, which rarely changes.

FEATURES:

  • object-oriented Ruby interface

  • Ruby-to-C on-the-fly translation based on mmap and RubyInline

  • optimized for (reading) speed

  • weak reference collections for easy memory reclaims

  • Berkeley DB hash index for the best index performance

  • immediate updates of hash indices

  • compatibility check of library version

  • compatibility check of data model

  • auto-generation of model (based on the database meta-data)

  • automatic model migrations (limited to addition/removal of properties and indexes)

  • full update of the database (removal of objects not available yet)

  • databases interlinking (via direct links or inverted indices)

  • data portability between big and little-endian systems

  • works on Linux and BSD

PROBLEMS

  • tested mostly on 64-bit systems

  • doesn’t work on Windows

  • concurrent writes not supported

  • batch-only data input/update

  • data removal not supported

SYNOPSIS:

ROD is designed for storing and accessing data which rarely changes. It is an opposite of RDBMS as the data is not normalized, while “joins” are much faster. It is an opposite of in-memory databases, since it is designed to cover out of core data sets (10 GB and more).

The primary reason for designing it was to create storage facility for natural language dictionaries and corpora. The data in a fully fledged dictionary is interconnected in many ways, thus the relational model (joins) introduces unacceptable performance hit. The size of corpora forces them to be kept on disks. The in-memory data bases are unacceptable for larg corpora. They would also require the data of a dictionary to be kept mostly in the operational memory, which is not needed (in most cases only a fraction of the data is used at the same time). That’s why a storage facility which minimizes the number of disk reads was designed. The Ruby interface facilitates it’s usage.

REQUIREMENTS:

  • Ruby 1.9

  • RubyInline

  • english

  • ActiveModel

  • bsearch

  • Berkeley DB

INSTALL

  1. Install Berkeley DB

http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html
  1. Install rod gem from rubygems:

gem install rod

TROUBLESHOOTING

If you get the following error:

error: db.h: No such file or directory

then you don’t have Berkeley DB installed or its header fiels are not available on the default path. Make sure that the library is installed and the headers are available.

If you get the following error:

.ruby: symbol lookup error: /home/vagrant/.ruby_inline/... undefined symbol: db_env_create

then you have to provide system-specific linker information. By default the library is linked with ‘-ldb’ linker flag. To change it you have to set up ROD_BDB_LINK_FLAGS environment variable, e.g.

ROD_BDB_LINK_FLAGS='-ldb-4.8'
export ROD_BDB_LINK_FLAGS

This configuration option will select the libdb-4.8.so library.

BASIC USAGE:

class MyDatabase < Rod::Database
end

class Model < Rod::Model
  database_class MyDatabase
end

class User < Model
  field :name, :string
  field :surname, :string, :index => :hash
  field :age, :integer
  has_one :account
  has_many :files
end

class Account < Model
  field :email, :string
  field :login, :string, :index => :hash
  field :password, :string
end

class File < Model
  field :title, :string, :index => :hash
  field :data, :string
end

MyDatabase.instance.create_database("data")
user = User.new(:name => 'Fred',
                :surname => 'Smith',
                :age => 22)
 = Account.new(:email => "[email protected]",
                      :login => "fred",
                      :password => "password")
file1 = File.new(:title => "Lady Gaga video")
file2.data = "0012220001..."
file2 = File.new(:title => "Pink Floyd video")
file2.data = "0012220001..."

user. = 
user.files << file1
user.files << file2

user.store
.store
file1.store
file2.store
MyDatabase.instance.close_database

MyDatabase.instance.open_database("data")
User.each do |user|
  puts "Name: #{user.name} surname: #{user.surname}"
  puts "login: #{user..} e-mail: #{user..email}"
  user.files.each do |file|
    puts "File: #{file.title}"
  end
end

User[0]                           # gives first user
User.find_by_surname("Smith")     # gives Fred
User.find_all_by_surname("Smith") # gives [Fred]
File[0].user                      # won't work - the data is not normalized

DEVELOPMENT

You’ll need bundler installed:

gem install bundler

Then you have to fetch all the dependencies:

bundle

To run all the test simple type rake:

rake

This might take several minutes, since for each scenario a whole set of C files have to compiled and linked.

During development you should watch your ~/.ruby_inline directory. If there are thousands of files there, you should fill a bug, since most of them should be automatically destroyed.

If you want to implement a feature/fix a bug, first be sure that it is documented on the bug tracker: github.com/apohllo/rod/issues. Then be sure to write tests covering the added feature/fixed bug. Include the tests in Rakefile, run the tests and if everything is fine send me a pull request.

BENCHMARKS:

There is a separate project available under github.com/apohllo/rod-benchmark that shows how ROD behave compared to the other storage engines. It should be noted that this benchmarks are biased towards the design goals of ROD. So don’t expect general purpose database tests.

LICENSE:

(The MIT/X11 License)

Copyright © 2008-2012 Aleksander Pohl

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

FEEDBACK