Class: Lite3::DBM

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/internal_lite3/dbm.rb

Overview

Lite3::DBM encapsulates a single table in a single SQLite3 database file and lets you access it as easily as a Hash. Multiple instances may be opened on different tables in the same database.

Note that instances do not explicitly own their database connection; instead, they are managed internally and shared across DBM instances.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, tablename, serializer = :yaml) ⇒ DBM

Create a new Lite3::DBM object that opens database file filename and performs subsequent operations on table. Both the database file and the table will be created if they do not yet exist. The table name must be a valid name identifier (i.e. matches /^[a-zA-Z_]\w*$/).

The optional third argument serializer is used to choose the serialization method for converting Ruby values into storable strings. There are three options:

  • :yaml uses the Psych module.
  • :marshal uses the Marshal module.
  • :string simply uses the default to_s method, just like the stock DBM.

Each of these will have their pros and cons. The default is :yaml because that is the most portable. :marshal tends to be faster but is incompatible across minor Ruby versions.

Your serializer choice is registered in a metadata table when tablename is created in the SQLite3 file. Afterward, it is an error to attempt to open the table with a different serializer and will result in a Lite3::Error exception.

Caveats:

  1. Both YAML and Marshal serialization have the usual security issues as described in the documentation for Marshal and Psych. If you are going to let an untrusted entity modify the database, you should not use these methods and instead stick to string conversion.

  2. DBM does not check your Marshal version; a mismatch will fail dramatically at exactly the wrong time.

  3. filename is normalized using File.realpath and this path is used to look up an existing database handle if one exists. Using hard links or other trickery to defeat this mechanism and open a second handle to the same database is probably still harmless but is not something this API guarantees will work correctly.



67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/internal_lite3/dbm.rb', line 67

def initialize(filename, tablename, serializer = :yaml)
  @filename = filename
  @tablename = tablename
  @valenc,
  @valdec = value_encoders(serializer)
  @handle = HandlePool.get(filename)

  @handle.addref(self)

  check("Malformed table name '#{tablename}'; must be a valid identifer") {
    tablename =~ /^[a-zA-Z_]\w*$/
  }

  transaction {
    register_serialization_scheme(serializer)
    @handle.create_key_value_table( actual_tbl() )
  }
rescue Error => e
  self.close if @handle
  raise e
end

Class Method Details

.open(filename, tablename, serializer = :yaml, &block) ⇒ Object

Identical to new except that if a block is provided, it is evaluated with a new Lite3::DBM which is then closed afterward. This is analagous to File.open. See initialize for an explanation of the arguments and caveats.



94
95
96
97
98
99
100
101
102
103
# File 'lib/internal_lite3/dbm.rb', line 94

def self.open(filename, tablename, serializer = :yaml, &block)
  instance = self.new(filename, tablename, serializer)
  return instance unless block

  begin
    return block.call(instance)
  ensure
    instance.close
  end
end

Instance Method Details

#[](key) ⇒ Object

Retrieve the value associated with key from the database or nil if it is not present.



263
264
265
# File 'lib/internal_lite3/dbm.rb', line 263

def [](key)
  return fetch(key, nil)
end

#[]=(key, value) ⇒ Object Also known as: store

Store value at key in the database.

key must be a String or a Symbol; Symbols are transparently converted to Strings.

value must be convertable to string by whichever serialization method you have chosen.



251
252
253
254
255
256
257
258
# File 'lib/internal_lite3/dbm.rb', line 251

def []=(key, value)
  key = check_key(key)
  valstr = @valenc.call(value)

  @handle.upsert(actual_tbl(), key, valstr)

  return value
end

#clearObject

Delete all entries from the table.



354
355
356
# File 'lib/internal_lite3/dbm.rb', line 354

def clear
  @handle.clear_table(actual_tbl())
end

#closeObject

Disassociate self from the underlying database. If this is the last DBM using it, the handle will (probably) also be closed.

Subsequent attempts to use self will fail with an error; the only exception to this is the method closed? which will return true.



186
187
188
189
# File 'lib/internal_lite3/dbm.rb', line 186

def close
  @handle.delref(self)
  @handle = ClosedHandle.new(@filename, @tablename)
end

#closed?Boolean

Test if this object has been closed. This is safe to call on a closed DBM.

Returns:

  • (Boolean)


193
194
195
# File 'lib/internal_lite3/dbm.rb', line 193

def closed?
  return @handle.is_a? ClosedHandle
end

#delete(key) ⇒ Object

Remove key and its associated value from self. If key is not present, does nothing.



440
441
442
# File 'lib/internal_lite3/dbm.rb', line 440

def delete(key)
  @handle.delete(actual_tbl(), key)
end

#delete_if {|value| ... } ⇒ Object Also known as: reject!

Evaluate the block on each key-value pair in self end delete each entry for which the block returns true.

Yields:

  • (value)

    The block to evaluate



448
449
450
451
452
# File 'lib/internal_lite3/dbm.rb', line 448

def delete_if(&block)
  transaction {
    self.each{ |k, v| block.call(k,v) and delete(k) }
  }
end

#each {|key, value| ... } ⇒ Object Also known as: each_pair

Calls the given block with each key-value pair in the usual order, then return self. The entire call takes place in its own transaction.

It is safe to modify self inside the block.

If no block is given, returns an Enumerator instead. The Enumerator does not start a transaction but individual accesses of it (e.g. calling next) each take place in their own transaction.

Yields:

  • (key, value)

    The block to evaluate



392
393
394
395
396
# File 'lib/internal_lite3/dbm.rb', line 392

def each(&block)
  return self.to_enum(:nt_each) unless block
  transaction { nt_each(&block) }
  return self
end

#each_key {|key| ... } ⇒ Object

Calls the given block with each key; returns self. Exactly like each except for the block argument.

Yields:

  • (key)

    The block to evaluate



415
416
417
418
# File 'lib/internal_lite3/dbm.rb', line 415

def each_key(&block)
  return Enumerator.new{|y| nt_each{ |k,v| y << k  } } unless block
  return each{ |k,v| block.call(k) }
end

#each_value {|value| ... } ⇒ Object

Calls the given block with each value; returns self. Exactly like each except for the block argument.

Yields:

  • (value)

    The block to evaluate



424
425
426
427
# File 'lib/internal_lite3/dbm.rb', line 424

def each_value(&block)
  return Enumerator.new{|y| nt_each{ |k,v| y << v  } } unless block
  return each{ |k,v| block.call(v) }
end

#empty?Boolean

Test if self is empty.

Returns:

  • (Boolean)


462
463
464
# File 'lib/internal_lite3/dbm.rb', line 462

def empty?
  return size == 0
end

#fast_each {|key, value| ... } ⇒ Object

Behaves like 'each' with a block--that is, call it for each key/value pair--but (probably) executes faster.

The downside is that there is no guarantee of reentrance or safety. The block MUST NOT access the database in any way. In addition, no guarantee is made about element order.

(You might be able to infer some ways to safely bend the rules by seeing what the underlying database libraries allow, but your code won't be future-proof if you do that.)

Yields:

  • (key, value)

    The block to evaluate



371
372
373
374
375
376
377
# File 'lib/internal_lite3/dbm.rb', line 371

def fast_each(&block)
  transaction {
    @handle.tbl_each_fast( actual_tbl() ) { |row|
      block.call(row[:key], @valdec.call(row[:value]));
    }
  }
end

#fetch(key, *args) {|key| ... } ⇒ Object

Retrieve the value associated with key.

key must be a String or a Symbol; Symbols are transparently converted to Strings.

If it is not present and a block is given, evaluate the block with the key as its argument and return that. If no block was given either but one extra parameter was given, that value is returned instead. Finally, if none of these was given, it throws an IndexError exception.

All database accesses occur within a transaction, so it is safe to consider fetch atomic. This includes evaluating a block argument.

It is an error if fetch is called with more than two arguments.

Yields:

  • (key)

    The fallback block.

Raises:

  • (IndexError)


285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
# File 'lib/internal_lite3/dbm.rb', line 285

def fetch(key, *args, &default_block)

  # Ensure there are no extra arguments
  nargs = args.size + 1
  check("Too many arguments for 'fetch'; expected 1 or 2; got #{nargs}") {
    nargs <= 2
  }

  # Retrieve the value
  key = check_key(key)

  # We do the lookup-and-maybe-replace in a transaction so that
  # it's atomic.
  transaction do
    # Return the value if found.  (nil will always mean the entry
    # isn't present because values are encoded in strings.)
    value = @handle.lookup(actual_tbl(), key)
    return @valdec.call(value) if value

    # Not found.  If a block was given, evaluate it and return its
    # result.
    return default_block.call(key) if default_block

    # Next, see if we have a default value we can return
    return args[0] if args.size > 0
  end

  # And if all else fails, raise an IndexError.
  raise IndexError.new("key '#{key}' not found.")
end

#has_key?(key) ⇒ Boolean Also known as: include?, member?, key?

Return true if the table contains key; otherwise, return false.

Returns:

  • (Boolean)


344
345
346
347
348
# File 'lib/internal_lite3/dbm.rb', line 344

def has_key?(key)
  return false unless key.class == String || key.class == Symbol
  fetch( key ) { return false }
  return true
end

#has_value?(val) ⇒ Boolean Also known as: value?

Test if val is one of the values in this table.

Potentially very slow, especially on large tables.

Returns:

  • (Boolean)


505
506
507
508
# File 'lib/internal_lite3/dbm.rb', line 505

def has_value?(val)
  fast_each{|k,v| return true if v == val }
  return false
end

#invertObject

Return a Hash whose keys are the table's values and whose values are the table's keys.

WARNING: it is possible for tables to be significantly larger than available RAM; in that case, this will likely crash your program.



517
518
519
520
521
# File 'lib/internal_lite3/dbm.rb', line 517

def invert
  result = {}
  fast_each{|k,v| result[v] = k}
  return result
end

#keysObject

Return an Array of all of the keys in the table.

WARNING: since this list is being read from disk, it is possible that the result could exceed available memory.



326
327
328
329
330
# File 'lib/internal_lite3/dbm.rb', line 326

def keys
  keys = []
  fast_each { |k, v| keys.push k }
  return keys
end

#shiftObject

Remove the first key/value pair from self and return it. "First" is defined by self's row order, which is the order of insertion as determined by SQLite3.



526
527
528
529
530
531
532
533
534
535
# File 'lib/internal_lite3/dbm.rb', line 526

def shift
  transaction {
    return nil if empty?

    key, value = self.each.first
    delete(key)

    return [key, value]
  }
end

#sizeObject Also known as: length

Return the number of entries (key-value pairs) in self.



456
457
458
# File 'lib/internal_lite3/dbm.rb', line 456

def size
  return @handle.get_size(actual_tbl())
end

#to_aObject

Returns an Array of 2-element Array objects each containing a key-value pair from self.

WARNING: it is possible for tables to be significantly larger than available RAM; in that case, this will likely crash your program.



490
491
492
493
494
# File 'lib/internal_lite3/dbm.rb', line 490

def to_a
  result = []
  fast_each { |k,v| result.push [k,v] }
  return result
end

#to_hashObject

Copies the table into a Hash and returns it.

WARNING: it is possible for tables to be significantly larger than available RAM; in that case, this will likely crash your program.



477
478
479
480
481
# File 'lib/internal_lite3/dbm.rb', line 477

def to_hash
  result = {}
  fast_each{|k,v| result[k] = v}
  return result
end

#to_sObject Also known as: inspect



171
172
173
174
175
# File 'lib/internal_lite3/dbm.rb', line 171

def to_s
  openstr = closed? ? 'CLOSED' : 'OPEN'
  return "<#{self.class}:0x#{object_id.to_s(16)} file='#{@filename}'" +
         " tablename='#{@tablename}' #{openstr}>"
end

#transaction {|db| ... } ⇒ obj

Begins a transaction, evaluates the given block and then ends the transaction. If no error occurred (i.e. an exception was thrown), the transaction is committed; otherwise, it is rolled back. Returns the block's result.

It is safe to call DBM.transaction within another DBM.transaction block's call chain because DBM will not start a new transaction on a database handle that already has one in progress. (It may be possible to trick DBM into trying via fibers or other flow control trickery; don't do that.)

Note that it's probably not a good idea to assume too much about the precise semantics; I can't guarantee that the underlying library(es) won't change or be replaced outright.

That being said, at present, this is simply a wrapper around Sequel::Database.transaction with the default options and so is subject to the quirks therein. In version 1.0.0, transactions were always executed in :deferred mode via the sqlite3 gem.

Yields:

  • (db)

    The block takes a reference to the receiver as an argument.

Returns:

  • (obj)

    Returns the block's result.



229
230
231
# File 'lib/internal_lite3/dbm.rb', line 229

def transaction(&block)
  return @handle.transaction { block.call(self) }
end

#transaction_active?Boolean

Test if there is currently a transaction in progress

Returns:

  • (Boolean)


234
235
236
# File 'lib/internal_lite3/dbm.rb', line 234

def transaction_active?
  return @handle.transaction_active?
end

#update(hash) ⇒ Object

Updates the database with multiple values from the specified object. Takes any object which implements the each_pair method, including Hash and DBM objects.



432
433
434
435
436
# File 'lib/internal_lite3/dbm.rb', line 432

def update(hash)
  transaction {
    hash.each{|k, v| self[k] = v }
  }
end

#valuesObject

Return an array of all values in the table.

WARNING: since this list is being read from disk, it is possible that the result could exceed available memory.



336
337
338
339
340
# File 'lib/internal_lite3/dbm.rb', line 336

def values
  values = []
  fast_each { |k, v| values.push v }
  return values
end

#values_at(*keys) ⇒ Object

Return a new Array containing the values corresponding to the given keys.



318
319
320
# File 'lib/internal_lite3/dbm.rb', line 318

def values_at(*keys)
  return keys.map{|k| self[k]}
end