Class: Bio::REBASE

Inherits:
Object show all
Defined in:
lib/bio/db/rebase.rb

Overview

bio/db/rebase.rb - Interface for EMBOSS formatted REBASE files

Author

Trevor Wennblom <[email protected]>

Copyright

Copyright © 2005-2007 Midwinter Laboratories, LLC (midwinterlabs.com)

License

The Ruby License

Description

Bio::REBASE provides utilties for interacting with REBASE data in EMBOSS format. REBASE is the Restriction Enzyme Database, more information can be found here:

EMBOSS formatted files located at:

These files are the same as the “emboss_?.???” files located at:

To easily get started with the data you can simply type this command at your shell prompt:

% wget ftp://ftp.neb.com/pub/rebase/emboss*

Usage

require 'bio'
require 'pp'

enz = File.read('emboss_e')
ref = File.read('emboss_r')
sup = File.read('emboss_s')

# When creating a new instance of Bio::REBASE
# the contents of the enzyme file must be passed.
# The references and suppiers file contents
# may also be passed.
rebase = Bio::REBASE.new( enz )
rebase = Bio::REBASE.new( enz, ref )
rebase = Bio::REBASE.new( enz, ref, sup )

# The 'read' class method allows you to read in files
# that are REBASE EMBOSS formatted
rebase = Bio::REBASE.read( 'emboss_e' )
rebase = Bio::REBASE.read( 'emboss_e', 'emboss_r' )
rebase = Bio::REBASE.read( 'emboss_e', 'emboss_r', 'emboss_s' )

# The data loaded may be saved in YAML format
rebase.save_yaml( 'enz.yaml' )
rebase.save_yaml( 'enz.yaml', 'ref.yaml' )
rebase.save_yaml( 'enz.yaml', 'ref.yaml', 'sup.yaml' )

# YAML formatted files can also be read with the
# class method 'load_yaml'
rebase = Bio::REBASE.load_yaml( 'enz.yaml' )
rebase = Bio::REBASE.load_yaml( 'enz.yaml', 'ref.yaml' )
rebase = Bio::REBASE.load_yaml( 'enz.yaml', 'ref.yaml', 'sup.yaml' )

pp rebase.enzymes[0..4]                     # ["AarI", "AasI", "AatI", "AatII", "Acc16I"]
pp rebase.enzyme_name?('aasi')              # true
pp rebase['AarI'].pattern                   # "CACCTGC"
pp rebase['AarI'].blunt?                    # false
pp rebase['AarI'].organism                  # "Arthrobacter aurescens SS2-322"
pp rebase['AarI'].source                    # "A. Janulaitis"
pp rebase['AarI'].primary_strand_cut1       # 11
pp rebase['AarI'].primary_strand_cut2       # 0
pp rebase['AarI'].complementary_strand_cut1 # 15
pp rebase['AarI'].complementary_strand_cut2 # 0
pp rebase['AarI'].suppliers                 # ["F"]
pp rebase['AarI'].supplier_names            # ["Fermentas International Inc."]

pp rebase['AarI'].isoschizomers             # Currently none stored in the references file
pp rebase['AarI'].methylation               # ""

pp rebase['EcoRII'].methylation             # "2(5)"
pp rebase['EcoRII'].suppliers               # ["F", "J", "M", "O", "S"]
pp rebase['EcoRII'].supplier_names  # ["Fermentas International Inc.", "Nippon Gene Co., Ltd.",
                                    # "Roche Applied Science", "Toyobo Biochemicals",
                                    # "Sigma Chemical Corporation"]

# Number of enzymes in the database
pp rebase.size                              # 673
pp rebase.enzymes.size                      # 673

rebase.each do |name, info|
  pp "#{name}:  #{info.methylation}" unless info.methylation.empty?
end

Defined Under Namespace

Classes: DynamicMethod_Hash, EnzymeEntry

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(enzyme_lines, reference_lines = nil, supplier_lines = nil, yaml = false) ⇒ REBASE

Constructor


Arguments

  • enzyme_lines: (required) contents of EMBOSS formatted enzymes file

  • reference_lines: (optional) contents of EMBOSS formatted references file

  • supplier_lines: (optional) contents of EMBOSS formatted suppliers files

  • yaml: (optional, default false) enzyme_lines, reference_lines, and supplier_lines are read as YAML if set to true

Returns

Bio::REBASE



174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
# File 'lib/bio/db/rebase.rb', line 174

def initialize( enzyme_lines, reference_lines = nil, supplier_lines = nil, yaml = false )
  # All your REBASE are belong to us.

  if yaml
    @enzyme_data = enzyme_lines
    @reference_data = reference_lines
    @supplier_data = supplier_lines
  else
    @enzyme_data = parse_enzymes(enzyme_lines)
    @reference_data = parse_references(reference_lines)
    @supplier_data = parse_suppliers(supplier_lines)
  end

  EnzymeEntry.supplier_data = @supplier_data
  setup_enzyme_data
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method_id, *args) ⇒ Object

Make the instantiated class act like a Hash on @data Does the equivalent and more of this:

def []( key ); @data[ key ]; end
def size; @data.size; end


158
159
160
161
162
163
# File 'lib/bio/db/rebase.rb', line 158

def method_missing(method_id, *args) #:nodoc:
  self.class.class_eval do
    define_method(method_id) { |a| Hash.instance_method(method_id).bind(@data).call(a) }
  end
  Hash.instance_method(method_id).bind(@data).call(*args)
end

Class Method Details

.load_yaml(f_enzyme, f_reference = nil, f_supplier = nil) ⇒ Object

Read YAML formatted files

rebase = Bio::REBASE.load_yaml( 'enz.yaml' )
rebase = Bio::REBASE.load_yaml( 'enz.yaml', 'ref.yaml' )
rebase = Bio::REBASE.load_yaml( 'enz.yaml', 'ref.yaml', 'sup.yaml' )

Arguments

  • f_enzyme: (required) Filename to read YAML-formatted enzyme data

  • f_reference: (optional) Filename to read YAML-formatted reference data

  • f_supplier: (optional) Filename to read YAML-formatted supplier data

Returns

Bio::REBASE object



261
262
263
264
265
266
# File 'lib/bio/db/rebase.rb', line 261

def self.load_yaml( f_enzyme, f_reference=nil, f_supplier=nil )
  e = YAML.load_file(f_enzyme)
  r = f_reference ? YAML.load_file(f_reference) : nil
  s = f_supplier ? YAML.load_file(f_supplier) : nil
  self.new(e,r,s,true)
end

.read(f_enzyme, f_reference = nil, f_supplier = nil) ⇒ Object

Read REBASE EMBOSS-formatted files

rebase = Bio::REBASE.read( 'emboss_e' )
rebase = Bio::REBASE.read( 'emboss_e', 'emboss_r' )
rebase = Bio::REBASE.read( 'emboss_e', 'emboss_r', 'emboss_s' )

Arguments

  • f_enzyme: (required) Filename to read enzyme data

  • f_reference: (optional) Filename to read reference data

  • f_supplier: (optional) Filename to read supplier data

Returns

Bio::REBASE object



243
244
245
246
247
248
# File 'lib/bio/db/rebase.rb', line 243

def self.read( f_enzyme, f_reference=nil, f_supplier=nil )
  e = IO.readlines(f_enzyme)
  r = f_reference ? IO.readlines(f_reference) : nil
  s = f_supplier ? IO.readlines(f_supplier) : nil
  self.new(e,r,s)
end

Instance Method Details

#eachObject

Calls block once for each element in @data hash, passing that element as a parameter.


Arguments

  • Accepts a block

Returns

results of block operations



150
151
152
# File 'lib/bio/db/rebase.rb', line 150

def each
  @data.each { |item| yield item }
end

#enzyme_name?(name) ⇒ Boolean

Check if supplied name is the name of an available enzyme


Arguments

  • name: Enzyme name

Returns

true/false

Returns:

  • (Boolean)


207
208
209
210
211
212
# File 'lib/bio/db/rebase.rb', line 207

def enzyme_name?(name)
  enzymes.each do |e|
    return true if e.downcase == name.downcase
  end
  return false
end

#enzymesObject

List the enzymes available


Arguments

  • none

Returns

Array sorted enzyme names



197
198
199
# File 'lib/bio/db/rebase.rb', line 197

def enzymes
  @data.keys.sort
end

#save_yaml(f_enzyme, f_reference = nil, f_supplier = nil) ⇒ Object

Save the current data

rebase.save_yaml( 'enz.yaml' )
rebase.save_yaml( 'enz.yaml', 'ref.yaml' )
rebase.save_yaml( 'enz.yaml', 'ref.yaml', 'sup.yaml' )

Arguments

  • f_enzyme: (required) Filename to save YAML formatted output of enzyme data

  • f_reference: (optional) Filename to save YAML formatted output of reference data

  • f_supplier: (optional) Filename to save YAML formatted output of supplier data

Returns

nothing



225
226
227
228
229
230
# File 'lib/bio/db/rebase.rb', line 225

def save_yaml( f_enzyme, f_reference=nil, f_supplier=nil )
  File.open(f_enzyme, 'w') { |f| f.puts YAML.dump(@enzyme_data) }
  File.open(f_reference, 'w') { |f| f.puts YAML.dump(@reference_data) } if f_reference
  File.open(f_supplier, 'w') { |f| f.puts YAML.dump(@supplier_data) } if f_supplier
  return
end