Class: Ensembl::Core::SeqRegion

Inherits:
DBConnection show all
Defined in:
lib/bio-ensembl/core/activerecord.rb

Overview

The SeqRegion class describes a part of a coordinate systems. It is an interface to the seq_region table of the Ensembl mysql database.

This class uses ActiveRecord to access data in the Ensembl database. See the general documentation of the Ensembl module for more information on what this means and what methods are available.

Examples:

chr4 = SeqRegion.find_by_name('4')
puts chr4.coord_system.name     #--> 'chromosome'
chr4.genes.each do |gene|
puts gene.biotype
end

Instance Method Summary collapse

Methods inherited from DBConnection

connect, ensemblgenomes_connect

Methods inherited from DBRegistry::Base

generic_connect, get_info, get_name_from_db

Instance Method Details

#assembled_seq_regions(coord_system_name = nil) ⇒ Array<SeqRegion>

The SeqRegion#assembled_seq_regions returns the sequence regions on which the current region is assembled. For example, calling this method on a contig sequence region, it might return the chromosome that that contig is part of. Optionally, this method takes a coordinate system name so that only regions of that coordinate system are returned.

Parameters:

  • coord_system_name (String) (defaults to: nil)

    Name of coordinate system

Returns:

  • (Array<SeqRegion>)

    Array of SeqRegion objects



400
401
402
403
404
405
406
407
408
409
410
411
412
413
# File 'lib/bio-ensembl/core/activerecord.rb', line 400

def assembled_seq_regions(coord_system_name = nil)
  if coord_system_name.nil?
    return self.asm_seq_regions
  else
    answer = Array.new
	  coord_system = CoordSystem.find_by_name(coord_system_name)
    self.asm_seq_regions.each do |asr|
      if asr.coord_system_id == coord_system.id
        answer.push(asr)
      end
    end
	  return answer
  end
end

This method queries the assembly table to find those rows (i.e. AssemblyLink objects) for which this seq_region is the assembly.

Examples:

my_seq_region = SeqRegion.find('4')
first_link = my_seq_region.assembly_links_as_assembly[0]
puts first_link.asm_start.to_s + "\t" + first_link.asm_end.to_s

Parameters:

  • coord_system (CoordSystem) (defaults to: nil)

    Coordinate system object that the components should belong to

Returns:



450
451
452
453
454
455
456
457
458
459
# File 'lib/bio-ensembl/core/activerecord.rb', line 450

def assembly_links_as_assembly(coord_system = nil)
  if Ensembl::SESSION.coord_system_ids.has_key?(coord_system.name)
    coord_system_id = Ensembl::SESSION.coord_system_ids[coord_system.name]
  else
    Ensembl::SESSION.coord_systems[cs.id] = coord_system.id
    Ensembl::SESSION.coord_system_ids[coord_system.name] = coord_system.id
  end
  coord_system = Ensembl::SESSION.coord_systems[coord_system.id]
  return AssemblyLink.find_by_sql("SELECT * FROM assembly a WHERE a.asm_seq_region_id = #{self.id} AND a.cmp_seq_region_id IN (SELECT sr.seq_region_id FROM seq_region sr WHERE coord_system_id = #{coord_system.id} )")
end

This method queries the assembly table to find those rows (i.e. AssemblyLink objects) for which this seq_region is the component.

Examples:


my_seq_region = SeqRegion.find('Chr4.003.1')
first_link = my_seq_region.assembly_links_as_component[0]
puts first_link.asm_start.to_s + "\t" + first_link.asm_end.to_s

Parameters:

  • coord_system (CoordSystem) (defaults to: nil)

    Coordinate system object that the assembly should belong to

Returns:



473
474
475
476
477
478
479
# File 'lib/bio-ensembl/core/activerecord.rb', line 473

def assembly_links_as_component(coord_system = nil)
  if coord_system.nil?
    return self.asm_links_as_cmp
  else
	  return self.asm_links_as_cmp.select{|alac| alac.asm_seq_region.coord_system_id == coord_system.id}
  end
end

#component_seq_regions(coord_system_name = nil) ⇒ Array<SeqRegion>

The SeqRegion#component_seq_regions returns the sequence regions contained within the current region (in other words: the bits used to assemble the current region). For example, calling this method on a chromosome sequence region, it might return the contigs that were assembled into this chromosome. Optionally, this method takes a coordinate system name so that only regions of that coordinate system are returned.

Parameters:

  • coord_system_name (String) (defaults to: nil)

    Name of coordinate system

Returns:

  • (Array<SeqRegion>)

    Array of SeqRegion objects



424
425
426
427
428
429
430
431
432
433
434
435
436
437
# File 'lib/bio-ensembl/core/activerecord.rb', line 424

def component_seq_regions(coord_system_name = nil)
	if coord_system_name.nil?
    return self.cmp_seq_regions
  else
    answer = Array.new
	  coord_system = CoordSystem.find_by_name(coord_system_name)
    self.cmp_seq_regions.each do |csr|
      if csr.coord_system_id == coord_system.id
        answer.push(csr)
      end
    end
	  return answer
  end
end

#sequenceString Also known as: seq

The SeqRegion#sequence method returns the sequence of this seq_region. At the moment, it will only return the sequence if the region belongs to the seqlevel coordinate system.

Returns:

  • (String)

    DNA sequence



486
487
488
# File 'lib/bio-ensembl/core/activerecord.rb', line 486

def sequence
  return self.dna.sequence
end

#sliceEnsembl::Core::Slice

The SeqRegion#slice method returns a slice object that covers the whole of the seq_region.

Returns:



388
389
390
# File 'lib/bio-ensembl/core/activerecord.rb', line 388

def slice
  return Ensembl::Core::Slice.new(self)
end

#subsequence(start, stop) ⇒ String Also known as: subseq

The SeqRegion#subsequence method returns a subsequence of this seq_region. At the moment, it will only return the sequence if the region belongs to the seqlevel coordinate system.

Parameters:

  • start (Integer)

    Start position

  • stop (Integer)

    Stop position

Returns:

  • (String)

    DNA sequence



498
499
500
# File 'lib/bio-ensembl/core/activerecord.rb', line 498

def subsequence(start, stop)
	return self.seq.slice(start - 1, (stop - start) + 1)
end