Class: Bio::Blast::Fastacmd
- Includes:
- Enumerable
- Defined in:
- lib/bio/io/fastacmd.rb
Overview
DESCRIPTION
Retrieves FASTA formatted sequences from a blast database using NCBI fastacmd command.
This class requires ‘fastacmd’ command and a blast database
(formatted using the ‘-o’ option of ‘formatdb’).
USAGE
require 'bio'
fastacmd = Bio::Blast::Fastacmd.new("/db/myblastdb")
entry = fastacmd.get_by_id("sp:128U_DROME")
fastacmd.fetch("sp:128U_DROME")
fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])
fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"]).each do |fasta|
puts fasta
end
REFERENCES
-
NCBI tool ftp.ncbi.nih.gov/blast/executables/LATEST/ncbi.tar.gz
-
fastacmd.html biowulf.nih.gov/apps/blast/doc/fastacmd.html
Instance Attribute Summary collapse
-
#database ⇒ Object
Database file path.
-
#fastacmd ⇒ Object
fastacmd command file path.
Instance Method Summary collapse
-
#each_entry ⇒ Object
(also: #each)
Iterates over all sequences in the database.
-
#fetch(list) ⇒ Object
Get the sequence for a list of IDs in the database.
-
#get_by_id(entry_id) ⇒ Object
Get the sequence of a specific entry in the BLASTable database.
-
#initialize(blast_database_file_path) ⇒ Fastacmd
constructor
This method provides a handle to a BLASTable database, which you can then use to retrieve sequences.
Constructor Details
#initialize(blast_database_file_path) ⇒ Fastacmd
This method provides a handle to a BLASTable database, which you can then use to retrieve sequences.
Prerequisites:
-
You have created a BLASTable database with the ‘-o T’ option.
-
You have the NCBI fastacmd tool installed.
For example, suppose the original input file looks like:
>my_seq_1
ACCGACCTCCGGAACGGATAGCCCGACCTACG
>my_seq_2
TCCGACCTTTCCTACCGCACACCTACGCCATCAC
...
and you’ve created a BLASTable database from that with the command
cd /my_dir/
formatdb -i my_input_file -t Test -n Test -o T
then you can get a handle to this database with the command
fastacmd = Bio::Blast::Fastacmd.new("/my_dir/Test")
Arguments:
- database
-
path and name of BLASTable database
81 82 83 84 |
# File 'lib/bio/io/fastacmd.rb', line 81 def initialize(blast_database_file_path) @database = blast_database_file_path @fastacmd = 'fastacmd' end |
Instance Attribute Details
#database ⇒ Object
Database file path.
55 56 57 |
# File 'lib/bio/io/fastacmd.rb', line 55 def database @database end |
#fastacmd ⇒ Object
fastacmd command file path.
58 59 60 |
# File 'lib/bio/io/fastacmd.rb', line 58 def fastacmd @fastacmd end |
Instance Method Details
#each_entry ⇒ Object Also known as: each
Iterates over all sequences in the database.
fastacmd.each_entry do |fasta|
p [ fasta.definition[0..30], fasta.seq.size ]
end
- Returns
-
a Bio::FastaFormat object for each iteration
130 131 132 133 134 135 136 137 138 139 140 141 |
# File 'lib/bio/io/fastacmd.rb', line 130 def each_entry cmd = [ @fastacmd, '-d', @database, '-D', '1' ] Bio::Command.call_command(cmd) do |io| io.close_write Bio::FlatFile.open(Bio::FastaFormat, io) do |f| f.each_entry do |entry| yield entry end end end self end |
#fetch(list) ⇒ Object
Get the sequence for a list of IDs in the database.
For example:
p fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])
This method always returns an array of Bio::FastaFormat objects, even when the result is a single entry.
Arguments:
-
ids: list of IDs to retrieve from the database
- Returns
-
array of Bio::FastaFormat objects
109 110 111 112 113 114 115 116 117 118 119 120 121 |
# File 'lib/bio/io/fastacmd.rb', line 109 def fetch(list) if list.respond_to?(:join) entry_id = list.join(",") else entry_id = list end cmd = [ @fastacmd, '-d', @database, '-s', entry_id ] Bio::Command.call_command(cmd) do |io| io.close_write Bio::FlatFile.new(Bio::FastaFormat, io).to_a end end |
#get_by_id(entry_id) ⇒ Object
Get the sequence of a specific entry in the BLASTable database. For example:
entry = fastacmd.get_by_id("sp:128U_DROME")
Arguments:
-
id: id of an entry in the BLAST database
- Returns
-
a Bio::FastaFormat object
94 95 96 |
# File 'lib/bio/io/fastacmd.rb', line 94 def get_by_id(entry_id) fetch(entry_id).shift end |