Class: Bio::Blast
- Defined in:
- lib/bio/appl/blast.rb,
lib/bio/io/fastacmd.rb,
lib/bio/appl/blast/rexml.rb,
lib/bio/appl/blast/remote.rb,
lib/bio/appl/blast/report.rb,
lib/bio/appl/bl2seq/report.rb,
lib/bio/appl/blast/format0.rb,
lib/bio/appl/blast/format8.rb,
lib/bio/appl/blast/wublast.rb,
lib/bio/appl/blast/rpsblast.rb,
lib/bio/appl/blast/xmlparser.rb,
lib/bio/appl/blast/ncbioptions.rb
Overview
Description
The Bio::Blast class contains methods for running local or remote BLAST searches, as well as for parsing of the output of such BLASTs (i.e. the BLAST reports). For more information on similarity searches and the BLAST program, see www.ncbi.nlm.nih.gov/Education/BLASTinfo/similarity.html.
Usage
require 'bio'
# To run an actual BLAST analysis:
# 1. create a BLAST factory
remote_blast_factory = Bio::Blast.remote('blastp', 'SWISS',
'-e 0.0001', 'genomenet')
#or:
local_blast_factory = Bio::Blast.local('blastn','/path/to/db')
# 2. run the actual BLAST by querying the factory
report = remote_blast_factory.query(sequence_text)
# Then, to parse the report, see Bio::Blast::Report
See also
-
Bio::Blast::Report
-
Bio::Blast::Report::Hit
-
Bio::Blast::Report::Hsp
References
Defined Under Namespace
Modules: Default, RPSBlast, Remote, WU Classes: Bl2seq, Fastacmd, NCBIOptions, Report, Report_tab
Instance Attribute Summary collapse
-
#blastall ⇒ Object
Full path for blastall.
-
#db ⇒ Object
Database name (-d option for blastall).
-
#filter ⇒ Object
Filter option for blastall -F (T or F).
-
#format ⇒ Object
Output report format for blastall -m .
-
#matrix ⇒ Object
Substitution matrix for blastall -M.
-
#options ⇒ Object
Options for blastall.
-
#output ⇒ Object
readonly
Returns a String containing blast execution output in as is the Bio::Blast#format.
-
#parser ⇒ Object
writeonly
to change :xmlparser, :rexml, :tab.
-
#program ⇒ Object
Program name (-p option for blastall): blastp, blastn, blastx, tblastn or tblastx.
-
#server ⇒ Object
Server to submit the BLASTs to.
Class Method Summary collapse
-
.local(program, db, options = '', blastall = nil) ⇒ Object
- This is a shortcut for Bio::Blast.new: Bio::Blast.local(program, database, options) is equivalent to Bio::Blast.new(program, database, options, ‘local’) — Arguments: * program (required): ‘blastn’, ‘blastp’, ‘blastx’, ‘tblastn’ or ‘tblastx’ * db (required): name of the local database * options: blastall options \ (see www.genome.jp/dbget-bin/show_man?blast2) * blastall: full path to blastall program (e.g. “/opt/bin/blastall”; DEFAULT: “blastall”) Returns
-
Bio::Blast factory object.
-
.remote(program, db, option = '', server = 'genomenet') ⇒ Object
Bio::Blast.remote does exactly the same as Bio::Blast.new, but sets the remote server ‘genomenet’ as its default.
-
.reports(input, parser = nil) ⇒ Object
Bio::Blast.report parses given data, and returns an array of report (Bio::Blast::Report or Bio::Blast::Default::Report) objects, or yields each report object when a block is given.
-
.reports_xml(input, parser = nil) ⇒ Object
Note that this is the old implementation of Bio::Blast.reports.
Instance Method Summary collapse
-
#initialize(program, db, opt = [], server = 'local') ⇒ Blast
constructor
Creates a Bio::Blast factory object.
-
#option ⇒ Object
Returns options of blastall.
-
#option=(str) ⇒ Object
Set options for blastall.
-
#query(query) ⇒ Object
This method submits a sequence to a BLAST factory, which performs the actual BLAST.
Constructor Details
#initialize(program, db, opt = [], server = 'local') ⇒ Blast
Creates a Bio::Blast factory object.
To run any BLAST searches, a factory has to be created that describes a certain BLAST pipeline: the program to use, the database to search, any options and the server to use. E.g.
blast_factory = Bio::Blast.new('blastn','dbsts', '-e 0.0001 -r 4', 'genomenet')
Arguments:
-
program (required): ‘blastn’, ‘blastp’, ‘blastx’, ‘tblastn’ or ‘tblastx’
-
db (required): name of the (local or remote) database
-
options: blastall options \
(see www.genome.jp/dbget-bin/show_man?blast2)
-
server: server to use (e.g. ‘genomenet’; DEFAULT = ‘local’)
- Returns
-
Bio::Blast factory object
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 |
# File 'lib/bio/appl/blast.rb', line 317 def initialize(program, db, opt = [], server = 'local') @program = program @db = db @blastall = 'blastall' @matrix = nil @filter = nil @output = '' @parser = nil @format = nil @options = (opt, program, db) self.server = server end |
Instance Attribute Details
#blastall ⇒ Object
Full path for blastall. (default: ‘blastall’).
280 281 282 |
# File 'lib/bio/appl/blast.rb', line 280 def blastall @blastall end |
#db ⇒ Object
Database name (-d option for blastall)
249 250 251 |
# File 'lib/bio/appl/blast.rb', line 249 def db @db end |
#filter ⇒ Object
Filter option for blastall -F (T or F).
286 287 288 |
# File 'lib/bio/appl/blast.rb', line 286 def filter @filter end |
#format ⇒ Object
Output report format for blastall -m
0, pairwise; 1; 2; 3; 4; 5; 6; 7, XML Blast outpu;, 8, tabular; 9, tabular with comment lines; 10, ASN text; 11, ASN binery [intege].
295 296 297 |
# File 'lib/bio/appl/blast.rb', line 295 def format @format end |
#matrix ⇒ Object
Substitution matrix for blastall -M
283 284 285 |
# File 'lib/bio/appl/blast.rb', line 283 def matrix @matrix end |
#options ⇒ Object
Options for blastall
252 253 254 |
# File 'lib/bio/appl/blast.rb', line 252 def @options end |
#output ⇒ Object (readonly)
Returns a String containing blast execution output in as is the Bio::Blast#format.
289 290 291 |
# File 'lib/bio/appl/blast.rb', line 289 def output @output end |
#parser=(value) ⇒ Object (writeonly)
to change :xmlparser, :rexml, :tab
298 299 300 |
# File 'lib/bio/appl/blast.rb', line 298 def parser=(value) @parser = value end |
#program ⇒ Object
Program name (-p option for blastall): blastp, blastn, blastx, tblastn or tblastx
246 247 248 |
# File 'lib/bio/appl/blast.rb', line 246 def program @program end |
#server ⇒ Object
Server to submit the BLASTs to
260 261 262 |
# File 'lib/bio/appl/blast.rb', line 260 def server @server end |
Class Method Details
.local(program, db, options = '', blastall = nil) ⇒ Object
This is a shortcut for Bio::Blast.new:
Bio::Blast.local(program, database, )
is equivalent to
Bio::Blast.new(program, database, , 'local')
Arguments:
-
program (required): ‘blastn’, ‘blastp’, ‘blastx’, ‘tblastn’ or ‘tblastx’
-
db (required): name of the local database
-
options: blastall options \
(see www.genome.jp/dbget-bin/show_man?blast2)
-
blastall: full path to blastall program (e.g. “/opt/bin/blastall”; DEFAULT: “blastall”)
- Returns
-
Bio::Blast factory object
79 80 81 82 83 84 85 |
# File 'lib/bio/appl/blast.rb', line 79 def self.local(program, db, = '', blastall = nil) f = self.new(program, db, , 'local') if blastall then f.blastall = blastall end f end |
.remote(program, db, option = '', server = 'genomenet') ⇒ Object
Bio::Blast.remote does exactly the same as Bio::Blast.new, but sets the remote server ‘genomenet’ as its default.
Arguments:
-
program (required): ‘blastn’, ‘blastp’, ‘blastx’, ‘tblastn’ or ‘tblastx’
-
db (required): name of the remote database
-
options: blastall options \
(see www.genome.jp/dbget-bin/show_man?blast2)
-
server: server to use (DEFAULT = ‘genomenet’)
- Returns
-
Bio::Blast factory object
97 98 99 |
# File 'lib/bio/appl/blast.rb', line 97 def self.remote(program, db, option = '', server = 'genomenet') self.new(program, db, option, server) end |
.reports(input, parser = nil) ⇒ Object
Bio::Blast.report parses given data, and returns an array of report (Bio::Blast::Report or Bio::Blast::Default::Report) objects, or yields each report object when a block is given.
Supported formats: NCBI default (-m 0), XML (-m 7), tabular (-m 8).
Arguments:
-
input (required): input data
-
parser: type of parser. see Bio::Blast::Report.new
- Returns
-
Undefiend when a block is given. Otherwise, an Array containing report (Bio::Blast::Report or Bio::Blast::Default::Report) objects.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
# File 'lib/bio/appl/blast.rb', line 114 def self.reports(input, parser = nil) begin istr = input.to_str rescue NoMethodError istr = nil end if istr then input = StringIO.new(istr) end raise 'unsupported input data type' unless input.respond_to?(:gets) # if proper parser is given, emulates old behavior. case parser when :xmlparser, :rexml ff = Bio::FlatFile.new(Bio::Blast::Report, input) if block_given? then ff.each do |e| yield e end return [] else return ff.to_a end when :tab istr = input.read unless istr rep = Report.new(istr, parser) if block_given? then yield rep return [] else return [ rep ] end end # preparation of the new format autodetection rule if needed if !defined?(@@reports_format_autodetection_rule) or !@@reports_format_autodetection_rule then regrule = Bio::FlatFile::AutoDetect::RuleRegexp blastxml = regrule[ 'Bio::Blast::Report', /\<\!DOCTYPE BlastOutput PUBLIC / ] blast = regrule[ 'Bio::Blast::Default::Report', /^BLAST.? +[\-\.\w]+ +\[[\-\.\w ]+\]/ ] tblast = regrule[ 'Bio::Blast::Default::Report_TBlast', /^TBLAST.? +[\-\.\w]+ +\[[\-\.\w ]+\]/ ] tab = regrule[ 'Bio::Blast::Report_tab', /^([^\t]*\t){11}[^\t]*$/ ] auto = Bio::FlatFile::AutoDetect[ blastxml, blast, tblast, tab ] # sets priorities blastxml.is_prior_to blast blast.is_prior_to tblast tblast.is_prior_to tab # rehash auto.rehash @@report_format_autodetection_rule = auto end # Creates a FlatFile object with dummy class ff = Bio::FlatFile.new(Object, input) ff.dbclass = nil # file format autodetection 3.times do break if ff.eof? or ff.autodetect(31, @@report_format_autodetection_rule) end # If format detection failed, assumed to be tabular (-m 8) ff.dbclass = Bio::Blast::Report_tab unless ff.dbclass if block_given? then ff.each do |entry| yield entry end ret = [] else ret = ff.to_a end ret end |
.reports_xml(input, parser = nil) ⇒ Object
Note that this is the old implementation of Bio::Blast.reports. The aim of this method is keeping compatibility for older BLAST XML documents which might not be parsed by the new Bio::Blast.reports nor Bio::FlatFile. (Though we are not sure whether such documents exist or not.)
Bio::Blast.reports_xml parses given data, and returns an array of Bio::Blast::Report objects, or yields each Bio::Blast::Report object when a block is given.
It can be used only for XML format. For default (-m 0) format, consider using Bio::FlatFile, or Bio::Blast.reports.
Arguments:
-
input (required): input data
-
parser: type of parser. see Bio::Blast::Report.new
- Returns
-
Undefiend when a block is given. Otherwise, an Array containing Bio::Blast::Report objects.
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 |
# File 'lib/bio/appl/blast.rb', line 220 def self.reports_xml(input, parser = nil) ary = [] input.each_line("</BlastOutput>\n") do |xml| xml.sub!(/[^<]*(<?)/, '\1') # skip before <?xml> tag next if xml.empty? # skip trailing no hits rep = Report.new(xml, parser) if rep.reports then if block_given? rep.reports.each { |r| yield r } else ary.concat rep.reports end else if block_given? yield rep else ary.push rep end end end return ary end |
Instance Method Details
#option ⇒ Object
Returns options of blastall
374 375 376 377 |
# File 'lib/bio/appl/blast.rb', line 374 def option # backward compatibility Bio::Command.make_command_line() end |
#option=(str) ⇒ Object
Set options for blastall
380 381 382 383 |
# File 'lib/bio/appl/blast.rb', line 380 def option=(str) # backward compatibility self. = Shellwords.shellwords(str) end |
#query(query) ⇒ Object
This method submits a sequence to a BLAST factory, which performs the actual BLAST.
# example 1
seq = Bio::Sequence::NA.new('agggcattgccccggaagatcaagtcgtgctcctg')
report = blast_factory.query(seq)
# example 2
str <<END_OF_FASTA
>lcl|MySequence
MPPSAISKISNSTTPQVQSSSAPNLTMLEGKGISVEKSFRVYSEEENQNQHKAKDSLGF
KELEKDAIKNSKQDKKDHKNWLETLYDQAEQKWLQEPKKKLQDLIKNSGDNSRVILKDS
END_OF_FASTA
report = blast_factory.query(str)
Bug note: When multi-FASTA is given and the format is 7 (XML) or 8 (tab), it should return an array of Bio::Blast::Report objects, but it returns a single Bio::Blast::Report object. This is a known bug and should be fixed in the future.
Arguments:
-
query (required): single- or multiple-FASTA formatted sequence(s)
- Returns
-
a Bio::Blast::Report (or Bio::Blast::Default::Report) object when single query is given. When multiple sequences are given as the query, it returns an array of Bio::Blast::Report (or Bio::Blast::Default::Report) objects. If it can not parse result, nil will be returnd.
358 359 360 361 362 363 364 365 366 367 368 369 370 371 |
# File 'lib/bio/appl/blast.rb', line 358 def query(query) case query when Bio::Sequence query = query.output(:fasta) when Bio::Sequence::NA, Bio::Sequence::AA, Bio::Sequence::Generic query = query.to_fasta('query', 70) else query = query.to_s end @output = self.__send__("exec_#{@server}", query) report = parse_result(@output) return report end |