Class: Bio::NCBI::REST
Overview
Description
The Bio::NCBI::REST class provides REST client for the NCBI E-Utilities
Entrez utilities index:
Defined Under Namespace
Constant Summary
- NCBI_INTERVAL =
Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time weekdays for any series of more than 100 requests. -> Not implemented yet in BioRuby
Wait for 1/3 seconds. NCBI's restriction is: "Make no more than 3 requests every 1 second.".
1.0 / 3.0
- @@last_access =
nil- @@last_access_mutex =
nil
Class Method Summary (collapse)
Instance Method Summary (collapse)
-
- (Object) efetch(ids, hash = {}, step = 100)
Retrieve database entries by given IDs and using E-Utils (efetch) service.
-
- (Object) einfo
List the NCBI database names E-Utils (einfo) service.
-
- (Object) esearch(str, hash = {}, limit = nil, step = 10000)
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
-
- (Object) esearch_count(str, hash = {})
Arguments
same as esearch method
Returns
array of entry IDs or a number of results.
Class Method Details
+ (Object) efetch(*args)
351 352 353 |
# File 'lib/bio/io/ncbirest.rb', line 351 def self.efetch(*args) self.new.efetch(*args) end |
+ (Object) einfo
339 340 341 |
# File 'lib/bio/io/ncbirest.rb', line 339 def self.einfo self.new.einfo end |
+ (Object) esearch(*args)
343 344 345 |
# File 'lib/bio/io/ncbirest.rb', line 343 def self.esearch(*args) self.new.esearch(*args) end |
+ (Object) esearch_count(*args)
347 348 349 |
# File 'lib/bio/io/ncbirest.rb', line 347 def self.esearch_count(*args) self.new.esearch_count(*args) end |
Instance Method Details
- (Object) efetch(ids, hash = {}, step = 100)
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
Usage
ncbi = Bio::NCBI::REST.new
ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"})
ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
-
ids: list of NCBI entry IDs (required)
-
hash: hash of E-Utils option => ???nuccore???, ???rettype??? => ???gb???
-
db: ???sequences???, ???nucleotide???, ???protein???, ???pubmed???, ???omim???, ???
-
retmode: ???text???, ???xml???, ???html???, ???
-
rettype: ???gb???, ???gbc???, ???medline???, ???count???,???
-
-
step: maximum number of entries retrieved at a time
Returns |
String |
315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 |
# File 'lib/bio/io/ncbirest.rb', line 315 def efetch(ids, hash = {}, step = 100) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" opts = default_parameters.merge({ "retmode" => "text" }) opts.update(hash) case ids when Array list = ids else list = ids.to_s.split(/\s*,\s*/) end result = "" 0.step(list.size, step) do |i| opts["id"] = list[i, step].join(',') unless opts["id"].empty? response = ncbi_post_form(serv, opts) result += response.body end end return result.strip #return result.strip.split(/\n\n+/) end |
- (Object) einfo
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome
books cancerchromosomes cdd gap domains gene genomeprj gensat geo
gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc
popset probe proteinclusters pcassay pccompound pcsubstance snp
taxonomy toolkit unigene unists
Usage
ncbi = Bio::NCBI::REST.new
ncbi.einfo
Bio::NCBI::REST.einfo
Returns |
array of string (database names) |
179 180 181 182 183 184 185 186 |
# File 'lib/bio/io/ncbirest.rb', line 179 def einfo serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" opts = default_parameters.merge({}) response = ncbi_post_form(serv, opts) result = response.body list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten return list end |
- (Object) esearch(str, hash = {}, limit = nil, step = 10000)
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
-
eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
-
www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helppubmed.section.pubmedhelp.Search_Field_Descrip
Usage
ncbi = Bio::NCBI::REST.new
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
-
str: query string (required)
-
hash: hash of E-Utils option => ???nuccore???, ???rettype??? => ???gb???
-
db: ???sequences???, ???nucleotide???, ???protein???, ???pubmed???, ???taxonomy???, ???
-
retmode: ???text???, ???xml???, ???html???, ???
-
rettype: ???gb???, ???medline???, ???count???, ???
-
retmax: integer (default 100)
-
retstart: integer
-
field:
-
"titl": Title [TI]
-
"tiab": Title/Abstract [TIAB]
-
"word": Text words [TW]
-
"auth": Author [AU]
-
"affl": Affiliation [AD]
-
"jour": Journal [TA]
-
"vol": Volume [VI]
-
"iss": Issue [IP]
-
"page": First page [PG]
-
"pdat": Publication date [DP]
-
"ptyp": Publication type [PT]
-
"lang": Language [LA]
-
"mesh": MeSH term [MH]
-
"majr": MeSH major topic [MAJR]
-
"subh": Mesh sub headings [SH]
-
"mhda": MeSH date [MHDA]
-
"ecno": EC/RN Number [rn]
-
"si": Secondary source ID [SI]
-
"uid": PubMed ID (PMID) [UI]
-
"fltr": Filter [FILTER] [SB]
-
"subs": Subset [SB]
-
-
reldate: 365
-
mindate: 2001
-
maxdate: 2002/01/01
-
datetype: ???edat???
-
-
limit: maximum number of entries to be returned (0 for unlimited; nil for the ???retmax??? value in the hash or the internal default value (=100))
-
step: maximum number of entries retrieved at a time
Returns |
array of entry IDs or a number of results |
246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 |
# File 'lib/bio/io/ncbirest.rb', line 246 def esearch(str, hash = {}, limit = nil, step = 10000) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = default_parameters.merge({ "term" => str }) opts.update(hash) case opts["rettype"] when "count" count = esearch_count(str, opts) return count else retstart = 0 retstart = hash["retstart"].to_i if hash["retstart"] limit ||= hash["retmax"].to_i if hash["retmax"] limit ||= 100 # default limit is 100 limit = esearch_count(str, opts) if limit == 0 # unlimit list = [] 0.step(limit, step) do |i| retmax = [step, limit - i].min opts.update("retmax" => retmax, "retstart" => i + retstart) response = ncbi_post_form(serv, opts) result = response.body list += result.scan(/<Id>(.*?)<\/Id>/m).flatten end return list end end |
- (Object) esearch_count(str, hash = {})
Arguments |
same as esearch method |
Returns |
array of entry IDs or a number of results |
277 278 279 280 281 282 283 284 285 286 |
# File 'lib/bio/io/ncbirest.rb', line 277 def esearch_count(str, hash = {}) serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = default_parameters.merge({ "term" => str }) opts.update(hash) opts.update("rettype" => "count") response = ncbi_post_form(serv, opts) result = response.body count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i return count end |