Class: Bio::NCBI::REST
- Inherits:
-
Object
- Object
- Bio::NCBI::REST
- Defined in:
- lib/bio/io/ncbirest.rb
Overview
Description
The Bio::NCBI::REST class provides REST client for the NCBI E-Utilities
Entrez Programming Utilities Help:
-
( redirected from www.ncbi.nlm.nih.gov/entrez/utils/ )
Direct Known Subclasses
Defined Under Namespace
Constant Summary collapse
- NCBI_INTERVAL =
Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time weekdays for any series of more than 100 requests. -> Not implemented yet in BioRuby
Wait for 1/3 seconds. NCBI’s restriction is: “Make no more than 3 requests every 1 second.”.
1.0 / 3.0
- @@last_access =
nil
- @@last_access_mutex =
nil
Class Method Summary collapse
Instance Method Summary collapse
-
#efetch(ids, hash = {}, step = 100) ⇒ Object
Retrieve database entries by given IDs and using E-Utils (efetch) service.
-
#einfo ⇒ Object
List the NCBI database names E-Utils (einfo) service.
-
#esearch(str, hash = {}, limit = nil, step = 10000) ⇒ Object
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
-
#esearch_count(str, hash = {}) ⇒ Object
- Arguments
- same as esearch method Returns
-
array of entry IDs or a number of results.
Class Method Details
.efetch(*args) ⇒ Object
391 392 393 |
# File 'lib/bio/io/ncbirest.rb', line 391 def self.efetch(*args) self.new.efetch(*args) end |
.einfo ⇒ Object
379 380 381 |
# File 'lib/bio/io/ncbirest.rb', line 379 def self.einfo self.new.einfo end |
.esearch(*args) ⇒ Object
383 384 385 |
# File 'lib/bio/io/ncbirest.rb', line 383 def self.esearch(*args) self.new.esearch(*args) end |
.esearch_count(*args) ⇒ Object
387 388 389 |
# File 'lib/bio/io/ncbirest.rb', line 387 def self.esearch_count(*args) self.new.esearch_count(*args) end |
Instance Method Details
#efetch(ids, hash = {}, step = 100) ⇒ Object
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
Usage
ncbi = Bio::NCBI::REST.new
ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"})
ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"})
Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"})
Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
-
ids: list of NCBI entry IDs (required)
-
hash: hash of E-Utils option => “nuccore”, “rettype” => “gb”
-
db: “sequences”, “nucleotide”, “protein”, “pubmed”, “omim”, …
-
retmode: “text”, “xml”, “html”, …
-
rettype: “gb”, “gbc”, “medline”, “count”,…
-
-
step: maximum number of entries retrieved at a time
- Returns
-
String
355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 |
# File 'lib/bio/io/ncbirest.rb', line 355 def efetch(ids, hash = {}, step = 100) serv = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" opts = default_parameters.merge({ "retmode" => "text" }) opts.update(hash) case ids when Array list = ids else list = ids.to_s.split(/\s*,\s*/) end result = "" 0.step(list.size, step) do |i| opts["id"] = list[i, step].join(',') unless opts["id"].empty? response = ncbi_post_form(serv, opts) result += response.body end end return result.strip #return result.strip.split(/\n\n+/) end |
#einfo ⇒ Object
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome
books cancerchromosomes cdd gap domains gene genomeprj gensat geo
gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc
popset probe proteinclusters pcassay pccompound pcsubstance snp
taxonomy toolkit unigene unists
Usage
ncbi = Bio::NCBI::REST.new
ncbi.einfo
Bio::NCBI::REST.einfo
- Returns
-
array of string (database names)
218 219 220 221 222 223 224 225 |
# File 'lib/bio/io/ncbirest.rb', line 218 def einfo serv = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" opts = default_parameters.merge({}) response = ncbi_post_form(serv, opts) result = response.body list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten return list end |
#esearch(str, hash = {}, limit = nil, step = 10000) ⇒ Object
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
-
( redirected from eutils.ncbi.nlm.nih.gov/books/n/helpeutils/chapter4/#chapter4.ESearch )
-
( redirected from eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html )
Usage
ncbi = Bio::NCBI::REST.new
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"})
Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"})
Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
-
str: query string (required)
-
hash: hash of E-Utils option => “nuccore”, “rettype” => “gb”
-
db: “sequences”, “nucleotide”, “protein”, “pubmed”, “taxonomy”, …
-
retmode: “text”, “xml”, “html”, …
-
rettype: “gb”, “medline”, “count”, …
-
retmax: integer (default 100)
-
retstart: integer
-
field:
-
“titl”: Title [TI]
-
“tiab”: Title/Abstract [TIAB]
-
“word”: Text words [TW]
-
“auth”: Author [AU]
-
“affl”: Affiliation [AD]
-
“jour”: Journal [TA]
-
“vol”: Volume [VI]
-
“iss”: Issue [IP]
-
“page”: First page [PG]
-
“pdat”: Publication date [DP]
-
“ptyp”: Publication type [PT]
-
“lang”: Language [LA]
-
“mesh”: MeSH term [MH]
-
“majr”: MeSH major topic [MAJR]
-
“subh”: Mesh sub headings [SH]
-
“mhda”: MeSH date [MHDA]
-
“ecno”: EC/RN Number [rn]
-
“si”: Secondary source ID [SI]
-
“uid”: PubMed ID (PMID) [UI]
-
“fltr”: Filter [FILTER] [SB]
-
“subs”: Subset [SB]
-
-
reldate: 365
-
mindate: 2001
-
maxdate: 2002/01/01
-
datetype: “edat”
-
-
limit: maximum number of entries to be returned (0 for unlimited; nil for the “retmax” value in the hash or the internal default value (=100))
-
step: maximum number of entries retrieved at a time
- Returns
-
array of entry IDs or a number of results
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
# File 'lib/bio/io/ncbirest.rb', line 286 def esearch(str, hash = {}, limit = nil, step = 10000) serv = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = default_parameters.merge({ "term" => str }) opts.update(hash) case opts["rettype"] when "count" count = esearch_count(str, opts) return count else retstart = 0 retstart = hash["retstart"].to_i if hash["retstart"] limit ||= hash["retmax"].to_i if hash["retmax"] limit ||= 100 # default limit is 100 limit = esearch_count(str, opts) if limit == 0 # unlimit list = [] 0.step(limit, step) do |i| retmax = [step, limit - i].min opts.update("retmax" => retmax, "retstart" => i + retstart) response = ncbi_post_form(serv, opts) result = response.body list += result.scan(/<Id>(.*?)<\/Id>/m).flatten end return list end end |
#esearch_count(str, hash = {}) ⇒ Object
- Arguments
-
same as esearch method
- Returns
-
array of entry IDs or a number of results
317 318 319 320 321 322 323 324 325 326 |
# File 'lib/bio/io/ncbirest.rb', line 317 def esearch_count(str, hash = {}) serv = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" opts = default_parameters.merge({ "term" => str }) opts.update(hash) opts.update("rettype" => "count") response = ncbi_post_form(serv, opts) result = response.body count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i return count end |