Entrez

Entrez is a simple API for making HTTP requests to Entrez utilities (eutils: eutils.ncbi.nlm.nih.gov/).

Installation

gem install entrez

or if you use Bundler:

# Gemfile
gem 'entrez'

It requires httparty.

See ‘Email & Tool’ section below for setup.

Usage

Supported Utilities

  • EFetch

  • EInfo

  • ESearch

  • ESummary

Not yet implemented

  • EPost

  • ELink

  • EGQuery

  • ESpell

You can copy/paste the resulting following request URLs into a browser to see what the response would be.

EFetch, ESummary

args:

ESummary takes the same arguments.

ESearch

args:

The response has a convenience method to retrieve the parsed ids.

response = Entrez.ESearch('genomeprj', {WORD: 'hapmap', SEQS: 'inprogress'}, retmode: :xml)
response.ids #=> [1, 2, ...]

Customized Queries

You can build your own customized queries if you have something more complex with ANDs and ORs. Use Entrez.convert_search_term_hash() to help you. It converts a hash into a valid Entrez search string properly joined with the operator of your choosing. If you pass in the OR operator, the returned search string will be wrapped in a set of parentheses.

EInfo

args:

Email & Tool

NCBI asks that you supply the tool you are using and your email. The Entrez gem uses ‘ruby’ as the tool. Email is obtained from an environment variable ENTREZ_EMAIL on your computer. I set mine in my ~/.bash_profile:

export ENTREZ_EMAIL='[email protected]'

NCBI query limits

NCBI recommends no more than 3 URL requests per second: www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen This gem respects this limit. It will delay the next request if the last 3 have been made within 1 second. The amount of delay time is no more than what is necessary to make the next request “respectful”.

Ignore query limits for testing

If you use something like FakeWeb for testing, and you don’t want to slow down your tests, tell Entrez to ignore the query limit:

require 'entrez/spec_helpers'
it 'does something that I promise will not bother NCBI' do
  Entrez.ignore_query_limit do
    # Anything that happens within this block will ignore the query limit.
    # So make sure you do not actually request queries from NCBI.
    # For example:
    FakeWeb.allow_net_connect = false
  end
  # Query limits are respected again outside of the block.
end

Compatibility

test.rubygems.org/gems/entrez