Who are these Chimps?

Infochimps is an online data marketplace and repository where anyone can find, share, and sell data.

Infochimps offers two APIs for users to access and modify data

  • a Catalog API to list, show, create, update, and destroy datasets and associated resources on Infochimps

  • a Query API to query data from particular rows of these datasets

Chimps is a Ruby library that makes interacting with Infochimps’ APIs as easy as

require 'rubygems'
require 'chimps'

# Sign up for an Infochimps account and get your keys from
# http://www.infochimps.com/me
Chimps.config[:catalog][:key]    = "Your Catalog API key"
Chimps.config[:catalog][:secret] = "Your Catalog API secret"

# list datasets in JSON
Chimps::Request.new("/datasets").get.print

You can use Chimps into your web application or into any other Ruby software that talks to Infochimps.

If you’re interested in a command line client built on top of Chimps, try Chimps CLI.

First Steps

Installing Chimps

Chimps is hosted as a gem on RubyGems. You can see your current gem sources with

$ gem sources

If you don’t see http://rubygems.org you’ll have to add it with

$ gem sources -a http://rubygems.org

Then you can install Chimps with

$ sudo gem install chimps

Configuring Chimps

You’ll need a Dataset API key and secret from Infochimps before you can start adding or modifying datasets via the Dataset API. Sign up for an Infochimps account and get your Catalog API key.

To query particular rows from a dataset, you’ll also need to get a Query API key.

You can always explicitly set values in Chimps.config like in the first example at the top of this README but you may find it more convenient to keep your keys in a configuration file.

Chimps will look for configuration in two places: /etc/chimps/chimps.yaml and ~/.chimps. Once you’ve registered for the API(s) you can create one of these files. The configuration file looks like

# -*-yaml-*-
# ~/.chimps
:catalog:
  :key:      xxxxxxxxxxxxxxxx
  :secret:   xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
:query:
  :key:      xxxxxxxxxxxxxxxxx

Your personal configuration file (~/.chimps) will override the site-configuration file (/etc/chimps/chimps.yaml).

You have to explicitly tell Chimps to read your configuration file when you require it:

require 'rubygems'
require 'chimps'

# reads configuration files
Chimps.boot!

# ..do stuff

Making Requests

Catalog API

You can use the Catalog API to search, list, show, create, update, or destroy datasets and other resources at Infochimps.

If instead of creating a dataset with 100,000 baby names you want to query a dataset of 100,000 baby names then you should skip down to the Query API section below.

You can find a complete list of Catalog API endpoints, expected parameters, return codes, documentation, and authentication requirements at www.infochimps.com/catalog-api.

A Simple GET Request

The Chimps::Request class makes requests against the Catalog API. Create a request by specifying a path on the Infochimps server. The extension of the path determines the MIME type that Infochimps will respond with:

# list datasets
request  = Chimps::Request.new('/datasets.json')
response = request.get
response.print
# lotsa JSON...

Understanding the Response

The response above is an instance of Chimps::Response. You can examine response.body<tt>, <tt>response.code, response.headers, &c.

response.code    #=> 200
response.headers #=> Hash of headers
response.body    #=> JSON text

Since the response is a structured data format, you can parse it and look inside:

response.parse!
response.size #=> 20
response.each do |dataset|
  # do something ...
end

A Signed GET Request

Many Catalog API calls at Infochimps (like listing datasets, above) do not require the request to be signed in any way. Other requests, which reveal your private information or create new datasets, &c. on Infochimps will require your Catalog API secret to sign. You can see a full list of the Catalog API endpoints available and whether or not requests to each need to be signed at www.infochimps.com/catalog-api.

Assuming you’ve properly configured Chimps (see above) you ask the request to sign itself.

# list only your datasets -- required to be signed
request  = Chimps::Request.new('/my/datasets.json', :sign => true)
response = request.get
response.print
# lotsa JSON...but only about *your* datasets

The authentication mechanism uses the Catalog API secret (which is shared between you and Infochimps) to sign either the query string (for the case of GET and DELETE requests) or the request body (for POST and PUT requests).

PUT, POST, and DELETE Requests

All POST, PUT, and DELETE requests to Infochimps are required to be signed.

Here’s how to you might create a new dataset via a POST request:

request  = Chimps::Request('/datasets.json', :body => { :dataset => { :title => "My Awesome Dataset", :description => "An amazing description." }}, :sign => true)
response = request.post
response.code #=> 201
response.print
# your new dataset in JSON...
response.parse!
response['dataset']['id'] #=> 20876

You can find a complete list of what Catalog API endpoints are available and what parameters they take at www.infochimps.com/catalog-api.

Using the Query API

The Chimps::QueryRequest class makes requests against the Query API. It works just the similarly to the Chimps::Request except that the path supplied is the path to the corresponding dataset on the Query API.

All QueryRequests will automatically be signed.

request  = Chimps::QueryRequest.new('soc/net/tw/trstrank.json', :query_params => { :screen_name => 'infochimps' } )
response = request.get
response.print
#=> {"trstrank":1.75,"user_id":15748351,"tq":96,"screen_name":"infochimps"}

Downloading Data

You can download the data for a dataset on Infochimps by making a signed POST request to obtain a download token and then making a GET request to the signed and expired URL contained in the token.

Chimps provides a Download class to simplify this for you. Here’s an example.

download = Chimps::Download.new('my-awesome-dataset')
# save your data directory
download.download('/data')

Uploading Data

Coming soon! For now you will have to upload your data manually through the Infochimps website.

Contributing

Chimps is an open source project created by the Infochimps team to encourage adoption of Infochimps’ Catalog & Query APIs. The official repository is hosted on GitHub

http://github.com/infochimps/chimps

Infochimps encourages you to contribute by cloning Chimps, adding your feature or bugfix, writing a spec, and sending us a pull request.