Who are these Chimps?
Infochimps is an online data marketplace and repository where anyone can find, share, and sell data.
Infochimps offers two APIs for users to access and modify data
-
a Catalog API to list, show, create, update, and destroy datasets and associated resources on Infochimps
-
a Query API to query data from particular rows of these datasets
Chimps is a Ruby library that makes interacting with Infochimps’ APIs as easy as
require 'rubygems'
require 'chimps'
# Sign up for an Infochimps account and get your keys from
# http://www.infochimps.com/me
Chimps.config[:catalog][:key] = "Your Catalog API key"
Chimps.config[:catalog][:secret] = "Your Catalog API secret"
# list datasets in JSON
Chimps::Request.new("/datasets").get.print
You can use Chimps into your web application or into any other Ruby software that talks to Infochimps.
If you’re interested in a command line client built on top of Chimps, try Chimps CLI.
First Steps
Installing Chimps
Chimps is hosted as a gem on RubyGems. You can see your current gem sources with
$ gem sources
If you don’t see http://rubygems.org
you’ll have to add it with
$ gem sources -a http://rubygems.org
Then you can install Chimps with
$ sudo gem install chimps
Configuring Chimps
You’ll need a Dataset API key and secret from Infochimps before you can start adding or modifying datasets via the Dataset API. Sign up for an Infochimps account and get your Catalog API key.
To query particular rows from a dataset, you’ll also need to get a Query API key.
You can always explicitly set values in Chimps.config
like in the first example at the top of this README
but you may find it more convenient to keep your keys in a configuration file.
Chimps will look for configuration in two places: /etc/chimps/chimps.yaml
and ~/.chimps
. Once you’ve registered for the API(s) you can create one of these files. The configuration file looks like
# -*-yaml-*-
# ~/.chimps
:catalog:
:key: xxxxxxxxxxxxxxxx
:secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
:query:
:key: xxxxxxxxxxxxxxxxx
Your personal configuration file (~/.chimps
) will override the site-configuration file (/etc/chimps/chimps.yaml
).
You have to explicitly tell Chimps to read your configuration file when you require it:
require 'rubygems'
require 'chimps'
# reads configuration files
Chimps.boot!
# ..do stuff
Making Requests
Catalog API
You can use the Catalog API to search, list, show, create, update, or destroy datasets and other resources at Infochimps.
If instead of creating a dataset with 100,000 baby names you want to query a dataset of 100,000 baby names then you should skip down to the Query API section below.
You can find a complete list of Catalog API endpoints, expected parameters, return codes, documentation, and authentication requirements at www.infochimps.com/catalog-api.
A Simple GET
Request
The Chimps::Request
class makes requests against the Catalog API. Create a request by specifying a path on the Infochimps server. The extension of the path determines the MIME type that Infochimps will respond with:
# list datasets
request = Chimps::Request.new('/datasets.json')
response = request.get
response.print
# lotsa JSON...
Understanding the Response
The response
above is an instance of Chimps::Response
. You can examine response.body<tt>, <tt>response.code
, response.headers
, &c.
response.code #=> 200
response.headers #=> Hash of headers
response.body #=> JSON text
Since the response is a structured data format, you can parse it and look inside:
response.parse!
response.size #=> 20
response.each do |dataset|
# do something ...
end
A Signed GET
Request
Many Catalog API calls at Infochimps (like listing datasets, above) do not require the request to be signed in any way. Other requests, which reveal your private information or create new datasets, &c. on Infochimps will require your Catalog API secret to sign. You can see a full list of the Catalog API endpoints available and whether or not requests to each need to be signed at www.infochimps.com/catalog-api.
Assuming you’ve properly configured Chimps (see above) you ask the request to sign itself.
# list only your datasets -- required to be signed
request = Chimps::Request.new('/my/datasets.json', :sign => true)
response = request.get
response.print
# lotsa JSON...but only about *your* datasets
The authentication mechanism uses the Catalog API secret (which is shared between you and Infochimps) to sign either the query string (for the case of GET
and DELETE
requests) or the request body (for POST
and PUT
requests).
PUT
, POST
, and DELETE
Requests
All POST
, PUT
, and DELETE
requests to Infochimps are required to be signed.
Here’s how to you might create a new dataset via a POST
request:
request = Chimps::Request('/datasets.json', :body => { :dataset => { :title => "My Awesome Dataset", :description => "An amazing description." }}, :sign => true)
response = request.post
response.code #=> 201
response.print
# your new dataset in JSON...
response.parse!
response['dataset']['id'] #=> 20876
You can find a complete list of what Catalog API endpoints are available and what parameters they take at www.infochimps.com/catalog-api.
Using the Query API
The Chimps::QueryRequest class makes requests against the Query API. It works just the similarly to the Chimps::Request except that the path supplied is the path to the corresponding dataset on the Query API.
All QueryRequests will automatically be signed.
request = Chimps::QueryRequest.new('soc/net/tw/trstrank.json', :query_params => { :screen_name => 'infochimps' } )
response = request.get
response.print
#=> {"trstrank":1.75,"user_id":15748351,"tq":96,"screen_name":"infochimps"}
Downloading Data
You can download the data for a dataset on Infochimps by making a signed POST
request to obtain a download token and then making a GET
request to the signed and expired URL contained in the token.
Chimps provides a Download
class to simplify this for you. Here’s an example.
download = Chimps::Download.new('my-awesome-dataset')
# save your data directory
download.download('/data')
Uploading Data
Coming soon! For now you will have to upload your data manually through the Infochimps website.
Contributing
Chimps is an open source project created by the Infochimps team to encourage adoption of Infochimps’ Catalog & Query APIs. The official repository is hosted on GitHub
http://github.com/infochimps/chimps
Infochimps encourages you to contribute by cloning Chimps, adding your feature or bugfix, writing a spec, and sending us a pull request.