Chimps on the Command Line
Infochimps is an online data marketplace and repository where anyone can find, share, and sell data.
Infochimps offers two APIs for users to access and modify data
-
a Dataset API to list, show, create, update, and destroy datasets and associated resources on Infochimps
-
a Query API to query data from particular rows of these datasets
Chimps is a Ruby wrapper for these APIs that makes interacting with them simple. You can embed Chimps inside your web application or any other software you write.
But if you finding yourself wishing that you could make queries, create datasets, &c. from your command line, where you already live, where you already keep your data…then Chimps CLI is for you:
# See your datasets
$ chimps list --my
...
# Create a new dataset
$ chimps create title="A Brand New Dataset" description="That I created in 2 minutes. But that doesn't mean it's not awesome."
...
# Take a look
$ chimps show a-brand-new-dataset
# Check out your competition
$ chimps search awesome data
...
# Hmmm. Better do some more work.
$ chimps update a-brand-new-dataset tag_list="awesome,new,data"
First Steps
Installing Chimps CLI
Assuming you’ve already set up your Gem sources, just run
gem install chimps-cli
This will also install Chimps if it’s not already present on your system.
Configuring Chimps
Chimps CLI is just a command-line wrapper for the Chimps library. If Chimps is already properly configured with your API credentials then Chimps CLI will read them just fine without you having to do anything.
If you need to obtain API keys for either the Dataset API or the Query API then sign up at Infochimps.
You’ll need to put your API keys into one of two files, either /etc/chimps/chimps.yaml
or ~/.chimps
. See the README for Chimps for more details on how to set up these configuration files.
Usage
Try running
$ chimps help
to make sure you can run the chimps
command and to see an overview of what subcommands are available. You can get more detailed help as well as example usage on COMMAND by running
$ chimps help COMMAND
You can test and see whether your credentials are valid using the test
command:
$ chimps test
Authenticated as user 'Infochimps' for Infochimps Dataset API at http://www.infochimps.com
Authenticated for Infochimps Query API at http://api.infochimps.com
If you get messages about missing keys and so on go back and read the Chimps installation instructions.
If you get messages about not being able to authenticate, double-check that the API keys in your configuration file (either ~/.chimps
or /etc/chimps/chimps.yaml
) match the credentials listed in your profile.
Options
Commands to chimps
accept arguments as well as options. Options always begin with two dashes and some options have single-letter flags as well.
Some options work for every chimps
command. --verbose
(-v
), for example, is a great way to see what underlying HTTP request(s) a given command is making.
Operating on a Dataset, Source, License, &c.
Many requests can operate on a particular resource. The show
command, for example, can be used to show a dataset (the default choice), a license, a source, or a user.
You can see what resources COMMAND
can operate on with chimps help COMMAND
. Two examples
# Will attempt to show the Dataset 'an-example'
$ chimps show an-example
# Will attempt to show the Source 'an-example'
$ chimps show source an-example
# Will search datasets for 'stocks'
$ chimps search stocks
# Will list all licenses
$ chimps list licenses
Providing Data to a Command
Some commands (typically those that result in HTTP GET
and DELETE
requests) don’t require you to pass any data to Infochimps.
Other commands (typically those that result in HTTP POST
and PUT
requests) do. These commands usually create or modify a dataset or other resource at Infochimps.
Say you wanted to create a new dataset on Infochimps with the title “List of hottest Salsas” and with description “All salsas were tried personally by me.”
There are two methods you can use to pass this data to a Chimps CLI command:
1) You can put the data you need to pass into a file on disk. Chimps understands YAML and JSON files formats and will automatically parse and serialize them properly when making a request. You could create the following file
# in salsa_dataset.yml
---
title: "List of hottest Salsas"
description: |-
All salsas were tried personally by me.
and you can create the dataset with
$ chimps create --data=salsa_dataset.yml
2) You can pass parameters and values directly on the command line. You could create the same dataset as above with
$ chimps create title="List of hottest Salsas" description="All salsas were tried personally by me."
This will only work for a flat collection of parameters and values, as in this example. If you need to pass a nested data structure you should use a file and the --data
option above.
Another example, which makes a query to the Query API and returns demographics on an IP address
$ chimps query web/an/ip_census ip=67.78.118.7
Basic HTTP Verbs
Infochimps’ Dataset API is RESTful so it respects the semantics of HTTP verbs. You can use this “lower-level” interface to make simple GET
, POST
, PUT
, and DELETE
requests.
Here’s how to return information on a Yahoo! Stocks dataset
$ chimps get /datasets/yahoo-stock-search
The default response will be in JSON but you can change the response format by explicitly passing a different one of xml
, json
, or yaml
. This works for (almost) all Dataset API requests.
$ chimps get /datasets/yahoo-stock-search --response_format=yaml
Try running chimps help
for the get
command (chimps help get
) as well as for the post
, put
, and delete
commands.
Signed vs. Unsigned Requests
Some requests, like the GET
request above, don’t need to be signed in any way: using chimps
to make a simple unsigned GET
request isn’t anything different than just doing it with curl
.
All POST
, PUT
, and DELETE
requests, however, need to be signed and using Chimps to do it makes it easy. Here’s how you might create a dataset
$ chimps post --sign /datasets title="My dataset" description="Some text..."
If you leave out the --sign
option then the request will fail with a 401 Authentication error.
The above request is really just the same as
$ chimps create title="My dataset" description="Some text..."
which is a little simple because create
understands what you’re trying to do and internally constructs the appropriate POST
request.
You can find a list of all available requests, the correct HTTP verb to use, whether the request needs to be signed, and what parameters it accepts at www.infochimps.com/apis.
To round out this section, here’s an example of a PUT
request and a DELETE
request (both of which must be signed):
# Update your existing dataset
$ chimps put --sign /datasets/my-dataset title="A New Title"
# Let's delete this dataset now because we're fickle little monkeys...
$ chimps delete --sign /datasets/my-dataset
Most things you might want to do with this “low-level” HTTP verb interface can be done with specialized chimps
commands. Read on.
Core REST Actions
Since the Infochimps Dataset API is RESTful, it implements list, show, create, update, and destroy actions for all resources. Each of these actions has a corresponding Chimps command.
Here’s how to list
datasets:
$ chimps list
The list
command is one of a few (search
being another) that accepts the --my
(-m
) option. This will restrict the output to only datasets (or whatever resource you’re listing) that are owned by you.
$ chimps list --my
$ chimps list --my licenses
Here’s how to show
a dataset:
$ chimps show my-dataset
this returns YAML by default but you can specify a different response format by passing the --response_format
option
$ chimps show my-dataset --response_format=json
You’ve already seen create
in action a few times so here’s update
instead
$ chimps update my-dataset title="A new title"
And of course destroy
$ chimps destroy my-dataset
If you’re curious about the underlying HTTP requests being sent, try running these commands with the --verbose
(-v
) flag.
Special Requests
Chimps CLI has a few special commands which aren’t HTTP verbs or core REST actions.
Search
Here’s how to search Infochimps for datasets about music:
$ chimps search music
Here’s the same search restricted to only datasets you own and pretty-printed:
$ chimps search --my music --pretty
Download
If a dataset on Infochimps has a downloadable package then the download
command can be used to download the data:
$ chimps download daily-1970-2010-open-close-hi-low-and-volume-nyse-exchange
The dataset must be free, you must own it, or you must have purchased it (through the website) before you can download
it with Chimps.
You may want to include the --verbose
(-v
) flag so that you can see the progress of the download, especially if it is a large file.
Upload
Infochimps does not presently allow you to upload data by using an API. Please create a dataset first (you can do this with Chimps) and then go to that dataset’s page in a browser and upload any data you wish.
This feature will be coming very, very soon!
Help/Test
chimps help
and chimps help COMMAND
should carry you a good ways with the examples and usage they output.
chimps test
should confirm that your API keys are properly configured.
Contributing
Chimps CLI is an open source project created by the Infochimps team to encourage adoption of the Infochimps APIs. The official repository is hosted on GitHub
http://github.com/infochimps/chimps-cli
Feel free to clone it and send pull requests.