shp2geocouch

This is an executable rubygem that creates GeoCouch databases from ESRI Shapefiles in one easy step

Installation

This has been verified to work on Unix/MacOSX. Your mileage may vary if you are on Windows.

You will need access to a GeoCouch server. For instructions on setting up your own local GeoCouch, check out my related blog post. The easiest way to get GeoCouch is the binary distribution from Couchbase.

Install dependencies: unzip, sed, ogr2ogr

If you are on Mac OS X, you will already have unzip and sed installed. To get ogr2ogr, install Homebrew and run brew install gdal.

Note: ogr2ogr is usually included with the gdal package (sometimes called gdal-bin).

shp2geocouch depends on the following gems:

  • httparty for sending data to CouchDB
  • couchrest for interacting with CouchDB

Install the shp2geocouch gem:

gem install shp2geocouch

Usage

shp2geocouch <path_to_shapefile> [geocouch-url (optional, default: http://localhost:5984/zip_filename)]

example: shp2geocouch /sweet_geo_data/Portland_Oregon_Public_Libraries.shp another possible example: shp2geocouch /zipped_shapfiles/US_Railroads.zip http://admin:passwordhelloworld.couchone.com/railroads@

Once completed, you can run bounding box queries against your data! Visit the following link to receive a dump of your entire dataset as GeoJSON (a bounding box that covers the entire planet): http://localhost:5984/your_db_name/_design/geo/_spatial/ful?bbox=-180,-90,180,90

shp2geocouch will try to install geocouch-utils into your database by replicating it from my public couch. geocouch-utils includes a basic web based map browser that will allow you to view the data from your Shapefile.

If you uploaded a large dataset, you may experience a delay on the first ever spatial query as GeoCouch builds it’s spatial index. You can check the status of the indexer by visiting http://localhost:5984/_utils/status.html

Arguments:

  • You can feed in either a .shp file or a .zip that contains a .shp
  • -v for verbose
  • --no-cleanup if you want shp2geocouch to leave you a copy of unzipped Shapefile and converted JSON.
  • --chunksize to customize the number of docs that it sends in each PUT to CouchDB’s _bulk_docs interface (default is 50)
  • -i to set the CouchDB IDs (_id) based on the given column in the Shapefile. The column is assumed to be unique; no error checking is performed.

What?

Raw geographic data is awesome, especially when it available online from entities like local governments. Unfortunately, a lot of agencies store and release their data in a proprietary format called a Shapefile, which was developed by a large for-profit corporation called ESRI, whose closed source desktop GIS software is ubiquitous in the academic and government GIS circles.

GeoCouch is a free and open source geospatial data store that is built on the the popular Apache CouchDB database. With GeoCouch, you serve your geographic data ‘a la carte’, or in small relevant chunks instead of having to download the entire dataset at a time. This pattern works especially well on the web, as you wouldn’t want to go giving the list of all 10,000 bus stops to someone on a cellphone that’s requesting just the 5 bus stops that are nearest to their location.

shp2geocouch is a utility that will take a Shapefile and create a new GeoCouch database on your GeoCouch server which will contain the geographic data from the Shapefile.