ckanext-datajson
A CKAN extension to generate the /data.json file and to harvest data
sources from a remote /data.json file according to the U.S. Project
Open Data metadata specification (http://project-open-data.github.io/).
This plugin creates a new view at /data.json (or other configurable
path) that outputs the contents of the data catalog in the Project
Open Data JSON metadata format. It also creates a view at /data.jsonld
which outputs the same in JSON-LD format.
The plugin also provides a harvester to import datasets from other
remote /data.json files. See below for setup instructions.
And the plugin also provides a new view to validate /data.json files
at http://ckanhostname/pod/validate.
This module assumes metadata is stored in CKAN in the way we do it
on http://hub.healthdata.gov. If you’re storing metadata under different
key names, you’ll have to revise ckanext/datajson/plugin.py accordingly.
Installation
To install, activate your CKAN virtualenv, install dependencies, and
install the module in develop mode, which just puts the directory in your
Python path.
. path/to/pyenv/bin/activate
pip install -r pip-requirements.txt
python setup.py develop
Then in your CKAN .ini file, add “datajson” to your ckan.plugins line:
ckan.plugins = (other plugins here...) datajson
That’s the plugin for /data.json output. To make the harvester available,
also add:
ckan.plugins = (other plugins here...) harvest datajson_harvest
If you’re running CKAN via WSGI, we found a strange Python dependency
bug. It might only affect development environments. The fix was to
revise wsgi.py and add:
import ckanext
before
from paste.deploy import loadapp
Then restart your server and check out:
http://yourdomain.com/data.json
and
http://yourdomain.com/data.jsonld
and
http://yourdomain.com/pod/validate
Caching The Response
If you’re deploying inside Apache, some caching would be a good idea
because