ckanext-datajson
A CKAN extension to generate the /data.json file and to harvest data sources from a remote /data.json file according to the U.S. Project Open Data metadata specification (http://project-open-data.github.io/).
This plugin creates a new view at /data.json (or other configurable path) that outputs the contents of the data catalog in the Project Open Data JSON metadata format. It also creates a view at /data.jsonld which outputs the same in JSON-LD format.
The plugin also provides a harvester to import datasets from other remote /data.json files. See below for setup instructions.
And the plugin also provides a new view to validate /data.json files at http://ckanhostname/pod/validate.
This module assumes metadata is stored in CKAN in the way we do it on http://hub.healthdata.gov. If you’re storing metadata under different key names, you’ll have to revise ckanext/datajson/plugin.py accordingly.
Installation
To install, activate your CKAN virtualenv, install dependencies, and install the module in develop mode, which just puts the directory in your Python path.
. path/to/pyenv/bin/activate
pip install -r pip-requirements.txt
python setup.py develop
Then in your CKAN .ini file, add “datajson” to your ckan.plugins line:
ckan.plugins = (other plugins here...) datajson
To make the harvester available, also add:
ckan.plugins = (other plugins here...) harvest datajson_harvest
Options
ckanext.datajson.path = /data.json
ckanext.datajsonld.path = /data.jsonld
ckanext.datajsonld.id = http://www.youragency.gov/data.json
ckanext.datajson.default_contactpoint = Health Data Initiative
ckanext.datajson.default_mbox = Healthdata@example.hhs.gov
ckanext.datajson.default_keywords = health
Caching The Response
If you’re deploying inside Apache, some caching would be a good idea because generating the /data.json file can take a good few moments. Enable the cache modules:
a2enmod cache
a2enmod disk_cache
The Harvester
To use the data.json harvester, you’ll also need to set up the CKAN harvester extension.
paster --plugin=ckanext-harvest harvester initdb --config=/path/to/ckan.ini
Credit / Copying
Written by the HealthData.gov team.
As a work of the United States Government, this package is in the public domain within the United States. Additionally, we waive copyright and related rights in the work worldwide through the CC0 1.0 Universal public domain dedication.