Extension Data Preview


Extension Basics

Title
Data Preview
Name
ckanext-datapreview
Type
Public extension
Description
Data preview extension for CKAN that supplies data from local storage or remote download, parses CSV/XLS and provides it at a URL for Recline or other data previewers.
CKAN versions
Download-Url (zip)
Last commit
10 years ago (2015-08-04 09:06:04)
Url to repo
Category
Visualization & Analytics


Background Infos

Description (long)
Show details

ckanext-datapreview

This CKAN extension supplies data from local storage or via a remote download, parses CSV/XLS and provides it at a URL that can be called by Recline or other CKAN data previewer.

e.g. for a spreadsheet it returns a JSON dict such as:

{
    "fields": ["Name", "Age"],
    "data": [["Bob", 42], ["Jill", 54]],
    "extra_text": "This preview shows only the first 10 rows",
    "max_results": 10,
    "length": 435,
    "url": "http://data.com/file.csv"
}

This extension is a modified, but local implementation of the OKFN dataproxy that runs as a CKAN extension rather than on Google AppEngine. This has been written to improve the performance on data.gov.uk and increase the maximum file size processed.

The interface to the extension:

/data/preview/<resource_id>?max-results=N&encoding=utf-8

is not exactly the same - dataproxy requires the URL instead of the resource id - the data returned is identical. Rather than always fetching the data from the remote site the new controller at the above route will first attempt to find the data in the ckanext-archiver’s local archive.

Installation

The most straightforward method of installation is:

git clone git://github.com/datagovuk/ckanext-datapreview.git
cd ckanext-datapreview
python setup.py develop

Or alternatively install directly using pip:

pip install -e git+https://github.com/datagovuk/ckanext-datapreview.git#egg=ckanext-datapreview

Once complete the datapreview should be added to your ckan.plugins property in the appropriate .ini file.

Config

In your CKAN config file, configure the following options:

limit

The ‘limit’ is the maximum size of a file downloaded or loaded into memory. If the data is not stored locally, then you don’t want to wait forever downloading it to be able to proxy it.

The limit is expressed in bytes, so the default of 5MB would be:

ckan.datapreview.limit = 5242880

Local CSV files are not subject to this limit because the first 100 rows can be loaded without loading the whole file into memory.

Requirements

  • ckanext-archiver - for the resource cache
  • messytables - (in setup.py)

Improvements

  • Increases the limit on download size (doesn’t have the appengine download limit)
  • Uses the local archive cache if it exists rather than hitting the remote site (only if ckanext-archiver has retrieved the file).

Note: This repository was archived by the owner on Jun 19, 2023. It is now read-only.

Version
Version release date
(not set)
Contact name
datagovuk
Contakt email
(not set)
Contact Url
(not set)


Installation Guide

Configuration hints

Requires ckanext-archiver for local resource cache. Repository archived on Jun 19, 2023.

Plugins to configure (ckan.ini)
datapreview
CKAN Settings (ckan.ini)
# ckan.datapreview.limit = 5242880
DB migration to be executed
(not set)
<< back to Extensions