
ckanext-stadtzh-harvest
Harvester for the City of Zurich.
The data is loaded from so called “dropzones”, which are mounted network drives.
Each folder in a dropzone is considered a dataset.
Content
Configuration
The harvest configuration must be provided as a valid JSON string.
Example:
{
"data_path": "/home/liip/dropzones/GEO",
"metafile_dir": "DEFAULT",
"update_datasets": false,
"update_date_last_modified": true,
"dataset_prefix": "",
"delete_missing_datasets": false
}
data_path
The path to the dropzone
metafile_dir
The name of the directory where the meta.xml is located.
The GEO dropzone has a subdirectory for the meta.xml, all other dropzones should provide an empty string here.
update_datasets
Boolean flag (true/false) to determine if this harvester should update existing datasets or not.
If the flag is false no updates will be performed, only new datasets will be added.
update_date_last_modified
Boolean flag (true/false) to determine if the field date_last_modified of a dataset should be updated by the harvester or not.
If the flag is true the date will be updated if the content of any resource of a dataset has changed.
dataset_prefix
Defines a prefix for all dataset names harvested by this harvester.
This is useful if a test harvester imports the same dataset as a regular harvester and it should be ensured, that the they can co-exists without overriding each others dataset.
E.g. if a harvester imports a dataset “velowege”, it will be imported as “velowege”; if dataset_prefix is set to "test_", it will be imported as test_velowege.