CKAN Metadata

From WPRDC Wiki
Revision as of 14:20, 1 March 2022 by DRW (talk | contribs) (Add to "Onboarding" category)
Jump to navigation Jump to search

Metadata overview

Metadata is a structured framework for documenting data. Some people like to say it's data about data. It's essential if anyone hopes to find and use your data.

The metadata standard we're using is adapted from those used by the City of San Francisco and data.gov (the U.S. Federal government's open data repository).

Each CKAN dataset has default metadata fields. Some of these are filled in automatically by CKAN when a dataset is created or updated, while others are set by the publisher when the dataset is created and can be updated by the publisher or WPRDC staff (or in some cases, programmatically, such as the updating of the Temporal Coverage metadata field by our watchdog utility.

WPRDC custom metadata

After upgrading our data portal to CKAN 2.7, it became possible to easily create new metadata subfields within the 'extras' metadata field for any dataset. This can be done through API calls or through the CKAN web interface (by editing the dataset package).

Below are partial lists of 'extras' metadata fields in use on https://data.wprdc.org:

Caption text
field name Use Used by
last_etl_update Indicates when the ETL job last finished. rocket-etl
time_field Dict specifying the field name in a table that stores each record's timestamp (used for determining dataset freshness). pocket-watch
no_updates_on List of days (e.g., "weekends") coding for when a table is not expected to update. pocket-watch