CKAN Administration
Changes that can be made through the frontend
There's a lot of documentation on publishing data on our CKAN portal here.
A few samples (to eventually migrate over):
- Our documentation for publishers on publishing data on the WPRDC
- How to create data dictionaries ==> Data Dictionaries
- Some of our standard extra metadata fields
Canonical Views
If you want to set a map or data table to be on the dataset landing page, you create a corresponding "view" under one of the resources in the dataset and then click the "Canonical View" button for that view. The catch is that CKAN does not enforce that only one view may be canonical, so if multiple views have their "Canonical View" button depressed, one of them will be chosen by CKAN to be the displayed one, and you will have to unclick others in order to get the one you want to display on the dataset landing page.
Writing dataset descriptions
The description field supports some limited markup, which appears to be a subset of Markdown.
- Starting a line with a single pound sign (#) indicates that the line should be in bigger, title text, but two pound signs (##) do not give a different font size, as they do in standard Markdown.
- Dashes can be used to denote elements in an unordered list (though I haven't been able to get nested lists to work).
- Use backticks to indicate that a sans serif font should be use to represent code, like `this`.
Images, links, bold and italic text all work.
It seems like it's limited to the original [1](very basic specification of Markdown).
Changes that can be made through the backend
Configuring the CKAN server
(The contents of this section were initially taken from the ORIENTATION
file in /home/ubuntu
on the CKAN production server.)
- The main CKAN config file is at
/etc/ckan/default/production.ini
- To monitor HTTP requests in real-time:
> tail -f /var/log/nginx/access.log
- Service-worker activity (like the Express Loader uploading files to the datastore and background geocoding) can be found in:
/var/log/ckan-worker.log
- Edit templates here (changes to templates should show up when reloading the relevant web pages):
/usr/lib/ckan/default/src/ckanext-wprdctheme/ckanext/wprdc/templates
templates/terms.html
is the source for the pop-up version of the Terms of Use. There appears to be no template linked to the "Terms" hyperlink.
- Create a file
templates/foo.html
and then run> sudo service supervisor restart
and THEN loaddata.wprdc.org/foo.html
in your browser, and the page will be there.
- Presumably
data.wprdc.org/foo/
can be populated by creating a file attemplates/foo/index.html
.
Managing the CKAN server
- To restart the Express Loader:
> sudo supervisorctl restart ckan-worker:*
- To edit the background worker configuration (including increasing the number of background workers),
- Edit the config file:
> vi /etc/supervisor/conf.d/supervisor-ckan-worker.conf
- Tell Supervisor to use the new configuration:
> sudo supervisorctl reread
- Update the deployed configuration to start the desired number of workers:
> sudo supervisorctl update
- Edit the config file:
- Activate the virtual environment that lets you run
paster
commands:> . /usr/lib/ckan/default/bin/activate
Managing the Docker containers
> cd docker-ckan
> docker ps
should return a list of running containers, which should include the following container names: datapusher-plus
, ckan
, solr
, and redis
.
Use
> sudo docker-compose logs --tail=100 ckan
to show the last 100 lines of the log for the ckan
Docker instance.
Adding/changing departments of publishers
To add or change the departments belonging to a particular publisher organization edit the dataset_schema.json
file: > vi /usr/lib/ckan/default/src/ckanext-scheming/ckanext/scheming/dataset_schema.json
Then run > sudo service apache2 reload
The extra tricky part about this one is that our GitHub repository that includes this JSON file is installed in a different directory: /usr/lib/ckan/default/src/ckanext-wprdctheme/
but changes to the files in that directory (and subdirectories) do nothing.
Dealing with inadequate disk space
Once, we were seeing OSError: write error in the CKAN Docker logs and had to increase disk space to make CKAN function again.
Steve on increasing volume size on AWS:
> if you ever have to increase a volume, here’s what i followed to make the filesystem use the new space: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html
that server was a Xen
instance
Other changes
Using CKAN metadata instead of local caches
To avoid keeping local databases about datasets (for instance, when writing code to track some aspect of datasets), store such information (such as the last time an ETL job was run on a given package) in the 'extras' metadata field of the CKAN package, as much as possible. This stores information in a centralized location so ETL jobs can be run from multiple computers without any other coordination. The extras metadata fields are cataloged on the CKAN Metadata page.
Hacky workaround for adding new users to publishers
In additional to adding the users to the organizations through the CKAN front-end, you also have to add them to groups, using this URL: [2]