Location in CKAN

From OSGeo
Revision as of 06:54, 12 January 2010 by Wiki-JoWalsh (talk | contribs) (rough notes version of ckan vision for location)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Comprehensive Knowledge Archive Network is a project of the Open Knowledge Foundation. It is made up of some free software, a web-based service, an API, but more importantly a network of package contributors and maintainers.

CKAN has a lot of listings for packages of geodata - see the 'geo' and 'geodata' tags. There is no location metadata, though there is an option to add new key-value-pair "tags" to the minimal default metadata.

CKAN aims to encourage and support the emergence of a culture where knowledge packages can be easily discovered and plugged together as is currently possible with software.

The function of this writing is to talk through the CKAN vision and think about different ways in which location information could make it easier to find and download data.

Ckan-vision.png

CKAN is a registry, not a repository - it stores minimal metadata about data sources. Essentially, a title, a project URL, a download URL, and author/maintainer contact information. Entries can be extended with "tags".

An design goal of CKAN is to convert download links into packages of data which can be automatically installed via an application. Links may be to pages listing individual files, or a big dump of XML, or the interface to a web service.

This is being worked on in the separate datapkg project. Many entries in CKAN will require a bit of custom glue to turn them into packages. Interfaces and formats may change over versions, so everything is versioned.


Location in CKAN

Some packages have definite locations - the data is about Spain, or about Boston, etc.

Other packages have global scope.

Some packages are "geographic information", or "GIS data", that is, primarily points lines and polygons or 3D objects, annotated with bits of text or links to other data sets.

But many more packages have a location component within their data.

Simplest useful thing?

Registries/Repositories that store GIS data tend to have "bounding box" metadata for each dataset or series of datasets. This isn't always appropriate. Without idea of data scale, bbox alone not that helpful for search. Plus, it requires data entry which is already quite "specialist", even if user is given a UI to draw a box on a map, and time-consuming.

Could do the same thing using links to entries in a gazetteer. "UK" or "Edinburgh".

Then using the links to create collections of packages. "apt-get install London" or "apt-get install London 1973".

Other services handle detailed metadata for GIS data at a great level of detail - Go-Geo!, AGMAP profile, national geoportals, etc. This is not trying to turn CKAN into a catalogue service for geodata, because there is already a lot of investment in that area through INSPIRE.

Rather, the focus here is on using locations to help make more sense of other, related datasets.


Notes: I should probably have put this on the OKF wiki, but I have forgotten how to work MoinMoin.