Difference between revisions of "Location in CKAN"

From OSGeo
Jump to navigation Jump to search
m
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
The Comprehensive Knowledge Archive Network is a project of the Open Knowledge Foundation. It is made up of some free software, a web-based service, an API, but more importantly a network of package contributors and maintainers.
+
The Comprehensive Knowledge Archive Network is a project of the [[Open Knowledge Foundation]]. It is a free and open source software package, also a web-based service, an API, but more importantly a network of package contributors and maintainers. [http://ckan.net/ CKAN.net] has a lot of listings for packages of geodata - see the [http://www.ckan.net/tag/read/geo geo] and [http://www.ckan.net/tag/read/geodata geodata] tags. There is no location metadata, though there is an option to add new key-value-pair "tags" to the minimal default metadata.  
  
CKAN has a lot of listings for packages of geodata - see the 'geo' and 'geodata' tags. There is no location metadata, though there is an option to add new key-value-pair "tags" to the minimal default metadata.
+
''CKAN aims to encourage and support the emergence of a culture where knowledge packages can be easily discovered and plugged together as is currently possible with software.''  
  
''CKAN aims to encourage and support the emergence of a culture where knowledge packages can be easily discovered and plugged together as is currently possible with software.''
+
The function of this writing is to talk through the CKAN vision and think about different ways in which location information could make it easier to find and download data.  
  
The function of this writing is to talk through the CKAN vision and think about different ways in which location information could make it easier to find and download data.
+
[[Image:Ckan-vision.png]]  
 
[[Image:Ckan-vision.png]]
 
  
 
CKAN is a registry, not a repository - it stores minimal metadata about data sources. Essentially, a title, a project URL, a download URL, and author/maintainer contact information. Entries can be extended with "tags".  
 
CKAN is a registry, not a repository - it stores minimal metadata about data sources. Essentially, a title, a project URL, a download URL, and author/maintainer contact information. Entries can be extended with "tags".  
Line 15: Line 13:
 
This is being worked on in the separate datapkg project. Many entries in CKAN will require a bit of custom glue to turn them into packages. Interfaces and formats may change over versions, so everything is versioned.  
 
This is being worked on in the separate datapkg project. Many entries in CKAN will require a bit of custom glue to turn them into packages. Interfaces and formats may change over versions, so everything is versioned.  
  
 +
<br>
  
== Location in CKAN ==
+
== Location in CKAN ==
  
Some packages have definite locations - the data is about Spain, or about Boston, etc.
+
Some packages have definite locations - the data is about Spain, or about Boston, etc.  
  
Other packages have global scope.
+
Other packages have global scope.  
  
Some packages are "geographic information", or "GIS data", that is, primarily points lines and polygons or 3D objects, annotated with bits of text or links to other data sets.
+
Some packages are "geographic information", or "GIS data", that is, primarily points lines and polygons or 3D objects, annotated with bits of text or links to other data sets.  
  
But many more packages have a location component within their data.
+
But many more packages have a location component within their data.  
  
== Simplest useful thing? ==
+
== Simplest useful thing? ==
  
 
Registries/Repositories that store GIS data tend to have "bounding box" metadata for each dataset or series of datasets.  
 
Registries/Repositories that store GIS data tend to have "bounding box" metadata for each dataset or series of datasets.  
Without idea of data scale/resolution, bbox alone not that helpful for search. Adding a bbox requires data entry which is already quite "specialist", even if user is given a UI to draw a box on a map, and time-consuming.
 
  
We could get similar value by creating links from CKAN packages to entries in a gazetteer of place names and approximate locations.
+
Without a description of data scale/resolution, bbox alone not that helpful for search. Adding a bbox requires data entry which is already quite "specialist", even if user is given a UI to draw a box on a map, and time-consuming.
 +
 
 +
We could get similar value by creating links from CKAN packages to entries in a gazetteer of place names and approximate locations.  
  
Then using the links to help create collections of packages. "apt-get install London" or "apt-get install London 1973".
+
Then using the links to help create collections of packages. "apt-get install London" or "apt-get install London 1973".  
  
=== Shouldn't we just do this with tags? ===
+
=== Shouldn't we just do this with tags? ===
  
User-created 'tags' partly serve this purpose already: for example UK-related packages can be found by looking at the 'uk' tag: http://www.ckan.net/tag/read/uk  
+
User-created 'tags' partly serve this purpose already: for example UK-related packages can be found by looking at the 'uk' tag: http://www.ckan.net/tag/read/uk These are not all data that describes the UK - some are global-scope environmental data sets that have been produced by UK researchers.  
These are not all data that describes the UK - some environmental data sets that have been produced by UK researchers.  
 
  
There are also tag-spaces that need gardening, or connecting together. Looking at tag/read/london finds only 2 packages. where a fulltext search reveals 3 different tags used for London data: [city-london greater-london-authority london]
+
There are also tag-spaces that need gardening, or connecting together. Looking at tag/read/london finds only 2 packages. where a fulltext search reveals 3 different tags used for London data: [city-london greater-london-authority london]  
  
Nice to have URLs to provide extra metadata about places and connections between them - see geonames.org semantic web service
+
Nice to have URLs to provide extra metadata about places and connections between them - see geonames.org semantic web service  
  
=== Shouldn't we just do this with an OGC Catalogue Service? ===
+
=== Shouldn't we just do this with an OGC Catalogue Service? ===
  
 
Other services handle detailed metadata for GIS data at a great level of detail - GeoNetwork, Go-Geo!, INSPIRE metadata, AGMAP profile, national geoportals, etc.  
 
Other services handle detailed metadata for GIS data at a great level of detail - GeoNetwork, Go-Geo!, INSPIRE metadata, AGMAP profile, national geoportals, etc.  
  
This is not about turning CKAN into a catalogue service for geodata, because there is already a lot of investment in that area through INSPIRE.
+
This is not about turning CKAN into a catalogue service for geodata, because there is already a lot of investment in that area through INSPIRE.  
  
 
Rather, the focus here is on using location data to help make more sense of other, related datasets.  
 
Rather, the focus here is on using location data to help make more sense of other, related datasets.  
  
* http://knowledgeforge.net/ckan/trac/wiki
+
*http://knowledgeforge.net/ckan/trac/wiki  
* http://sciencecommons.org/weblog/archives/2008/08/18/voices-from-the-future-of-science-rufus-pollock-of-the-open-knowledge-foundation/
+
*http://sciencecommons.org/weblog/archives/2008/08/18/voices-from-the-future-of-science-rufus-pollock-of-the-open-knowledge-foundation/  
* http://events.ccc.de/congress/2009/Fahrplan/events/3647.en.html
+
*http://events.ccc.de/congress/2009/Fahrplan/events/3647.en.html  
* http://www.data.gov.uk/ uses CKAN as a knowledge registry
+
*http://www.data.gov.uk/ uses CKAN as a knowledge registry  
* http://developer.yahoo.com/geo/geoplanet/data/ - Yahoo! WOEID concept
+
*http://developer.yahoo.com/geo/geoplanet/data/ - Yahoo! WOEID concept  
* http://www.geonames.org/ontology/ - Geonames semantic web service  
+
*http://www.geonames.org/ontology/ - Geonames semantic web service  
* http://www.gogeo.ac.uk/cgi-bin/index.cgi - a more traditional geographic data portal service
+
*http://www.gogeo.ac.uk/cgi-bin/index.cgi - a more traditional geographic data portal service  
* http://www.gigateway.org.uk/metadata/pdf/GEMINI2.pdf - the gory detail of the UK-specific metadata spec which is an INSPIRE profile.  
+
*http://www.gigateway.org.uk/metadata/pdf/GEMINI2.pdf - the gory detail of the UK-specific metadata spec which is an INSPIRE profile.
  
 
----
 
----
  
 
Notes: I should probably have put this on the OKF wiki, but I have forgotten how to work MoinMoin.
 
Notes: I should probably have put this on the OKF wiki, but I have forgotten how to work MoinMoin.

Latest revision as of 14:42, 14 January 2010

The Comprehensive Knowledge Archive Network is a project of the Open Knowledge Foundation. It is a free and open source software package, also a web-based service, an API, but more importantly a network of package contributors and maintainers. CKAN.net has a lot of listings for packages of geodata - see the geo and geodata tags. There is no location metadata, though there is an option to add new key-value-pair "tags" to the minimal default metadata.

CKAN aims to encourage and support the emergence of a culture where knowledge packages can be easily discovered and plugged together as is currently possible with software.

The function of this writing is to talk through the CKAN vision and think about different ways in which location information could make it easier to find and download data.

Ckan-vision.png

CKAN is a registry, not a repository - it stores minimal metadata about data sources. Essentially, a title, a project URL, a download URL, and author/maintainer contact information. Entries can be extended with "tags".

An design goal of CKAN is to convert download links into packages of data which can be automatically installed via an application. Links may be to pages listing individual files, or a big dump of XML, or the interface to a web service.

This is being worked on in the separate datapkg project. Many entries in CKAN will require a bit of custom glue to turn them into packages. Interfaces and formats may change over versions, so everything is versioned.


Location in CKAN

Some packages have definite locations - the data is about Spain, or about Boston, etc.

Other packages have global scope.

Some packages are "geographic information", or "GIS data", that is, primarily points lines and polygons or 3D objects, annotated with bits of text or links to other data sets.

But many more packages have a location component within their data.

Simplest useful thing?

Registries/Repositories that store GIS data tend to have "bounding box" metadata for each dataset or series of datasets.

Without a description of data scale/resolution, bbox alone not that helpful for search. Adding a bbox requires data entry which is already quite "specialist", even if user is given a UI to draw a box on a map, and time-consuming.

We could get similar value by creating links from CKAN packages to entries in a gazetteer of place names and approximate locations.

Then using the links to help create collections of packages. "apt-get install London" or "apt-get install London 1973".

Shouldn't we just do this with tags?

User-created 'tags' partly serve this purpose already: for example UK-related packages can be found by looking at the 'uk' tag: http://www.ckan.net/tag/read/uk These are not all data that describes the UK - some are global-scope environmental data sets that have been produced by UK researchers.

There are also tag-spaces that need gardening, or connecting together. Looking at tag/read/london finds only 2 packages. where a fulltext search reveals 3 different tags used for London data: [city-london greater-london-authority london]

Nice to have URLs to provide extra metadata about places and connections between them - see geonames.org semantic web service

Shouldn't we just do this with an OGC Catalogue Service?

Other services handle detailed metadata for GIS data at a great level of detail - GeoNetwork, Go-Geo!, INSPIRE metadata, AGMAP profile, national geoportals, etc.

This is not about turning CKAN into a catalogue service for geodata, because there is already a lot of investment in that area through INSPIRE.

Rather, the focus here is on using location data to help make more sense of other, related datasets.


Notes: I should probably have put this on the OKF wiki, but I have forgotten how to work MoinMoin.