Training Material for UN Open GIS OpenData

From OSGeo
Revision as of 10:00, 20 January 2019 by Codrina (talk | contribs)
Jump to navigation Jump to search



The following educational material has been drafted within the framework of the OSGeo UN Committee Educational Challenge - Open Geospatial Data and software for UN sustainable development goals. The overarching goal is to show that at this time, the combination of open (geo)data globally available and the significant developments of the free and open source solutions for geospatial is sufficient to initiate geospatial analysis, at worldwide level, at small and intermediate scales, to better understand our ecosystem. In that respect, we have employed OSGeo software solutions to process global open geospatial datasets to answer one selected indicator for a sustainable development goal. The selected indicator is 9.1.1 Proportion of the rural population who live within 2 km of an all-season road (C0901010) which supports the target of developing quality, reliable, sustainable and resilient infrastructure, including regional and transborder infrastructure, to support economic development and human well-being, with a focus on affordable and equitable access for all. The indicator has been chosen after a close analysis of all SGDs and the corresponding indicators as to comply with the following:

  1. to have a spatial dimension;
  2. to not be an indicator that is already addressed through another initiative, such as the GEO Wetlands Initiative, WHO Interactive Air Pollution Maps, GEO AquaWatch, ESA CoastColour etc.;
  3. if possible, to not be yet the subject of a published methodology.


We have prepared this educational material for researchers, educators and professionals in local, regional, national or international agencies with minimal to intermediate geospatial information knowledge. We assume our audience has already basic knowledge of geospatial data structures, formats and that they have already used a GIS software, as to have basic skills and understanding of how to work with geospatial and tabular data. In that respect, we have limited the interactions with the command line, however we have inserted references to it.

Data and software used

Datasets and software used For the calculation of the SDG indicator, we have only used QGIS 3.4. We have also taken into consideration that changes that might occur from one version to another and thus focused on the functions used more than on a step-by-step guide. The datasets used for our exercise are the following:

Acquired knowledge

After going through the entire educational material, one will be able to:

  • Have a broader view on what are the types of geospatial data open at global scale, as well as what are their limitations.
  • Have a more deeper understanding of working with geospatial data using a dedicated software
  • Consistent knowledge of QGIS software fundamentals
  • Learn how to create cartographic representations of the obtained results

Educational Material

Open geospatial information and its role in answering UN Sustainable Development Goals

Population dataset description
Global Administrative Units Dataset description
For road related data, we have decided to use OpenStreetMap data as it is the only homogeneously designed globally available dataset. Without doubt, the amount and the quality of the available data for various regions around the world can vary consistently. However, given the clear and consistent definition of each map element and tag, this exercise should be reproducible in any other part of the world.
Yet, given our area of interest, the Tabora county from Tanzania, we must take into consideration specific developments for Africa, more precisely, the Highway Tag Africa - Topology of Road Network in African countries, and furthermore, the East Africa Tagging Guidelines.
However, with consideration to the global replicability of our educational material, we will also insert specifications on a more general scale. Of course, it must be acknowledged that the workflow presented here could require other adjustments with respect to the specificity of the road dataset used in calculation.

Preparing the geospatial data

For the scope of this exercise we have chosen the Tabora county of Tanzania. As we strive to create an educational material that can be applied no matter the region of interest, a decision was made to use the available datasets, on a global level. The following table presents the datasets used:

Topic Name collection/dataset Abstract Indicators Produce/collector Owner License Type of data Format Scale/spatial resolution Edition CRS Other URL
Administrative units Database of Global Administrative Areas GADM provides maps and spatial data for all countries and their sub-divisions. administrative units University of California, Berkeley,Museum of Vertebrate Zoology, and theInternational Rice Research Institute (Global Administrative Areas 2009) GDAM The data are freely available for academic use and other non-commercial use. Redistribution, or commercial use is not allowed without prior permission. vector Geopackage, shapefile, geodatabase. KMZ, R formats n/a April 2018 Geographic WGS84
World Population WorldPop Alpha version 2010 and 2015 estimates of numbers of people per grid square, with national totals adjusted to match UN population division estimates ( and remaining unadjusted. Settelments, Population numbers, birth and pregnancy, age structures, poverty spatial distribution etc. GeoData Institute, University of Southampton GeoData Institute, University of Southampton CC BY 4.0 raster GeoTIFF 100 m July 2013 Geographic WGS84
World Population Global Rural-Urban Mapping project (GRUMP), v1 To provide a polygon representation of urban areas with city or agglomeration name and time series population estimates. urban geometries Socioeconomic Data and Applications Center (sedac) Socioeconomic Data and Applications Center (sedac) CC BY 4.0 vector shapefile 30 arc-second 2006 Geographic WGS84 n/a
World Population Global Human Built-up And Settlement Extent (HBASE) Dataset From Landsat, v1 (2010) To provide high spatial resolution estimates of global urban extent derived from global 30m Landsat satellite data for the target year 2010 and a companion dataset to the Global Man-made Impervious Surface (GMIS) dataset. urban extent Socioeconomic Data and Applications Center (sedac) Socioeconomic Data and Applications Center (sedac) CC BY 4.0 raster GeoTiff 30 m 2017 Geographic WGS84, UTM n/a

Step 1

Add all listed data to your project. Create group layers as you bring in the data, so it is easier when you start processing to navigate through all datasets. Create and save your project, so you can pick up the work from where you left it.
/ [Layer]-->[Add layer] - will open the Data Source Manager that allows you to load the data.: the administrative unites (GDAM dataset), the population numbers (WorldPop - we will use the TZA_popmap15adj_v2b.tif file), the urban extent (we will use the global_urban_extent_polygons_v1.01.shp file).
As mentioned, we will use OpenStreetMap data for the roads geometry and condition. Bringing OSM data will require you install a new plugin - OSM Downloader.

[Plugins]-->[Manage and install plugins]. 

Step 2

The datasets are in various projections, either the Geographic projection EPSG 4326 or the Pseudo_Mercator EPSG 3857. A geographic coordinate system is based on a spheroid and uses angular units (degrees). Thus, when using QGIS calculator, for example, it returns values in decimal degree and not meters. You can see the used units in a projection's description that can be retrieved from [1] As we will work with road geometries, we must reproject all the datasets in a projected coordinate system, which is based on a 2D plane (with the spheroid projected on a 2D plane) and uses linear units, such as meters. For our study, we identify a suitable CRS [2] for our region of interest, the Tabora county in Tanzania. To do that we will use After a quick search, we find WGS 84 / UTM zone 36S-EPSG: 32736 to be the appropriate for our region.
To reproject vector data using QGIS, we have to save the file with the desired projection.
Click on the vector layer you want to reproject and choose [Export]-->[Save features as..]
For raster datasets, we will use gdalwrap that is available as a processing tool in the Processing toolbox.
[Processing]-->[Toolbox] We can search by typing the keyword 'reproject' in the search bar.

Then, we will cut all layers by the boundary of the selected county, Tabora.
Firstly, export from the administrative units level 1, Tabora county. Secondly, clip all layers by its geometry.

Step 3

Step 3 produces the rural areas of the Tabora county. According to Wikipedia, Tanzania is divided into regions (GDAM administrative level 1), districts (GDAM administrative level 2) and wards, (GDAM administrative level 3).
We will calculate the Rural Access Index on wards, thus we will extract the rural regions from the administrative units level 3. The resulting dataset will be vector type. [Vector]-->[Geoprocessing tools]-->[Difference]

Step 4

Step 4 prepares the dataset from which we will extract the

  1. is an open-source web service with a database of coordinates systems used in maps worldwide that allows discovery of coordinate reference systems utilized all over the world for creating maps and geodata and for identifying geo-position.
  2. CRS stands for Coordinate Reference System