MapGuide RFC 5 - Enhanced Join Support

From OSGeo
Jump to navigation Jump to search
This page contains an change request (RFC) for MapGuide Open Source.  
More MapGuide RFCs can be found on the MapGuide RFCs page.

Status

  • Submission Date: November 2, 2006
  • Last Modified Date: November 2, 2006
  • Author: Ronnie Louie
  • RFC Status: draft
  • Implementation Status: pending
  • Voting History:
  • Assigned PSC guide(s):

Overview

This proposal is to support additional functionality to feature source joins for chained joins, inner joins, and 1-to-many support.

Current join support is limited to left outer join type only. This means that the result consists of all feature records from the primary source, even if the attributes are null. An inner join is defined as a join that results in only the records with matching secondary records.

A chained join is one in which the secondary source for a previously defined joined is used as a primary source for another join. This situation occurs when there is a need to join a secondary table to another secondary table, not just hang all secondary tables off the primary one.

It is possible that a secondary source within a join might have multiple records that match one primary record. This is a 1 : many case. Current functionality limits the joined records to only the first one, effectively enforcing a 1 : 1 case, such that the additional matching records are not available in the join result. By supporting the 1 : many case, all the joined secondary data is available for further analysis.

Motivation

(This section is currently being edited)

The existing feature join functionality requires the feature source to define an extension to a primary feature class. This extension is defined by identifying the secondary feature source and feature class, along with the attributes from both classes to join on. The extension does not specify the type of join to perform, which actually is left outer join only. There are situations where an inner join type on the data is desired, however this option is currently not possible since there is no way to choose a join type with the current implementation.

Joins in MapGuide currently consist of one primary feature class joined to one or more secondary feature classes. The secondary class cannot be extended via join in MapGuide even though it may make sense to perform such a chained join. For example, a primary table “parcels” is joined to a secondary table of “owners”. But you want to join “owners” to “demographics” to do an analysis on age or income levels or family size.

When a primary feature record is joined to secondary source, there may be one or more matching secondary records. MapGuide is only retrieving the first matching secondary record and ignoring the rest. Users would need to interrogate the secondary source separately to obtain the additional data. This is cumbersome and possible error prone as there is no indication provided that additional records may exist. We need to provide an option to the extension definition to facilitate a 1 : many join result. This result will comprise of a duplicate feature geometry for each additional matching secondary record. For example, a primary table "parcels" is joined to a secondary table of "owners". A particular parcel "A01" is can have 3 different owners. MapGuide currently implements the 1 : 1 case, and retrieves the first encountered owner and and display a single parcel geometry on the map. The other two owners are never displayed in any fashion. If MapGuide supports a 1 : many option, MapGuide will be able to provide the details for all three owners, should the need arise.

Funding/Resources

The effort to implement the proposed changes will be sponsored by Autodesk.

Proposed Changes

Chained joins will be specified using the name of a primary join as a prefix in the FeatureClassProperty attribute value using the following notation: <Primary_Join_name.Attribute_name>, where the dot symbol "." is used as a delimiter.

The notation for extended properties will use a delimiter to separate property names from their qualifiers in joins. The default symbol will be a vertical bar "|". This delimiter is a customizable string and is not limited to a single character.

Enhanced join support will make use of schema changes already incorporated into Bond for specifying the join type, 1 : many support, and the delimiter symbol for identifying extended attributes.

More details TBD.

Implications

These changes will impact existing join functionality. Existing applications will need to be updated for the new delimited extended property names. Older feature sources may need to be updated/migrated to the new schema. Documentation for will need to be updated for the new join functionality.

Test Plan

In addition to the existing tests for left outer joins, new test cases for inner join, chained joins, and 1 : many support will need to be created. These tests should cover joins to file-based sources as well as connection-based sources.