Difference between revisions of "Point Clustering"

From OSGeo
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets.  Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
 
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets.  Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
  
=== Possible Approaches: ===
+
=== Possible Approaches ===
 
* Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
 
* Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
 
* K-means Clustering
 
* K-means Clustering
 
* Hierarchical Clustering
 
* Hierarchical Clustering
 
* Distance calculation for each coordinate pair
 
* Distance calculation for each coordinate pair
 +
 +
=== Input Parameters ===
 +
Depending on algorithm...
 +
 +
Partitioning methods
 +
* Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
 +
* Some self-correlation threshold (see e.g. k-means)
 +
* Predefined irregular polygons (e.g. zip code boundaries) 
  
 
=== Implementations ===
 
=== Implementations ===
Line 14: Line 22:
 
=== References ===
 
=== References ===
 
* [http://en.wikipedia.org/wiki/Data_clustering Wikipedia] Article on Data Clustering
 
* [http://en.wikipedia.org/wiki/Data_clustering Wikipedia] Article on Data Clustering
* [http://www.nabble.com/%27clustering%27-of-points-t1261780.html#a3347923 PostGIS Mailing List] thread on clustering points
+
* [http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#pycluster PyCluster]: Python Cluster Functions (2013)
* [http://trac.osgeo.org/postgis/ticket/174 Point Clustering Utility Trigger] enhancement idea reported as ticket to PostGIS Trac.
+
* [http://trac.osgeo.org/postgis/ticket/174 Point Clustering Utility Trigger] enhancement idea reported as ticket to PostGIS Trac (2012).
* [http://www.nabble.com/clustering-points-t1404935.html#a3781371 Here] & [http://www.nabble.com/Visualizing-Point-Data-t1052056.html#a2741608 here]: Mapserver Mailing List threads on clustering points
+
* [http://gis.stackexchange.com/questions/11567/spatial-clustering-with-postgis "Spatial Clustering with PostGIS] from gis.stackexchange.com (2011)
* [http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#pycluster PyCluster]: Python Cluster Functions
+
* [http://www.geocomputation.org/2000/GC015/Gc015.htm Using Genetic Algorithms in Clustering Problems]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
* [http://www.geocomputation.org/2000/GC015/Gc015.htm Using Genetic Algorithms in Clustering Problems]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference
+
* [http://www.geocomputation.org/2000/GC024/Gc024.htm Automatic clustering via boundary extraction for mining massive point-data sets]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
* [http://www.geocomputation.org/2000/GC024/Gc024.htm Automatic clustering via boundary extraction for mining massive point-data sets]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference
+
* <s>[http://www.nabble.com/%27clustering%27-of-points-t1261780.html#a3347923 PostGIS Mailing List] thread on clustering points</s>
 +
* <s>[http://www.nabble.com/clustering-points-t1404935.html#a3781371 Here] & [http://www.nabble.com/Visualizing-Point-Data-t1052056.html#a2741608 here]: Mapserver Mailing List threads on clustering points</s>

Latest revision as of 16:16, 12 October 2014

Point Clustering: Various Approaches

Please fill this in with any approaches that you have tried for Point Clustering along with code snippets. Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.

Possible Approaches

  • Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
  • K-means Clustering
  • Hierarchical Clustering
  • Distance calculation for each coordinate pair

Input Parameters

Depending on algorithm...

Partitioning methods

  • Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
  • Some self-correlation threshold (see e.g. k-means)
  • Predefined irregular polygons (e.g. zip code boundaries)

Implementations

References