Difference between revisions of "Point Clustering"

From OSGeo
Jump to navigation Jump to search
 
(6 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets.  Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
 
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets.  Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
  
=== Possible Approaches: ===
+
=== Possible Approaches ===
* Coordinate interleaving
+
* Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
* K means Clustering
+
* K-means Clustering
* Heirarchical Clustering
+
* Hierarchical Clustering
* distance calculation for each coordinate pair
+
* Distance calculation for each coordinate pair
 +
 
 +
=== Input Parameters ===
 +
Depending on algorithm...
 +
 
 +
Partitioning methods
 +
* Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
 +
* Some self-correlation threshold (see e.g. k-means)
 +
* Predefined irregular polygons (e.g. zip code boundaries) 
 +
 
 +
=== Implementations ===
 +
* [http://mapserver.org/mapfile/cluster.html MapServer CLUSTER]
  
 
=== References ===
 
=== References ===
 
* [http://en.wikipedia.org/wiki/Data_clustering Wikipedia] Article on Data Clustering
 
* [http://en.wikipedia.org/wiki/Data_clustering Wikipedia] Article on Data Clustering
* [http://www.nabble.com/%27clustering%27-of-points-t1261780.html#a3347923 PostGIS Mailing List] thread on clustering points
+
* [http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#pycluster PyCluster]: Python Cluster Functions (2013)
* [http://www.nabble.com/clustering-points-t1404935.html#a3781371 Here] & [http://www.nabble.com/Visualizing-Point-Data-t1052056.html#a2741608 here]: Mapserver Mailing List threads on clustering points
+
* [http://trac.osgeo.org/postgis/ticket/174 Point Clustering Utility Trigger] enhancement idea reported as ticket to PostGIS Trac (2012).
* [http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#pycluster PyCluster]: Python Cluster Functions
+
* [http://gis.stackexchange.com/questions/11567/spatial-clustering-with-postgis "Spatial Clustering with PostGIS] from gis.stackexchange.com (2011)
 +
* [http://www.geocomputation.org/2000/GC015/Gc015.htm Using Genetic Algorithms in Clustering Problems]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
 +
* [http://www.geocomputation.org/2000/GC024/Gc024.htm Automatic clustering via boundary extraction for mining massive point-data sets]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
 +
* <s>[http://www.nabble.com/%27clustering%27-of-points-t1261780.html#a3347923 PostGIS Mailing List] thread on clustering points</s>
 +
* <s>[http://www.nabble.com/clustering-points-t1404935.html#a3781371 Here] & [http://www.nabble.com/Visualizing-Point-Data-t1052056.html#a2741608 here]: Mapserver Mailing List threads on clustering points</s>

Latest revision as of 16:16, 12 October 2014

Point Clustering: Various Approaches

Please fill this in with any approaches that you have tried for Point Clustering along with code snippets. Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.

Possible Approaches

  • Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
  • K-means Clustering
  • Hierarchical Clustering
  • Distance calculation for each coordinate pair

Input Parameters

Depending on algorithm...

Partitioning methods

  • Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
  • Some self-correlation threshold (see e.g. k-means)
  • Predefined irregular polygons (e.g. zip code boundaries)

Implementations

References