Difference between revisions of "Point Clustering"

From OSGeo
Jump to navigation Jump to search
(bootsrap)
 
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
== Point Clustering: Various Approaches ==
 +
 +
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets.  Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
 +
 +
=== Possible Approaches ===
 +
* Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
 +
* K-means Clustering
 +
* Hierarchical Clustering
 +
* Distance calculation for each coordinate pair
  
== Point Clustering: Various Approaches ==
+
=== Input Parameters ===
 +
Depending on algorithm...
 +
 
 +
Partitioning methods
 +
* Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
 +
* Some self-correlation threshold (see e.g. k-means)
 +
* Predefined irregular polygons (e.g. zip code boundaries) 
 +
 
 +
=== Implementations ===
 +
* [http://mapserver.org/mapfile/cluster.html MapServer CLUSTER]
 +
 
 +
=== References ===
 +
* [http://en.wikipedia.org/wiki/Data_clustering Wikipedia] Article on Data Clustering
 +
* [http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#pycluster PyCluster]: Python Cluster Functions (2013)
 +
* [http://trac.osgeo.org/postgis/ticket/174 Point Clustering Utility Trigger] enhancement idea reported as ticket to PostGIS Trac (2012).
 +
* [http://gis.stackexchange.com/questions/11567/spatial-clustering-with-postgis "Spatial Clustering with PostGIS] from gis.stackexchange.com (2011)
 +
* [http://www.geocomputation.org/2000/GC015/Gc015.htm Using Genetic Algorithms in Clustering Problems]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
 +
* [http://www.geocomputation.org/2000/GC024/Gc024.htm Automatic clustering via boundary extraction for mining massive point-data sets]: paper from [http://www.geocomputation.org/ GeoComputation] 2000 conference (2000)
 +
* <s>[http://www.nabble.com/%27clustering%27-of-points-t1261780.html#a3347923 PostGIS Mailing List] thread on clustering points</s>
 +
* <s>[http://www.nabble.com/clustering-points-t1404935.html#a3781371 Here] & [http://www.nabble.com/Visualizing-Point-Data-t1052056.html#a2741608 here]: Mapserver Mailing List threads on clustering points</s>

Latest revision as of 16:16, 12 October 2014

Point Clustering: Various Approaches

Please fill this in with any approaches that you have tried for Point Clustering along with code snippets. Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.

Possible Approaches

  • Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
  • K-means Clustering
  • Hierarchical Clustering
  • Distance calculation for each coordinate pair

Input Parameters

Depending on algorithm...

Partitioning methods

  • Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
  • Some self-correlation threshold (see e.g. k-means)
  • Predefined irregular polygons (e.g. zip code boundaries)

Implementations

References