Point Clustering
Revision as of 10:57, 12 October 2014 by Wiki-Sfkeller (talk | contribs)
Point Clustering: Various Approaches
Please fill this in with any approaches that you have tried for Point Clustering along with code snippets. Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.
Possible Approaches
- Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
- K-means Clustering
- Hierarchical Clustering
- Distance calculation for each coordinate pair
Input Parameters
Depending on algorithm...
- Partitioning methods
- Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
- Some self-correlation threshold (see e.g. k-means)
- Predefined irregular polygons (e.g. zip code boundaries)
Implementations
References
- Wikipedia Article on Data Clustering
- PostGIS Mailing List thread on clustering points
- Point Clustering Utility Trigger enhancement idea reported as ticket to PostGIS Trac.
- Here & here: Mapserver Mailing List threads on clustering points
- PyCluster: Python Cluster Functions
- Using Genetic Algorithms in Clustering Problems: paper from GeoComputation 2000 conference
- Automatic clustering via boundary extraction for mining massive point-data sets: paper from GeoComputation 2000 conference