# Point Clustering

From OSGeo

## Contents

## Point Clustering: Various Approaches

Please fill this in with any approaches that you have tried for Point Clustering along with code snippets. Please include discussion on why a particular method worked well or didn't work well and what circumstances it may be good for.

### Possible Approaches

- Coordinate interleaving (i.e. 1. rounding input coordinates, 2. grouping/aggregating them, and then 3. averaging their original coordinates so that the cluster position is at the weighted coordinate of all input geometries).
- K-means Clustering
- Hierarchical Clustering
- Distance calculation for each coordinate pair

### Input Parameters

Depending on algorithm...

Partitioning methods

- Map grid width ("quare / manhattan world", see coordinate interleaving/rounding)
- Some self-correlation threshold (see e.g. k-means)
- Predefined irregular polygons (e.g. zip code boundaries)

### Implementations

### References

- Wikipedia Article on Data Clustering
- PostGIS Mailing List thread on clustering points
- Point Clustering Utility Trigger enhancement idea reported as ticket to PostGIS Trac.
- Here & here: Mapserver Mailing List threads on clustering points
- PyCluster: Python Cluster Functions
- Using Genetic Algorithms in Clustering Problems: paper from GeoComputation 2000 conference
- Automatic clustering via boundary extraction for mining massive point-data sets: paper from GeoComputation 2000 conference