I describe an implementation of recent results that provide high quality k-means clustering at very high speed. For well clusterable data, this algorithm provides good bounds on quality, but ...

I describe an implementation of recent results that provide high quality k-means clustering at very high speed. For well clusterable data, this algorithm provides good bounds on quality, but practically speaking, it makes clustering practical in many applications by providing roughly 3 orders of magnitude speedup relative to the standard algorithm based on Lloyd's initial efforts. In addition, the algorithm is highly amenable to implementation using map-reduce and shows essentially linear speedup.

Just as significant, this new algorithm allows clustering with a very large number of clusters which makes it practical to use as a feature extraction algorithm or set up for a nearest neighbor search.

### Statistics

### Views

- Total Views
- 8,087
- Views on SlideShare
- 3,492
- Embed Views

### Actions

- Likes
- 3
- Downloads
- 63
- Comments
- 0

### Accessibility

### Categories

### Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

### Usage Rights

© All Rights Reserved

No comments yet3 Likes1Full NameComment goes here.Jorge Mendoza, Project Manager at LATAM Airlines Group 1 year agoPetro Rudenko1 year agoWinkausyar Wanranto1 year ago