
Be the first to like this
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Ted Dunning describes an implementation of recent results that provide high quality kmeans clustering at very high speed.
"For well clusterable data, this algorithm provides good bounds on quality, but practically speaking, it makes clustering practical in many applications by providing roughly 3 orders of magnitude speedup relative to the standard algorithm based on Lloyd's initial efforts. In addition, the algorithm is highly amenable to implementation using mapreduce and shows essentially linear speedup.
Just as significant, this new algorithm allows clustering with a very large number of clusters which makes it practical to use as a feature extraction algorithm or set up for a nearest neighbor search. "  Ted Dunning
Be the first to like this
Be the first to comment