Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
The book "Probabilistic Data Structures and Algorithms in Big Data Applications" is now available at Amazon and from local bookstores. More details at https://pdsa.gakhov.com
In this presentation, I described popular algorithms that employed Locality Sensitive Hashing (LSH) to solve similarity-related problems. I started with LSH in general and then switched to such algorithms as MinHash (LSH for Jaccard similarity) and SimHash (LSH for cosine similarity). Each approach came with some math that was behind it and simple examples to clarify the theory statements.