Tools andTechnologies for Large Scale Data                 Mining                      Jaganadh G                Project L...
About me !!     Software Engineer Specializing in Text Analytics Research &     Development     When free, teaches Python,...
Machine Learning  Machine Learning  Machine learning is a subfield of artificial intelligence (AI)  concerned with algorithm...
Machine Learning  Machine Learning  Machine learning is a subfield of artificial intelligence (AI)  concerned with algorithm...
Machine Learning  Machine Learning  Machine learning is a subfield of artificial intelligence (AI)  concerned with algorithm...
Machine Learning  Machine Learning  Machine learning is a subfield of artificial intelligence (AI)  concerned with algorithm...
Machine Learning and Our Life     Do you think that Machine Learning has any impact in our life     ??                    ...
Machine Learning and Our Life     Do you think that Machine Learning has any impact in our life     ??     Yes            ...
Machine Learning and Our Life     Do you think that Machine Learning has any impact in our life     ??     Yes     In our ...
Machine Learning and Our Life     Do you think that Machine Learning has any impact in our life     ??     Yes     In our ...
Machine Learning and Our Life     Do you think that Machine Learning has any impact in our life     ??     Yes     In our ...
Examples           Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Examples           Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Examples           Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Tool for building Machine Learning powerd product/service  Apache Mahout  Apache Mahout is a scalable machine learning lib...
Algorithms in Apache Mahout                   Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Algorithms in Apache Mahout     Collaborative Filtering                         Jaganadh G   Tools andTechnologies for Lar...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders                         Jagan...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Algorithms in Apache Mahout     Collaborative Filtering     User and Item based recommenders     K-Means, Fuzzy K-Means cl...
Demo       Building recommendations engines with Mahout       Document Classification with Mahout       Some Python stuff on...
Reference            Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Reference     Mahout in Action - Book by Sean Owen and Robin Anil,     published by Manning Publications.     Taming Text ...
Useful Resources      Apache Mahout Site http://mahout.apache.org/      Apache Mahout Mailing List user@mahout.apache.org ...
Questions ??               Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Acknowledgments  Thanks to :      Manning Publications for Review Copy of the book ”Mahout      in Action”      Apache Mah...
Finally          Jaganadh G   Tools andTechnologies for Large Scale Data Mining
Upcoming SlideShare
Loading in...5
×

Tools andTechnologies for Large Scale Data Mining

1,361

Published on

Tools andTechnologies for Large Scale Data
Mining

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,361
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
71
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Tools andTechnologies for Large Scale Data Mining

  1. 1. Tools andTechnologies for Large Scale Data Mining Jaganadh G Project Lead NLP R&D 365Media Pvt. Ltd. jaganadhg@gmail.com DRDO Sponsored National Level Seminar on Challenging Issues on Data Mining Semantic Web, Sri Krishna College of Engineering and Technology, Coimbatore 27th Jan 2012 Jaganadh G Tools andTechnologies for Large Scale Data Mining
  2. 2. About me !! Software Engineer Specializing in Text Analytics Research & Development When free, teaches Python, Speaks about FOSS and blogs at http://jaganadhg.in Working as Project Lead (NLP) 365Media Pvt. Ltd. Coimbatore I am a computational linguist / Linguist and Indologist, Book reviewer Maters Degree Holder in Sanskrit from University of Kerala Jaganadh G Tools andTechnologies for Large Scale Data Mining
  3. 3. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. Jaganadh G Tools andTechnologies for Large Scale Data Mining
  4. 4. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. Jaganadh G Tools andTechnologies for Large Scale Data Mining
  5. 5. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. This talk is not aimed to give introduction about Machine Learning Jaganadh G Tools andTechnologies for Large Scale Data Mining
  6. 6. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. This talk is not aimed to give introduction about Machine Learning Dont expect some mathy equations here Jaganadh G Tools andTechnologies for Large Scale Data Mining
  7. 7. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Jaganadh G Tools andTechnologies for Large Scale Data Mining
  8. 8. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes Jaganadh G Tools andTechnologies for Large Scale Data Mining
  9. 9. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Jaganadh G Tools andTechnologies for Large Scale Data Mining
  10. 10. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools E-mail spam filtering , product recommendations etc .. Jaganadh G Tools andTechnologies for Large Scale Data Mining
  11. 11. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools E-mail spam filtering , product recommendations etc .. Fraud detection Jaganadh G Tools andTechnologies for Large Scale Data Mining
  12. 12. Examples Jaganadh G Tools andTechnologies for Large Scale Data Mining
  13. 13. Examples Jaganadh G Tools andTechnologies for Large Scale Data Mining
  14. 14. Examples Jaganadh G Tools andTechnologies for Large Scale Data Mining
  15. 15. Tool for building Machine Learning powerd product/service Apache Mahout Apache Mahout is a scalable machine learning library that supports large data sets. Apache Mahout’s goal is to build scalable machine learning libraries. Commercially friendly licence Well documented Healthy community Targeted to developers Jaganadh G Tools andTechnologies for Large Scale Data Mining
  16. 16. Algorithms in Apache Mahout Jaganadh G Tools andTechnologies for Large Scale Data Mining
  17. 17. Algorithms in Apache Mahout Collaborative Filtering Jaganadh G Tools andTechnologies for Large Scale Data Mining
  18. 18. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders Jaganadh G Tools andTechnologies for Large Scale Data Mining
  19. 19. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Jaganadh G Tools andTechnologies for Large Scale Data Mining
  20. 20. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Jaganadh G Tools andTechnologies for Large Scale Data Mining
  21. 21. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Jaganadh G Tools andTechnologies for Large Scale Data Mining
  22. 22. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Jaganadh G Tools andTechnologies for Large Scale Data Mining
  23. 23. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Jaganadh G Tools andTechnologies for Large Scale Data Mining
  24. 24. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Jaganadh G Tools andTechnologies for Large Scale Data Mining
  25. 25. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Complementary Naive Bayes classifier Jaganadh G Tools andTechnologies for Large Scale Data Mining
  26. 26. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Complementary Naive Bayes classifier Random forest decision tree based classifier Jaganadh G Tools andTechnologies for Large Scale Data Mining
  27. 27. Demo Building recommendations engines with Mahout Document Classification with Mahout Some Python stuff on Machine Learning Jaganadh G Tools andTechnologies for Large Scale Data Mining
  28. 28. Reference Jaganadh G Tools andTechnologies for Large Scale Data Mining
  29. 29. Reference Mahout in Action - Book by Sean Owen and Robin Anil, published by Manning Publications. Taming Text - By Grant Ingersoll and Tom Morton, published by Manning Publications. Introducing Apache Mahout - Grant Ingersoll - Intro to Apache Mahout focused on clustering, classification and collaborative filtering. https://www.ibm.com/developerworks/java/library/j- mahout/index.html Programming Collective Intelligence: Building Smart Web 2.0 Applications http://www.amazon.com/Programming-Collective- Intelligence-Building-Applications/dp/0596529325 Jaganadh G Tools andTechnologies for Large Scale Data Mining
  30. 30. Useful Resources Apache Mahout Site http://mahout.apache.org/ Apache Mahout Mailing List user@mahout.apache.org The code which I used for Mahout demo is available at http://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/ Twenty News Group data set http://people.csail.mit.edu/jrennie/20Newsgroups/20news- bydate.tar.gz Jaganadh G Tools andTechnologies for Large Scale Data Mining
  31. 31. Questions ?? Jaganadh G Tools andTechnologies for Large Scale Data Mining
  32. 32. Acknowledgments Thanks to : Manning Publications for Review Copy of the book ”Mahout in Action” Apache Mahout mailing list members Ted Dunning and Robin Anil for suggestions Sreejith S and Biju B for Java help @chelakkandupoda for review and criticism Mukundhanchari R&D Director 365Media Pvt. Ltd. for support and encouragement Jaganadh G Tools andTechnologies for Large Scale Data Mining
  33. 33. Finally Jaganadh G Tools andTechnologies for Large Scale Data Mining
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×