SlideShare a Scribd company logo
Practical Machine Learning
  A Tutorial on Apache Mahout


               Biju B
         NLP R&D Division
         365Media Pvt. Ltd.
         bijub@365Media.in

             FOSSMEET NITC,
                 Calicut


          4-6 February 2011




   Biju B & Jaganadh G   Practical Machine Learning
nlp r d $ whoweare




     Working in Natural Language Processing (NLP), Machine Learning,
     Data Mining
     Passionate about Free and Open source :-)
     When gets free time teaches Python and blogs at
     http://jaganadhg.freeflux.net/blog and contributes to
     Openstreetmap
     Works for 365Media Pvt. Ltd. Coimbatore India.
     twitter handle : @jaganadhg, @bijub




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning




  Machine Learning
  Machine learning is a subfield of artificial intelligence (AI) concerned with
  algorithms that allow computers to learn.




                         Biju B & Jaganadh G   Practical Machine Learning
Machine Learning




  Machine Learning
  Machine learning is a subfield of artificial intelligence (AI) concerned with
  algorithms that allow computers to learn.




                         Biju B & Jaganadh G   Practical Machine Learning
Machine Learning




  Machine Learning
  Machine learning is a subfield of artificial intelligence (AI) concerned with
  algorithms that allow computers to learn.

      This talk is not aimed to give introduction about Machine Learning




                         Biju B & Jaganadh G   Practical Machine Learning
Machine Learning




  Machine Learning
  Machine learning is a subfield of artificial intelligence (AI) concerned with
  algorithms that allow computers to learn.

      This talk is not aimed to give introduction about Machine Learning
      Dont expect some mathy equations here




                         Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools
     Recommendation Engines




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools
     Recommendation Engines
     Clustering




                      Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools
     Recommendation Engines
     Clustering
     Classification , Spam Filtering




                       Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools
     Recommendation Engines
     Clustering
     Classification , Spam Filtering
     Sentiment Analysis




                       Biju B & Jaganadh G   Practical Machine Learning
Machine Learning and Our Life



     Do you think that Machine Learning has any impact in our life ??
     Yes
     In our day to day life we may use many Machine Learning powered
     tools
     Recommendation Engines
     Clustering
     Classification , Spam Filtering
     Sentiment Analysis
     Fraud Detraction




                        Biju B & Jaganadh G   Practical Machine Learning
Mahout



  Mahout
  Open Source project by Apache Foundation
  Goal of this project is to build scalable machine learning libraries




                          Biju B & Jaganadh G   Practical Machine Learning
Mahout




  Mahout
  Mahout: a person who drives elephant ;-)
  The name comes from the project’s use of Apache Hadoop.




                       Biju B & Jaganadh G   Practical Machine Learning
Why a new library ?



  There are more than 30 Java libraries/ tools available for Machine
  Learning.
  Weka , Mallet, Classifier4j, Rapidminer ........
      Large Amount of data processing is not an easy task
      Machine Learning tools are supposed to produce quick results
      If the amount of data is too large it is not easy to process with a
      single machine (Even if it is powerful)
      Mahout is scalable: the core algorithms in Mahout are implemented
      on top of Apache Hadoop using the map/reduce paradigm




                        Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout




                Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering
     Latent Dirichlet Allocation




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering
     Latent Dirichlet Allocation
     Singular value decomposition




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering
     Latent Dirichlet Allocation
     Singular value decomposition
     Parallel Frequent Pattern mining




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering
     Latent Dirichlet Allocation
     Singular value decomposition
     Parallel Frequent Pattern mining
     Complementary Naive Bayes classifier




                       Biju B & Jaganadh G   Practical Machine Learning
Algorithms in Apache Mahout



     Collaborative Filtering
     User and Item based recommenders
     K-Means, Fuzzy K-Means clustering
     Mean Shift clustering
     Dirichlet process clustering
     Latent Dirichlet Allocation
     Singular value decomposition
     Parallel Frequent Pattern mining
     Complementary Naive Bayes classifier
     Random forest decision tree based classifier




                       Biju B & Jaganadh G   Practical Machine Learning
Recommendation




    Filter information based on user preference
    Searching a large set of people and finding a smaller set with tastes
    similar to you
    e.g :- Amazon’s book recommendation , Netflix movie
    recommendation




                      Biju B & Jaganadh G   Practical Machine Learning
Document Classification




     Classify documents based on its content
     e.g: - spam filtering,priority inbox




                       Biju B & Jaganadh G   Practical Machine Learning
Demo


       Building recommendations engines with Mahout
       Document Classification with Mahout




                       Biju B & Jaganadh G   Practical Machine Learning
Reference




            Biju B & Jaganadh G   Practical Machine Learning
Reference


     Mahout in Action - Book by Sean Owen and Robin Anil, published
     by Manning Publications.
     Taming Text - By Grant Ingersoll and Tom Morton, published by
     Manning Publications.
     Introducing Apache Mahout - Grant Ingersoll - Intro to Apache
     Mahout focused on clustering, classification and collaborative
     filtering. https://www.ibm.com/developerworks/java/library/j-
     mahout/index.html
     Programming Collective Intelligence: Building Smart Web 2.0
     Applications
     http://www.amazon.com/Programming-Collective-Intelligence-
     Building-Applications/dp/0596529325




                      Biju B & Jaganadh G   Practical Machine Learning
Useful Resources




     Apache Mahout Site http://mahout.apache.org/
     Apache Mahout Mailing List user@mahout.apache.org
     The code which I used for Mahout demo is available at
     http://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/
     Twenty News Group data set
     http://people.csail.mit.edu/jrennie/20Newsgroups/20news-
     bydate.tar.gz




                      Biju B & Jaganadh G   Practical Machine Learning
Questions ??




               Biju B & Jaganadh G   Practical Machine Learning
Acknowledgments



  Thanks to :
      Manning Publications for Review Copy of the book ”Mahout in
      Action”
      Apache Mahout mailing list members
      Ted Dunning and Robin Anil for suggestions
      @chelakkandupoda for review and criticism
      Mukundhanchari R&D Director 365Media Pvt. Ltd. for support and
      encouragement




                       Biju B & Jaganadh G   Practical Machine Learning
Finally




          Biju B & Jaganadh G   Practical Machine Learning

More Related Content

Similar to Mahout Tutorial FOSSMEET NITC

BotConf..pptx
BotConf..pptxBotConf..pptx
BotConf..pptx
KarekarAtharvaAjit
 
Cognitive Automation - Your AI Coworker
Cognitive Automation - Your AI CoworkerCognitive Automation - Your AI Coworker
Cognitive Automation - Your AI Coworker
Tamilselvan Subramanian
 
Python Machine Learning Tutorial
Python Machine Learning TutorialPython Machine Learning Tutorial
Python Machine Learning Tutorial
grinu
 
AI Training in Lucknow
AI Training in LucknowAI Training in Lucknow
AI Training in Lucknow
Training At Innovitt Global
 
Projects
ProjectsProjects
Brief Presentation on Machine Learning In Power BI.pptx
Brief Presentation on Machine Learning In Power BI.pptxBrief Presentation on Machine Learning In Power BI.pptx
Brief Presentation on Machine Learning In Power BI.pptx
kprasad8
 
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptxSession 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
jameshodgkinson9
 
Machine learning tutorial
Machine learning tutorialMachine learning tutorial
Machine learning tutorial
ssuser8a512c
 
Machine learning tutorial
Machine learning tutorialMachine learning tutorial
Machine learning tutorial
AshokKumarC18
 
Citizen AI Engineer Program 2018 CAI 500 Fast Track AI Week1 Roadmap
Citizen AI Engineer Program 2018 CAI 500  Fast Track AI Week1 RoadmapCitizen AI Engineer Program 2018 CAI 500  Fast Track AI Week1 Roadmap
Citizen AI Engineer Program 2018 CAI 500 Fast Track AI Week1 Roadmap
Dr. Mohan K. Bavirisetty
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
Edureka!
 
Testing of artificial intelligence; AI quality engineering skils - an introdu...
Testing of artificial intelligence; AI quality engineering skils - an introdu...Testing of artificial intelligence; AI quality engineering skils - an introdu...
Testing of artificial intelligence; AI quality engineering skils - an introdu...
Rik Marselis
 
JAM23-24 session 2 .pptx
JAM23-24 session 2 .pptxJAM23-24 session 2 .pptx
JAM23-24 session 2 .pptx
AbrarSharif2
 
VIRTUAL GYM ASSISTANT
VIRTUAL GYM ASSISTANTVIRTUAL GYM ASSISTANT
VIRTUAL GYM ASSISTANT
IRJET Journal
 
Pycon india-2016-success-story
Pycon india-2016-success-storyPycon india-2016-success-story
Pycon india-2016-success-story
Chetan Khatri
 
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
AgileNetwork
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
power-of-generative-ai.pdf
power-of-generative-ai.pdfpower-of-generative-ai.pdf
power-of-generative-ai.pdf
yaswantuj99
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
Benjamin Bengfort
 
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM LeaderWebinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
Product School
 

Similar to Mahout Tutorial FOSSMEET NITC (20)

BotConf..pptx
BotConf..pptxBotConf..pptx
BotConf..pptx
 
Cognitive Automation - Your AI Coworker
Cognitive Automation - Your AI CoworkerCognitive Automation - Your AI Coworker
Cognitive Automation - Your AI Coworker
 
Python Machine Learning Tutorial
Python Machine Learning TutorialPython Machine Learning Tutorial
Python Machine Learning Tutorial
 
AI Training in Lucknow
AI Training in LucknowAI Training in Lucknow
AI Training in Lucknow
 
Projects
ProjectsProjects
Projects
 
Brief Presentation on Machine Learning In Power BI.pptx
Brief Presentation on Machine Learning In Power BI.pptxBrief Presentation on Machine Learning In Power BI.pptx
Brief Presentation on Machine Learning In Power BI.pptx
 
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptxSession 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
Session 1 AI literacy What is AI and how do we use it (Slide Presentation).pptx
 
Machine learning tutorial
Machine learning tutorialMachine learning tutorial
Machine learning tutorial
 
Machine learning tutorial
Machine learning tutorialMachine learning tutorial
Machine learning tutorial
 
Citizen AI Engineer Program 2018 CAI 500 Fast Track AI Week1 Roadmap
Citizen AI Engineer Program 2018 CAI 500  Fast Track AI Week1 RoadmapCitizen AI Engineer Program 2018 CAI 500  Fast Track AI Week1 Roadmap
Citizen AI Engineer Program 2018 CAI 500 Fast Track AI Week1 Roadmap
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
 
Testing of artificial intelligence; AI quality engineering skils - an introdu...
Testing of artificial intelligence; AI quality engineering skils - an introdu...Testing of artificial intelligence; AI quality engineering skils - an introdu...
Testing of artificial intelligence; AI quality engineering skils - an introdu...
 
JAM23-24 session 2 .pptx
JAM23-24 session 2 .pptxJAM23-24 session 2 .pptx
JAM23-24 session 2 .pptx
 
VIRTUAL GYM ASSISTANT
VIRTUAL GYM ASSISTANTVIRTUAL GYM ASSISTANT
VIRTUAL GYM ASSISTANT
 
Pycon india-2016-success-story
Pycon india-2016-success-storyPycon india-2016-success-story
Pycon india-2016-success-story
 
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
Agile Network India | Agility Day @Noida | Enterprise agility through enginee...
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
 
power-of-generative-ai.pdf
power-of-generative-ai.pdfpower-of-generative-ai.pdf
power-of-generative-ai.pdf
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM LeaderWebinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
Webinar: Using GenAI for Increasing Productivity in PM by Amazon PM Leader
 

More from Jaganadh Gopinadhan

Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
Jaganadh Gopinadhan
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Natural Language Processing with Per
Natural Language Processing with PerNatural Language Processing with Per
Natural Language Processing with PerJaganadh Gopinadhan
 
Indian Language Spellchecker Development for OpenOffice.org
Indian Language Spellchecker Development for OpenOffice.org Indian Language Spellchecker Development for OpenOffice.org
Indian Language Spellchecker Development for OpenOffice.org Jaganadh Gopinadhan
 
Sanskrit and Computational Linguistic
Sanskrit and Computational Linguistic Sanskrit and Computational Linguistic
Sanskrit and Computational Linguistic Jaganadh Gopinadhan
 
Script to Sentiment : on future of Language TechnologyMysore latest
Script to Sentiment : on future of Language TechnologyMysore latestScript to Sentiment : on future of Language TechnologyMysore latest
Script to Sentiment : on future of Language TechnologyMysore latestJaganadh Gopinadhan
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
Linguistic localization framework for Ooo
Linguistic localization framework for OooLinguistic localization framework for Ooo
Linguistic localization framework for Ooo
Jaganadh Gopinadhan
 
Ilucbe python v1.2
Ilucbe python v1.2Ilucbe python v1.2
Ilucbe python v1.2
Jaganadh Gopinadhan
 
Success Factor
Success Factor Success Factor
Success Factor
Jaganadh Gopinadhan
 
ntroduction to GNU/Linux Linux Installation and Basic Commands
ntroduction to GNU/Linux Linux Installation and Basic Commands ntroduction to GNU/Linux Linux Installation and Basic Commands
ntroduction to GNU/Linux Linux Installation and Basic Commands
Jaganadh Gopinadhan
 
Let’s Learn Python An introduction to Python
Let’s Learn Python An introduction to Python Let’s Learn Python An introduction to Python
Let’s Learn Python An introduction to Python
Jaganadh Gopinadhan
 
Introduction to Free and Open Source Software
Introduction to Free and Open Source Software Introduction to Free and Open Source Software
Introduction to Free and Open Source Software
Jaganadh Gopinadhan
 
Opinion Mining and Sentiment Analysis Issues and Challenges
Opinion Mining and Sentiment Analysis Issues and Challenges Opinion Mining and Sentiment Analysis Issues and Challenges
Opinion Mining and Sentiment Analysis Issues and Challenges
Jaganadh Gopinadhan
 
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
Jaganadh Gopinadhan
 
Hdfs
HdfsHdfs

More from Jaganadh Gopinadhan (20)

Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Natural Language Processing with Per
Natural Language Processing with PerNatural Language Processing with Per
Natural Language Processing with Per
 
Indian Language Spellchecker Development for OpenOffice.org
Indian Language Spellchecker Development for OpenOffice.org Indian Language Spellchecker Development for OpenOffice.org
Indian Language Spellchecker Development for OpenOffice.org
 
Sanskrit and Computational Linguistic
Sanskrit and Computational Linguistic Sanskrit and Computational Linguistic
Sanskrit and Computational Linguistic
 
Script to Sentiment : on future of Language TechnologyMysore latest
Script to Sentiment : on future of Language TechnologyMysore latestScript to Sentiment : on future of Language TechnologyMysore latest
Script to Sentiment : on future of Language TechnologyMysore latest
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Linguistic localization framework for Ooo
Linguistic localization framework for OooLinguistic localization framework for Ooo
Linguistic localization framework for Ooo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Ilucbe python v1.2
Ilucbe python v1.2Ilucbe python v1.2
Ilucbe python v1.2
 
Social Media Analytics
Social Media Analytics Social Media Analytics
Social Media Analytics
 
Success Factor
Success Factor Success Factor
Success Factor
 
ntroduction to GNU/Linux Linux Installation and Basic Commands
ntroduction to GNU/Linux Linux Installation and Basic Commands ntroduction to GNU/Linux Linux Installation and Basic Commands
ntroduction to GNU/Linux Linux Installation and Basic Commands
 
Let’s Learn Python An introduction to Python
Let’s Learn Python An introduction to Python Let’s Learn Python An introduction to Python
Let’s Learn Python An introduction to Python
 
Introduction to Free and Open Source Software
Introduction to Free and Open Source Software Introduction to Free and Open Source Software
Introduction to Free and Open Source Software
 
Opinion Mining and Sentiment Analysis Issues and Challenges
Opinion Mining and Sentiment Analysis Issues and Challenges Opinion Mining and Sentiment Analysis Issues and Challenges
Opinion Mining and Sentiment Analysis Issues and Challenges
 
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
What they think about my brand/product ?!?!? An Introduction to Sentiment Ana...
 
Hdfs
HdfsHdfs
Hdfs
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Mahout Tutorial FOSSMEET NITC

  • 1. Practical Machine Learning A Tutorial on Apache Mahout Biju B NLP R&D Division 365Media Pvt. Ltd. bijub@365Media.in FOSSMEET NITC, Calicut 4-6 February 2011 Biju B & Jaganadh G Practical Machine Learning
  • 2. nlp r d $ whoweare Working in Natural Language Processing (NLP), Machine Learning, Data Mining Passionate about Free and Open source :-) When gets free time teaches Python and blogs at http://jaganadhg.freeflux.net/blog and contributes to Openstreetmap Works for 365Media Pvt. Ltd. Coimbatore India. twitter handle : @jaganadhg, @bijub Biju B & Jaganadh G Practical Machine Learning
  • 3. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. Biju B & Jaganadh G Practical Machine Learning
  • 4. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. Biju B & Jaganadh G Practical Machine Learning
  • 5. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. This talk is not aimed to give introduction about Machine Learning Biju B & Jaganadh G Practical Machine Learning
  • 6. Machine Learning Machine Learning Machine learning is a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn. This talk is not aimed to give introduction about Machine Learning Dont expect some mathy equations here Biju B & Jaganadh G Practical Machine Learning
  • 7. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Biju B & Jaganadh G Practical Machine Learning
  • 8. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes Biju B & Jaganadh G Practical Machine Learning
  • 9. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Biju B & Jaganadh G Practical Machine Learning
  • 10. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Recommendation Engines Biju B & Jaganadh G Practical Machine Learning
  • 11. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Recommendation Engines Clustering Biju B & Jaganadh G Practical Machine Learning
  • 12. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Recommendation Engines Clustering Classification , Spam Filtering Biju B & Jaganadh G Practical Machine Learning
  • 13. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Recommendation Engines Clustering Classification , Spam Filtering Sentiment Analysis Biju B & Jaganadh G Practical Machine Learning
  • 14. Machine Learning and Our Life Do you think that Machine Learning has any impact in our life ?? Yes In our day to day life we may use many Machine Learning powered tools Recommendation Engines Clustering Classification , Spam Filtering Sentiment Analysis Fraud Detraction Biju B & Jaganadh G Practical Machine Learning
  • 15. Mahout Mahout Open Source project by Apache Foundation Goal of this project is to build scalable machine learning libraries Biju B & Jaganadh G Practical Machine Learning
  • 16. Mahout Mahout Mahout: a person who drives elephant ;-) The name comes from the project’s use of Apache Hadoop. Biju B & Jaganadh G Practical Machine Learning
  • 17. Why a new library ? There are more than 30 Java libraries/ tools available for Machine Learning. Weka , Mallet, Classifier4j, Rapidminer ........ Large Amount of data processing is not an easy task Machine Learning tools are supposed to produce quick results If the amount of data is too large it is not easy to process with a single machine (Even if it is powerful) Mahout is scalable: the core algorithms in Mahout are implemented on top of Apache Hadoop using the map/reduce paradigm Biju B & Jaganadh G Practical Machine Learning
  • 18. Algorithms in Apache Mahout Biju B & Jaganadh G Practical Machine Learning
  • 19. Algorithms in Apache Mahout Collaborative Filtering Biju B & Jaganadh G Practical Machine Learning
  • 20. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders Biju B & Jaganadh G Practical Machine Learning
  • 21. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Biju B & Jaganadh G Practical Machine Learning
  • 22. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Biju B & Jaganadh G Practical Machine Learning
  • 23. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Biju B & Jaganadh G Practical Machine Learning
  • 24. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Biju B & Jaganadh G Practical Machine Learning
  • 25. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Biju B & Jaganadh G Practical Machine Learning
  • 26. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Biju B & Jaganadh G Practical Machine Learning
  • 27. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Complementary Naive Bayes classifier Biju B & Jaganadh G Practical Machine Learning
  • 28. Algorithms in Apache Mahout Collaborative Filtering User and Item based recommenders K-Means, Fuzzy K-Means clustering Mean Shift clustering Dirichlet process clustering Latent Dirichlet Allocation Singular value decomposition Parallel Frequent Pattern mining Complementary Naive Bayes classifier Random forest decision tree based classifier Biju B & Jaganadh G Practical Machine Learning
  • 29. Recommendation Filter information based on user preference Searching a large set of people and finding a smaller set with tastes similar to you e.g :- Amazon’s book recommendation , Netflix movie recommendation Biju B & Jaganadh G Practical Machine Learning
  • 30. Document Classification Classify documents based on its content e.g: - spam filtering,priority inbox Biju B & Jaganadh G Practical Machine Learning
  • 31. Demo Building recommendations engines with Mahout Document Classification with Mahout Biju B & Jaganadh G Practical Machine Learning
  • 32. Reference Biju B & Jaganadh G Practical Machine Learning
  • 33. Reference Mahout in Action - Book by Sean Owen and Robin Anil, published by Manning Publications. Taming Text - By Grant Ingersoll and Tom Morton, published by Manning Publications. Introducing Apache Mahout - Grant Ingersoll - Intro to Apache Mahout focused on clustering, classification and collaborative filtering. https://www.ibm.com/developerworks/java/library/j- mahout/index.html Programming Collective Intelligence: Building Smart Web 2.0 Applications http://www.amazon.com/Programming-Collective-Intelligence- Building-Applications/dp/0596529325 Biju B & Jaganadh G Practical Machine Learning
  • 34. Useful Resources Apache Mahout Site http://mahout.apache.org/ Apache Mahout Mailing List user@mahout.apache.org The code which I used for Mahout demo is available at http://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/ Twenty News Group data set http://people.csail.mit.edu/jrennie/20Newsgroups/20news- bydate.tar.gz Biju B & Jaganadh G Practical Machine Learning
  • 35. Questions ?? Biju B & Jaganadh G Practical Machine Learning
  • 36. Acknowledgments Thanks to : Manning Publications for Review Copy of the book ”Mahout in Action” Apache Mahout mailing list members Ted Dunning and Robin Anil for suggestions @chelakkandupoda for review and criticism Mukundhanchari R&D Director 365Media Pvt. Ltd. for support and encouragement Biju B & Jaganadh G Practical Machine Learning
  • 37. Finally Biju B & Jaganadh G Practical Machine Learning