SlideShare a Scribd company logo
1 of 37
Download to read offline
Machine Learning
             What, Where and How
Narinder Kumar
  (nkumar@mercris.com)
Mercris Technologies
  (www.mercris.com)
Agenda


    Definition

    Types of Machine Learning

    Under-the Hood

    Languages & Libraries

                                2
What is Machine Learning ?




                             3
Definition

     Field of Study that gives Computers the ability
    to learn without being explicitly programmed
    --Arthur Samuel

              A more Mathematical one


     A Computer program is said to learn from
    Experience E with respect to some Task T and
    Performance measure P, if it's Performance at
    Task in T, as measured by P, improves with
    Experience E –Tom M. Mitchell                      4
Related Disciplines

    Sub-Field of Artificial Intelligence

    Deals with Design and Development of Algorithms
   Closely related to Data Mining

    Uses techniques from Statistics, Probability Theory
    and Pattern Recognition


     Not new but growing fast because of Big Data
                                                          5
Types of Machine Learning

    Supervised Machine Learning
      
          Provide right set of answers for different set of
          questions
      
          Underlying algorithm learns/infers over a period
          of time
      
          Tries to return correct answers for similar
          questions


    Unsupervised Machine Learning
      
          Provide data &
      
          Let underlying algorithm find some structure        6
Popular Use Cases

    Recommendation Systems
          
              Amazon, Netflix, iTunes Genius, IMDb...

    Up-Selling & Churn Analysis

    Customer Sentiment Analysis

    Market Segmentation

    ...

                                                        7
Understanding Regression




                           8
Problem Contest




                  9
Typical Machine Learning Algorithm

            Training Set




             Learning
             Algorithm



 Input                     Expected
            Hypothesis      Output
Features
                                      10
Let's Simplify a bit
                                                                              ➢
                                                                                  Goal is to draw a
   4000
                                         House Sizes vs Prices                    Straight line which
   3500
                                                                                  covers our Data-Set
   3000                                                                           reasonably
   2500                                                                       ➢
                                                                                  Our Hypothesis can be
Prices (1000 USD)




   2000


   1500
                                                                                     hθ ( x)=θ0+θ1 x
                                                                                      hΘthat 0+Θ1(xx)≃ y
                                                                                          x=Θ h
   1000




                                                                                  Such
             500
                                                                              ➢
                    0
                                                                                                θ
                        50   100   150     200    250       300   350   400

                                   House Sizes (Sq Yards)




                                                                                                           11
In Mathematical Terms
➢
    Hypothesis      hθ ( x)=θ0+θ1 x
➢
    Parameters       θ0 ,θ1
➢
    Cost Function

➢
    We would like to minimize   J (θ0 ,θ1 )


                                              12
Solution : Gradient Descent
➢
    Start with an initial
    values of θ0 , θ1

➢
    Keep Changing θ0 , θ1
    until we end up at
    minimum




                                         13
Mathematically
Repeat Until Convergence




For Our Scenario



Generic Formula

                           14
Let's see all this in Action




                               15
Extending Regression
➢
    Quadratic Model

➢
    Cubic Model

➢
    Square Root Model

➢
    We can create multiple new Features like

        X 2=X   2
                        X 3=X   3
                                    X 4= √ X
                                               16
Additional Pointers

➢
    Mean Normalization

➢
    Feature Scaling

➢
    Learning Rate

➢
    Gradient Descent vs Others



                                 17
HOW-TO
Languages & Libraries



                        18
Languages




            19
Libraries, Tools and Products




                                20
A Short Introduction



                       21
What is WEKA ?

    Developed by Machine Learning Group,
    University of Waikato, New Zealand

    Collection of Machine Learning Algorithms

    Contains tools for
       
           Data Pre-Processing
       
           Classification & Regression
       
           Clustering
       
           Visualization

    Can be embedded inside your application

    Implemented in Java
                                                22
Main Components


    Explorer


    Experimenter


    Knowledge Flow


    CLI



                     23
Terminology

    Training DataSet == Instances

    Each Row in DataSet == Instance

    Instance is Collection of Attributes (Features)

    Types of Attributes
      
          Nominal (True, False, Malignant, Benign,
          Cloudy...)
      
          Real values (6, 2.34, 0...)
      
          String (“Interesting”, “Really like it”, “Hate
          It” ...)
      
          ...
                                                           24
Sample DataSets
@RELATION house                @RELATION CPU

@ATTRIBUTE houseSize real      @attribute outlook {sunny, overcast,
@ATTRIBUTE lotSize real         rainy}
@ATTRIBUTE bedrooms real       @attribute temperature real
@ATTRIBUTE granite real        @attribute humidity real
@ATTRIBUTE bathroom real       @attribute windy {TRUE, FALSE}
@ATTRIBUTE sellingPrice real   @attribute play {yes, no}


@DATA                          @data
3529,9191,6,0,0,205000         sunny,85,85,FALSE,no
3247,10061,5,1,1,224900        sunny,80,90,TRUE,no
4032,10150,5,0,1,197900        overcast,83,86,FALSE,yes
2397,14156,4,1,0,189900        rainy,70,96,FALSE,yes
2200,9600,4,0,1,195000         rainy,68,80,FALSE,yes
3536,19994,6,1,1,325000        rainy,65,70,TRUE,no
2983,9365,5,0,1,230000         overcast,64,65,TRUE,yes
                                                                      25
WEKA Demo




            26
27
Apache Mahout
➢
    Collection of Machine Learning Algorithms
➢
    Map-Reduce Enabled (most cases)
➢
    DataSources
       ➢
           Database
       ➢
           File-System
       ➢
           Lucene Integration
➢
    Very Active Community
➢
    Apache License


                                                28
WEKA vs Apache Mahout
            WEKA                    Apache-Mahout
➢
    Lot of Algorithms        ➢
                                 Lesser number of
➢
    Tools for                    Algorithms but
       ➢
           Modeling              growing
       ➢
           Comparison        ➢
                                 Lack of tools for
       ➢
           Data-Flow             Modeling
➢
    May need work for        ➢
                                 Ready by Design for
    running on large data-       Large Scale
    sets                     ➢
                                 Vibrant Community
➢
    License Issues           ➢
                                 Apache License
                                                       29
&

An Overview



              30
Google Prediction API 101
➢
    Cloud Based Web Service for Machine Learning
➢
    Exposed as REST API
➢
    Does not require any Machine Learning
    knowledge
➢
    Capabilities
       ➢
           Categorical &
       ➢
           Regression
                                                   31
Working with Google Prediction API




                                     32
Let's see in Action




                      33
Analysis
Very Promising Concept
Can be powerful tool for SME's



Not configurable
Data Security
Not Yet Production Ready (IMHO)
                                  34
Recap
➢
    Very vast
➢
    Huge demand
➢
    Has an Initial Steep Learning Curve
➢
    Several libraries available
➢
    Lot of Innovative work going on currently



                                                35
nkumar@mercris.com

     @kumar_narinder

     www.mercris.com

http://mercris.wordpress.com

                               36
Resources
➢
    Online Machine Learning Course - Prof. Andrew
     Ng, Stanford University
➢
    WEKA Wiki and API docs
➢
    Apache Mahout Wiki
➢
    IBM Developer Works Articles
➢
    Google Prediction API Web Site
➢
    Data Mining : Practical Machine Learning Tools &
     Techniques – Ian H. Witten, Eibe Frank, Mark Hall
➢
    Machine Learning Forums
                                                         37

More Related Content

What's hot

Lesson 16: Derivatives of Logarithmic and Exponential Functions
Lesson 16: Derivatives of Logarithmic and Exponential FunctionsLesson 16: Derivatives of Logarithmic and Exponential Functions
Lesson 16: Derivatives of Logarithmic and Exponential FunctionsMatthew Leingang
 
Discrete Models in Computer Vision
Discrete Models in Computer VisionDiscrete Models in Computer Vision
Discrete Models in Computer VisionYap Wooi Hen
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
Introduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenIntroduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenZhuyi Xue
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionSeonho Park
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesKevin Regan
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
Machine Learning Chapter 11 2
Machine Learning Chapter 11 2Machine Learning Chapter 11 2
Machine Learning Chapter 11 2butest
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputszukun
 
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)LetAgileFly
 
Ada boost brown boost performance with noisy data
Ada boost brown boost performance with noisy dataAda boost brown boost performance with noisy data
Ada boost brown boost performance with noisy dataShadhin Rahman
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataLiyuan Xu
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryAshwin Rao
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersLecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersMarina Santini
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural NetKen'ichi Matsui
 
Lesson19 Maximum And Minimum Values 034 Slides
Lesson19   Maximum And Minimum Values 034 SlidesLesson19   Maximum And Minimum Values 034 Slides
Lesson19 Maximum And Minimum Values 034 SlidesMatthew Leingang
 

What's hot (19)

Lesson 16: Derivatives of Logarithmic and Exponential Functions
Lesson 16: Derivatives of Logarithmic and Exponential FunctionsLesson 16: Derivatives of Logarithmic and Exponential Functions
Lesson 16: Derivatives of Logarithmic and Exponential Functions
 
Discrete Models in Computer Vision
Discrete Models in Computer VisionDiscrete Models in Computer Vision
Discrete Models in Computer Vision
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
Introduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenIntroduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi Chen
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for Regression
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision Processes
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Machine Learning Chapter 11 2
Machine Learning Chapter 11 2Machine Learning Chapter 11 2
Machine Learning Chapter 11 2
 
JavaYDL7
JavaYDL7JavaYDL7
JavaYDL7
 
JavaYDL3
JavaYDL3JavaYDL3
JavaYDL3
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputs
 
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)
Scrum Gathering 2012 Shanghai_工程实践与技术卓越分会场:编程练习(尹哲)
 
Ada boost brown boost performance with noisy data
Ada boost brown boost performance with noisy dataAda boost brown boost performance with noisy data
Ada boost brown boost performance with noisy data
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison Data
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility Theory
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersLecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net
 
Lesson19 Maximum And Minimum Values 034 Slides
Lesson19   Maximum And Minimum Values 034 SlidesLesson19   Maximum And Minimum Values 034 Slides
Lesson19 Maximum And Minimum Values 034 Slides
 

Viewers also liked

The theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscienceThe theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscienceBrian Spiering
 
Healthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of AnalyticsHealthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of AnalyticsDale Sanders
 
Cognitive Neuroscience an Introduction
Cognitive Neuroscience an  IntroductionCognitive Neuroscience an  Introduction
Cognitive Neuroscience an IntroductionPS Deb
 
Introduction to Methods in Neuroscience
Introduction to Methods in NeuroscienceIntroduction to Methods in Neuroscience
Introduction to Methods in NeuroscienceAlbert
 
History of neuroscience
History of neuroscienceHistory of neuroscience
History of neurosciencePS Deb
 
Lecture (Neuroscience)
Lecture (Neuroscience)Lecture (Neuroscience)
Lecture (Neuroscience)dermengles
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
Introduction To Neuroscience
Introduction To NeuroscienceIntroduction To Neuroscience
Introduction To Neurosciencevacagodx
 
Digestive System And Endocrine
Digestive System And EndocrineDigestive System And Endocrine
Digestive System And Endocrinegueste8aa65
 
Neuron structure and nerve cells
Neuron structure and nerve cellsNeuron structure and nerve cells
Neuron structure and nerve cellsmishal786
 
Cognitive neuroscience introduction 2011
Cognitive neuroscience introduction 2011Cognitive neuroscience introduction 2011
Cognitive neuroscience introduction 2011PS Deb
 
The structure and function of the brain
The structure and function of the brainThe structure and function of the brain
The structure and function of the brainCrystal Delosa
 
Human body systems
Human body systemsHuman body systems
Human body systemsrlinde
 

Viewers also liked (20)

The theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscienceThe theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscience
 
Healthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of AnalyticsHealthcare 2.0: The Age of Analytics
Healthcare 2.0: The Age of Analytics
 
Human nervous system
 Human nervous system Human nervous system
Human nervous system
 
Nerve power point
Nerve power pointNerve power point
Nerve power point
 
Neurotransmitters
NeurotransmittersNeurotransmitters
Neurotransmitters
 
Cognitive Neuroscience an Introduction
Cognitive Neuroscience an  IntroductionCognitive Neuroscience an  Introduction
Cognitive Neuroscience an Introduction
 
Nervous System
Nervous SystemNervous System
Nervous System
 
Physiology of neurotransmission
Physiology of neurotransmissionPhysiology of neurotransmission
Physiology of neurotransmission
 
structure of neuron
structure of neuronstructure of neuron
structure of neuron
 
Introduction to Methods in Neuroscience
Introduction to Methods in NeuroscienceIntroduction to Methods in Neuroscience
Introduction to Methods in Neuroscience
 
History of neuroscience
History of neuroscienceHistory of neuroscience
History of neuroscience
 
Lecture (Neuroscience)
Lecture (Neuroscience)Lecture (Neuroscience)
Lecture (Neuroscience)
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Introduction To Neuroscience
Introduction To NeuroscienceIntroduction To Neuroscience
Introduction To Neuroscience
 
Digestive System And Endocrine
Digestive System And EndocrineDigestive System And Endocrine
Digestive System And Endocrine
 
Neuron structure and nerve cells
Neuron structure and nerve cellsNeuron structure and nerve cells
Neuron structure and nerve cells
 
Cognitive neuroscience introduction 2011
Cognitive neuroscience introduction 2011Cognitive neuroscience introduction 2011
Cognitive neuroscience introduction 2011
 
The structure and function of the brain
The structure and function of the brainThe structure and function of the brain
The structure and function of the brain
 
Human body systems
Human body systemsHuman body systems
Human body systems
 
Functions of the Brain
Functions of the BrainFunctions of the Brain
Functions of the Brain
 

Similar to Machine Learning - What, Where and How

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...Edge AI and Vision Alliance
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodAzul Systems Inc.
 
Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevConryanstout
 
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...ETS Asset Management Factory
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyGanesan Narayanasamy
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Google Big Data Expo
Google Big Data ExpoGoogle Big Data Expo
Google Big Data ExpoBigDataExpo
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Ganesan Narayanasamy
 
The business case for automated software engineering
The business case for automated software engineering The business case for automated software engineering
The business case for automated software engineering CS, NcState
 
New Directions for Mahout
New Directions for MahoutNew Directions for Mahout
New Directions for MahoutTed Dunning
 
Class 26: Objectifying Objects
Class 26: Objectifying ObjectsClass 26: Objectifying Objects
Class 26: Objectifying ObjectsDavid Evans
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07Ted Dunning
 
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...Amazon Web Services
 

Similar to Machine Learning - What, Where and How (20)

MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...
 
supervised.pptx
supervised.pptxsupervised.pptx
supervised.pptx
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the Hood
 
Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevCon
 
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Google Big Data Expo
Google Big Data ExpoGoogle Big Data Expo
Google Big Data Expo
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
The business case for automated software engineering
The business case for automated software engineering The business case for automated software engineering
The business case for automated software engineering
 
New Directions for Mahout
New Directions for MahoutNew Directions for Mahout
New Directions for Mahout
 
Class 26: Objectifying Objects
Class 26: Objectifying ObjectsClass 26: Objectifying Objects
Class 26: Objectifying Objects
 
New directions for mahout
New directions for mahoutNew directions for mahout
New directions for mahout
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07
 
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
 
Icpp power ai-workshop 2018
Icpp power ai-workshop 2018Icpp power ai-workshop 2018
Icpp power ai-workshop 2018
 
Shap
ShapShap
Shap
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Machine Learning - What, Where and How

  • 1. Machine Learning What, Where and How Narinder Kumar (nkumar@mercris.com) Mercris Technologies (www.mercris.com)
  • 2. Agenda  Definition  Types of Machine Learning  Under-the Hood  Languages & Libraries 2
  • 3. What is Machine Learning ? 3
  • 4. Definition  Field of Study that gives Computers the ability to learn without being explicitly programmed --Arthur Samuel A more Mathematical one  A Computer program is said to learn from Experience E with respect to some Task T and Performance measure P, if it's Performance at Task in T, as measured by P, improves with Experience E –Tom M. Mitchell 4
  • 5. Related Disciplines  Sub-Field of Artificial Intelligence  Deals with Design and Development of Algorithms  Closely related to Data Mining  Uses techniques from Statistics, Probability Theory and Pattern Recognition Not new but growing fast because of Big Data 5
  • 6. Types of Machine Learning  Supervised Machine Learning  Provide right set of answers for different set of questions  Underlying algorithm learns/infers over a period of time  Tries to return correct answers for similar questions  Unsupervised Machine Learning  Provide data &  Let underlying algorithm find some structure 6
  • 7. Popular Use Cases  Recommendation Systems  Amazon, Netflix, iTunes Genius, IMDb...  Up-Selling & Churn Analysis  Customer Sentiment Analysis  Market Segmentation  ... 7
  • 10. Typical Machine Learning Algorithm Training Set Learning Algorithm Input Expected Hypothesis Output Features 10
  • 11. Let's Simplify a bit ➢ Goal is to draw a 4000 House Sizes vs Prices Straight line which 3500 covers our Data-Set 3000 reasonably 2500 ➢ Our Hypothesis can be Prices (1000 USD) 2000 1500 hθ ( x)=θ0+θ1 x hΘthat 0+Θ1(xx)≃ y x=Θ h 1000 Such 500 ➢ 0 θ 50 100 150 200 250 300 350 400 House Sizes (Sq Yards) 11
  • 12. In Mathematical Terms ➢ Hypothesis hθ ( x)=θ0+θ1 x ➢ Parameters θ0 ,θ1 ➢ Cost Function ➢ We would like to minimize J (θ0 ,θ1 ) 12
  • 13. Solution : Gradient Descent ➢ Start with an initial values of θ0 , θ1 ➢ Keep Changing θ0 , θ1 until we end up at minimum 13
  • 14. Mathematically Repeat Until Convergence For Our Scenario Generic Formula 14
  • 15. Let's see all this in Action 15
  • 16. Extending Regression ➢ Quadratic Model ➢ Cubic Model ➢ Square Root Model ➢ We can create multiple new Features like X 2=X 2 X 3=X 3 X 4= √ X 16
  • 17. Additional Pointers ➢ Mean Normalization ➢ Feature Scaling ➢ Learning Rate ➢ Gradient Descent vs Others 17
  • 19. Languages 19
  • 20. Libraries, Tools and Products 20
  • 22. What is WEKA ?  Developed by Machine Learning Group, University of Waikato, New Zealand  Collection of Machine Learning Algorithms  Contains tools for  Data Pre-Processing  Classification & Regression  Clustering  Visualization  Can be embedded inside your application  Implemented in Java 22
  • 23. Main Components  Explorer  Experimenter  Knowledge Flow  CLI 23
  • 24. Terminology  Training DataSet == Instances  Each Row in DataSet == Instance  Instance is Collection of Attributes (Features)  Types of Attributes  Nominal (True, False, Malignant, Benign, Cloudy...)  Real values (6, 2.34, 0...)  String (“Interesting”, “Really like it”, “Hate It” ...)  ... 24
  • 25. Sample DataSets @RELATION house @RELATION CPU @ATTRIBUTE houseSize real @attribute outlook {sunny, overcast, @ATTRIBUTE lotSize real rainy} @ATTRIBUTE bedrooms real @attribute temperature real @ATTRIBUTE granite real @attribute humidity real @ATTRIBUTE bathroom real @attribute windy {TRUE, FALSE} @ATTRIBUTE sellingPrice real @attribute play {yes, no} @DATA @data 3529,9191,6,0,0,205000 sunny,85,85,FALSE,no 3247,10061,5,1,1,224900 sunny,80,90,TRUE,no 4032,10150,5,0,1,197900 overcast,83,86,FALSE,yes 2397,14156,4,1,0,189900 rainy,70,96,FALSE,yes 2200,9600,4,0,1,195000 rainy,68,80,FALSE,yes 3536,19994,6,1,1,325000 rainy,65,70,TRUE,no 2983,9365,5,0,1,230000 overcast,64,65,TRUE,yes 25
  • 26. WEKA Demo 26
  • 27. 27
  • 28. Apache Mahout ➢ Collection of Machine Learning Algorithms ➢ Map-Reduce Enabled (most cases) ➢ DataSources ➢ Database ➢ File-System ➢ Lucene Integration ➢ Very Active Community ➢ Apache License 28
  • 29. WEKA vs Apache Mahout WEKA Apache-Mahout ➢ Lot of Algorithms ➢ Lesser number of ➢ Tools for Algorithms but ➢ Modeling growing ➢ Comparison ➢ Lack of tools for ➢ Data-Flow Modeling ➢ May need work for ➢ Ready by Design for running on large data- Large Scale sets ➢ Vibrant Community ➢ License Issues ➢ Apache License 29
  • 31. Google Prediction API 101 ➢ Cloud Based Web Service for Machine Learning ➢ Exposed as REST API ➢ Does not require any Machine Learning knowledge ➢ Capabilities ➢ Categorical & ➢ Regression 31
  • 32. Working with Google Prediction API 32
  • 33. Let's see in Action 33
  • 34. Analysis Very Promising Concept Can be powerful tool for SME's Not configurable Data Security Not Yet Production Ready (IMHO) 34
  • 35. Recap ➢ Very vast ➢ Huge demand ➢ Has an Initial Steep Learning Curve ➢ Several libraries available ➢ Lot of Innovative work going on currently 35
  • 36. nkumar@mercris.com @kumar_narinder www.mercris.com http://mercris.wordpress.com 36
  • 37. Resources ➢ Online Machine Learning Course - Prof. Andrew Ng, Stanford University ➢ WEKA Wiki and API docs ➢ Apache Mahout Wiki ➢ IBM Developer Works Articles ➢ Google Prediction API Web Site ➢ Data Mining : Practical Machine Learning Tools & Techniques – Ian H. Witten, Eibe Frank, Mark Hall ➢ Machine Learning Forums 37