Introduction to
                           Machine Learning
                                      Tiberio Caetano
                           NICTA and Australian National University




Friday, 24 February 2012
Friday, 24 February 2012
Quick calibration:




Friday, 24 February 2012
Quick calibration:


                           Who has heard of Machine Learning?




Friday, 24 February 2012
Quick calibration:


                           Who has heard of Machine Learning?


                            Who has used Machine Learning?




Friday, 24 February 2012
Quick calibration:


                           Who has heard of Machine Learning?


                            Who has used Machine Learning?


              Who has built new Machine Learning tools?




Friday, 24 February 2012
PROBLEM:


                                   DATA




                           ACTIONABLE KNOWLEDGE


     That’s roughly the problem Machine Learning addresses


Friday, 24 February 2012
BLUE: DATA                                 RED: KNOWLEDGE


              - Is this email spam or not spam?



              - Is there a face in this picture?


              - Should I lend money to this customer given his
              spending behaviour?



Friday, 24 February 2012
Knowledge is not concrete


                                “Face” is an abstraction
                                “Spam” is an abstraction
                           “Who to lend to” is an abstraction


        You don’t find faces, spam or financial advice in datasets
                                     you just find bits


Friday, 24 February 2012
?


           We have data        But we want abstractions

Friday, 24 February 2012
What is an abstraction
                          anyway?
                     • Anything whose description does not
                           depend exclusively on the bits you have


                     • Notion of generalisation is fundamental

                     • Abstraction always involves assumptions
Friday, 24 February 2012
Ready to define Machine Learning:


                     • Machine Learning is the science of
                           automating the process of abstraction from
                           raw data and assumptions


           Raw Data
                                      Machine Learning         Abstraction
      Assumptions

Friday, 24 February 2012
Data: (painted image) + (dataset of normal images)


                                +



     Assumption: the non-painted parts of the painted image
              behave as the images in the dataset




Friday, 24 February 2012
Data: (painted image) + (dataset of normal images)


                                +



     Assumption: the non-painted parts of the painted image
              behave as the images in the dataset



       Abstraction: corrected image



Friday, 24 February 2012
Several forms of abstraction


                                  Cluster data
                                  Classify data
                               Predict from data
                                Summarise data
                              Decide based on data
                                      etc...




Friday, 24 February 2012
(e) Ground Truth
        Clustering




                                               (i) Ground Truth

                                           Figure 2: Resulting motion
                                           based algorithms. 2(a)-2(d)


                                         [10] S. M. Goldfeld and R
               http://home.dei.polimi.it/matteucc/
                                              Holland Publishing C
               Clustering/tutorial_html/
                                           [11] D. W. Hosmer. Maxim
                                                lines. In Communicati
Friday, 24 February 2012
Dimensionality Reduction and Visualization




                 http://isomap.stanford.edu/datasets.html
Friday, 24 February 2012
Regression




Friday, 24 February 2012
Classification




                           {spam; not spam}   {0,1,2,3,4,5,6,7,8,9}




Friday, 24 February 2012
Structured Prediction


                       Image Understanding




                           Protein Structure
                              Prediction




                           Machine Translation




                                                 Image credit: S. Gould
Friday, 24 February 2012
Structured Prediction



                               Chess, NY, Kasparov, WTC




                              Kangaroo, Sun, Sea, Australia



Friday, 24 February 2012
What Machine Learning IS NOT




Friday, 24 February 2012
What Machine Learning IS NOT

                                  Find 01001000:




Friday, 24 February 2012
What Machine Learning IS NOT

                                  Find 01001000:




                Machine Learning is not exact pattern matching




Friday, 24 February 2012
What Machine Learning IS NOT

                                       Find 01001000:




                Machine Learning is not exact pattern matching

                           This is “just” classical computer science
                            classical “database query”, deduction

Friday, 24 February 2012
What Machine Learning IS NOT

                                       Find 01001000:




                Machine Learning is not exact pattern matching

                           This is “just” classical computer science
                            classical “database query”, deduction
                            Machine Learning involves induction
Friday, 24 February 2012
But Machine Learning IS NOT classical statistics either




Friday, 24 February 2012
But Machine Learning IS NOT classical statistics either




                            - Complex rather than simple models
                              (forget Gaussianity, forget linearity)

               - Numerical rather than analytical solution
    (forget pencil-and-paper: need hardcore numerical optimization)

                           - VERY High rather than low dimensional
                                   (p>>n rather than n>>p)

Friday, 24 February 2012
Some popular technologies driven by Machine Learning

     Recommender Systems




Friday, 24 February 2012
Some popular technologies driven by Machine Learning

        Social media




Friday, 24 February 2012
Big Data and Machine Learning

                                     Parallelism is crucial

                           - Linear algebraic approaches favoured
                                (matrix multiplication-based)

                            - Much of Feature Extraction can be
                                        parallelised

                       - Model Training is another story: usually
                                    needs syncing

Friday, 24 February 2012
Machine Learning and Data Mining


      Data Mining is a buzzword and in that sense it includes
                         Machine Learning


  In a more strict sense, Data Mining is often associated to
  data analysis without necessarily doing predictive analytics
         (which is the hallmark of Machine Learning)




Friday, 24 February 2012
When is Machine Learning helpful?

                                          DATA




                           ACTIONABLE KNOWLEDGE
       When you don’t really know how to find an explicit
       (at the bit-level) description for your abstraction or
                      “actionable knowledge”




Friday, 24 February 2012
When is Machine Learning helpful?

                                          DATA




                           ACTIONABLE KNOWLEDGE
       When you don’t really know how to find an explicit
       (at the bit-level) description for your abstraction or
                      “actionable knowledge”

                                 And this is common!!

Friday, 24 February 2012
http://tiberiocaetano.com


           http://www.nicta.com.au/research/machine_learning




Friday, 24 February 2012

Introduction to Machine Learning

  • 1.
    Introduction to Machine Learning Tiberio Caetano NICTA and Australian National University Friday, 24 February 2012
  • 2.
  • 3.
  • 4.
    Quick calibration: Who has heard of Machine Learning? Friday, 24 February 2012
  • 5.
    Quick calibration: Who has heard of Machine Learning? Who has used Machine Learning? Friday, 24 February 2012
  • 6.
    Quick calibration: Who has heard of Machine Learning? Who has used Machine Learning? Who has built new Machine Learning tools? Friday, 24 February 2012
  • 7.
    PROBLEM: DATA ACTIONABLE KNOWLEDGE That’s roughly the problem Machine Learning addresses Friday, 24 February 2012
  • 8.
    BLUE: DATA RED: KNOWLEDGE - Is this email spam or not spam? - Is there a face in this picture? - Should I lend money to this customer given his spending behaviour? Friday, 24 February 2012
  • 9.
    Knowledge is notconcrete “Face” is an abstraction “Spam” is an abstraction “Who to lend to” is an abstraction You don’t find faces, spam or financial advice in datasets you just find bits Friday, 24 February 2012
  • 10.
    ? We have data But we want abstractions Friday, 24 February 2012
  • 11.
    What is anabstraction anyway? • Anything whose description does not depend exclusively on the bits you have • Notion of generalisation is fundamental • Abstraction always involves assumptions Friday, 24 February 2012
  • 12.
    Ready to defineMachine Learning: • Machine Learning is the science of automating the process of abstraction from raw data and assumptions Raw Data Machine Learning Abstraction Assumptions Friday, 24 February 2012
  • 13.
    Data: (painted image)+ (dataset of normal images) + Assumption: the non-painted parts of the painted image behave as the images in the dataset Friday, 24 February 2012
  • 14.
    Data: (painted image)+ (dataset of normal images) + Assumption: the non-painted parts of the painted image behave as the images in the dataset Abstraction: corrected image Friday, 24 February 2012
  • 15.
    Several forms ofabstraction Cluster data Classify data Predict from data Summarise data Decide based on data etc... Friday, 24 February 2012
  • 16.
    (e) Ground Truth Clustering (i) Ground Truth Figure 2: Resulting motion based algorithms. 2(a)-2(d) [10] S. M. Goldfeld and R http://home.dei.polimi.it/matteucc/ Holland Publishing C Clustering/tutorial_html/ [11] D. W. Hosmer. Maxim lines. In Communicati Friday, 24 February 2012
  • 17.
    Dimensionality Reduction andVisualization http://isomap.stanford.edu/datasets.html Friday, 24 February 2012
  • 18.
  • 19.
    Classification {spam; not spam} {0,1,2,3,4,5,6,7,8,9} Friday, 24 February 2012
  • 20.
    Structured Prediction Image Understanding Protein Structure Prediction Machine Translation Image credit: S. Gould Friday, 24 February 2012
  • 21.
    Structured Prediction Chess, NY, Kasparov, WTC Kangaroo, Sun, Sea, Australia Friday, 24 February 2012
  • 22.
    What Machine LearningIS NOT Friday, 24 February 2012
  • 23.
    What Machine LearningIS NOT Find 01001000: Friday, 24 February 2012
  • 24.
    What Machine LearningIS NOT Find 01001000: Machine Learning is not exact pattern matching Friday, 24 February 2012
  • 25.
    What Machine LearningIS NOT Find 01001000: Machine Learning is not exact pattern matching This is “just” classical computer science classical “database query”, deduction Friday, 24 February 2012
  • 26.
    What Machine LearningIS NOT Find 01001000: Machine Learning is not exact pattern matching This is “just” classical computer science classical “database query”, deduction Machine Learning involves induction Friday, 24 February 2012
  • 27.
    But Machine LearningIS NOT classical statistics either Friday, 24 February 2012
  • 28.
    But Machine LearningIS NOT classical statistics either - Complex rather than simple models (forget Gaussianity, forget linearity) - Numerical rather than analytical solution (forget pencil-and-paper: need hardcore numerical optimization) - VERY High rather than low dimensional (p>>n rather than n>>p) Friday, 24 February 2012
  • 29.
    Some popular technologiesdriven by Machine Learning Recommender Systems Friday, 24 February 2012
  • 30.
    Some popular technologiesdriven by Machine Learning Social media Friday, 24 February 2012
  • 31.
    Big Data andMachine Learning Parallelism is crucial - Linear algebraic approaches favoured (matrix multiplication-based) - Much of Feature Extraction can be parallelised - Model Training is another story: usually needs syncing Friday, 24 February 2012
  • 32.
    Machine Learning andData Mining Data Mining is a buzzword and in that sense it includes Machine Learning In a more strict sense, Data Mining is often associated to data analysis without necessarily doing predictive analytics (which is the hallmark of Machine Learning) Friday, 24 February 2012
  • 33.
    When is MachineLearning helpful? DATA ACTIONABLE KNOWLEDGE When you don’t really know how to find an explicit (at the bit-level) description for your abstraction or “actionable knowledge” Friday, 24 February 2012
  • 34.
    When is MachineLearning helpful? DATA ACTIONABLE KNOWLEDGE When you don’t really know how to find an explicit (at the bit-level) description for your abstraction or “actionable knowledge” And this is common!! Friday, 24 February 2012
  • 35.
    http://tiberiocaetano.com http://www.nicta.com.au/research/machine_learning Friday, 24 February 2012