Introduction to Machine Learning

2,644 views

Published on

Tiberio Caetano, NICTA
Meetup #1, 23 Feb 2012 - http://sydney.bigdataaustralia.com.au/events/49103992/

Published in: Technology, Education

Introduction to Machine Learning

  1. 1. Introduction to Machine Learning Tiberio Caetano NICTA and Australian National UniversityFriday, 24 February 2012
  2. 2. Friday, 24 February 2012
  3. 3. Quick calibration:Friday, 24 February 2012
  4. 4. Quick calibration: Who has heard of Machine Learning?Friday, 24 February 2012
  5. 5. Quick calibration: Who has heard of Machine Learning? Who has used Machine Learning?Friday, 24 February 2012
  6. 6. Quick calibration: Who has heard of Machine Learning? Who has used Machine Learning? Who has built new Machine Learning tools?Friday, 24 February 2012
  7. 7. PROBLEM: DATA ACTIONABLE KNOWLEDGE That’s roughly the problem Machine Learning addressesFriday, 24 February 2012
  8. 8. BLUE: DATA RED: KNOWLEDGE - Is this email spam or not spam? - Is there a face in this picture? - Should I lend money to this customer given his spending behaviour?Friday, 24 February 2012
  9. 9. Knowledge is not concrete “Face” is an abstraction “Spam” is an abstraction “Who to lend to” is an abstraction You don’t find faces, spam or financial advice in datasets you just find bitsFriday, 24 February 2012
  10. 10. ? We have data But we want abstractionsFriday, 24 February 2012
  11. 11. What is an abstraction anyway? • Anything whose description does not depend exclusively on the bits you have • Notion of generalisation is fundamental • Abstraction always involves assumptionsFriday, 24 February 2012
  12. 12. Ready to define Machine Learning: • Machine Learning is the science of automating the process of abstraction from raw data and assumptions Raw Data Machine Learning Abstraction AssumptionsFriday, 24 February 2012
  13. 13. Data: (painted image) + (dataset of normal images) + Assumption: the non-painted parts of the painted image behave as the images in the datasetFriday, 24 February 2012
  14. 14. Data: (painted image) + (dataset of normal images) + Assumption: the non-painted parts of the painted image behave as the images in the dataset Abstraction: corrected imageFriday, 24 February 2012
  15. 15. Several forms of abstraction Cluster data Classify data Predict from data Summarise data Decide based on data etc...Friday, 24 February 2012
  16. 16. (e) Ground Truth Clustering (i) Ground Truth Figure 2: Resulting motion based algorithms. 2(a)-2(d) [10] S. M. Goldfeld and R http://home.dei.polimi.it/matteucc/ Holland Publishing C Clustering/tutorial_html/ [11] D. W. Hosmer. Maxim lines. In CommunicatiFriday, 24 February 2012
  17. 17. Dimensionality Reduction and Visualization http://isomap.stanford.edu/datasets.htmlFriday, 24 February 2012
  18. 18. RegressionFriday, 24 February 2012
  19. 19. Classification {spam; not spam} {0,1,2,3,4,5,6,7,8,9}Friday, 24 February 2012
  20. 20. Structured Prediction Image Understanding Protein Structure Prediction Machine Translation Image credit: S. GouldFriday, 24 February 2012
  21. 21. Structured Prediction Chess, NY, Kasparov, WTC Kangaroo, Sun, Sea, AustraliaFriday, 24 February 2012
  22. 22. What Machine Learning IS NOTFriday, 24 February 2012
  23. 23. What Machine Learning IS NOT Find 01001000:Friday, 24 February 2012
  24. 24. What Machine Learning IS NOT Find 01001000: Machine Learning is not exact pattern matchingFriday, 24 February 2012
  25. 25. What Machine Learning IS NOT Find 01001000: Machine Learning is not exact pattern matching This is “just” classical computer science classical “database query”, deductionFriday, 24 February 2012
  26. 26. What Machine Learning IS NOT Find 01001000: Machine Learning is not exact pattern matching This is “just” classical computer science classical “database query”, deduction Machine Learning involves inductionFriday, 24 February 2012
  27. 27. But Machine Learning IS NOT classical statistics eitherFriday, 24 February 2012
  28. 28. But Machine Learning IS NOT classical statistics either - Complex rather than simple models (forget Gaussianity, forget linearity) - Numerical rather than analytical solution (forget pencil-and-paper: need hardcore numerical optimization) - VERY High rather than low dimensional (p>>n rather than n>>p)Friday, 24 February 2012
  29. 29. Some popular technologies driven by Machine Learning Recommender SystemsFriday, 24 February 2012
  30. 30. Some popular technologies driven by Machine Learning Social mediaFriday, 24 February 2012
  31. 31. Big Data and Machine Learning Parallelism is crucial - Linear algebraic approaches favoured (matrix multiplication-based) - Much of Feature Extraction can be parallelised - Model Training is another story: usually needs syncingFriday, 24 February 2012
  32. 32. Machine Learning and Data Mining Data Mining is a buzzword and in that sense it includes Machine Learning In a more strict sense, Data Mining is often associated to data analysis without necessarily doing predictive analytics (which is the hallmark of Machine Learning)Friday, 24 February 2012
  33. 33. When is Machine Learning helpful? DATA ACTIONABLE KNOWLEDGE When you don’t really know how to find an explicit (at the bit-level) description for your abstraction or “actionable knowledge”Friday, 24 February 2012
  34. 34. When is Machine Learning helpful? DATA ACTIONABLE KNOWLEDGE When you don’t really know how to find an explicit (at the bit-level) description for your abstraction or “actionable knowledge” And this is common!!Friday, 24 February 2012
  35. 35. http://tiberiocaetano.com http://www.nicta.com.au/research/machine_learningFriday, 24 February 2012

×