Introduction to Machine Learning

489 views
414 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
489
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
23
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Introduction to Machine Learning

  1. 1. Introduction to Machine Learning CAP 4630 Xingquan (Hill) Zhu
  2. 2. Outline <ul><li>What is Machine Learning? </li></ul><ul><ul><li>What is Pattern Recognition? </li></ul></ul><ul><ul><li>Typical Pattern Recognition Systems </li></ul></ul><ul><ul><li>Resource & References </li></ul></ul><ul><li>Decision Trees </li></ul><ul><li>Neural Networks </li></ul>
  3. 3. Face Detection Demo <ul><li>http://demo.pittpatt.com/ </li></ul><ul><ul><li>http://graphics8.nytimes.com/images/2008/03/17/us/17bush-600.jpg </li></ul></ul><ul><ul><li>http://img.timeinc.net/time/daily/special/photo/siralec/faces.jpg </li></ul></ul><ul><ul><li>http://demo.pittpatt.com/detection_demo/ </li></ul></ul>
  4. 4. More Complicated Examples <ul><li>Given The following Examples </li></ul><ul><li>Which of the following person belong to this small group? </li></ul>Machine Learning Yes or No! How much confidence? Fst: Yes, 90% Snd: No, 70% Thd: No, 90% Fth: No, 80%
  5. 5. Neural Networks Applications
  6. 6. ALVINN drives 70mph on highways CMU
  7. 7. More Complicated Example Regression <ul><li>Example: Price of a used car </li></ul><ul><li>x : car attributes </li></ul><ul><li>y : price </li></ul><ul><li>y = g ( x | θ  ) </li></ul><ul><li>g ( ) model, </li></ul><ul><li> θ parameters </li></ul>y = wx + w 0 http://www.theparticle.com/applets/ml/index.html
  8. 8. Outline <ul><li>What is Machine Learning? </li></ul><ul><ul><li>What is Pattern Recognition? </li></ul></ul><ul><ul><li>Typical Pattern Recognition Systems </li></ul></ul><ul><ul><li>Resource & References </li></ul></ul>
  9. 9. What is Machine Learning? <ul><li>Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what they observe, e.g., </li></ul><ul><ul><li>Recognizing a face </li></ul></ul><ul><ul><li>Understanding spoken words </li></ul></ul><ul><ul><li>Reading handwriting </li></ul></ul><ul><ul><li>Distinguishing fresh food from its smell </li></ul></ul><ul><li>We would like to give similar capabilities to machines </li></ul>
  10. 10. What is Machine Learning <ul><li>Programming computers to use example data or past experience </li></ul><ul><ul><li>Needed in cases where we cannot directly write a computer program but have example data </li></ul></ul><ul><li>Learning is used when: </li></ul><ul><ul><li>Human expertise does not exist (navigating on Mars), </li></ul></ul><ul><ul><li>Humans are unable to explain their expertise (speech recognition) </li></ul></ul><ul><ul><li>Solution changes in time (routing on a computer network) </li></ul></ul><ul><ul><li>Solution needs to be adapted to particular cases (user biometrics) </li></ul></ul><ul><li>Are all problems learnable? </li></ul>
  11. 11. “ Learning” … <ul><li>Learning general models from the data of particular examples </li></ul><ul><ul><li>Data is usually cheap and abundant (data warehouses, data marts); knowledge is expensive and scarce. </li></ul></ul><ul><ul><li>Data scarcity, learning is possible but knowledge is less reliable </li></ul></ul><ul><li>Example in retail: Customer transactions to consumer behavior: </li></ul><ul><ul><li>People who bought “Da Vinci Code” also bought “The Five People You Meet in Heaven” (www.amazon.com) </li></ul></ul><ul><li>Build a model that is a good and useful approximation to the data. </li></ul>
  12. 12. Machine Learning: Creating a Classifier Adaptively <ul><li>Supervised learning </li></ul><ul><ul><li>Decision Trees </li></ul></ul><ul><ul><li>Feedforward neural network and backpropagation </li></ul></ul><ul><li>Unsupervised learning </li></ul><ul><ul><li>Clustering </li></ul></ul><ul><ul><ul><li>Grouping similar instances </li></ul></ul></ul><ul><ul><li>Association analysis </li></ul></ul><ul><ul><ul><li>People who bought “Da Vinci Code” also bought “The Five People You Meet in Heaven” (www.amazon.com) </li></ul></ul></ul><ul><li>Reinforcement learning </li></ul>
  13. 13. Machine Learning Output <ul><li>The output of Machine learning </li></ul><ul><ul><li>Patterns </li></ul></ul><ul><ul><ul><li>Decision trees, data summarization, data generative models </li></ul></ul></ul><ul><ul><li>Discriminative machines </li></ul></ul><ul><ul><ul><li>Neural networks (explicit rules are not available) </li></ul></ul></ul><ul><li>The requirements of the output crucially determine the underlying learning models </li></ul><ul><ul><li>Class category </li></ul></ul><ul><ul><li>Confidence (probability) </li></ul></ul><ul><ul><li>The comprehensibility of the decision model </li></ul></ul><ul><li>The process of Learning is the process of pattern discovery </li></ul>
  14. 14. Outline <ul><li>What is Machine Learning? </li></ul><ul><ul><li>What is Pattern Recognition? </li></ul></ul><ul><ul><li>Typical Pattern Recognition Systems </li></ul></ul><ul><ul><li>Resource & References </li></ul></ul>
  15. 15. What is Pattern Recognition <ul><li>A pattern is an entity, vaguely defined, that could be given a name, e.g., </li></ul><ul><ul><li>fingerprint image </li></ul></ul><ul><ul><li>handwritten word </li></ul></ul><ul><ul><li>human face </li></ul></ul><ul><ul><li>speech signal </li></ul></ul><ul><ul><li>DNA sequence </li></ul></ul><ul><li>Pattern recognition is the study of how machines can </li></ul><ul><ul><li>observe the environment </li></ul></ul><ul><ul><li>learn to distinguish patterns of interest </li></ul></ul><ul><ul><li>make sound and reasonable decisions about the categories of the patterns </li></ul></ul>
  16. 16. A Simple Example
  17. 17. Typical Decision Process <ul><li>Fish Face Recognition? </li></ul><ul><li>Salmon tastes better? </li></ul><ul><li>What kind of information can distinguish one species from the others? </li></ul><ul><ul><li>length, width, weight, number and shape of fins, tail shape, etc. </li></ul></ul><ul><li>What can cause problems during sensing? </li></ul><ul><ul><li>lighting conditions, position of fish on the conveyor belt, camera noise, etc. </li></ul></ul><ul><li>What are the steps in the process? </li></ul><ul><ul><li>capture image -> isolate fish -> take measurements -> make decision </li></ul></ul>
  18. 18. Example: Feature Selection <ul><li>Assume a fisherman (domain knowledge) told us that a sea bass is generally longer than a salmon. </li></ul><ul><li>We can use length as a feature and decide between sea bass and salmon according to a threshold on length. </li></ul><ul><li>How can we choose this threshold? </li></ul>
  19. 19. Example: Feature Selection
  20. 20. Example: Feature Selection <ul><li>Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold. </li></ul><ul><li>Try another feature: average lightness of the fish scales. </li></ul>
  21. 21. Example: Feature Selection
  22. 22. Example: Multiple Features <ul><li>Assume we also observed that sea bass are typically wider than salmon. </li></ul><ul><li>We can use two features in our decision: </li></ul><ul><ul><li>lightness: x1 </li></ul></ul><ul><ul><li>width: x2 </li></ul></ul><ul><li>Each fish image is now represented as a point ( feature vector ) </li></ul><ul><li>in a two-dimensional feature space </li></ul>
  23. 23. Example: Multiple Features
  24. 24. Example: Multiple Features Feature vector - A vector of observations (measurements). - is a point in feature space . Hidden state - Cannot be directly measured. - Patterns with equal hidden state belong to the same class. Task - To design a classifer (decision rule) which decides about a hidden state based on an onbservation. Pattern
  25. 25. Example: Multiple Features <ul><li>Does adding more features always improve the results? </li></ul><ul><ul><li>Avoid unreliable features. </li></ul></ul><ul><ul><li>Be careful about correlations with existing features. </li></ul></ul><ul><ul><li>Be careful about measurement costs. </li></ul></ul><ul><ul><li>Be careful about noise in the measurements. </li></ul></ul><ul><li>Is there some curse for working in very high dimensions? </li></ul>
  26. 26. Example: Decision Boundaries
  27. 27. Overfitting and underfitting underfitting overfitting good fit
  28. 28. Example: Decision Boundaries
  29. 29. Example: Cost of Error <ul><li>We should also consider costs of different errors we make in our decisions. For example, if the fish packing company knows that: </li></ul><ul><ul><li>Customers who buy salmon will object vigorously if they see sea bass in their cans. </li></ul></ul><ul><ul><li>Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans. </li></ul></ul><ul><li>How does this knowledge affect our decision? </li></ul>
  30. 30. The actual forms of the decision model (Patterns) <ul><li>Decision Rules </li></ul><ul><ul><li>IF lightness>8 AND length>14 THEN Salmon </li></ul></ul><ul><li>Decision Trees </li></ul><ul><ul><li>A tree structure to make decision </li></ul></ul><ul><li>Probability Models </li></ul><ul><ul><li>Gaussian models </li></ul></ul><ul><ul><li>Bayesian decision models </li></ul></ul><ul><li>Weight functions </li></ul><ul><ul><li>Neural Networks </li></ul></ul><ul><ul><li>Linear Discriminate Analysis </li></ul></ul>
  31. 31. Outline <ul><li>What is Machine Learning? </li></ul><ul><ul><li>What is Pattern Recognition? </li></ul></ul><ul><ul><li>Typical Pattern Recognition Systems </li></ul></ul><ul><ul><li>Resource & References </li></ul></ul><ul><li>Interesting Demo </li></ul><ul><li>Story picturing: http://alipr.com/spe/ </li></ul>
  32. 32. Resources: Datasets <ul><li>UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html </li></ul><ul><li>UCI KDD Archive: http ://kdd.ics.uci.edu/summary.data.application.html </li></ul><ul><li>Statlib: http://lib.stat.cmu.edu/ </li></ul><ul><li>Delve: http://www.cs.utoronto.ca/~ delve/ </li></ul>
  33. 33. Resources: Journals <ul><li>Journal of Machine Learning Researc </li></ul><ul><li>Machine Learning </li></ul><ul><li>Neural Computation </li></ul><ul><li>Neural Networks </li></ul><ul><li>IEEE Transactions on Neural Networks </li></ul><ul><li>IEEE Transactions on Pattern Analysis and Machine Intelligence </li></ul><ul><li>Annals of Statistics </li></ul><ul><li>Journal of the American Statistical Association </li></ul><ul><li>IEEE Trans. On Knowledge and Data Engineering </li></ul><ul><li>Data Mining and Knowledge Discovery </li></ul><ul><li>... </li></ul>
  34. 34. Resources: Conferences <ul><li>International Joint Conference on Artificial Intelligence (IJCAI) </li></ul><ul><li>I nternational Conference on Machine Learning (ICML) </li></ul><ul><li>Neural Information Processing Systems (NIPS) </li></ul><ul><li>American Association for Artificial Intelligence (AAAI) </li></ul><ul><li>Uncertainty in Artificial Intelligence (UAI) </li></ul><ul><li>International Conference on Neural Networks (Europe) </li></ul><ul><li>ACM Knowledge Discovery and Data Mining (KDD) </li></ul><ul><li>IEEE International Conference on Data Mining (ICDM) </li></ul><ul><li>... </li></ul>
  35. 35. Outline <ul><li>What is Machine Learning? </li></ul><ul><ul><li>What is Pattern Recognition? </li></ul></ul><ul><ul><li>Typical Pattern Recognition Systems </li></ul></ul><ul><ul><li>Resource & References </li></ul></ul><ul><li>Decision Trees </li></ul><ul><li>Neural Networks </li></ul>

×