On Machine Learning <ul><li>at CSTalks </li></ul><ul><ul><ul><li>by Vlad Hosu </li></ul></ul></ul>
Introduction
Fundamental Questions <ul><li>What are the fundamental laws that govern all learning processes? </li></ul><ul><li>How can ...
Learning: Method <ul><ul><li>a process of adaption </li></ul></ul><ul><ul><li>by which a parametric model is automatically...
Before Learning I’m learning, hence I need adapt!
After Result: Liony adjusts his diet.
Biological Learning <ul><li>Model: nervous system neuron connectivity, chemical changes etc </li></ul><ul><li>Fitness: imp...
Machine Learning <ul><li>a mathematical model </li></ul><ul><li>with adjustable parameters </li></ul><ul><li>optimizing so...
Motivation
Why? <ul><li>some things are hard to code </li></ul><ul><li>too much data </li></ul><ul><li>automatic learning works bette...
Learning: Purpose <ul><li>estimation </li></ul><ul><ul><li>function - stock market </li></ul></ul><ul><ul><li>class - reco...
Requirements <ul><li>good learning ability </li></ul><ul><li>scalability to large problems </li></ul><ul><li>simple and ea...
Things Ahead <ul><li>Problems </li></ul><ul><ul><li>Clustering </li></ul></ul><ul><ul><li>Classification </li></ul></ul><u...
Important Problems
Clustering
Classification x1 x2
Classification <ul><li>Types </li></ul><ul><ul><li>discriminative </li></ul></ul><ul><ul><li>generative </li></ul></ul>x1 x2
Classification <ul><li>Types </li></ul><ul><ul><li>discriminative </li></ul></ul><ul><ul><li>generative </li></ul></ul>x1 ...
Classification
Regression
Making Connections <ul><li>discrete value regression => generative classification </li></ul><ul><li>regression on boundary...
Learning Issues
Domain Knowledge <ul><li>exploitation of problem structure </li></ul><ul><ul><li>human abstractions are better </li></ul><...
Grouping in Images <ul><li>groups together similar parts of an image </li></ul><ul><ul><li>select objects </li></ul></ul><...
Segmentation
Color Space RGB space RGB space
Color Space (cont)
Suitable Clustering
Generalization Ability <ul><ul><li>training data generalizes to new data </li></ul></ul><ul><ul><li>important for classifi...
Support Vector Machines (SVM) <ul><li>linear classifier on distorted space  </li></ul>
Learning Ability over fitting
Problems with  Over-fitting
SVM vs Decision Trees
Complexity Issues <ul><li>models should be </li></ul><ul><ul><li>as simple as possible  </li></ul></ul><ul><ul><li>but rep...
Neural Networks <ul><li>model: weights </li></ul><ul><li>fitness: output error </li></ul><ul><li>general function  </li></...
Training a Network
Non-trivial Functions
Optimization
Optimizing Fitness <ul><li>find extrema </li></ul><ul><li>strategies </li></ul><ul><ul><li>gradient descent </li></ul></ul...
Optimization <ul><li>finding extrema </li></ul><ul><li>local/global </li></ul>
Gradient Descent
Problem: Local Extrema
Problem: Speed
Linear Programming x1 x2 lines define a  convex function planes in 3D etc
Considerations <ul><li>scaling to large features spaces </li></ul><ul><ul><li>feature selection </li></ul></ul><ul><ul><li...
Open Problems
Open Problems <ul><li>unlabeled data for regression </li></ul><ul><li>exploiting sparsity in high dimensional spaces for n...
Open Problems (cont) <ul><li>algorithms for learning control strategies from delayed rewards and other inputs </li></ul><u...
The end Questions?
Types of Regression <ul><li>parametric </li></ul><ul><li>non-parametric </li></ul>
Linear vs Non-linear <ul><li>linear </li></ul><ul><ul><li>smooth </li></ul></ul><ul><ul><li>under-fitting </li></ul></ul><...
Naive Bayes good spam write people free π π No. Good No. Spam * *
Graph Clustering
Mean Shift
Problems in CV <ul><li>What are the physical and geometric processes that govern (digital) imaging? </li></ul><ul><li>What...
Linear Regression <ul><li>model: straight line </li></ul><ul><li>2 adjustable parameters </li></ul><ul><li>fitness functio...
Solution Stability y-shift slope
Some Issues with Model Selection normal outliers wrong model
Real Photo in  Color Space EM KMeans
Conjugate Gradient
Newton’s Method
Upcoming SlideShare
Loading in …5
×

CSTalks - On machine learning - 2 Mar

672 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
672
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CSTalks - On machine learning - 2 Mar

  1. 1. On Machine Learning <ul><li>at CSTalks </li></ul><ul><ul><ul><li>by Vlad Hosu </li></ul></ul></ul>
  2. 2. Introduction
  3. 3. Fundamental Questions <ul><li>What are the fundamental laws that govern all learning processes? </li></ul><ul><li>How can we build computer systems that automatically improve with experience? </li></ul>
  4. 4. Learning: Method <ul><ul><li>a process of adaption </li></ul></ul><ul><ul><li>by which a parametric model is automatically adjusted </li></ul></ul><ul><ul><li>so that some fitness criteria is more readily met </li></ul></ul>
  5. 5. Before Learning I’m learning, hence I need adapt!
  6. 6. After Result: Liony adjusts his diet.
  7. 7. Biological Learning <ul><li>Model: nervous system neuron connectivity, chemical changes etc </li></ul><ul><li>Fitness: improved behavior skills, memory, knowledge </li></ul>
  8. 8. Machine Learning <ul><li>a mathematical model </li></ul><ul><li>with adjustable parameters </li></ul><ul><li>optimizing some fitness function </li></ul>
  9. 9. Motivation
  10. 10. Why? <ul><li>some things are hard to code </li></ul><ul><li>too much data </li></ul><ul><li>automatic learning works better </li></ul><ul><li>is easier to customize/personalize </li></ul>
  11. 11. Learning: Purpose <ul><li>estimation </li></ul><ul><ul><li>function - stock market </li></ul></ul><ul><ul><li>class - recognition </li></ul></ul><ul><ul><li>structure - grouping </li></ul></ul>
  12. 12. Requirements <ul><li>good learning ability </li></ul><ul><li>scalability to large problems </li></ul><ul><li>simple and easy algorithm implementation </li></ul>
  13. 13. Things Ahead <ul><li>Problems </li></ul><ul><ul><li>Clustering </li></ul></ul><ul><ul><li>Classification </li></ul></ul><ul><ul><li>Regression </li></ul></ul><ul><li>Learning issues </li></ul><ul><ul><li>importance of domain knowledge </li></ul></ul><ul><ul><li>learning/generalization ability </li></ul></ul><ul><ul><li>model complexity issues </li></ul></ul><ul><li>Optimization </li></ul>
  14. 14. Important Problems
  15. 15. Clustering
  16. 16. Classification x1 x2
  17. 17. Classification <ul><li>Types </li></ul><ul><ul><li>discriminative </li></ul></ul><ul><ul><li>generative </li></ul></ul>x1 x2
  18. 18. Classification <ul><li>Types </li></ul><ul><ul><li>discriminative </li></ul></ul><ul><ul><li>generative </li></ul></ul>x1 x2 1 0
  19. 19. Classification
  20. 20. Regression
  21. 21. Making Connections <ul><li>discrete value regression => generative classification </li></ul><ul><li>regression on boundary space => discriminative classification </li></ul><ul><li>clustering + labels => classification </li></ul>
  22. 22. Learning Issues
  23. 23. Domain Knowledge <ul><li>exploitation of problem structure </li></ul><ul><ul><li>human abstractions are better </li></ul></ul><ul><ul><li>important for picking the right model </li></ul></ul>
  24. 24. Grouping in Images <ul><li>groups together similar parts of an image </li></ul><ul><ul><li>select objects </li></ul></ul><ul><ul><li>find patterns </li></ul></ul><ul><li>features = pixel values (function of) </li></ul>
  25. 25. Segmentation
  26. 26. Color Space RGB space RGB space
  27. 27. Color Space (cont)
  28. 28. Suitable Clustering
  29. 29. Generalization Ability <ul><ul><li>training data generalizes to new data </li></ul></ul><ul><ul><li>important for classification accuracy </li></ul></ul>
  30. 30. Support Vector Machines (SVM) <ul><li>linear classifier on distorted space </li></ul>
  31. 31. Learning Ability over fitting
  32. 32. Problems with Over-fitting
  33. 33. SVM vs Decision Trees
  34. 34. Complexity Issues <ul><li>models should be </li></ul><ul><ul><li>as simple as possible </li></ul></ul><ul><ul><li>but representative of the training data </li></ul></ul>
  35. 35. Neural Networks <ul><li>model: weights </li></ul><ul><li>fitness: output error </li></ul><ul><li>general function </li></ul>∑
  36. 36. Training a Network
  37. 37. Non-trivial Functions
  38. 38. Optimization
  39. 39. Optimizing Fitness <ul><li>find extrema </li></ul><ul><li>strategies </li></ul><ul><ul><li>gradient descent </li></ul></ul><ul><ul><li>convex optimization </li></ul></ul>
  40. 40. Optimization <ul><li>finding extrema </li></ul><ul><li>local/global </li></ul>
  41. 41. Gradient Descent
  42. 42. Problem: Local Extrema
  43. 43. Problem: Speed
  44. 44. Linear Programming x1 x2 lines define a convex function planes in 3D etc
  45. 45. Considerations <ul><li>scaling to large features spaces </li></ul><ul><ul><li>feature selection </li></ul></ul><ul><ul><li>dimensionality reduction </li></ul></ul>
  46. 46. Open Problems
  47. 47. Open Problems <ul><li>unlabeled data for regression </li></ul><ul><li>exploiting sparsity in high dimensional spaces for non-parametric learning </li></ul><ul><li>transferring learnt information from one task to simplify learning another </li></ul>
  48. 48. Open Problems (cont) <ul><li>algorithms for learning control strategies from delayed rewards and other inputs </li></ul><ul><li>best “active learning” strategies for different learning problems </li></ul><ul><li>degree one can preserve data privacy while obtaining the benefits of data mining </li></ul>
  49. 49. The end Questions?
  50. 50. Types of Regression <ul><li>parametric </li></ul><ul><li>non-parametric </li></ul>
  51. 51. Linear vs Non-linear <ul><li>linear </li></ul><ul><ul><li>smooth </li></ul></ul><ul><ul><li>under-fitting </li></ul></ul><ul><ul><li>good enough for some processes (biz) </li></ul></ul><ul><li>non-linear </li></ul><ul><ul><li>complex </li></ul></ul><ul><ul><li>over-fitting </li></ul></ul><ul><ul><li>works on most data-sets </li></ul></ul>
  52. 52. Naive Bayes good spam write people free π π No. Good No. Spam * *
  53. 53. Graph Clustering
  54. 54. Mean Shift
  55. 55. Problems in CV <ul><li>What are the physical and geometric processes that govern (digital) imaging? </li></ul><ul><li>What are the “informative” areas of an image and how do we detect them? </li></ul><ul><li>What portions of an image pertain to one another and to relevant physical phenomena? </li></ul><ul><li>From one (or more) images, how can we determine the geometry of the scene? </li></ul>
  56. 56. Linear Regression <ul><li>model: straight line </li></ul><ul><li>2 adjustable parameters </li></ul><ul><li>fitness function: root mean squared error </li></ul>
  57. 57. Solution Stability y-shift slope
  58. 58. Some Issues with Model Selection normal outliers wrong model
  59. 59. Real Photo in Color Space EM KMeans
  60. 60. Conjugate Gradient
  61. 61. Newton’s Method

×