Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Computer vision, machine, and deep learning

869 views

Published on

This talk is about computer vision, machine learning, and deep learning using Python.

Published in: Engineering
  • Be the first to comment

Computer vision, machine, and deep learning

  1. 1. COMPUTER VISION, MACHINE, AND DEEP LEARNING WITH PYTHON Dr.Eng. Igi Ardiyanto
  2. 2. PROFILE Igi Ardiyanto Field of Interest : Robotics Computer Vision Intelligent Transportation System Embedded System Parallel Computing Deep Learning More Information ?? http://te.ugm.ac.id/~igi
  3. 3. What is Computer Vision? Computer Vision, Machine, and Deep Learning with Python
  4. 4. COMPUTER VISION Make computers understand images and video What kind of scene? Where are the people? How far is the building? Where is Waldo? Like when human “sees” something …..
  5. 5. VISION IS REALLY HARD  Vision is an amazing feat of natural intelligence  Visual cortex occupies about 50% of Macaque brain  More human brain devoted to vision than anything else Sik…sik…. Iki dolanan opo panganan, cuk?
  6. 6. OPTICAL CHARACTER RECOGNITION (OCR) Digit recognition, AT&T labs http://www.research.att.com/~yann/ Technology to convert scanned docs to text • If you have a scanner, it probably came with OCR software License plate readers http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
  7. 7. FACE DETECTION  Many new digital cameras now detect faces  Canon, Sony, Fuji, …
  8. 8. SMILE DETECTION Sony Cyber-shot® T70 Digital Still Camera
  9. 9. What is Machine Learning? Computer Vision, Machine, and Deep Learning with Python
  10. 10. MACHINE LEARNING  Machine learning is programming computers to optimize a performance criterion using example data or past experience.  There is no need to “learn” to calculate payroll  Learning is used when:  Human expertise does not exist (navigating on Mars),  Humans are unable to explain their expertise (speech recognition)  Solution changes in time (routing on a computer network)  Solution needs to be adapted to particular cases (user biometrics)
  11. 11. COMPUTER VISION MEETS MACHINE LEARNING Dog Cat Raccoon Dog Train: Deploy: Training Labels Training Image Features Prediction Image Features Learned model
  12. 12. IMAGE FEATURES ??  Color  Histograms  Shape  … Slide credit: L. Lazebnik
  13. 13. VERY BRIEF TOUR OF SOME CLASSIFIERS  K-nearest neighbor  SVM  Boosted Decision Trees  Neural networks  Naïve Bayes  Bayesian network  Gaussian Logistic regression  Random Forests  RBMs  Etc.
  14. 14. FACIAL ATTRACTIVENESS PREDICTION Yoona: Score 3.6 Yuri: Score 3.4 Tiffany: Score 3.8
  15. 15. FACIAL ATTRACTIVENESS PREDICTION https://github.com/avisingh599/face-rating Yoona: Score 3.6 Yuri: Score 3.4 Tiffany: Score 3.8
  16. 16. What is Deep Learning? Computer Vision, Machine, and Deep Learning with Python
  17. 17. 1) A host of statistical machine learning techniques 2) Enables the automatic learning of feature hierarchies 3) Generally based on artificial neural networks DEEP LEARNING
  18. 18.  English and Mandarin speech recognition  Transition from English to Mandarin made simpler by end-to-end DL  No feature engineering or Mandarin-specificsrequired  More accurate than humans  Error rate 3.7% vs. 4% for human tests http://arxiv.org/abs/1512.02595 END-TO-END DEEP LEARNING FOR ENGLISH AND MANDARIN SPEECH RECOGNITION BAIDU DEEP SPEECH 2
  19. 19. FIRST COMPUTER PROGRAM TO BEAT A HUMAN GO PROFESSIONAL Training DNNs : 3 weeks, 340 million training steps on 50 GPUs Play : Asynchronousmulti-threadedsearch Simulations on CPUs, policy and value DNNs in parallel on GPUs Single machine: 40 search threads, 48 CPUs, and 8 GPUs Distributed version: 40 search threads, 1202 CPUs and 176 GPUs Outcome: Beat both European and World Go champions in best of 5 matches ALPHA-GO
  20. 20. DEEP LEARNING EVERYWHERE INTERNET & CLOUD Image Classification Speech Recognition Language Translation Language Processing Sentiment Analysis Recommendation MEDIA & ENTERTAINMENT Video Captioning Video Search Real Time Translation AUTONOMOUS MACHINES Pedestrian Detection Lane Tracking Recognize Traffic Sign SECURITY & DEFENSE Face Detection Video Surveillance Satellite Imagery MEDICINE & BIOLOGY Cancer Cell Detection Diabetic Grading Drug Discovery
  21. 21. So what’s the f*** there for Python? Computer Vision, Machine, and Deep Learning with Python
  22. 22. WHAT IS PYTHON?  General purpose interpreted programming language  Widely used by scientists and programmers of all stripes  Supported by many 3rd-party libraries (currently 21,054 on the main python package website)  Free!
  23. 23. WHY IS IT WELL-SUITED TO SCIENCE?  NumPy  Numerical library for python  Written in C, wrapped by python  Fast  Scipy  Built on top of NumPy (i.e. Also fast!)  Common maths, science, engineering routines  Matplotlib  Hugely flexible plotting library  Similar syntax to Matlab  Produces publication-quality output
  24. 24. WHY IS PYTHON BETTER THAN WHAT I USE NOW?  It can do everything  Fast mathematical operations  Easy file manipulation  Format conversion  Plotting  Scripting  Command line  OK, not everything  Write thesis for you
  25. 25. Python has a wide range of deep learning-related libraries available Low level High level (efficient gpu-powered math) (theano-wrapper, models in python code, abstracts theano away) (wrapper for theano, yaml, experiment-oriented) (computer-vision oriented DL framework, model-zoo, prototxt model definitions) pythonification ongoing! (theano-extension, models in python code, theano not hidden) and of course:
  26. 26. HOW EASY TO PROGRAM??
  27. 27. HOW EASY TO PROGRAM??
  28. 28. DEMO

×