This document contains summaries of several topics:
1) It discusses deep learning and how stacking many simple functions can lead to complex models for tasks like image classification.
2) It notes some growing trends in deep learning research like an increasing number of papers being submitted to top conferences.
3) It lists some major companies investing heavily in deep learning research like Google, Facebook, Microsoft and Amazon.
4) It provides an overview of the author's research which aims to synthesize optical flow from pre-trained disparity estimation networks since optical flow data can be difficult to obtain.
7. What is machine learning?
• Data -> Learn -> Knowledge (Make predictions on unseen data)
• Examples
Support Vector Machines Gaussian Process Regression
10. Unsupervised learning is cool but…
• In many cases some form of supervision is necessary
• “Fully unsupervised” learning is impossible -> Some assumptions
about input data should be given!
• No free lunch theorem
• Let’s only focus on supervised learning for now!
11. Supervised learning
• Goal: Find a function “f” that maps input data to output domain
• Input & output domain could be anything
• Examples
• Input: Images -> Output: Label
• Input: Korean -> Output: English
12. What is deep learning?
• Consider a supervised learning problem where we want to predict
whether an image is a cat or not
• Again, we want to find “f”
• The relationship between an image and its label will be very complex
• Probably not “y = ax + b”
• We can build “complex” functions using composition!
• Stack “a lot” of functions -> deep learning
13. What is deep learning?
Y = f(a*f(c*f(e*f….(x)+g)+d) + b)
19. Number of submissions
• CVPR – top tier conference in computer vision
• Acceptance rate: approx. 23%
20. Number of submissions
• NIPS – top tier machine learning conference
• Acceptance rate: approx. 20%
21. A typical review process
• Most machine learning conferences use a peer-review system
• Papers are submitted to different “areas”
• Ex) If one submits a paper on classifying cats, it will go under the category of
“image classification”
• Each “area” has an “area chair” – a person who is in charge of all
submission to that “area”
• “Area chairs” distribute the submitted papers to reviewers, who will
eventually decide accept/reject
22. More papers are good but…
• It means we need more reviewers
• However, the total number of reviewers is limited
• A lot of bad quality reviewers are present
34. Disparity
• Finding disparity leads to depth estimation in stereo vision
• In order to find disparity, we need rectified stereo pairs
• Matching points should be on the ‘same’ horizontal line
38. Disparity -> Optical flow
• Optical flow training data is hard to find
• In many cases synthetic data is used
• Could we use pre-trained models for disparity estimation to find
optical flow?
39. Disparity -> Optical flow
• Input: Image pair
• Output: Optical flow
• Novelty: Obtain optical flow estimation using networks trained on
stereo disparity estimation
• Method overview
• Pre-train a neural network(DispNet-Horizontal) on stereo disparity data
• Using basic rotation operations, obtain another neural network(DispNet-
Vertical) on rotated stereo disparity data
• Train a domain transfer network to map optical flow target images to stereo
images(automatic rectification)
• Fine-tune domain transfer network using optical flow ground truth