Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018

269 views

Published on

https://telecombcn-dl.github.io/2018-dlcv/

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Published in: Data & Analytics
  • Be the first to comment

Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018

  1. 1. Towards Set Learning and Prediction using deep neural networks Prof. Laura Leal-Taixé Slides by S. Hamid Rezatofighi 1
  2. 2. Introduction Deep neural networks ü Benefits: Remarkable performance on many real- world problems û Limitation: built to deal with structured input and output data, i.e. vectors, matrices or tensors § Ordered § Fixed in size Many real-world problems, however, are naturally described as sets, rather than vectors 2
  3. 3. Set VS Vector As opposed to a vector, a set • is not fixed in size • is invariant to the ordering of entities within it For example But !" = 1,2,3,4 !) = 3,1,2,4 !" ≜ !) +" = (1,2,3,4) +) = (3,1,2,4) +" ≠ +) 3
  4. 4. Conventional Approach Conventional machine learning approaches learn the optimal parameters w∗ of the distribution # Y x, w∗ , that maps a structured input, e.g. a tensor, to a structured output, e.g. a fixed length vector W∗ Y = )* )+ ), )- ). x 4
  5. 5. Conventional Approach However, the output of some problems, e.g. image tagging, are indeed a set of elements (e.g. tags) rather than vectors W∗ Y = %& %' %( %) %* x 5
  6. 6. Conventional Approach However, the output of some problems, e.g. image tagging, are indeed a set of elements (e.g. tags) rather than vectors W∗ x 6 $ = &'()*+, -*.)/012
  7. 7. Conventional Approach However, the output of some problems, e.g. image tagging, are indeed a set of elements (e.g. tags) rather than vectors W∗ x 7 $ = &'(, *+,-'., /01213+
  8. 8. Conventional Approach However, the output of some problems, e.g. image tagging, are indeed a set of elements (e.g. tags) rather than vectors W∗ x 8 $ =
  9. 9. Existing (Approx.) Solutions Existing techniques that allow for sequential data, e.g. RNN or LSTM û do not guarantee a valid solution 9 Dog Bicycle Dog
  10. 10. Existing (Approx.) Solutions Existing techniques that allow for sequential data, e.g. RNN or LSTM û do not guarantee a valid solution û heavily depend on the input and output order 10 Person Bicycle Dog ?
  11. 11. Existing (Approx.) Solutions Existing techniques that allow for sequential data, e.g. RNN or LSTM û do not guarantee a valid solution û heavily depend on the input and output order 11 Dog Person Bicycle ?
  12. 12. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO 12
  13. 13. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO 13
  14. 14. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO ûDo not perform well in cluttered scenes (partial occlusions) 14 Heuristics (NMS)
  15. 15. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO ûDo not perform well in cluttered scenes (partial occlusions) 15 Heuristics (NMS)
  16. 16. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO ûDo not perform well in cluttered scenes (strong occlusions) 16 Heuristics (NMS)
  17. 17. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO 17 Heuristics (NMS) These two boxes will be suppressed even if they belong to an object and their detection score ≈ 1.0
  18. 18. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO 18 Heuristics (NMS) These two boxes will be suppressed even if they belong to an object and their detection score ≈ 1.0
  19. 19. Application Examples ü Object Detection and Localization State-of-the-art detectors: Faster-RCNN, SSD &YOLO ûDo not perform well in cluttered scenes (partial occlusions) 19 Heuristics (NMS)
  20. 20. Application Examples ü Object Detection and Localization Objective: Train a network to detect 5 bounding boxes e.g. using regression 20
  21. 21. Application Examples ü Object Detection and Localization 1st Training instance 21
  22. 22. Application Examples ü Object Detection and Localization 1st Training instance 22
  23. 23. Application Examples ü Object Detection and Localization 1st Training instance Y = #$ %$ #& %& #' %' #( %( #) %) *$ ℎ$ *& ℎ& *' ℎ' *( ℎ( *) ℎ) 23
  24. 24. Application Examples ü Object Detection and Localization 2nd Training instance Y = #$ %$ #& %& #' %' #( %( #) %) *$ ℎ$ *& ℎ& *' ℎ' *( ℎ( *) ℎ) 24
  25. 25. Application Examples ü Object Detection and Localization 2nd Training instance Y = #$ %$ #& %& #' %' #( %( #) %) *$ ℎ$ *& ℎ& *' ℎ' *( ℎ( *) ℎ) 25
  26. 26. Application Examples ü Object Detection and Localization 3rd Training instance Y = #$ %$ #& %& #' %' #( %( #) %) *$ ℎ$ *& ℎ& *' ℎ' *( ℎ( *) ℎ) ? 26
  27. 27. Application Examples ü Object Detection and Localization 4th Training instance Y = #$ %$ #& %& #' %' #( %( #) %) *$ ℎ$ *& ℎ& *' ℎ' *( ℎ( *) ℎ) #, %,*, ℎ, + 27
  28. 28. Application Examples ü Object Detection and Localization ü Instance Level Segmentation 28
  29. 29. Application Examples ü Object Detection and Localization ü Instance Level Segmentation ü Multi-Object Tracking ü Data Clustering ü Feature/Graph Matching or any Graph problem 29
  30. 30. Set Learning It is intuitive that conventional machine learning approaches cannot deal with the set inputs and/or outputs, where the dimensions are unknown and unfixed and their elements can be permuted. Set learning an emerging field of study • O. Vinyals et al. Order matters, ICLR 2015: using RNN to input and output an ordered set • Rezatofighi et al. DeepSetNet, ICCV 2017 (Spotlight): using CNN to output a set • A. Smola et al. Deep Sets, NIPS 2017 (Oral): using CNN to input a set • Rezatofighi et al. Joint DeepSetNet, AAAI 2018, using CNN to output a set • Rezatofighi et al. Deep Perm-Set Net, submitted to NIPS 2018, using CNN to output a set with unknown permutation 30
  31. 31. Technical Difficulties & Developments 1. Structured Input, Set output Potential Applications: Object Detection, Instance level segmentation and etc 31
  32. 32. Technical Difficulties & Developments 1. Structured Input, Set output Potential Applications: Object Detection, Instance level segmentation and etc 2. Set Input, Set output Potential Applications: Graph problems, e.g. matching 32
  33. 33. Technical Difficulties & Developments 1. Structured Input, Set output Potential Applications: Object Detection, Instance level segmentation and etc 2. Set Input, Set output Potential Applications: Graph problems, e.g. matching 3. Recurrent Set Network Potential Application: Real-time multiple object tracking 33
  34. 34. Technical Difficulties & Developments 1. Structured Input, Set output ü Learning set cardinality as a discrete distribution, using the backpropagation algorithm [1] ü Joint learning of set cardinality and state distributions [2] ü Learning to make the outputs permutation invariants [3] 34 2. Rezatofighi et al. Joint Learning of Set Cardinality and State Distribution, AAAI 2018 1. Rezatofighi et al. DeepSetNet: Prediciting Sets With Deep Neural Networks, ICCV 2017 3. Rezatofighi et al. Deep Perm-Set Net, submitted to NIPS 2018
  35. 35. Deep Perm-Set Net Some Descriptions 35 a set with unknown cardinality m and permutation
  36. 36. Deep Perm-Set Net Some Descriptions 36 a set with known cardinality m but unknown permutation
  37. 37. Deep Perm-Set Net Some Descriptions 37 a set with known cardinality m and permutation !, where
  38. 38. Deep Perm-Set Net Some Descriptions 38 a set with known cardinality m and permutation !, where It is simply a tensor !!
  39. 39. Deep Perm-Set Net Some Descriptions 39 It can be expressed by an ordered set with any arbitrary permutation, where
  40. 40. Deep Perm-Set Net The probability density function for a set 40 Output as a set
  41. 41. Deep Perm-Set Net The probability density function for a set 41 Input as a tensor, e.g. Image
  42. 42. Deep Perm-Set Net The probability density function for a set 42 Sets of parameters, e.g. CNN parameters
  43. 43. Deep Perm-Set Net The probability density function for a set 43 Cardinality Distribution
  44. 44. Deep Perm-Set Net The probability density function for a set 44 The normalizer to avoid the unit mismatch across the different dimensions
  45. 45. Deep Perm-Set Net The probability density function for a set 45 Joint distribution of m elements of the set
  46. 46. Deep Perm-Set Net The probability density function for a set 46 Marginalization over the permutations
  47. 47. Deep Perm-Set Net 47 Permutation State (bounding boxes) Cardinality
  48. 48. Deep Perm-Set Net 48 Training: you compute the loss on the state ordered by a given permutation à Hungarian matching algorithm
  49. 49. Deep Perm-Set Net 49 Testing: you only need to use the estimated cardinality and state.
  50. 50. Experiments Pedestrian Detection - MOTChallenge 50
  51. 51. Experiments Pedestrian Detection - MOTChallenge 51
  52. 52. Experiments Pedestrian Detection - MOTChallenge 52
  53. 53. Experiments Complex CAPTCHA test – De-Sum 53
  54. 54. Experiments Complex CAPTCHA test – De-Sum 54 Far more interesting outcome!! The ability of mimicking arithmetics without any rules being coded.
  55. 55. Experiments 55 Deep Perm-Set Net Vs. Faster-RCNN
  56. 56. Questions? 56

×