Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning on iOS #360iDev

5,454 views

Published on

360|iDev
15th Aug. 2017

- https://360idev.com/sessions/deep-learning-ios/
- https://360idev.com/speakers/shuichi-tsutsumi/
- https://github.com/shu223

Published in: Technology
  • Be the first to comment

Deep Learning on iOS #360iDev

  1. 1. 360|iDev 2017 Shuichi Tsutsumi @shu223 Deep Learning on iOS
  2. 2. Overview • How to implement “Deep Learning” on iOS Metal Performance Shaders (MPSCNN) Accelerate (BNNS) Core ML Vision Your App
  3. 3. Why so exciting? !
  4. 4. AIphaGo Cancer Detection Self-driving Car
  5. 5. AutoDraw
  6. 6. Pose Estimation http://qiita.com/nnn112358/items/a4490d85dac5827db53b
  7. 7. Frontal View Synthesis
  8. 8. AutoHair
  9. 9. AutoDraw Pose Estimation Frontal View AutoHair
  10. 10. AutoDraw Pose Estimation Frontal View AutoHair This evolutional tech works on iOS"
  11. 11. Deep Learning ON iOS = works on iOS devices
  12. 12. (Demo)
  13. 13. Image Result 60 times / sec xUsers # #
  14. 14. TOOTHPASTE
  15. 15. !
  16. 16. Train Inference Trained Params iOS ML frameworks
  17. 17. Implementation
  18. 18. """Core ML"""
  19. 19. The largest hall was 100% full for the Core ML session.
  20. 20.
  21. 21. Metal Performance Shaders (MPSCNN) Accelerate (BNNS) Core ML Vision Your App iOS 11 iOS 10
  22. 22. Metal Performance Shaders (MPSCNN) Accelerate (BNNS) Core ML Vision Your App iOS 11 iOS 10
  23. 23. Your App Metal Performance Shaders (MPSCNN) Accelerate (BNNS) iOS 10 • Optimized for GPU (by ) and CPU (by Accelerate) • Available for iOS 10, too • Basically any ML tools can be used to train the models Still works!
  24. 24. Metal Performance Shaders (MPSCNN) Accelerate (BNNS) Your App GPU CPU
  25. 25. MPSCNN 900 results Core ML 160,000 results
  26. 26. 3 steps to implement w/ MPSCNN
  27. 27. How to implement w/ MPSCNN Step 1: Create the model
  28. 28. Train Inference Trained Params -Which tools can be used for the training? -What kind of formats can be used to pass the pre-trained params?
  29. 29. Which ML tools can be used? : ANY What model format can be used? : ANY Model (Trained Params) dat Train ML Tools
  30. 30. Which ML tools can be used? : ANY What model format can be used? : ANY Model (Trained Params) dat Train ML Tools hdf5 •The “.dat” files are common binary files. - Not specific for iOS or MPS. •Contains the trained params - Weights / Biases •Any other format can be used as long as it can be read by the iOS app. - c.f. hdf5
  31. 31. Which ML tools can be used? : ANY What model format can be used? : ANY Model (Trained Params) dat Train ML Tools hdf5 •Any tools which can train CNN, and export the params
  32. 32. How to implement w/ MPSCNN Step 2: Implement the network
  33. 33. MPSCNNConvolution MPSCNNFullyConnected MPSCNNPooling MPSCNNConvolution MPSCNNConvolution MPSCNNPooling
  34. 34. MPSCNNConvolution MPSCNNFullyConnected MPSCNNPooling MPSCNNConvolution MPSCNNConvolution MPSCNNPooling • Almost same name -> Easy to find • Complicated Maths or GPU optimization are encapsulated Classes corresponding to each CNN layers are provided
  35. 35. How to implement w/ MPSCNN Step 3: Implement the inference
  36. 36. Input image MPSImage Result (UInt, etc.) CNN • Implemented in Step 2 • Trained params (created in Step 1) are loaded. Do something
  37. 37. • Step 1: Create the model - Any ML tools can be used - Any file format can be used for the trained params • Step 2: Implement the network - Classes corresponding to each CNN layer are provided • Step 3: Implement the inference - Input to the CNN, and output from the CNN
  38. 38. Demo : Logo detection
  39. 39. Increased by 70 times
  40. 40. Trained Params dat f.write(session.run(w_conv1_p).tobytes())
  41. 41. (Demo)
  42. 42.
  43. 43. CNN for the logo detection
  44. 44. GoogLeNet (Inception v3)
  45. 45. GoogLeNet (Inception v3) Apple’s implementation w/ MPSCNN Inception3Net.swift
  46. 46. 2,000 lines& Only for the CNN
  47. 47. Core ML
  48. 48. Development Flow w/ MPSCNN ML Tools Train some format Trained Params dat Extract Parse dat MPSCNNConvolution MPSCNNFullyConnected Implement Network 2,000 lines& App
  49. 49. Development Flow w/ Core ML ML Tools Train some format Trained Params dat Extract Parse dat MPSCNNConvolution MPSCNNFullyConnected Implement NetworkApp 2,000 lines& 1) Convert w/ coremltools 2) Drag & Drop some format Generate
  50. 50. Input image MPSImage Result (UInt, etc.) CNN Do something let size = MTLSize(width: inputWidth, height: inputHeight let region = MTLRegion(origin: MTLOrigin(x: 0, y: 0, z: 0 size: size) network.srcImage.texture.replace( region: region, mipmapLevel: 0, slice: 0, withBytes: context.data!, bytesPerRow: inputWidth, bytesPerImage: 0) Need to know Metal to use MPSCNN let origin = MTLOrigin(x: 0, y: 0, z: 0) let size = MTLSize(width: 1, height: 1, depth: 1) finalLayer.texture.getBytes(&(result_half_array[4*i]), bytesPerRow: MemoryLayout<UIn bytesPerImage: MemoryLayout<U from: MTLRegion(origin: origi size: size), mipmapLevel: 0, slice: i)
  51. 51. MPSCNN Accelerate (BNNS) Core ML Vision Your App
  52. 52. Input image MPSImage Results CNN Do something let ciImage = CIImage(cvPixelBuffer: imageBuffer) let handler = VNImageRequestHandler(ciImage: ciImage) try! handler.perform([self.coremlRequest]) Don’t need to touch Metal to use Vision guard let results = request.results as? [VNClassificationObservation] else { return } guard let best = results.first?.identifier else { return
  53. 53. MPSCNN Accelerate (BNNS) Your App
  54. 54. MPSCNN Accelerate (BNNS) Core ML Vision Your App
  55. 55. MPSCNN Accelerate (BNNS) Core ML Vision Your App How about BNNS?
  56. 56. Deep Learning on CPU?!
  57. 57. CPU: 40 days GPU: 6 days Deep Learning on CPU?!
  58. 58. Asked @ Lab in WWDC17
  59. 59. “How can we utilize both MPSCNN and BNNS?”
  60. 60. ' “Basically use MPSCNN.”
  61. 61. “OK, but, when should I choose BNNS?”
  62. 62. ' “I don’t know.”
  63. 63. • He added “watchOS might be a case on that you should use BNNS.” • Because watchOS doesn’t support MPSCNN, but supports BNNS. (I haven’t tried yet.)
  64. 64. My current understanding: • The cost for passing data between CPU and GPU is not small • When the network is small, the CPU <-> GPU cost might be bigger than the benefit of parallel processing. BNNS might be better when the network is small.
  65. 65. Recap • Why is “Deep Learning on iOS” exciting? • How to implement “Deep Learning” on iOS - w/ MPSCNN (iOS 10) - w/ Core ML & Vision (iOS 11) - When to choose BNNS MPSCNN BNNS Core ML Vision
  66. 66. Thank you! https://github.com/shu223

×