Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to apply deep learning to 3 d objects

3,217 views

Published on

EuroPython presentation

https://ep2017.europython.eu/conference/talks/how-to-appy-deep-learning-for-3d-object

Published in: Science
  • Be the first to comment

How to apply deep learning to 3 d objects

  1. 1. How to apply Deep Learning to 3D objects
  2. 2. The Goal▸ First Part(General Strategy): Get knowledge about how to approach making a deep learning product ▸ Second Part(Specific case): Get knowledge about deep learning applied to 3D objects 2
  3. 3. Table of Contents Strategy 1. Find the right problem 2. Find the right method a. Getting Information 3. Keep rechallenging a. Keep tries small b. A lot of challenges 4. Focus a. Focus on the right problem b. Raise priority of the right method 3 Reference: https://www.iconfinder.com/
  4. 4. Table of Contents Specific Case 1. Deep Learning Model: VoxNet a. 3D CNN b. 3D Max Pool c. Fully connected d. Output e. Train 2. Improve technique a. Accuracy b. Speed 3. Results 4
  5. 5. HELLO! I am Masaya Ohgushi I work at Kabuku Inc. I am an image processing developer. Twitter: @SnowGushiGit 5
  6. 6. Kabuku Inc. ・On-demand manufacturing service - Receive 3D data to manufacture by using 3D printers and others 6 https://www.kabuku.co.jp/
  7. 7. Strategy 7
  8. 8. Strategy Find the Right problem 8
  9. 9. “Would you like to use deep learning to solve your problem?
  10. 10. Find the right problem ▸ What are the best cases to use Deep learning? ▹ Most cases ▹ Image processing ▹ Speech recognition ▹ Some cases ▹ Natural language processing ▹ Time series analysis 10
  11. 11. Right problem ▸ What are the worst cases to use Deep learning? ▹ Not enough data ▹ Can’t prepare a pre-train model ▹ Need 100% accuracy 11
  12. 12. Strategy Find the right method 12
  13. 13. “How can you find the best way to solve your problem using deep learning?
  14. 14. “Let’s search Google !!
  15. 15. “ Not possible!!
  16. 16. Find the Right method ▸ How to research the best deep learning solution? (In my case) ▹ Google scholar ▹ It is a paper search engine ▹ You can find the following ▹ Good methods ▹ Good keywords ▹ Which university laboratories know about this problem 16
  17. 17. Find the Right method ▸ How to research the best deep learning solution? (In my case) ▹ University laboratory sites ▹ It is possible to get data ▹ It is possible to get code ▹ GitXiv ▹ You can find papers and code ▹ Follow twitter users ▹ It is possible to get the latest information 17
  18. 18. Find the Right method ▸ How to research the best deep learning solution? (In my case) ▹ Book ▹ You can get well structured knowledge ▹ ArXiv ▹ You can find the latest methods ▹ Github ▹ You can find code ▹ Google ▹ If you already know a good keyword !! 18
  19. 19. Strategy Keep rechallenging 19
  20. 20. “You gathered a lot of training data !! Now let’s train using the full data set.
  21. 21. “ Not possible!!
  22. 22. Keep rechallenging ▸ Keep tries small ▹ If you get a lot of data, first do the following things. ▹ Prepare a small data set ▹ Check module works correctly ▹ Prepare an easy to verify training data set ▹ Most models can be trained with data such as “mnist” , you have to check it works 22 Reference: https://www.tensorflow.org/get_started/mnist/beginners
  23. 23. Keep rechallenging ▸ A lot of challenges ▹ There are no obvious methods to improve accuracy ▹ You have to check the results ▹ If training and validation accuracies don’t improve, you have to stop it ▹ Check the results by visual boards such as TensorBoard ▹ You have to increasing challenge times by improving the calculation speed ▹ Using GPU ▹ Optimize CPU 23
  24. 24. Strategy Focus 24
  25. 25. “ Deep learning has a lot of methods to improve accuracy
  26. 26. “- Model - How deep - How it is structured - Adjusting the hyper parameters - Preprocess data - Data augmentation (If using Graphical data) - Optimizer - SGD, Adam, etc
  27. 27. Focus ▸ Focus on the right problem ▹ Depends on your situation ▹ Enough computation resource and enough data ▹ Try a deep and complex model ▹ Enough computation resource, but not enough data ▹ Find a good pre-train model ▹ Focus on the preprocess such as data augmentation 27
  28. 28. Focus ▸ Focus on the right problem ▹ Depending on your situation ▹ Not enough computation resource or data ▹ Consider other ways to solve your problem ▹ Logistic Regression, SVM, Random Forest ▹ Deep learning probably isn’t the best choice 28
  29. 29. Specific Case Deep Learning applied to 3D objects 29
  30. 30. Specific Case Deep Learning Model: VoxNet 30
  31. 31. “There are a lot of deep learning models… How to choose one
  32. 32. Deep Learning Model: VoxNet 32 ▸ In my case, I considered 3 things ▹ Resource ▹ Computation resources ▹ Human resources ▹ Performance ▹ accuracy ▹ Speed ▹ Speed of development
  33. 33. Deep Learning Model: VoxNet 33 Reference: http://ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf.
  34. 34. Deep Learning Model: VoxNet 34 ▸ VoxNet Advantage ▹ Resource ▹ Computation resources ▹ Good ▹ Memory 32GB (In my environment) ▹ GPU GeForce GTX 1080 (In my environment) ▹ Performance ▹ Accuracy ▹ 83 % accuracy (Top model 95 %) ▹ Speed ▹ Open source, simple code
  35. 35. 35 Deep Learning Model: VoxNet ▸ Voxelize ▹ Maps 3D data to a 32 * 32 * 32 voxel ▹ Reduce data size
  36. 36. Deep Learning Model: VoxNet (3D CNN 3D objects) 36 ▸ Convolution 3D Reference: https://www.youtube.com/watch?v=ecbeIRVqD7g
  37. 37. Deep Learning Model: VoxNet (3D CNN 3D objects) 37 1 5 1 6 2 7 4 8 3 1 2 2 1 3 0 4 kernel: 2x2 stride: 2 Convolution 2D Input Image(4x4) 5 1 3 2 1*5+1*1 + 5*3+6*2 2*5+4*1 + 7*3+8*2 3*5+2*1 + 1*3+2*2 1*5+0*1 + 3*3+4*2 33 51 24 8 Convoluted Image(2x2)
  38. 38. Deep Learning Model: VoxNet (3D CNN 3D objects) 38 ▸ Convolution 3D
  39. 39. Deep Learning Model: VoxNet (3D CNN 3D objects) 39 ▸ Convolution 3D
  40. 40. Deep Learning Model: VoxNet (3D CNN 3D objects) 40 ▸ Convolution 3D
  41. 41. Deep Learning Model: VoxNet (3D CNN 3D objects) 41 ▸ Convolution 3D
  42. 42. Deep Learning Model: VoxNet (3D CNN 3D objects) 42 ▸ Convolution 3D
  43. 43. Deep Learning Model: VoxNet (3D CNN 3D objects) 43 ▸ Convolution 3D
  44. 44. Deep Learning Model: VoxNet (3D CNN 3D objects) 44 ▸ Convolution 3D
  45. 45. Deep Learning Model: VoxNet (3D CNN 3D objects) 45 7 times ▸ Convolution 3D
  46. 46. Deep Learning Model: VoxNet (3D CNN 3D objects) 46 32 filter : : 3DCNN Conv3D(input_shape=(32, 32, 32, 1), kernel_size=(5, 5, 5), strides=(2, 2, 2), data_format=”channels_last” ) ▸ Convolution 3D
  47. 47. Deep Learning Model: VoxNet (Max Pool3D) 47 1 5 1 6 2 7 4 8 3 1 2 2 1 3 0 4 6 8 3 4 Max pool Pooling size: 2 x 2 Stride: 2 ▸ Max Pool 2D Convolution Feature
  48. 48. Deep Learning Model: VoxNet (Max Pool3D) 48 ▸ Max Pool 3D Convolution Feature Pool window
  49. 49. Deep Learning Model: VoxNet (Max Pool3D) 49 Convolution Feature ▸ Max Pool 3D
  50. 50. Deep Learning Model: VoxNet (Max Pool3D) 50 Convolution Feature ▸ Max Pool 3D
  51. 51. Deep Learning Model: VoxNet (Max Pool3D) 51 MaxPooling3D(pool_size=(2, 2, 2), data_format='channels_last',) ▸ Max Pool 3D
  52. 52. Deep Learning Model: VoxNet (Max Pool3D) 52 ▸ Fully Connected and Output 3DCNN & 3DMaxPool : : Fully connected Dense 128 Dense number of class softmax
  53. 53. Deep Learning Model: VoxNet (Flatten, Dense) 53 ▸ Fully Connected and Output ▹ Softmax function ▹ It maps output as a probability distribution ▹ It is easy to differentiate
  54. 54. Deep Learning Model: VoxNet (Output) 54 ▸ Fully Connected and Output model.add(Flatten()) model.add(Dense(128, activation='linear',)) model.add(Dense(output_dim=number_class, activation='linear',)) model.add(Activation("softmax"))
  55. 55. Deep Learning Model: VoxNet 55 model = Sequential() model.add(Conv3D(input_shape=(32, 32, 32, 1), kernel_size=(5, 5, 5), strides=(2, 2, 2), data_format='channels_last',)) model.add(Conv3D(input_shape=(32, 32, 32, 1), kernel_size=(3, 3, 3), strides=(1, 1, 1), data_format='channels_last',)) model.add(MaxPooling3D(pool_size=(2, 2, 2), data_format='channels_last',)) model.add(Flatten()) model.add(Dense(128, activation='linear',)) model.add(Dense(output_dim=number_class, activation='linear',)) model.add(Activation(‘softmax’)) ▸ Model
  56. 56. Deep Learning Model: VoxNet 56 ▸ Train model.compile(loss='categorical_crossentropy', metrics=['accuracy']) model.fit(x_voxel_data, y_class_label)
  57. 57. Specific Case Improve technique accuracy 57
  58. 58. Improve technique 58 ▸ Improving accuracy has 2 approaches ▹ Model ▹ Advantage ▹ A variety of ways to improve accuracy ▹ Disadvantage ▹ A deep model takes a lot of resources ▹ It is not obvious which model is better ▹ Data ▹ Advantage ▹ The effect of changes are obvious ▹ Disadvantage ▹ Approaches are limited
  59. 59. Improve accuracy 59 ▸ Improve accuracy for validation data(In my case) ▹ Model ▹ RandomDropout ▹ LeakyRelu ▹ Data ▹ Data augmentation(3D data) ▹ Data increase ▹ Class weight for Unbalanced category data
  60. 60. Improve technique 60 ▸ Data Approach ▹ Data Augmentation has advantages over other methods ▹ The effects are obvious ▹ It does not increase calculation time unlike adding layers to the model
  61. 61. Improve technique 61 ▸ Data Augmentation ▹ Rotation ▹ Shift ▹ Shear ▹ etc...
  62. 62. Improve technique 62 ▸ Data Augmentation 3D ▹ Augmentation_matrix is changing channel_images = [ndi.interpolation.affine_transform(x, augmentation_matrix, )for x in x_voxel] x = np.stack(channel_images, axis=0)
  63. 63. Improve technique 63 ▸ Data Augmentation 3D Rotation rotation_matrix_y = np.array([[np.cos(theta), 0, np.sin(theta) , 0], [0 , 1, 0 , 0], [-np.sin(theta), 0, np.cos(theta), 0], [0 , 0 , 0 , 1]])
  64. 64. Improve technique 64 ▸ Data Augmentation 3D Shift shift_matrix = np.array([[1, 0, 0, shift_x], [0, 1, 0, shift_y], [0, 0, 1, shift_z], [0, 0, 0, 1 ]])
  65. 65. Improve technique 65 ▸ Data Augmentation 3D Shear shear_matrix = np.array([[1 , shear_x, shear_x, 0], [shear_y, 1 , shear_y, 0], [shear_z, shear_z, 1 , 0], [0 , 0 , 0 , 1] ])
  66. 66. Improve accuracy 66 ▸ Data increase Add Data augmented data to Training data
  67. 67. Specific Case Improve technique speed 67
  68. 68. Improve calculation speed 68 ▸ Deep Learning has a lot of ways to improve calculation speed (In my case) ▹ Use a GPU(GeForce GTX 1080: Memory 8GB) ▹ CPU optimization ▹ Multi thread ▹ Prepare feature set
  69. 69. Improve calculation speed 69 ▸ CPU optimize (TensorFlow build option) bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma
  70. 70. Specific Case Results 70
  71. 71. Results 71 ▸ Validation Accuracy
  72. 72. Result 72 Method Explanation Training (accuracy) Validation (accuracy) BaseLine BaseLine 90% 79% Shift_x_Shift_y Data augmentation(x-shift, y-shift) 80% 80% Shift_x_Shift_y_class _weight Data augmentation(x-shift, y-shift) + class weight 80% 83% Add_Shift_x_Shift_y_ class_weight Data augmentation(x-shift, y-shift) + class weight + ADD(x-shift, y-shift) 85% 85%
  73. 73. Conclusion 73
  74. 74. Conclusion Summary of this presentation 74
  75. 75. Conclusion 75 Ref: https://www.iconfinder.com/, https://github.com/, https://arxiv.org/, http://www.gitxiv.com/, https://scholar.google.co.jp/ Right problem laboratory Rechallenge Focus ▸ Strategy Right Problem Rechallenge FocusRight Method
  76. 76. Conclusion 76 ▸ Our case Right Problem Right Method Rechallenge Focus Data augmentation Customize model 3D object recognition VoxNet On-demand manufacturing service
  77. 77. We’re hiring Recruiting 77
  78. 78. Deep Learning For 3D objects It is a rare case, Implementing deep learning for 3D objects 78
  79. 79. “We are hiring!! - Deep Learning for 3D objects - Working in Japan https://www.wantedly.com/project s/111707 contact@kabuku.co.jp
  80. 80. 80 THANKS! Any questions? You can find me at @SnowGushiGit & masaya.ohgushi@kabuku.co.jp
  81. 81. References 81
  82. 82. References ▸ MNIST datasets ▹ https://www.tensorflow.org/get_started/mnist/beginners ▸ Flaticon ▹ www.flaticon.com ▸ 3D CNN-Action Recognition Part-1 ▹ https://www.youtube.com/watch?v=ecbeIRVqD7g&t=82s ▸ Bengio, Yoshua, et al. "Curriculum learning." Proceedings of the 26th annual international conference on machine learning. ACM, 2009. ▸ He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. ▸ Yann, Margot Lisa-Jing, and Yichuan Tang. "Learning Deep Convolutional Neural Networks for X-Ray Protein Crystallization Image Analysis." AAAI. 2016. ▸ Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015. 82
  83. 83. References ▸ Deep Learning Book ▹ http://www.deeplearningbook.org/ ▸ IconFinder ▹ https://www.iconfinder.com/, ▸ Github ▹ https://github.com/, ▸ Arxiv ▹ https://arxiv.org/ ▸ GitXiv ▹ http://www.gitxiv.com/ ▸ GoogleSchlor ▹ https://scholar.google.co.jp/ ▸ 83

×