Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ICIP2014 Presentation

2,026 views

Published on

I present "Hand posture recognition base on bottom-up structured deep convolutional neural network with curriculum learning" at ICIP 2014.

Published in: Engineering

ICIP2014 Presentation

  1. 1. 㻴㼍㼚㼐㻌㻼㼛㼟㼠㼡㼞㼑㻌㻾㼑㼏㼛㼓㼚㼕㼠㼕㼛㼚㻌㻮㼍㼟㼑㼐㻌㼛㼚㻌㻮㼛㼠㼠㼛㼙㻙㼡㼜㻌㻿㼠㼞㼡㼏㼠㼡㼞㼑㼐㻌 㻰㼑㼑㼜㻌㻯㼛㼚㼢㼛㼘㼡㼠㼕㼛㼚㼍㼘㻌㻺㼑㼡㼞㼍㼘㻌㻺㼑㼠㼣㼛㼞㼗㻌䈊 㼣㼕㼠㼔㻌㻯㼡㼞㼞㼕㼏㼡㼘㼡㼙㻌㻸㼑㼍㼞㼚㼕㼚㼓䈊 Takayoshi Yamashita1, Taro Watasue2 1Chubu University, 2Tome RD
  2. 2. 㻯㼍㼚㻌㼥㼛㼡㻌㼒㼕㼓㼡㼞㼑㻌㼛㼡㼠㻌㼠㼔㼕㼟㻫 1 x2 +1 ∫ dx
  3. 3. 㻯㼍㼚㻌㼏㼔㼕㼘㼐㻌㼒㼕㼓㼡㼞㼑㻌㼛㼡㼠㻌㼠㼔㼕㼟㻫 1 x2 +1 ∫ dx
  4. 4. 㼀㼛㻌㼟㼛㼘㼢㼑㻌㼠㼔㼕㼟㻌㼑㼝㼡㼍㼠㼕㼛㼚 It requires fundamental knowledge of math that studies along the curriculum (with other knowledge form different classes) 1 x2 +1 ∫ dx arithmetic equation differential square integration root psychics ………..
  5. 5. 㻷㼑㼥㻌㼕㼐㼑㼍㻌㼛㼒㻌㼠㼔㼕㼟㻌㼜㼞㼑㼟㼑㼚㼠㼍㼠㼕㼛㼚 Inspired from human’s knowledge acquisition ! Train good feature representation using the curriculum learning ! Transfer the knowledge (networks) from heterogeneous task
  6. 6. 㻰㼑㼑㼜㻌㻯㼛㼚㼢㻚㻌㻺㼑㼠㼟 ! Deep architecture which consists of convolutional, sampling and fully connection layers [LeCun 1998] ! It has translation invariance of object ! CNN+ ReLu, dropout, Normalization, etc [Krizhevsky 2012] ! Recognize the category of 1000 classes ! Top performance in Large Scale Visual Recognition Challenge 2012
  7. 7. 㻯㼡㼞㼞㼕㼏㼡㼘㼡㼙㻌㻸㼑㼍㼞㼚㼕㼚㼓 ! Train while changing difficulty of training dataset 䚷䚷(= similar with Bootstrap, but different…) x1 x2 x3 xi y1 y2 h1h2yj y1 y2 y3 hjinitial training with simple set (square size) update with complexity set (various aspect ratio) We propose the novel curriculum learning which updates the network from the heterogeneous task Y. Bengio, J. Louradour, R. Collobert, J. Weston, “Curriculum Learning”, ICML2009.
  8. 8. 㼜㼞㼛㼜㼛㼟㼑㼐㻌㼙㼑㼠㼔㼛㼐 • Train good feature representation using curriculum learning • Transfer the knowledge from heterogeneous task hand gesture recognition main idea of proposed method -We train the network with two curriculum -Two curriculums are heterogeneous
  9. 9. 㻼㼞㼛㼜㼛㼟㼑㼐㻌㼙㼑㼠㼔㼛㼐䚷ࠥ㼠㼞㼍㼕㼚㼕㼚㼓㻔㻝㻕ࠥ Train the networks as segmentation task Convolutional Layer Pooling Layer fully connection Layer Convolutional Layer Pooling Layer Binarization layer Input data : gray scale image ground truth : hand segmented image
  10. 10. 㻼㼞㼛㼜㼛㼟㼑㼐㻌㼙㼑㼠㼔㼛㼐䚷ࠥ㼠㼞㼍㼕㼚㼕㼚㼓㻔㻞㻕ࠥ Transfer the networks to classification task Utilize as initial parameters Input data : gray scale image ground truth : class label updating the parameters
  11. 11. 㻼㼞㼛㼜㼛㼟㼑㼐㻌㼙㼑㼠㼔㼛㼐䚷ࠥ㼜㼞㼑㼐㼕㼏㼠ࠥ Classify the object using only updated networks 5 Input data : gray scale image output : class label
  12. 12. 㻱㼤㼜㼑㼞㼕㼙㼑㼚㼠㻌䠄䠍䠅 ! Evaluation data ! 㻢㻌㼏㼘㼍㼟㼟㼑㼟㻌㻦㻌㼔㼍㼚㼐㻌㼟㼔㼍㼜㼑㻌㼜㼛㼟㼑 ! 㻏㻌㼛㼒㻌㼠㼞㼍㼕㼚㼕㼚㼓㻌㼕㼙㼍㼓㼑㼟㻌㻦㻌㻞㻜㻷㻌㼕㼙㼍㼓㼑㼟㻌㼎㼥㻌㼐㼍㼠㼍㻌㼍㼡㼓㼙㼑㼚㼠㼍㼠㼕㼛㼚 ! 㻏㻌㼛㼒㻌㼠㼑㼟㼠㼕㼚㼓㻌㼕㼙㼍㼓㼑㼟㻌㻦㻝㻢㻜㻜㻌㻔㻌㼑㼍㼏㼔㻌㼏㼘㼍㼟㼟㻕 ! Comparison ! 㼔㼍㼚㼐㻌㼟㼔㼍㼜㼑㻌㼏㼘㼍㼟㼟㼕㼒㼕㼏㼍㼠㼕㼛㼚 ! 㻯㼛㼙㼜㼍㼞㼕㼟㼛㼚㻌㼙㼑㼠㼔㼛㼐㼟䠖 䚷䚷㻙㼀㼞㼍㼕㼚㼕㼚㼓㻌㼣㼕㼠㼔㼛㼡㼠㻌㼏㼡㼞㼞㼕㼏㼡㼘㼡㼙㻌㼘㼑㼍㼞㼚㼕㼚㼓 䚷䚷㻙㼀㼞㼍㼕㼚㼕㼚㼓㻌㼣㼕㼠㼔㻌㼜㼞㼛㼜㼛㼟㼑㼐㻌㼏㼡㼞㼞㼕㼏㼡㼘㼡㼙㻌㼘㼑㼍㼞㼚㼕㼚㼓 䚷䚷䚷
  13. 13. 㻱㼤㼜㼑㼞㼕㼙㼑㼚㼠䠄䠎䠅 ! Network architecture layer setting input 䚷䚷䚷 input layer 40x40 pixel (gray scale image) 1st convolutional layer kernel size䠖5x5 # of kernel䠖32 activation function䠖Maxout䚷 2nd pooling layer pooling䠖max pooling size 䠖2x2 3rd convolutional layer kernel size䠖5x5 # of kernel䠖32 activation function䠖Maxout䚷 4th pooling layer pooling䠖max pooling size 䠖2x2 5th fully connection layer # of nodes䠖200 activation function䠖sigmoid output classification layer (binarization layer) # of nodes :6䚷䚷䠄or 1600 when segmentation task䠅
  14. 14. 㻱㼤㼜㼑㼞㼕㼙㼑㼚㼠䠄䠏䠅 ! Training parameters ! 㻏㻌㼛㼒㻌㼡㼜㼐㼍㼠㼑㼟䠖㻞㻜㻜㻷 ! 㼘㼑㼍㼞㼚㼕㼚㼓㻌㼞㼍㼠㼑䃔䠖㻜㻚㻜㻜㻡ࠥ㻜㻚㻜㻜㻣 ! 㼙㼕㼚㼕㻌㼎㼍㼠㼏㼔㻌㼟㼕㼦㼑䠖㻝㻜 ! 㼐㼞㼛㼜㼛㼡㼠㻌㻦㻌㻡㻜㻑 䚷䚷䚷
  15. 15. 㼀㼞㼍㼕㼚㼕㼚㼓㻌㼑㼞㼞㼛㼞㻌 ) ( '
  16. 16. * %
  17. 17. *
  18. 18. * ' * % * * ' % $ # ! % ! !%
  19. 19. 㻼㼑㼞㼒㼛㼞㼙㼍㼚㼏㼑
  20. 20. Ground Truth class
  21. 21. without curriculum learning with curriculum learning classification class Ground Truth class classification class
  22. 22. 㼂㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚㻌㼛㼒㻌㼗㼑㼞㼚㼑㼘㼟 1st convolutional layer 2nd convolutional layer without Curriculum learning with Curriculum learning total updating time : 200000 total updating time : 200000 (segmentation: 50000 +recognition:15000)
  23. 23. 㻵㼚㼠㼑㼞㼙㼑㼐㼕㼍㼠㼑㻌㼜㼍㼞㼍㼙㼑㼠㼑㼞㼟㻌㼢㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚 ! updating time : 0 - 50000
  24. 24. ! final parameters of binarization layer 㻝㻥 x1 x2 x3 xi y1 y2 Yj 㻵㼚㼠㼑㼞㼙㼑㼐㼕㼍㼠㼑㻌㼜㼍㼞㼍㼙㼑㼠㼑㼞㼟㻌㼢㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚
  25. 25. ! final parameters of binarization layer 㻞㻜 x1 x2 x3 xi y1 y2 Yj 㻵㼚㼠㼑㼞㼙㼑㼐㼕㼍㼠㼑㻌㼜㼍㼞㼍㼙㼑㼠㼑㼞㼟㻌㼢㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚
  26. 26. ! final parameters of binarization layer 㻞㻝 x1 x2 x3 xi y1 y2 Yj 㻵㼚㼠㼑㼞㼙㼑㼐㼕㼍㼠㼑㻌㼜㼍㼞㼍㼙㼑㼠㼑㼞㼟㻌㼢㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚
  27. 27. ! final parameters of binarization layer 㻞㻞 x1 x2 x3 xi y1 y2 Yj 㻵㼚㼠㼑㼞㼙㼑㼐㼕㼍㼠㼑㻌㼜㼍㼞㼍㼙㼑㼠㼑㼞㼟㻌㼢㼕㼟㼡㼍㼘㼕㼦㼍㼠㼕㼛㼚
  28. 28. 㼔㼍㼚㼐㻌㼟㼔㼍㼜㼑㻌㼟㼑㼓㼙㼑㼚㼠㼍㼠㼕㼛㼚 ! Extract hand region from gray scale image in clutter background
  29. 29. 㻯㼛㼚㼏㼘㼡㼟㼕㼛㼚 ! We propose the training method of Deep Convolutional Neural Networks with curriculum learning ! As the curriculum, the method transfer the network from heterogeneous task (segmentation = classification) ! The method is able to improve the feature representation ! Future works 䚷䚷apply to other objects and new curriculum
  30. 30. Thank you for your attention

×