Your SlideShare is downloading. ×
  • Like
Quoc Le, Stanford & Google - Tera Scale Deep Learning
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Quoc Le, Stanford & Google - Tera Scale Deep Learning

  • 1,280 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,280
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
17
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Tera-scale deep learning Quoc  V.  Le   Stanford  University  and  Google       Joint  work  with   Kai  Chen   Greg  Corrado   Jeff  Dean   MaAhieu  Devin   Rajat  Monga   Andrew  Ng   Marc Aurelio   Paul  Tucker   Ke  Yang   Ranzato  
  • 2. Machine  Learning  successes  Face  recogniLon   OCR   Autonomous  car   Email  classificaLon   RecommendaLon  systems   Web  page  ranking   Quoc  Le  
  • 3. The  role  of  Feature  ExtracLon     in  PaAern  RecogniLon   Classifier   Feature  extracLon   (Mostly  hand-­‐craWed  features)   Quoc  Le  
  • 4. Hand-­‐CraWed  Features   Computer  vision:       …   SIFT/HOG   SURF   Speech  RecogniLon:       …  MFCC   Spectrogram   ZCR   Quoc  Le  
  • 5. New  feature-­‐designing  paradigm  Unsupervised  Feature  Learning  /  Deep  Learning      Show  promises  for  small  datasets    Expensive  and  typically  applied  to  small  problems   Quoc  Le  
  • 6. The  Trend  of  BigData   Quoc  Le  
  • 7. Brain  SimulaLon   Autoencoder   Watching  10  million  YouTube  video  frames     Train  on  2000  machines  (16000  cores)  for  1  week     Autoencoder   1.15  billion  parameters   -­‐  100x  larger  than  previously  reported     -­‐  Small  compared  to  visual  cortex     Autoencoder   Image  Le,  et  al.,  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.  ICML  2012  
  • 8. Key  results   Face  detector   Human  body  detector   Cat  detector   Totally  unsupervised!     ~85%   correct  in     classifying     face  vs  no  face    Le,  et  al.,  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.  ICML  2012  
  • 9. ImageNet  classificaLon  0.005%   9.5%   15.8%   Random  guess   State-­‐of-­‐the-­‐art   Feature  learning     (Weston,  Bengio  ‘11)   From  raw  pixels   ImageNet  2009  (10k  categories):  Best  published  result:  17%                                                                                                                        (Sanchez  &  Perronnin  ‘11  ),                                                                                                                        Our  method:  20%     Using  only  1000  categories,  our  method  >  50%     Quoc  Le  
  • 10. Scaling  up  Deep  Learning   Prior  art   Our  work   #  Examples   100,000   10,000,000   #  Dimensions   1,000   10,000  #  Parameters   10,000,000   1,000,000,000   Data  set  size   Gbytes   Tbytes   Edge  filters     High-­‐level  features  Learned  features   from  Images   Face,  cat  detectors   Quoc  Le  
  • 11. Summary  of  Scaling  up  -­‐  Local  connecLvity  (Model  Parallelism)  -­‐  Asynchronous  SGDs  (Clever  opLmizaLon  /  Data  parallelism)    -­‐  RPCs  -­‐  Prefetching  -­‐  Single  -­‐  Removing  slow  machines  -­‐  Lots  of  opLmizaLon   Quoc  Le  
  • 12. Locally  connected  networks   Machine  #1   Machine  #2   Machine  #3   Machine  #4   Features  Image   Quoc  Le  
  • 13. Asynchronous  Parallel  SGDs  (Alex  Smola’s  talk)   Parameter  server   Quoc  Le  
  • 14. Conclusions   •  Scale  deep  learning  100x  larger  using  distributed  training  on  1000   machines   •  Brain  simulaLon  -­‐>  Cat  neuron   •  State-­‐of-­‐the-­‐art  performances  on     –  Object  recogniLon  (ImageNet)   –  AcLon  RecogniLon   –  Cancer  image  classificaLon   •  Other  applicaLons   –  Speech  recogniLon   –  Machine  TranslaLon   ImageNet   0.005%   9.5%   15.8%   Best  published  result  Model     Random  guess   Our  method  Parallelism  Data   Parameter  server  Parallelism   Cat  neuron   Face  neuron  
  • 15. References  •  Q.V.  Le,  M.A.  Ranzato,  R.  Monga,  M.  Devin,  G.  Corrado,  K.  Chen,  J.  Dean,  A.Y.   Ng.  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.   ICML,  2012.  •  Q.V.  Le,  J.  Ngiam,  Z.  Chen,  D.  Chia,  P.  Koh,  A.Y.  Ng.  Tiled  Convolu7onal  Neural   Networks.  NIPS,  2010.    •  Q.V.  Le,  W.Y.  Zou,  S.Y.  Yeung,  A.Y.  Ng.  Learning  hierarchical  spa7o-­‐temporal   features  for  ac7on  recogni7on  with  independent  subspace  analysis.  CVPR,   2011.  •  Q.V.  Le,  J.  Ngiam,  A.  Coates,  A.  Lahiri,  B.  Prochnow,  A.Y.  Ng.     On  op7miza7on  methods  for  deep  learning.  ICML,  2011.    •  Q.V.  Le,  A.  Karpenko,  J.  Ngiam,  A.Y.  Ng.    ICA  with  Reconstruc7on  Cost  for   Efficient  Overcomplete  Feature  Learning.  NIPS,  2011.    •  Q.V.  Le,  J.  Han,  J.  Gray,  P.  Spellman,  A.  Borowsky,  B.  Parvin.  Learning  Invariant   Features  for  Tumor  Signatures.  ISBI,  2012.    •  I.J.  Goodfellow,  Q.V.  Le,  A.M.  Saxe,  H.  Lee,  A.Y.  Ng,    Measuring  invariances  in   deep  networks.  NIPS,  2009.   hAp://ai.stanford.edu/~quocle