Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

H2O World - H2O Deep Learning with Arno Candel


Published on

H2O World 2015

Tutorial scripts for R, Python are here:

- Powered by the open source machine learning software Contributors welcome at:
- To view videos on H2O open source machine learning software, go to:

Published in: Software
  • Be the first to comment

H2O World - H2O Deep Learning with Arno Candel

  1. 1. H2O  Deep  Learning Arno  Candel,  PhD   Chief  Architect,   @ArnoCandel
  2. 2. Why  Deep  Learning? • Deep  Learning  is  trending  (so  it  must  be  useful) 2013 20152011 I  join  H2O Deep  Learning  is  everywhere Google  starts  Google  Trends
  3. 3. What  is  Deep  Learning? • Deep  Learning  learns  a   hierarchy  of  non-­‐linear   transformations   • Neurons  transform  their   input  in  a  non-­‐linear  way   • Black-­‐box,  brute-­‐force   method,  really  good  at   pattern  recognition   • Deep  Learning  got  a  boost   in  the  last  decade  due  to   faster  hardware  and   algorithmic  advances Deep  Learning  model     =     set  of  connecting  weights     +   type  of  non-­‐linearity Age Income FICO  score Fraud Legit Layer  1   hidden   neurons Layer  2   hidden     neurons
  4. 4. Deep  Learning:  Practical  Use • non  linear   • robust  to  correlated  features   • conceptually  simple   • learned  features  can  be  extracted   • can  stop  training  at  any  time   • can  be  fine-­‐tuned  with  more  data   • great  ensemble  member   • efficient  for  multi-­‐class  problems   • world-­‐class  at  pattern  recognition strengths • slow  to  train   • slow  to  score   • not  interpretable   • results  not  fully  reproducible   • theory  not  well  understood   • overfits,  needs  regularization   • many  hyper-­‐parameters   • expands  categorical  variables   • must  impute  missing  values weaknesses
  5. 5. H2O  Deep  Learning  Features • H2O  Eco-­‐System  Benefits:   - Scalable  to  massive  datasets  on  large  clusters,  fully  parallelized   - Low-­‐latency  Java  (“POJO”)  scoring  code  is  auto-­‐generated   - Easy  to  deploy  on  Laptop,  Server,  Hadoop  cluster,  Spark  cluster,  HPC   - APIs  include  R,  Python,  Flow  UI,  Scala,  Java,  JavaScript,  REST   • Regularization  techniques:  Dropout,  L1/L2   • Early  stopping,  N-­‐fold  cross-­‐validation,  Grid  search   • Handling  of  categorical,  missing  and  sparse  data   • Gaussian/Laplace/Poisson/Gamma/Tweedie  regression  with  offsets,  observation   weights,  various  loss  functions   • Unsupervised  mode  for  non-­‐linear  dimensionality  reduction,  outlier  detection
  6. 6. Learn  More  about  H2O  Deep  Learning Tomorrow  11:00  AM  Erdos  Stage   Top  10  Deep  Learning  Tips  &  Tricks
  7. 7. What  do  these  stickers  mean? I have H2O Installed I have Python installed I have R installed I have the H2O World data sets Pick  up  stickers  or  get  install  help  at  the   information  booth
  8. 8. Hands-­‐On  Tutorial • Introduction   • Installation  and  Startup   • Decision  Boundaries   • Cover  Type  Dataset   • Exploratory  Data  Analysis   • Deep  Learning  Model   • Hyper-­‐Parameter  Search   • Checkpointing   • Cross-­‐Validation   • Model  Save  &  Load   • Regression  and  Binary  Classification   • Deep  Learning  Tips  &  Tricks  (more  tomorrow!)