Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sri Ambati – CEO, 0xdata at MLconf ATL


Published on

"Comparing Variable Importance from Ensemble and Deep Learning Methods for AdTech Data"

Variable Importance brings interpretability to popular black box modeling techniques. In this talk we study performance of popular ensemble techniques like Random Forest, Gradient Boosting with GLM. We observe certain traits that get magnified by non-linear techniques like Deep Learning that are otherwise missed by GBM or Random Forest.
We describe Open Source Scalable Machine Learning package, H2O which through ease-of-use and speed makes comparisons and picking best-of-breed and ensembles more natural. H2O's implementation of these algorithms tracks popular open source and text book implementations closely.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Sri Ambati – CEO, 0xdata at MLconf ATL

  1. 1. Open Source Machine Learning for Intelligent Applications Machine Intelligence
  2. 2. Time is the only non-renewable resource Speed Matters! Machine Intelligence
  3. 3. Sampling Law of Large Numbers
  4. 4. On Premise On / Off Hadoop On EC2 Per Node 2M Row ingest/sec 50M Row Regression/sec 750M Row Aggregates / sec Tableau R JSON Scala Java Python H2O Prediction Engine SDK / API Nano Fast Scoring Engine Deep learning Regression Trees Boosting Forests Solvers Gradients ensembles Cluster Query Processor R-engine In-Mem Map Reduce Distributed fork/join Memory Manager Columnar Compression Classify HDFS S3 SQL NoSQL Excel Machine Intelligence
  5. 5. Infrastructure Parallelism Data Parallel Chunking Express! Algorithm Parallel Parallel Code blocks Math Parallelism ADMM, HogWild Distribution Zero-Serialization – endian wars have ended
  6. 6. Scalable Machine Learning For Smarter Applications Machine Intelligence
  7. 7. Programmable Internet Machine Intelligence
  8. 8. Programmable Devices Machine Intelligence
  9. 9. AdSense Sense Machine Intelligence
  10. 10. Correlation Causality Machine Intelligence
  11. 11. Data Sensors Devices Events. Signals. TimeSeries Semi-structured data. json. High velocity. High dimensions. Machine Intelligence
  12. 12. Streaming Data Historical Data Scoring from prediction Anomaly and Outliers Detection Unsupervised Learning Machine Intelligence
  13. 13. Streaming Data Historical Data Anomaly and Outliers Detection model Scoring from prediction Machine Intelligence
  14. 14. Streaming Data Historical Data Clustering / Unsupervise Learning model Scoring from prediction Machine Intelligence
  15. 15. Machine Intelligence
  16. 16. Take Models to Production in Java Machine Intelligence
  17. 17. Onset of Rita Machine Intelligence
  18. 18. Common ensemble techniques Bayesian Classifiers Ensembles of all hypotheses in hypothesis-space. Bagging Each model votes with equal weight. Bagging trains models on randomly drawn subset Boosting Incrementally build an ensemble of each new model Machine Intelligence
  19. 19. Machine Intelligence
  20. 20. Machine Intelligence
  21. 21. Gradient Boosting Machine Machine Intelligence
  22. 22. Machine Intelligence
  23. 23. Machine Intelligence
  24. 24. Variable Importance Comparison Gradient Boosting Machine, 50 trees Random Forest, 50 trees Machine Intelligence
  25. 25. Generalized Linear Modeling – Variable Importance GLM, Elastic Net (Binomial) GLM, Elastic Net (Binomial) Categorical expansion on Age Machine Intelligence
  26. 26. Variable Importance Comparison Deep Learning (Tanh / 4-layer) Deep Learning (Tanh / 3-layer) Machine Intelligence
  27. 27. every generation needs to invent it’s math. Our data, our tools! Machine Intelligence
  28. 28. Power-Law
  29. 29. Code is incomplete without Community! Open Source Matters! Machine Intelligence
  30. 30. Community Committers 30 Meet ups 90 in 12 months Coverage Conference Speakers Curriculum Stanford, MIT, CSU, SUNY, SJSU, Purdue
  31. 31. Data Driven Decision Making is hard! Courage Matters! Machine Intelligence
  32. 32. Thanks Courtney, Nick & MLConf for bringing us to ATL
  33. 33. Sparkling Water Application Life Cycle Sparkling App jar file Spark Master JVM spark-submit Spark Worker JVM Spark Worker JVM Spark Worker JVM (1) (2) (3) (1) User submits App to Spark cluster Master node (2) App distributed to Spark cluster Worker nodes (3) Spark Executor JVMs start for App (4) H2O instance starts within each Executor JVM (5) App’s Scala main program runs Sparkling Water Cluster Spark Executor JVM H2O (4) Spark Executor JVM H2O Spark Executor JVM H2O
  34. 34. Sparkling Water Data Distribution Sparkling Water Cluster H2O H2O H2O Spark Executor JVM Data Source (e.g. HDFS) (1) (2) (3) (1) Use Spark SQL to read data into a Spark RDD (2) Convert Spark RDD to H2O RDD; H2O RDD is column-based and highly compressed (Not shown) Run modeling and prediction workflows with H2O (3) Convert H2O RDD (e.g. predictions) back to Spark RDD H2O RDD Spark RDD Spark Executor JVM Spark Executor JVM
  35. 35. H2O HHDFS H2O YARN HHDFS H2O Hadoop MR HHDFS Standalone YARN H2O in MR H HortonWorks, Cloudera, MapR, Intel Machine Intelligence
  36. 36. H2O – The Killer-App for Spark MLlib H2O SQL H2ORDD HDFS=DATA Sparkling Water Machine Intelligence In-Memory Big Data, Columnar ML 100x faster Algos R CRAN, API, fast engine API Spark API, Java MM Community Devs, Data Science
  37. 37. examples Machine Intelligence
  38. 38. Fraud / No-fraud 1/1000 unbalanced Click-Stream Browse / Click / Buy Machine Intelligence
  39. 39. Propensity Models Merchants –to- Users Lifetime Value of Customer Pricing Engines Machine Intelligence