Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Meetup: Custom Machine Learning Recipes: Ingredients for Success

1,539 views

Published on

This meetup was recorded in Mountain View on November 12, 2019.

Recording of the presentation can be viewed here: https://www.youtube.com/watch?v=aRKZTVnyfPM&list=PLNtMya54qvOErCPus07wKDqHlTyMjgTbX&index=2&t=0s

Description:
H2O Driverless AI is H2O.ai's flagship platform for automatic machine learning. It fully automates the data science workflow including some of the most challenging tasks in applied data science such as feature engineering, model tuning, model optimization, and model deployment. Driverless AI turns Kaggle Grandmaster recipes into a full functioning platform that delivers "an expert data scientist in a box" from training to deployment. Driverless AI empowers data scientists to work on projects faster using automation and state-of-the-art computing power from GPUs to accomplish tasks in minutes that used to take months.

We're excited to have recently added the ability for users, partners, and customers to extend the platform with Bring-Your-Own-Recipe. Domain experts and advanced data scientists can now write their own recipes and seamlessly extend Driverless AI with their favorite tools from the rich ecosystem of open-source data science and machine learning libraries.
----------------------------------------------------------------------------

Ana's Bio:
Ana is a Data Science Evangelist for H2O.ai. Before H2O.ai, she worked as an Evangelist for Hortonworks (Cloudera). She holds a B.S. in Electrical Engineering and is currently pursuing a Master in Statistics with a concentration in Machine Learning at San Jose State University. When not at H2O.ai or school, she can be found in Fresno working with farmers to identify ML solutions for their agricultural challenges.

Published in: Technology
  • Check the source HelpWriting.net This site is really helped me out gave me relief from essay headaches. Good luck!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Meetup: Custom Machine Learning Recipes: Ingredients for Success

  1. 1. Custom Machine Learning Recipes: Ingredients for Success Get Started with Open Source Custom Recipes Ana Castro Ana.Castro@h2o.ai Rafael Coss Rafael@h2o.ai @racoss
  2. 2. 2 • aquarium.h2o.ai – H2O.ai’s cloud environment that provides access to various tools – Recommended for use as a training, workshops and tutorials • Driverless AI Test Drive Setup Instructions – https://h2oai.github.io/tutorials/getting-started-with-driverless-ai-test-drive/#0 H2O Aquarium 1 2
  3. 3. 3 • Automatic Machine Learning Workflow • Extending Automatic Machine Learning … Open? • What are custom recipes? • Tutorial: Using custom recipes Custom Machine Learning Recipes
  4. 4. 4 Company Founded in Silicon Valley in 2012 Funding: $147M | Series D Investors: Goldman Sachs, Ping An, Wells Fargo, NVIDIA, Nexus Ventures Products H2O Open Source Machine Learning H2O Driverless AI: Automatic Machine Learning Community 20,000 companies using open source 160,000 strong meetup community Team 185 AI experts (Expert data scientists, 13 Kaggle Grandmasters, Distributed Computing, Visualization) Global Mountain View, NYC, London, Paris, Ottawa, Prague, Chennai, Singapore H2O.ai Snapshot
  5. 5. 5 Driverless AI Features Targe t Data Quality and Transformation Modeling Table Model Building Model Data Integration + Automates Data Science and ML Workflows
  6. 6. 6 ML Solves Business Critical Problems Across Industries Save Time. Save Money. Gain a Competitive Edge. Wholesale / Commercial Banking • Know Your Customers (KYC) • Anti-Money Laundering (AML) Card / Payments Business • Transaction frauds • Collusion fraud • Real-time targeting • Credit risk scoring • In-context promotion Retail Banking • Deposit fraud • Customer churn prediction • Auto-loan Financial Services • Early cancer detection • Product recommendations • Personalized prescription matching • Medical claim fraud detection • Flu season prediction • Drug discovery • ER and hospital management • Remote patient monitoring • Medical test predictions Healthcare and Life Science • Predictive maintenance • Avoidable truck-rolls • Customer churn prediction • Improved customer viewing experience • Master data management • In-context promotions • Intelligent ad placements • Personalized program recommendations Telecom • Funnel predictions • Personalized ads • Credit scoring • Fraud detection • Next best offer • Next best action • Customer segmentation • Customer churn • Customer recommendations • Ad predictions and fraud Marketing and Retail
  7. 7. 7 Key Capabilities of H2O Driverless AI • Automatic Feature Engineering • Automatic Visualization • Machine Learning Interpretability (MLI) • Automatic Scoring Pipelines • Natural Language Processing • Time Series Forecasting • Flexibility of Data & Deployment • NVIDIA GPU Acceleration • Bring-Your-Own Recipes
  8. 8. Confidential8 Democratize AI Make every company an AI company. 8
  9. 9. 9 Automatic Model Optimization Make Your Own AI Model Recipes • i.i.d. data • Time-series • NLP Advanced Feature Engineering Algorithm Model Tunin g + + Survival of the Fittest New Capabilities Challenge • Customize for domain use case – Need additional algos, feature engineering, or optimize for customer scorer • Leverage their company IP (secret sauce) • AI is a Fast Innovation space and can not wait for vendor updates
  10. 10. 10 Automatic Model Optimization Make Your Own AI via Bring Your Own Recipe Capability Model Recipes • i.i.d. data • Time-series • NLP Advanced Feature Engineering Algorithm Model Tunin g + + Survival of the Fittest New Capabilities Challenge • Customize for domain use case – Need additional algos, feature engineering, or optimize for customer scorer • Leverage their company IP (secret sauce) • AI is a Fast Innovation space and can not wait for vendor updates Solution • Modular and extensible auto ML optimization • App Store for AI – Open source catalog of recipes (100+) – Leverage company AI IP • Integrate latest Machine Learning techniques Transformations ... Algorithms ... Scorers ...
  11. 11. Confidential11 Make Your Own AI via Bring Your Own Recipe Capability ScorersAlgorithmsTransformations New Capabilities Data Automatic Model Optimization Model Recipes • i.i.d. data • Time-series • NLP Advanced Feature Engineering Algorithm Model Tuning+ + Bring Your Own ✔ Import from open source (100+) ✔ H2O company catalog/Github ✔ Develop and upload new recipe • Modular and extensible autoML optimization • App Store for AI – Open source catalog of recipes (100+) – Leverage a company’s domain expertise • Integrate latest Machine Learning techniques • Customize for domain use case • Import latest algorithms, techniques without needing to upgrade entire platform.
  12. 12. 12 H2O Driverless AI - How it works? SQL Local Amazon S3 HDFS X Y Automatic Model Optimization Automatic Scoring Pipeline Machine learning Interpretability Deploy Low-latency Scoring to Production Modelling Dataset Model Recipes • i.i.d. data • Time-series • More on the way Advanced Feature Engineering Algorithm Model Tuning+ + Survival of the Fittest Understand the data shape, outliers, missing values, etc. 1 Drag and Drop Data 2 Automatic Visualization Use best practice model recipes and the power of high performance computing to iterate across thousands of possible models including advanced feature engineering and parameter tuning 3 Automatic Model Optimization Deploy ultra-low latency Python or Java Automatic Scoring Pipelines that include feature transformations and models 5 Automatic Scoring Pipelines Bring data in from cloud, big data and desktop systems Google BigQuery Azure Blog Storage Snowflake Model Documentation Transformations ... Algorithms ... Scorers ... 4 Extensible and Open Recipes
  13. 13. 13 • Machine Learning Pipelines’ model prepped data to solve a business question – Transformations are done on the original data to ensure it’s clean and most predictive – Additional datasets may be brought in to add insights – The data is modeled using an algorithm to find the optimal rules to solve the problem – We determine the best model by using a specific metric, or scorer • BYOR stands for Bring Your Own Recipe and it allows domain scientists to solve their problems faster and with more precision by adding their expertise in the form of Python code snippets • By providing your own custom recipes, you can gain control over the optimization choices that Driverless AI makes to best solve your machine learning problems What is a Recipe…
  14. 14. 14 https://github.com/h2oai/driverlessai-recipes FAQ / Architecture Diagram etc.
  15. 15. 16 • Automatic Machine Learning Workflow • Extending Automatic Machine Learning … Open? • What are custom recipes? • Tutorial: Using custom recipes – Transformer – Scorer – Model Custom Machine Learning Recipes
  16. 16. 17 https://h2oai.github.io/tutorials/ https://h2oai.github.io/tutorials/get-started-with-open-source-custom-recipes-tutorial
  17. 17. 1818 Confidential18 • aquarium.h2o.ai – H2O.ai’s cloud environment that provides access to various tools – Recommended for use as a training, workshops and tutorials • Driverless AI Test Drive – https://h2oai.github.io/tutorials/getting-started-with-driverless-ai-test-drive/#0 • Your data will disappear after 2 hours – Run as many times as needed H2O Aquarium 1 2 3
  18. 18. 19 About the dataset: – Kaggle’s customer churn Telco dataset: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset Add the data: – /data/Splunk/churn Launch base experiment – Predict: Customer Churn Launch a base Experiment
  19. 19. 20 1. Experiments -> Exp1. Baseline -> New Model Same Parameters 2. Expert Settings -> Official Recipes External 3. Branch rel-1.8.0 -> Transformers -> Numeric -> sum.py 4. Recipes -> Include Specific Transformer -> Select Values 5. Verify Transformer -> Launch Experiment Custom Transformer
  20. 20. 21 1. Experiments -> Exp1. Baseline -> New Model Same Parameters 2. Expert Settings -> Official Recipes External 3. Branch rel-1.8.0 -> Scorers -> Classification -> binary-> brier_loss.py 4. Recipes -> Include Scorer ->Select Values 5. Scorer -> Select Brier -> Launch Experiment Learn more: https://en.wikipedia.org/wiki/Brier_score Custom Scorer
  21. 21. 22 1. Experiments -> Exp1. Baseline -> New Model Same Parameters 2. Expert Settings -> Official Recipes External 3. Branch rel-1.8.0 -> Models -> algorithms -> extra_trees.py ->RAW 4. Recipes -> Include Model ->Select Values 5. Scorer -> ExtraTrees-> Launch Experiment Learn more: https://scikit-learn.org/ Custom Model
  22. 22. 23 • http://catalog.h2o.ai/ – https://github.com/h2oai/driverlessai-recipes • https://h2oai.github.io/tutorials/ • https://h2oai-community.slack.com/ Resources
  23. 23. Thank You
  24. 24. 2525 Giving Back to the Community
  25. 25. 26 26
  26. 26. 27 H2O AI and ML Meetups Around the World
  27. 27. 28 Giving Back to the Community Giving H2O talks at other meetups
  28. 28. 29 We are hiring in! Find out more: h2o.ai/careers

×