Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
S U M M I T
S E O U L
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DataRobot, Automated ML
changes in methodology a...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
1. DataRobot Introduction
Corporate Visio...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Strategic Vision
Enabling the AI-Driven Enterpri...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Opportunity for Machine Learning in Any Busi...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Filling the Gap
Accelerate the process of resear...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The world’s most advanced Enterprise Machine Lea...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best Practices and Technology
The top ranked Dat...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliat...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Scientists
Data
Scientist
Programming
Skill...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scientific Methodology (Metaphysics)
Karl Popper...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Science Methodology
Due to limited resource...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Science Landscape
Where do you mostly work?...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Algorithm
Best Practices
Business-driven, featur...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Motivations for AutoML
Value of diverse set of a...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
And trends
Most modelling software using a
20th ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliat...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is Automated Machine Learning
• 10 steps to...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What about DataRobot?
Key Points
• End to end au...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DataRobot Workflow
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Different but powerful way of analysis
A few per...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : safer model
Robust model free from th...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : more effort on feature space
”Feature...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : Explainability
Model-agnostic explana...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : effective blending
Search over candid...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : Hyper-param Tuning
Gradient-free and ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits : API integration
data scientists and d...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo : Bleedout prediction
Binary classification...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo : Bleedout MFG process
1) Unwinding 2) Coat...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo : Jupyter Notebook
Sagemaker Notebook Autom...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliat...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Full-fledged Automation
Expanded coverage of aut...
여러분의 피드백을 기다립니다!
#AWSSummit 해시태그로
소셜미디어에 여러분의
행사소감을 올려주세요.
AWS Summit Seoul 2019
모바일 앱과 QR코드를 통해
강연평가 및 설문조사에
참여해 주시기 바랍니다...
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
홍운표
woonpyo.hong@datarobot.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Upcoming SlideShare
Loading in …5
×

Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Summit Seoul 2019

212 views

Published on

스폰서 발표 세션 | Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용
홍운표 데이터 사이언티스트, DataRobot

데이터로봇은 기존 분석 소프트웨어와 달리 자동화된 분석 플랫폼입니다. 현업 담당자는 데이터 정의만 완료되면 자신의 업무에 AI를 적용하여 업무 효율을 얻을 수 있고, 데이터 과학자도 기존 분석업무 대비 수십배의 효율성을 얻을 수 있습니다. 데이터로봇은 이렇게 기업 업무에 AI를 쉽게 적용하여, 비지니스 가치를 실현하도록 도와드릴 수 있습니다. 본 세션에서는 데이터로봇이 제공하는 자동화된 분석의 세부 기능을 살펴보고 제품 데모를 통해 자동화된 분석이 어떻게 분석 결과물의 품질을 높이고, 기존 분석 작업보다 훨씬 효율적인 업무를 수행할 수 있게 도와드리는지 확인하실 수 있습니다.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Summit Seoul 2019

  1. 1. S U M M I T S E O U L © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  2. 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DataRobot, Automated ML changes in methodology and benefits 홍운표 Customer Facing Data Scientist DataRobot, Korea
  3. 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda 1. DataRobot Introduction Corporate Vision, Introduction 2. Data Science and Practices Science Methodology, Data Scientist, Agony 3. Automated ML What & How, Benefits, Live Demo 4. Future Direction What & How
  4. 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  5. 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Strategic Vision Enabling the AI-Driven Enterprise Where AI is applied in every business process to predict outcomes. The AI-Driven Enterprise adapts to new conditions at incredible speeds and continually self-optimizes based on predicting the future. “If your competitor is rushing to build AI and you don’t, it will crush you.” -Elon Musk
  6. 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Opportunity for Machine Learning in Any Business Marketing Predicting customer Lifetime Value (LTV) Churn Customer segmentation Product mix (best product mix to reduce churn) Cross selling/recommendation algorithms Up selling Channel optimization Discount targeting Responses rates Reactivation likelihood Adwords optimization and ad buying In store traffic patterns Aircraft scheduling Sales Lead prioritization Demand forecasting Pricing Market Basket Inventory management / Dynamic Pricing Promos / Upgrades / Offers Human Resources Resume screening Employee churn Training recommendation Talent management Risk Credit risk Fraud detection Accounts Payable Recovery Anti-money laundering Insurance Claims prediction Readmission Risk Warranty Analytics Claim Prediction Logistics Procurement Warehousing Cost Analysis Product life cycle Demand Forecast Assembly Turnover Banking Insurance Healthcare Media Pharma Telco Retail Government Energy Transportation
  7. 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Filling the Gap Accelerate the process of researching, testing and deploying predictive algorithms. Enable more people to help research, test and deploy predictive algorithms. KEY Demand for predictive models Supply of data scientists Turn Analysts & Engineers into Data Scientists Increase Data Scientist productivity Unmet demand for Data Science
  8. 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The world’s most advanced Enterprise Machine Learning Automation platform 2012 Founded, HQ in Boston, MA $224M In funding 1,000,000,000+ Models built on DataRobot Cloud 250+ Data Scientists & Engineers (of 600+) 4 #1 ranked Data Scientists 50+ Top 3 finishes INSURANCE FINTECH HEALTHCARE MARKETING BANKING MANY MORE
  9. 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best Practices and Technology The top ranked Data Scientists in the world Owen Zhang Product Advisor Highest: 1st MASTER Xavier Conort Chief Data Scientist Highest: 1st MASTER Sergey Yurgenson Data Scientist Highest: 1st MASTER Amanda Schierz Data Scientist Current: 1st Female, 1st in UK MASTER Jeremy Achin CEO & Co-Founder Highest: 20th MASTER The best technologies in the world Tom de Godoy CTO & Co-Founder Highest: 20th MASTER
  10. 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  11. 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Scientists Data Scientist Programming Skills Math & Stats Domain Expertise Required Capabilities 1. Knowledge of the business 2. Knowledge of the data 3. Ability to write code to gather data 4. Ability to write code to explore/inspect data 5. Ability to write code to manipulate data 6. Ability to write code to extract actionable intel 7. Ability to write code to build models 8. Ability to write code to implement models 9. Foundational statistics 10. Internals of algorithms 11. Practical knowledge and experience 12. Knowing how to interpret and explain models
  12. 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scientific Methodology (Metaphysics) Karl Popper Observation/Rationale Hypothesis Experiment Theory
  13. 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Science Methodology Due to limited resource, call for amelioration No target goal A few algorithms & prone to overfit Aging of model Not sufficient Explanations
  14. 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Science Landscape Where do you mostly work? Business space Feature space Algorithm space (LoB, DE, DS) (DE, DS) ROI Availability Accuracy LOB : Line of Business DE : Data Engineer (ETL) DS : Data Scientist Actionable predictive model, Valuable insights (LOB, DS)
  15. 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Algorithm Best Practices Business-driven, feature-oriented analysis Business Feature Business Algorithm Business Feature Feature Algorithm Data Scientist Drives Business Drives Balanced and Promising √
  16. 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Motivations for AutoML Value of diverse set of algorithms Source: http://statweb.stanford.edu/~tibs/ElemStatLearn/ Methodology driven Problem driven
  17. 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. And trends Most modelling software using a 20th Century paradigm Expert systems with modelling intelligence
  18. 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  19. 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is Automated Machine Learning • 10 steps to building models • An expert system that knows how to do each of these 10 steps, without human instructions • Human friendly – not a black box • Fast and accurate • Replicable data science
  20. 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. What about DataRobot? Key Points • End to end automated machine learning – all 10 steps are automated • Hundreds of algorithms in the repository with new algorithms being added regularly • Chooses the best algorithms for your data • Best-in-class human-friendly insights • Widest range of deployment options • Enterprise ready • Automatic model reports • Large support team around the world
  21. 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DataRobot Workflow
  22. 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Different but powerful way of analysis A few perspectives (many more) Single model Multiple models Only interpretable algorithm is chosen  Linear model is preferable No need of Hold-out partition : just train/test or k-fold CV Hold-out partition for evaluation of several models Blending starts from existing model Interpretability is model-agnostic Blending is fair-basis reflecting multiple models performance, with speed vs accuracy data Interaction should be considered for model performance (linear model) Interaction automatically reflected in tree-based algorithms. If interaction should be of importance, DR has GA2M model and R/Python api support for that parameter tuning is limited for a model and time-consuming Parameter tuning is exhaustive for all candidate models. One can easily confine the search space and quickly get the results
  23. 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : safer model Robust model free from the risk of overfitting Average of these 5 validation scores is the cross validation score The holdout is completely hidden from the models during the training process. After you have selected your optimal model, you can score your model on this to get your holdout score. Partition 1 (TRAINING) Partition 2 (TRAINING) Partition 3 (TRAINING) Partition 4 (TRAINING) Partition 5 (VALIDATION) Holdout Partition 1 (TRAINING) Partition 2 (TRAINING) Partition 3 (TRAINING) Partition 4 (VALIDATION) Partition 5 (TRAINING) Holdout Partition 1 (TRAINING) Partition 2 (TRAINING) Partition 3 (VALIDATION) Partition 4 (TRAINING) Partition 5 (TRAINING) Holdout Partition 1 (TRAINING) Partition 2 (VALIDATION) Partition 3 (TRAINING) Partition 4 (TRAINING) Partition 5 (TRAINING) Holdout Partition 1 (VALIDATION) Partition 2 (TRAINING) Partition 3 (TRAINING) Partition 4 (TRAINING) Partition 5 (TRAINING) Holdout CV-Fold #1 CV-Fold #2 CV-Fold #3 CV-Fold #4 CV-Fold #5
  24. 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : more effort on feature space ”Feature engineering is the art of data science” (Sergey Yurgenson) Business space Feature space Algorithm space (LoB, DE, DS) (DE, DS) ROI Availability Accuracy Actionable predictive model, Valuable insights (LOB, DS) Feature Impact Feature Effects Prediction Explanations
  25. 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : Explainability Model-agnostic explanation [Feature Impact] [Feature Effect] [Prediction Explanation] • The importance of each feature • Coincides with domain knowledge? • Any new insights? • Relationship among target and a feature • Relationship reflects domain knowledge? • Any new insights or feature transform? • What is the basis of prediction? • The predictions are reliablable to business people?
  26. 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : effective blending Search over candidates which promises tangible improvement
  27. 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : Hyper-param Tuning Gradient-free and effective pattern search
  28. 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits : API integration data scientists and developers can use API Application server Prediction worker RestAPI, R/Python pkg Model Factory Automatic Model Refresh Model Diags & Viz Feature Engineering App. Integration Custom Analysis and various Analysis Notebook Web UIConsole US, EU region
  29. 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  30. 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo : Bleedout prediction Binary classification for QA Process: Coating of thin film by covering the surface with coating solution and drying, followed by polymerizing with UV-light. Problem: Unintended precipitation of powder such as unpolymerized monomer, antioxidant occurs causing “bleedout”. It spoils the product and contaminates the production line. Data: ● Material: length of film roll ● Project type: production vs experiment ● Control: winding tension, UV-exposure duration, O2 concentration etc Winding in Coating Drying Tension UV exposure (O2 purged)
  31. 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo : Bleedout MFG process 1) Unwinding 2) Coating 3) Drying 4) UV exposure 5) Tension Control 6) Winding
  32. 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo : Jupyter Notebook Sagemaker Notebook Automatically Project Created & Run
  33. 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  34. 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Full-fledged Automation Expanded coverage of automation, full automation Consumption Consuming and application of advanced analytics in the form of dashboards, decisions, and analytics powered applications. Prep, Blend, Agg and ETL Self-Service BI, data prep, blending, transformation, feature engineering, and sharing of insights. Data pipeline and workflow execution. Data Management Data cataloging, organization, and collaboration. Automatic indexing and knowledge gathering made available to the entire organization. Analytics (Advanced, Simple) Simple: Self-Service BI, charts, graphs, tables, queries. Advanced: Automated data investigation for insights, predictions, and recommendations Deployment Powering business applications by providing advanced analytics insights, predictions, monitoring, and refresh on new data. Hosted as an API, SDK, or code.
  35. 35. 여러분의 피드백을 기다립니다! #AWSSummit 해시태그로 소셜미디어에 여러분의 행사소감을 올려주세요. AWS Summit Seoul 2019 모바일 앱과 QR코드를 통해 강연평가 및 설문조사에 참여해 주시기 바랍니다. 내년 Summit을 만들 여러분의 소중한 의견 부탁 드립니다. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  36. 36. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 홍운표 woonpyo.hong@datarobot.com
  37. 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

×