SlideShare a Scribd company logo
Advanced Model
Comparison and Automated
Deployment Using MLFlow
Charu Kalra Connor McCambridge
Sr Data Scientist Sr Data Scientist
Fraud Insights & Analytics Data Science Team
Charu Kalra
• Senior Data Scientist
• Master’s in Mathematical Finance from Rutgers University
• Previously worked at American Express as a Risk Manager and Commerce Bank as a Data
Scientist
• 2+ years experience working with Spark, Databricks, and Big Data Architectures
Connor McCambridge
• Senior Data Scientist
• Master’s in Business Intelligence and Analytics from Rockhurst University
• Started data science career as an Intern for Sprint’s Prepaid Division
• 3+ years experience working with Spark, Databricks, and Big Data Architectures
Ted Burbidge
• Senior Data Scientist
• Master’s in Applied Statistics from the University of Kansas
• Working in telecom since 2000 in various roles including Performance Engineering and
Application Design
• 3+ years experience working with Spark, Databricks, and Big Data Architectures
Agenda
§ Project Vision
§ Solution Design
§ Demo
§ Conclusion
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.
Project Vision
• Problem Statement:
• Multiple Fraud Checks on New Accounts
• Missed Fraud falls into Delinquent Status.
• Objective:
1. Measure Fraud Rate
2. Identify Missed Fraud
Pre-
Activation
Check
Post-
Activation
Check
Delinquent
Status
Check
Random Sampling
Machine Learning
Outlier Detection
Project Approach
Automated Process Flow
Historical
Data
Build
Model
Daily
Data
Production
Model
Collect
Results
Load
Results
Manual
Review
Examine Model Performance
Refresh Historical Records
Data Science Stages
▪ Gather data
▪ Training data
▪ Test data
▪ Variable Creation
▪ Aggregations
▪ Scaling/
Standardizing
• Data Transformation
• Data Preparation
▪ Machine Learning
▪ Outlier Detection
• Model Building
▪ Examine Metrics
▪ Productionalize
Model (optional)
• Review Results
Notebook Building
Data Prep
Build Transformer
Build Model
(Logistic Regression)
Review Results
Data Prep
Build Transformer
Build Model
(Logistic Regression)
Review Results
Build Model
(Neural Net)
Build Model
(XGBoost)
Data Prep
Build Transformer
Build Model
(Logistic Regression)
Review Results
Data Prep
Build Transformer
Build Model
(Neural Net)
Review Results
Data Prep
Build Transformer
Build Model
(XGBoost)
Review Results
Notebook Framework
Review Results
Build Model
(XGBoost)
Build Model
(Neural Net)
Build Model
(Logistic Regression)
Data Prep
Build Transformer
Framework Components
§ Create & Store Data
§ Build Uniform Transformer
§ Train Multiple Models
§ Hyper-Tune Parameters
§ Model Selection
§ Automated Deployment
Machine Learning
Outlier Detection
Outlier
Experiment
Solution Design
Data Prep &
Transformer
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Machine Learning
Outlier Detection
Outlier
Experiment
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Create & Store Data
Data Prep &
Transformer
Machine Learning
Outlier Detection
Outlier
Experiment
Data Prep &
Transformer
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Build Uniform Transformer
Data Prep &
Transformer
Machine Learning
Outlier Detection
Outlier
Experiment
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Train Multiple Models
Data Prep &
Transformer
Machine Learning
Outlier Detection
Outlier
Experiment
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Hyper-Tuning Parameters
{𝝆𝟏: 𝝆𝟏𝟏, 𝝆𝟏𝟐, … , 𝝆𝟏𝒏 ,
𝝆𝟐: 𝝆𝟐𝟏, 𝝆𝟐𝟐, … , 𝝆𝟐𝒏 ,
…,
𝝆𝒏: 𝝆𝒏𝟏, 𝝆𝒏𝟐, … , 𝝆𝒏𝒏 }
fmin():
{𝝆𝟏𝟏, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏}
{𝝆𝟏𝟐, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏}
{𝝆𝟏𝟑, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏}
{… , … , … , … }
{𝝆𝟏𝒏, 𝝆𝟐𝒏, … , 𝝆𝒏𝒏}
{𝑩𝒆𝒔𝒕 𝑴𝒐𝒅𝒆𝒍}
Test & Train
Data
Transform
Data
Define
Hyperopt
Build
Models
Compare
Results
Select Best
Model
Build & Save
Pipeline
Hyper-Tuning
Data Prep &
Transformer
Machine Learning
Outlier Detection
Outlier
Experiment
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Model Selection
Data Prep &
Transformer
Machine Learning
Outlier Detection
Outlier
Experiment
ML Model
Building
Outlier Model
Building
Outlier Model
Compare
ML Model
Compare
Daily Batch
Scoring
Outlier
Model Registry
ML
Experiment
ML
Model Registry
Transformer
Model Registry
Automated Deployment
Azure Data Factory Implementation
By coding all the notebooks in Azure Databricks, we can then use Azure Data Factory to orchestrate the
notebook executions
Demo
Dataset: Worldline and the Machine Learning Group at the Free University of Brussels.
(2018). Credit Card Fraud Detection (Version 3) [CSV]. Retrieved from
https://www.kaggle.com/mlg-ulb/creditcardfraud
Project Results
§ 3x higher fraud rate through
Random Sampling
§ Outlier Detection captured
4x random sample rate
§ Machine Learning captured
10x random sample rate
Key Takeaways
• Leverage available tools in Azure
• Customization of solution using Databricks
• Complete model lifecycle utilizing MLFlow
• Full automation with Azure Data Factory
Thank You

More Related Content

What's hot

SOLID principles
SOLID principlesSOLID principles
SOLID principles
Jonathan Holloway
 
DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!
Adrien Blind
 
Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
Opsta
 
Deployment Strategies Powerpoint Presentation Slides
Deployment Strategies Powerpoint Presentation SlidesDeployment Strategies Powerpoint Presentation Slides
Deployment Strategies Powerpoint Presentation Slides
SlideTeam
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
jeykottalam
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
Max De Marzi
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Design
Araf Karsh Hamid
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
James Serra
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
Simon Frid
 
Solid Principles
Solid PrinciplesSolid Principles
Solid Principles
NexThoughts Technologies
 
Google Vertex AI
Google Vertex AIGoogle Vertex AI
Google Vertex AI
VikasBisoi
 
Open Closed Principle kata
Open Closed Principle kataOpen Closed Principle kata
Open Closed Principle kata
Paul Blundell
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
Benjamin Raethlein
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기
SeongIkKim2
 
Solid principles
Solid principlesSolid principles
Solid principles
Declan Whelan
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
DataWorks Summit
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
Tyler Treat
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
Databricks
 
Introduction to CI/CD
Introduction to CI/CDIntroduction to CI/CD
Introduction to CI/CD
Steve Mactaggart
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
Slava Kokaev
 

What's hot (20)

SOLID principles
SOLID principlesSOLID principles
SOLID principles
 
DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!
 
Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
 
Deployment Strategies Powerpoint Presentation Slides
Deployment Strategies Powerpoint Presentation SlidesDeployment Strategies Powerpoint Presentation Slides
Deployment Strategies Powerpoint Presentation Slides
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Design
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
 
Solid Principles
Solid PrinciplesSolid Principles
Solid Principles
 
Google Vertex AI
Google Vertex AIGoogle Vertex AI
Google Vertex AI
 
Open Closed Principle kata
Open Closed Principle kataOpen Closed Principle kata
Open Closed Principle kata
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기
 
Solid principles
Solid principlesSolid principles
Solid principles
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
 
Introduction to CI/CD
Introduction to CI/CDIntroduction to CI/CD
Introduction to CI/CD
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 

Similar to Advanced Model Comparison and Automated Deployment Using ML

MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
Provectus
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
RTTS
 
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Databricks
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Wei Di
 
Transforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales IntelligenceTransforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales Intelligence
Songtao Guo
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
Databricks
 
Using ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime ValueUsing ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime Value
Navin Albert
 
Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)
Vadlamudi Saketh
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
Databricks
 
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeGDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
James Anderson
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
Cameron Joannidis
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Databricks
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
Mostafa Majidpour
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
Stepan Pushkarev
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
AI & AWS DeepComposer
AI & AWS DeepComposerAI & AWS DeepComposer
AI & AWS DeepComposer
Amazon Web Services
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
Rajesh Muppalla
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
DATAVERSITY
 

Similar to Advanced Model Comparison and Automated Deployment Using ML (20)

MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
 
Transforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales IntelligenceTransforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales Intelligence
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Using ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime ValueUsing ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime Value
 
Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeGDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
AI & AWS DeepComposer
AI & AWS DeepComposerAI & AWS DeepComposer
AI & AWS DeepComposer
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 

Recently uploaded (20)

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 

Advanced Model Comparison and Automated Deployment Using ML

  • 1. Advanced Model Comparison and Automated Deployment Using MLFlow Charu Kalra Connor McCambridge Sr Data Scientist Sr Data Scientist
  • 2. Fraud Insights & Analytics Data Science Team Charu Kalra • Senior Data Scientist • Master’s in Mathematical Finance from Rutgers University • Previously worked at American Express as a Risk Manager and Commerce Bank as a Data Scientist • 2+ years experience working with Spark, Databricks, and Big Data Architectures Connor McCambridge • Senior Data Scientist • Master’s in Business Intelligence and Analytics from Rockhurst University • Started data science career as an Intern for Sprint’s Prepaid Division • 3+ years experience working with Spark, Databricks, and Big Data Architectures Ted Burbidge • Senior Data Scientist • Master’s in Applied Statistics from the University of Kansas • Working in telecom since 2000 in various roles including Performance Engineering and Application Design • 3+ years experience working with Spark, Databricks, and Big Data Architectures
  • 3. Agenda § Project Vision § Solution Design § Demo § Conclusion
  • 4. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.
  • 5. Project Vision • Problem Statement: • Multiple Fraud Checks on New Accounts • Missed Fraud falls into Delinquent Status. • Objective: 1. Measure Fraud Rate 2. Identify Missed Fraud Pre- Activation Check Post- Activation Check Delinquent Status Check
  • 6. Random Sampling Machine Learning Outlier Detection Project Approach
  • 8. Data Science Stages ▪ Gather data ▪ Training data ▪ Test data ▪ Variable Creation ▪ Aggregations ▪ Scaling/ Standardizing • Data Transformation • Data Preparation ▪ Machine Learning ▪ Outlier Detection • Model Building ▪ Examine Metrics ▪ Productionalize Model (optional) • Review Results
  • 9. Notebook Building Data Prep Build Transformer Build Model (Logistic Regression) Review Results Data Prep Build Transformer Build Model (Logistic Regression) Review Results Build Model (Neural Net) Build Model (XGBoost) Data Prep Build Transformer Build Model (Logistic Regression) Review Results Data Prep Build Transformer Build Model (Neural Net) Review Results Data Prep Build Transformer Build Model (XGBoost) Review Results
  • 10. Notebook Framework Review Results Build Model (XGBoost) Build Model (Neural Net) Build Model (Logistic Regression) Data Prep Build Transformer
  • 11. Framework Components § Create & Store Data § Build Uniform Transformer § Train Multiple Models § Hyper-Tune Parameters § Model Selection § Automated Deployment
  • 12. Machine Learning Outlier Detection Outlier Experiment Solution Design Data Prep & Transformer ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry
  • 13. Machine Learning Outlier Detection Outlier Experiment ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Create & Store Data Data Prep & Transformer
  • 14. Machine Learning Outlier Detection Outlier Experiment Data Prep & Transformer ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Build Uniform Transformer
  • 15. Data Prep & Transformer Machine Learning Outlier Detection Outlier Experiment ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Train Multiple Models
  • 16. Data Prep & Transformer Machine Learning Outlier Detection Outlier Experiment ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Hyper-Tuning Parameters
  • 17. {𝝆𝟏: 𝝆𝟏𝟏, 𝝆𝟏𝟐, … , 𝝆𝟏𝒏 , 𝝆𝟐: 𝝆𝟐𝟏, 𝝆𝟐𝟐, … , 𝝆𝟐𝒏 , …, 𝝆𝒏: 𝝆𝒏𝟏, 𝝆𝒏𝟐, … , 𝝆𝒏𝒏 } fmin(): {𝝆𝟏𝟏, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏} {𝝆𝟏𝟐, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏} {𝝆𝟏𝟑, 𝝆𝟐𝟏, … , 𝝆𝒏𝟏} {… , … , … , … } {𝝆𝟏𝒏, 𝝆𝟐𝒏, … , 𝝆𝒏𝒏} {𝑩𝒆𝒔𝒕 𝑴𝒐𝒅𝒆𝒍} Test & Train Data Transform Data Define Hyperopt Build Models Compare Results Select Best Model Build & Save Pipeline Hyper-Tuning
  • 18. Data Prep & Transformer Machine Learning Outlier Detection Outlier Experiment ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Model Selection
  • 19. Data Prep & Transformer Machine Learning Outlier Detection Outlier Experiment ML Model Building Outlier Model Building Outlier Model Compare ML Model Compare Daily Batch Scoring Outlier Model Registry ML Experiment ML Model Registry Transformer Model Registry Automated Deployment
  • 20. Azure Data Factory Implementation By coding all the notebooks in Azure Databricks, we can then use Azure Data Factory to orchestrate the notebook executions
  • 21. Demo Dataset: Worldline and the Machine Learning Group at the Free University of Brussels. (2018). Credit Card Fraud Detection (Version 3) [CSV]. Retrieved from https://www.kaggle.com/mlg-ulb/creditcardfraud
  • 22. Project Results § 3x higher fraud rate through Random Sampling § Outlier Detection captured 4x random sample rate § Machine Learning captured 10x random sample rate
  • 23. Key Takeaways • Leverage available tools in Azure • Customization of solution using Databricks • Complete model lifecycle utilizing MLFlow • Full automation with Azure Data Factory