SlideShare a Scribd company logo
1 of 24
Download to read offline
Pipelines for model deployment
2017-04-25
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
Digital Origin – Introduction
Digital Origin is a leading Spanish fintech company focused on technology-enabled consumer finance.
Founded in 2011. €15 million A-round in 2015. 80 employees with offices in Barcelona and Madrid.
Uniquely positioned to address mainstream consumer finance market with a wide portfolio of instant real-time
products with a process completely online.
Over €150 million lent to date.
¡QuéBueno! was released in 2011: Consumer finance microlending.
1
2
3
4
5
Paga+Tarde was released in 2015: Consumer finance for eCommerce and InStore.6
Fraud
Risk
Business
Monitoring
Massive Fraud
Identity Fraud
Not Willing to Pay
Default Risk
Product, UX,
AR vs DR tradeoff
Evaluation
Control & Alerts
Marketing
Credit Cards / Returnings
QB
Device - Fingerprinting
User request
Graph relationships model
DNI Images models
Geo fraud model
Basket Model
Behavioural Model
Configuration & Parameter
Tuning
Reporting
Uplift models
CLTV models
Identity fraud model
Alerts
CREDIT RISK ENGINE
(CRE)
Design
&
Models
params
Risk Model
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
A recurrent problem moving to production
I+D Environment Prod. Environment
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
There are different requirement in development/design phase and once in production.
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
Different languages implies twice or more work.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
Solution A: Python is well suited for both necessities.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
/ / / ...
...
Solution B: API approach to get some give some flexibility.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
...
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
H2O - Architectures
• Open source API for Machine Learning
• Massively Scalable Big Data Analysis
• Easy-to-use WebUI (Jupyter – Python notebook)
• Familiar Interfaces: R, Python, Scala, Java, API, …
• Real-time Data Scoring
• Rapidly deploy models to production via POJO
or model-optimized Java objects (MOJO)
• Algorithms
• GLM
• Random Forest
• GBM
• “Deep Learning”
• Deep Water: Tensorflow, MXNet, Caffe, … (not yet)
• …
https://www.h2o.ai/h2o/
H2O - Architectures
Local
Cluster + HDFS
Cluster
H2O - Architectures
Cluster + Spark
Node 1 … Node N
Cluster + Spark
H2O - Performance
Reproducible benchmark: https://github.com/szilard/benchm-ml
GLM RF GBM (setup A) GBM (setup B)
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
Fraud
Risk
Business
Monitoring
Massive Fraud
Identity Fraud
Not Willing to Pay
Default Risk
Product, UX,
AR vs DR tradeoff
Evaluation
Control & Alerts
Marketing
Credit Cards / Returnings
QB
Device - Fingerprinting
User request
Graph relationships model
DNI Images models
Geo fraud model
Behavioural Model
Configuration & Parameter
Tuning
Reporting
Uplift models
CLTV models
Identity fraud model
Alerts
CREDIT RISK ENGINE
(CRE)
Design
&
Models
params
Risk Model
Development Production
Node 1 … Node N
Hadoop ecosystem
Extract
Transform
Train
models
Transform
Scoring
Export POJO
Digital Origin – Introduction
Data Analytics activity
Production
Credit Risk Engine (CRE)
Digital Origin – Actual Pipeline
Reporting
Replica Databases
{{mustache}}
streaming
Query
template
Tools
Corporate Libraries
batch
Data Science
Daily activity and
recurrent processes
Analytics and
Reporting databases
Production Databases
Alerts System
Services to other dep.
CRE development
CRE param. tuning
Front End
New
Model
Back End
New
Config
THANKS!
Questions?
ralabern@digitalorigin.com
markus@digitalorigin.com

More Related Content

What's hot

RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedis Labs
 
Application performance monitoring with Elastic APM and the ELK stack
Application performance monitoring with Elastic APM and the ELK stackApplication performance monitoring with Elastic APM and the ELK stack
Application performance monitoring with Elastic APM and the ELK stackAlain Lompo
 
新浪微博开放平台Redis实战
新浪微博开放平台Redis实战新浪微博开放平台Redis实战
新浪微博开放平台Redis实战mysqlops
 
Web application development - The past, the present, the future
Web application development - The past, the present, the futureWeb application development - The past, the present, the future
Web application development - The past, the present, the futureJuho Vepsäläinen
 
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and CapabilitiesNot Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and CapabilitiesBrett Meyer
 
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillDelivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillHostedbyConfluent
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in ProductionDataWorks Summit
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams APIconfluent
 
Why Predictive Maintenance in Dairy Industry is a necessity? Know here
Why Predictive Maintenance in Dairy Industry is a necessity? Know hereWhy Predictive Maintenance in Dairy Industry is a necessity? Know here
Why Predictive Maintenance in Dairy Industry is a necessity? Know hereInfinite Uptime
 
Predictive Process Monitoring in Camunda
Predictive Process Monitoring in CamundaPredictive Process Monitoring in Camunda
Predictive Process Monitoring in Camundacamunda services GmbH
 
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Flink Forward
 
Domain Driven Design com Python
Domain Driven Design com PythonDomain Driven Design com Python
Domain Driven Design com PythonFrederico Cabral
 
Foundry technical intro
Foundry technical introFoundry technical intro
Foundry technical introesseemme69
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdfcaa28steve
 
SmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsSmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsKoppelaars
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)Sergey Karayev
 
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...HostedbyConfluent
 

What's hot (20)

RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
 
Application performance monitoring with Elastic APM and the ELK stack
Application performance monitoring with Elastic APM and the ELK stackApplication performance monitoring with Elastic APM and the ELK stack
Application performance monitoring with Elastic APM and the ELK stack
 
新浪微博开放平台Redis实战
新浪微博开放平台Redis实战新浪微博开放平台Redis实战
新浪微博开放平台Redis实战
 
Web application development - The past, the present, the future
Web application development - The past, the present, the futureWeb application development - The past, the present, the future
Web application development - The past, the present, the future
 
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and CapabilitiesNot Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
 
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillDelivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Why Predictive Maintenance in Dairy Industry is a necessity? Know here
Why Predictive Maintenance in Dairy Industry is a necessity? Know hereWhy Predictive Maintenance in Dairy Industry is a necessity? Know here
Why Predictive Maintenance in Dairy Industry is a necessity? Know here
 
Predictive Process Monitoring in Camunda
Predictive Process Monitoring in CamundaPredictive Process Monitoring in Camunda
Predictive Process Monitoring in Camunda
 
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
 
Domain Driven Design com Python
Domain Driven Design com PythonDomain Driven Design com Python
Domain Driven Design com Python
 
AI in the Enterprise at Scale
AI in the Enterprise at ScaleAI in the Enterprise at Scale
AI in the Enterprise at Scale
 
Foundry technical intro
Foundry technical introFoundry technical intro
Foundry technical intro
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
 
SmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsSmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing Concepts
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)
 
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 

Similar to Digital Origin - Pipelines for model deployment

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and PythonTravis Oliphant
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkDatabricks
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupSri Ambati
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastSammy Fung
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIMatt Stubbs
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglotTugdual Grall
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2OSri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEJo-fai Chow
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMESri Ambati
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the EnterpriseEric Kavanagh
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneJo-fai Chow
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSSri Ambati
 
Simplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopSimplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopPrecisely
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFSri Ambati
 
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleDevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleJAXLondon_Conference
 
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"Daniel Bryant
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"Jo-fai Chow
 
ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureMicrosoft Tech Community
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 KeynotePeter Wang
 

Similar to Digital Origin - Pipelines for model deployment (20)

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache Spark
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglot
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Simplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopSimplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing Hadoop
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
 
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleDevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
 
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on Azure
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 Keynote
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Digital Origin - Pipelines for model deployment

  • 1. Pipelines for model deployment 2017-04-25
  • 2. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 3. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 4. Digital Origin – Introduction Digital Origin is a leading Spanish fintech company focused on technology-enabled consumer finance. Founded in 2011. €15 million A-round in 2015. 80 employees with offices in Barcelona and Madrid. Uniquely positioned to address mainstream consumer finance market with a wide portfolio of instant real-time products with a process completely online. Over €150 million lent to date. ¡QuéBueno! was released in 2011: Consumer finance microlending. 1 2 3 4 5 Paga+Tarde was released in 2015: Consumer finance for eCommerce and InStore.6
  • 5.
  • 6.
  • 7.
  • 8. Fraud Risk Business Monitoring Massive Fraud Identity Fraud Not Willing to Pay Default Risk Product, UX, AR vs DR tradeoff Evaluation Control & Alerts Marketing Credit Cards / Returnings QB Device - Fingerprinting User request Graph relationships model DNI Images models Geo fraud model Basket Model Behavioural Model Configuration & Parameter Tuning Reporting Uplift models CLTV models Identity fraud model Alerts CREDIT RISK ENGINE (CRE) Design & Models params Risk Model
  • 9. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 10. A recurrent problem moving to production I+D Environment Prod. Environment • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile There are different requirement in development/design phase and once in production. • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • …
  • 11. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile Different languages implies twice or more work.
  • 12. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile Solution A: Python is well suited for both necessities.
  • 13. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile / / / ... ... Solution B: API approach to get some give some flexibility.
  • 14. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile ...
  • 15. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 16. H2O - Architectures • Open source API for Machine Learning • Massively Scalable Big Data Analysis • Easy-to-use WebUI (Jupyter – Python notebook) • Familiar Interfaces: R, Python, Scala, Java, API, … • Real-time Data Scoring • Rapidly deploy models to production via POJO or model-optimized Java objects (MOJO) • Algorithms • GLM • Random Forest • GBM • “Deep Learning” • Deep Water: Tensorflow, MXNet, Caffe, … (not yet) • … https://www.h2o.ai/h2o/
  • 18. H2O - Architectures Cluster + Spark Node 1 … Node N Cluster + Spark
  • 19. H2O - Performance Reproducible benchmark: https://github.com/szilard/benchm-ml GLM RF GBM (setup A) GBM (setup B)
  • 20. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 21. Fraud Risk Business Monitoring Massive Fraud Identity Fraud Not Willing to Pay Default Risk Product, UX, AR vs DR tradeoff Evaluation Control & Alerts Marketing Credit Cards / Returnings QB Device - Fingerprinting User request Graph relationships model DNI Images models Geo fraud model Behavioural Model Configuration & Parameter Tuning Reporting Uplift models CLTV models Identity fraud model Alerts CREDIT RISK ENGINE (CRE) Design & Models params Risk Model
  • 22. Development Production Node 1 … Node N Hadoop ecosystem Extract Transform Train models Transform Scoring Export POJO Digital Origin – Introduction
  • 23. Data Analytics activity Production Credit Risk Engine (CRE) Digital Origin – Actual Pipeline Reporting Replica Databases {{mustache}} streaming Query template Tools Corporate Libraries batch Data Science Daily activity and recurrent processes Analytics and Reporting databases Production Databases Alerts System Services to other dep. CRE development CRE param. tuning Front End New Model Back End New Config