SlideShare a Scribd company logo
1 of 18
Financial Predictions
with Machine Learning
Stefano Tempesta
AGENDA
Online Payment Fraud Detection
Credit Risk Prediction
Online Payment Fraud Detection
VISA handles
2000 transactions / sec
>170M tpd
In 2017, financial fraud
totaled a cost of
€ 900 million
Traditional Payment Fraud Detection
• Analyze transactions and human-review suspicious ones
• Use a combination of data, horizon-scanning and “gut-feel”
• Every attempted purchase that raises an alert is either
declined or reviewed
• False positive
• Fail to predict unware threats
• Need to update risk score regularly
ML Payment Fraud Detection
• Large, historical datasets across many clients and industries
• Benefit also small companies
• Non-binary  Optimize recall and precision
• Self-learning models not determined by a fraud analyst
• Feature contribution is driven by the training set
• Risk score is updated quickly
Feature Engineering
• Aggregated variables: aggregated transaction amount per account,
aggregated transaction count per account in last 24 hours and last 30
days.
• Mismatch values: mismatch between shipping Country and billing
Country.
• Risk tables: fraud risks are calculated using historical probability
grouped by country, IP address, etc.
Anomaly Detection
• Anomaly Detection encompasses many important tasks in Machine
Learning:
• Identifying transactions that are potentially fraudulent
• Learning patterns that indicate an intrusion has occurred
• Finding abnormal clusters of patients
• Checking values input to a system
• Azure Machine Learning supports:
• PCA-Based Anomaly Detection  Unsupervised Learning
• One-Class Support Vector Machine  Supervised Learning
Anomaly Detection in Azure ML
Credit Risk Prediction
Credit Risk
The problem
Predict an individual’s credit risk based on the information they gave on
a credit application.
The solution
A predictive binary classification model based on publicly available
credit risk data.
UCI Statlog (German Credit Data) Data Set
from the UC Irvine Machine Learning repository
http://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)
Dataset
• 20 variables for 1000 past applicants for credit  features
• Financial information, credit history, employment status and personal
information
• Applicant's calculated credit risk
• 700 applicants identified as a low credit risk
• 300 applicants identified as a high credit risk
Model Training
Two-Class Boosted Decision Tree
• Supervised learning method  Classification model
• The second tree corrects for the errors of the first tree, the third tree
corrects for the errors of the first and second trees, and so forth…
• Predictions are based on the entire ensemble of trees together that
makes the prediction
• Produce better results when features are somewhat related
• Memory-intensive  holds everything in memory
Model Evaluation
Credit Risk Prediction in Azure ML
Thank you!
@stefanotempesta
/in/stefanotempesta

More Related Content

Similar to Expert Network - Financial Predictions with Machine Learning

Data scoring presentation_eng
Data scoring presentation_engData scoring presentation_eng
Data scoring presentation_engSergey Skabelkin
 
Andrius Biceika (Revolut): The New Era of Digital Banking
Andrius Biceika (Revolut): The New Era of Digital BankingAndrius Biceika (Revolut): The New Era of Digital Banking
Andrius Biceika (Revolut): The New Era of Digital BankingFinTechZone
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersBrian Griffith
 
Applications of Data Science in Banking Sector.pptx
Applications of Data Science in Banking Sector.pptxApplications of Data Science in Banking Sector.pptx
Applications of Data Science in Banking Sector.pptxjojikriparachel
 
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례kosena
 
Analystics in banking and financial services
Analystics in banking and financial servicesAnalystics in banking and financial services
Analystics in banking and financial servicesRoshithaSunil
 
YAR-Bank launch of IB project 2014
YAR-Bank launch of IB project 2014YAR-Bank launch of IB project 2014
YAR-Bank launch of IB project 2014Olga Maslova
 
Security metrics 2
Security metrics 2Security metrics 2
Security metrics 2Manish Kumar
 
Kreditech - NOAH14 London
Kreditech - NOAH14 LondonKreditech - NOAH14 London
Kreditech - NOAH14 LondonNOAH Advisors
 
Ibm odm fraud detection & management system
Ibm odm   fraud detection & management systemIbm odm   fraud detection & management system
Ibm odm fraud detection & management systemsflynn073
 
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...emagia
 
Innovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionInnovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionDataStax
 
Mindfull - The Power of Predictive
Mindfull - The Power of PredictiveMindfull - The Power of Predictive
Mindfull - The Power of PredictiveMindfull_NZ
 
QlikView for Risk and Customer Intelligence
QlikView for Risk and Customer IntelligenceQlikView for Risk and Customer Intelligence
QlikView for Risk and Customer IntelligenceQlikView-India
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYBig Data Week
 
How to Use a Cyber Loss Model within a Retail Bank
How to Use a Cyber Loss Model within a Retail BankHow to Use a Cyber Loss Model within a Retail Bank
How to Use a Cyber Loss Model within a Retail BankThomas Lee
 
Cyber Risks and Regulatory Challenges- An auditor's perspective
Cyber Risks and Regulatory Challenges- An auditor's perspectiveCyber Risks and Regulatory Challenges- An auditor's perspective
Cyber Risks and Regulatory Challenges- An auditor's perspectiveBiju Nair
 
ISMG - Fighting Business Email Compromise
ISMG - Fighting Business Email CompromiseISMG - Fighting Business Email Compromise
ISMG - Fighting Business Email CompromiseLaurent Pacalin
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServicesDavid Walker
 

Similar to Expert Network - Financial Predictions with Machine Learning (20)

Data scoring presentation_eng
Data scoring presentation_engData scoring presentation_eng
Data scoring presentation_eng
 
Andrius Biceika (Revolut): The New Era of Digital Banking
Andrius Biceika (Revolut): The New Era of Digital BankingAndrius Biceika (Revolut): The New Era of Digital Banking
Andrius Biceika (Revolut): The New Era of Digital Banking
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
 
Applications of Data Science in Banking Sector.pptx
Applications of Data Science in Banking Sector.pptxApplications of Data Science in Banking Sector.pptx
Applications of Data Science in Banking Sector.pptx
 
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례
[코세나, kosena] Auto ML, H2O.ai의 보험분야 활용사례
 
Analystics in banking and financial services
Analystics in banking and financial servicesAnalystics in banking and financial services
Analystics in banking and financial services
 
YAR-Bank launch of IB project 2014
YAR-Bank launch of IB project 2014YAR-Bank launch of IB project 2014
YAR-Bank launch of IB project 2014
 
Security metrics 2
Security metrics 2Security metrics 2
Security metrics 2
 
Kreditech - NOAH14 London
Kreditech - NOAH14 LondonKreditech - NOAH14 London
Kreditech - NOAH14 London
 
Ibm odm fraud detection & management system
Ibm odm   fraud detection & management systemIbm odm   fraud detection & management system
Ibm odm fraud detection & management system
 
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...
High Touch to Hi-Tech: The Journey to Touchless Collections | Emagia Master C...
 
Innovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionInnovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud Detection
 
FinTech
FinTechFinTech
FinTech
 
Mindfull - The Power of Predictive
Mindfull - The Power of PredictiveMindfull - The Power of Predictive
Mindfull - The Power of Predictive
 
QlikView for Risk and Customer Intelligence
QlikView for Risk and Customer IntelligenceQlikView for Risk and Customer Intelligence
QlikView for Risk and Customer Intelligence
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERYEVOLVING PATTERNS IN BIG DATA - NEIL AVERY
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
 
How to Use a Cyber Loss Model within a Retail Bank
How to Use a Cyber Loss Model within a Retail BankHow to Use a Cyber Loss Model within a Retail Bank
How to Use a Cyber Loss Model within a Retail Bank
 
Cyber Risks and Regulatory Challenges- An auditor's perspective
Cyber Risks and Regulatory Challenges- An auditor's perspectiveCyber Risks and Regulatory Challenges- An auditor's perspective
Cyber Risks and Regulatory Challenges- An auditor's perspective
 
ISMG - Fighting Business Email Compromise
ISMG - Fighting Business Email CompromiseISMG - Fighting Business Email Compromise
ISMG - Fighting Business Email Compromise
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServices
 

More from Stefano Tempesta

Robotics & AI User Group - Smart City
Robotics & AI User Group - Smart CityRobotics & AI User Group - Smart City
Robotics & AI User Group - Smart CityStefano Tempesta
 
Robotics & AI User Group - Computer Vision - Azure Kinect
Robotics & AI User Group - Computer Vision - Azure KinectRobotics & AI User Group - Computer Vision - Azure Kinect
Robotics & AI User Group - Computer Vision - Azure KinectStefano Tempesta
 
Virtual eye vision with HoloLens
Virtual eye vision with HoloLensVirtual eye vision with HoloLens
Virtual eye vision with HoloLensStefano Tempesta
 
Design Patterns for Distributed Systems in Azure Kubernetes Service
Design Patterns for Distributed Systems in Azure Kubernetes ServiceDesign Patterns for Distributed Systems in Azure Kubernetes Service
Design Patterns for Distributed Systems in Azure Kubernetes ServiceStefano Tempesta
 
Measure your teams sentiment
Measure your teams sentimentMeasure your teams sentiment
Measure your teams sentimentStefano Tempesta
 
Electronic signature with blockchain
Electronic signature with blockchainElectronic signature with blockchain
Electronic signature with blockchainStefano Tempesta
 
Best Practices to Secure Your Kubernetes Cluster
Best Practices to Secure Your Kubernetes ClusterBest Practices to Secure Your Kubernetes Cluster
Best Practices to Secure Your Kubernetes ClusterStefano Tempesta
 
Automate Blockchain Workflows
Automate Blockchain WorkflowsAutomate Blockchain Workflows
Automate Blockchain WorkflowsStefano Tempesta
 
Expert Network - Machine Learning Tech Days
Expert Network - Machine Learning Tech DaysExpert Network - Machine Learning Tech Days
Expert Network - Machine Learning Tech DaysStefano Tempesta
 
Designing and Building Decentralized Blockchain Apps
Designing and Building Decentralized Blockchain AppsDesigning and Building Decentralized Blockchain Apps
Designing and Building Decentralized Blockchain AppsStefano Tempesta
 
Smart Unified Service Desk with Machine Learning
Smart Unified Service Desk with Machine LearningSmart Unified Service Desk with Machine Learning
Smart Unified Service Desk with Machine LearningStefano Tempesta
 
Introduction to Dynamics 365 for Talent
Introduction to Dynamics 365 for TalentIntroduction to Dynamics 365 for Talent
Introduction to Dynamics 365 for TalentStefano Tempesta
 
Dynamics 365 Saturday Dubai 2018
Dynamics 365 Saturday Dubai 2018Dynamics 365 Saturday Dubai 2018
Dynamics 365 Saturday Dubai 2018Stefano Tempesta
 
Blockchain, The Next Frontier of CRM
Blockchain, The Next Frontier of CRMBlockchain, The Next Frontier of CRM
Blockchain, The Next Frontier of CRMStefano Tempesta
 
Programming the Microsoft Bot Framework
Programming the Microsoft Bot FrameworkProgramming the Microsoft Bot Framework
Programming the Microsoft Bot FrameworkStefano Tempesta
 
GDPR and Russian FL152 Data Privacy Compliance
GDPR and Russian FL152 Data Privacy ComplianceGDPR and Russian FL152 Data Privacy Compliance
GDPR and Russian FL152 Data Privacy ComplianceStefano Tempesta
 
Sentiment Analysis with Azure Machine Learning
Sentiment Analysis with Azure Machine LearningSentiment Analysis with Azure Machine Learning
Sentiment Analysis with Azure Machine LearningStefano Tempesta
 

More from Stefano Tempesta (20)

Robotics & AI User Group - Smart City
Robotics & AI User Group - Smart CityRobotics & AI User Group - Smart City
Robotics & AI User Group - Smart City
 
Robotics & AI User Group - Computer Vision - Azure Kinect
Robotics & AI User Group - Computer Vision - Azure KinectRobotics & AI User Group - Computer Vision - Azure Kinect
Robotics & AI User Group - Computer Vision - Azure Kinect
 
Virtual eye vision with HoloLens
Virtual eye vision with HoloLensVirtual eye vision with HoloLens
Virtual eye vision with HoloLens
 
Design Patterns for Distributed Systems in Azure Kubernetes Service
Design Patterns for Distributed Systems in Azure Kubernetes ServiceDesign Patterns for Distributed Systems in Azure Kubernetes Service
Design Patterns for Distributed Systems in Azure Kubernetes Service
 
Measure your teams sentiment
Measure your teams sentimentMeasure your teams sentiment
Measure your teams sentiment
 
Electronic signature with blockchain
Electronic signature with blockchainElectronic signature with blockchain
Electronic signature with blockchain
 
Best Practices to Secure Your Kubernetes Cluster
Best Practices to Secure Your Kubernetes ClusterBest Practices to Secure Your Kubernetes Cluster
Best Practices to Secure Your Kubernetes Cluster
 
Azure Cost Management
Azure Cost ManagementAzure Cost Management
Azure Cost Management
 
Automate Blockchain Workflows
Automate Blockchain WorkflowsAutomate Blockchain Workflows
Automate Blockchain Workflows
 
Expert Network - Machine Learning Tech Days
Expert Network - Machine Learning Tech DaysExpert Network - Machine Learning Tech Days
Expert Network - Machine Learning Tech Days
 
Designing and Building Decentralized Blockchain Apps
Designing and Building Decentralized Blockchain AppsDesigning and Building Decentralized Blockchain Apps
Designing and Building Decentralized Blockchain Apps
 
Build Better CRM Charts
Build Better CRM ChartsBuild Better CRM Charts
Build Better CRM Charts
 
Azure Blockchain
Azure BlockchainAzure Blockchain
Azure Blockchain
 
Smart Unified Service Desk with Machine Learning
Smart Unified Service Desk with Machine LearningSmart Unified Service Desk with Machine Learning
Smart Unified Service Desk with Machine Learning
 
Introduction to Dynamics 365 for Talent
Introduction to Dynamics 365 for TalentIntroduction to Dynamics 365 for Talent
Introduction to Dynamics 365 for Talent
 
Dynamics 365 Saturday Dubai 2018
Dynamics 365 Saturday Dubai 2018Dynamics 365 Saturday Dubai 2018
Dynamics 365 Saturday Dubai 2018
 
Blockchain, The Next Frontier of CRM
Blockchain, The Next Frontier of CRMBlockchain, The Next Frontier of CRM
Blockchain, The Next Frontier of CRM
 
Programming the Microsoft Bot Framework
Programming the Microsoft Bot FrameworkProgramming the Microsoft Bot Framework
Programming the Microsoft Bot Framework
 
GDPR and Russian FL152 Data Privacy Compliance
GDPR and Russian FL152 Data Privacy ComplianceGDPR and Russian FL152 Data Privacy Compliance
GDPR and Russian FL152 Data Privacy Compliance
 
Sentiment Analysis with Azure Machine Learning
Sentiment Analysis with Azure Machine LearningSentiment Analysis with Azure Machine Learning
Sentiment Analysis with Azure Machine Learning
 

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Expert Network - Financial Predictions with Machine Learning

  • 1. Financial Predictions with Machine Learning Stefano Tempesta
  • 2. AGENDA Online Payment Fraud Detection Credit Risk Prediction
  • 3.
  • 5. VISA handles 2000 transactions / sec >170M tpd In 2017, financial fraud totaled a cost of € 900 million
  • 6. Traditional Payment Fraud Detection • Analyze transactions and human-review suspicious ones • Use a combination of data, horizon-scanning and “gut-feel” • Every attempted purchase that raises an alert is either declined or reviewed • False positive • Fail to predict unware threats • Need to update risk score regularly
  • 7. ML Payment Fraud Detection • Large, historical datasets across many clients and industries • Benefit also small companies • Non-binary  Optimize recall and precision • Self-learning models not determined by a fraud analyst • Feature contribution is driven by the training set • Risk score is updated quickly
  • 8.
  • 9. Feature Engineering • Aggregated variables: aggregated transaction amount per account, aggregated transaction count per account in last 24 hours and last 30 days. • Mismatch values: mismatch between shipping Country and billing Country. • Risk tables: fraud risks are calculated using historical probability grouped by country, IP address, etc.
  • 10. Anomaly Detection • Anomaly Detection encompasses many important tasks in Machine Learning: • Identifying transactions that are potentially fraudulent • Learning patterns that indicate an intrusion has occurred • Finding abnormal clusters of patients • Checking values input to a system • Azure Machine Learning supports: • PCA-Based Anomaly Detection  Unsupervised Learning • One-Class Support Vector Machine  Supervised Learning
  • 13. Credit Risk The problem Predict an individual’s credit risk based on the information they gave on a credit application. The solution A predictive binary classification model based on publicly available credit risk data. UCI Statlog (German Credit Data) Data Set from the UC Irvine Machine Learning repository http://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)
  • 14. Dataset • 20 variables for 1000 past applicants for credit  features • Financial information, credit history, employment status and personal information • Applicant's calculated credit risk • 700 applicants identified as a low credit risk • 300 applicants identified as a high credit risk
  • 15. Model Training Two-Class Boosted Decision Tree • Supervised learning method  Classification model • The second tree corrects for the errors of the first tree, the third tree corrects for the errors of the first and second trees, and so forth… • Predictions are based on the entire ensemble of trees together that makes the prediction • Produce better results when features are somewhat related • Memory-intensive  holds everything in memory
  • 17. Credit Risk Prediction in Azure ML

Editor's Notes

  1. The traditional approach to tackling this problem is to use rules or logic statements to query transactions and to direct suspicious transactions through to human review. While there is some variation, it is notable that over 90 percent of online fraud detection platforms still use this method, including platforms used by banks and payment gateways. While this is effective to some degree, in cases where there is a sufficient gap between an order being received and goods being shipped, it is also incredibly costly and far slower than alternatives. The “rules” in these platform use a combination of data, horizon-scanning and gut-feel. The system is backed with manual reviews to confirm experts’ decisions. If we take the recent reports of an abundance of Turkish credit cards available on the darknet due to the publicized data breach in Turkey: businesses recognize the increased risk of Turkish cards as fraudulent and can simply add a rule to review any transactions from Turkish credit cards Following this, every attempted purchase made by such a card raises an alert and is declined or reviewed. However, this raises two significant issues. The first is that such a generalized rule may turn away millions of legitimate customers, ultimately losing the business money and jeopardizing customer relations. Secondly, while this can deter future threats after such fraud has been found, it fails to identify or predict potential threats that businesses are not aware of. These rules tend to produce binary results, deeming transactions as either good or bad and failing to consider anything in between. And until the rules are manually reviewed, the system will continue to prevent such transactions as those from Turkish credit cards, even if the risk or threat is no longer prominent.
  2. Machine learning works on the basis of large, historical datasets that have been created using a collection of data across many clients and industries. Even companies that only process a relatively small number of transactions are able to take full advantage of the data sets for their vertical, allowing them to get accurate decisions on each transaction. This aggregation of data provides a highly accurate set of training data, and the access to this information allows businesses to choose the right model to optimize the levels of recall and precision that they provide: out of all the transactions the model predicts to be fraudulent (recall), what proportion of these actually are (precision)? Within the datasets, features are constructed. These are data points such as the age and value of the customer account, as well as the origin of the credit card. There can be hundreds of features and each contributes, to varying extents, towards the fraud probability. Note, the degree in which each feature contributes to the fraud score is not determined by a fraud analyst, but is generated by the artificial intelligence of the machine which is driven by the training set. So, in regards to the Turkish card fraud, if the use of Turkish cards to commit fraud is proven to be high, the fraud weighting of a transaction that uses a Turkish credit card will be equally so. However, if this were to diminish, the contribution level would parallel. Simply put, these models self-learn without explicit programming such as with manual review.
  3. To enhance the predictive power of the ML algorithms, an important step is feature engineering, where additional features are created from raw data based on domain knowledge. For example, if an account has not made a big purchase in the last month, then a thousand-dollar transaction all of a sudden could be suspicious. The new features generated in this scenario include: Aggregated variables, such as aggregated transaction amount per account, aggregated transaction count per account in last 24 hours and last 30 days; Mismatch variables: such as mismatch between shippingCountry and billingCountry, which potentially indicates abnormal behavior. Risk tables: fraud risks are calculated using historical probability grouped by country, state, IPAddress, etc
  4. This module is designed for use in scenarios where it is easy to obtain training data from one class, such as valid transactions, but difficult to obtain sufficient samples of the targeted anomalies. For example, if you need to detect fraudulent transaction, you might not have enough examples of fraud to train the mode, but have many examples of good transactions. However, if you use the PCA-Based Anomaly Detection module, you can train the model using the available features to determine what constitutes a "normal" class, and then use distance metrics to identify cases that represent anomalies. Principal Component Analysis, which is frequently abbreviated to PCA, is an established technique in machine learning that can be applied to feature selection and classification. PCA is frequently used in exploratory data analysis because it reveals the inner structure of the data and explains the variance in the data. PCA works by analyzing data that contains multiple variables, all possibly correlated, and determining some combination of values that best captures the differences in outcomes. It then outputs the combination of values into a new set of values called principal components. In the case of anomaly detection, for each new input, the anomaly detector first computes its projection on the eigenvectors, and then computes the normalized reconstruction error. This normalized error is the anomaly score. The higher the error, the more anomalous the instance is. You can use the One-Class Support Vector Model module to create an anomaly detection model. This module is particularly useful in scenarios where you have a lot of "normal" data and not many cases of the anomalies you are trying to detect. For example, if you need to detect fraudulent transactions, you might not have many examples of fraud that you could use to train a typically classification model, but you might have many examples of good transactions.
  5. Column names Status of checking account, Duration in months, Credit history, Purpose, Credit amount, Savings account/bond, Present employment since, Installment rate in percentage of disposable income, Personal status and sex, Other debtors, Present residence since, Property, Age in years, Other installment plans, Housing, Number of existing credits, Job, Number of people providing maintenance for, Telephone, Foreign worker, Credit risk