To begin, the talk will briefly discuss on how Rippleshot uses payment card transaction data and fraud records to trace back in time where cards were stolen by hackers. Through the analysis of common points of purchase, Rippleshot is able to inform issuer partners which locations have been breached, which cards are most likely to be used fraudulently in the near term, and should be reissued, and which transactions should be declined.
In addition, the presentation will showcase a technique used extensively by Rippleshot in modeling. Instead of using primary variables like postal codes as indicators in models, they prefer to use risk indices derived from the primary variables, in order to future-proof the models.
For example, the fraudsters might be working out of a postal code in Florida right now, but they will change that soon in response to law enforcement or issuer declines. If that happens, a model using the postal code directly would quickly fail, requiring expensive rebuilding and, quite possibly, expensive model failures in the meantime.
Instead, Rippleshot experts collect the fraud rates for that postal code and use those as indicator variables, as opposed to the postal code itself. They can then update this fraud rate table continuously, without having to update the entire model as frequently. In turn, this makes for a far more robust response to dynamic fraud behaviors.
This document discusses methods for detecting click fraud through traffic mix adjustment algorithms. It provides two case studies where these algorithms were effective at uncovering click fraud scams. In the first case, a botnet launched a sophisticated attack against multiple advertisers. The mix adjustment algorithm revealed which advertisers were seeing valuable traffic versus those being inundated with fake clicks from the botnet. In the second case, an affiliate was using a conversion fraud scam to artificially inflate their reported conversion rates. Again, the mix adjustment algorithm accurately revealed the true conversion rates by analyzing traffic patterns elsewhere. The document concludes with a reference to a research paper on using mix adjustment to detect click fraud botnets.
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...Big Data Week
All analysts and associated industry projections suggest the rate at which data volumes will grow continues to pick up momentum. While it may seem we are splitting at the seams now, projections suggest we are on the cusp of hitting the wall with current architectural models with no end in sight. When gaining insight from the data is a function of one or more complex queries, simply applying more hardware and more people becomes unfeasible.
In this presentation, Infobright CEO Don DeLoach will discuss how high-value approximation can be used to gain equivalent insight to exact queries while overcoming the prohibitive time and costs associated with continuing with traditional models.
Rethinking the problem using statistical metadata offers a compelling opportunity to overcome the mounting scale barriers by drastically reducing the resource requirements and query times to enable previously unattainable opportunities.
BDW Chicago 2016 - Jessica Freaner, Data Scientist, Datascope Analytics - You...Big Data Week
The document discusses the history and development of artificial intelligence over several decades. Early research focused on symbolic approaches using rules and logic but progress was slow. More recently, machine learning techniques such as deep learning have seen increasing success by learning from large amounts of data without being explicitly programmed. These new approaches are being applied to many areas and fueling a new wave of innovation and development in AI.
BDW Chicago 2016 - Chris Gladwin, Founder Cleversafe & Ocient - How Big Is D...Big Data Week
This document discusses the rapid growth of data and the challenges posed by increasingly large data volumes. It notes that data is growing exponentially as technology improves, driven by factors like genome sequencing, medical imaging, video, and sensors in devices. The amount of enterprise data is doubling every two years, and will reach 20 exabytes of structured data stored annually by 2017. However, 90% of this data may never be accessed. Analyzing such enormous datasets, with potential queries in the billions per second, requires advanced computing techniques like machine learning to handle the scale.
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...Big Data Week
It’s no secret that there’s a shortage of traditional scientists. They’re hard to find, and even harder to afford when you do find them. And even if you can, you’ll still never feel like you have enough of them. That’s why the rise of the citizen data scientist is so critical to the ongoing analytics revolution. These non-technical but supremely ambitious line of business employees represent the future of analytics. Now, and for the foreseeable future, citizen data scientists will be the driving force behind the use of analytics to drive innovation.
Empowering them with the right tools is thus paramount to the long-term success of analytics. Enter collective intelligence. In a world where empowering the citizen data scientist is paramount, collective intelligence holds the key. In this in-depth session, John K. Thompson, GM, Dell Statistica, will examine the concept of collective intelligence as it relates to analytics, and explain how organizations lacking the skills to build the right analytical models themselves can now leverage the work of those who do have the necessary skills – all without having to hire those experts directly.
FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G...Big Data Week
Martin is VP of Data Science at Skimlinks, leading on large-scale machine learning for natural language processing and the modelling of consumer behaviour. He previously working as a statistician at the University of Oxford, where he conducted research into the genetics of personality. As head of research at Qubit, he also built predictive models of behaviour for online personalisation.
Emma is the Commercial Data Manager for the Evening Standard, Independent, i100 and London Live websites. This summer the team won the AOP award for Best Use of Data for their innovative approach to using data-insight to shape commercial campaigns. Prior to ESI, Emma worked at some of the largest international media owners, including Yahoo! and the commercial arm of the BBC; providing data insights to inform, execute and evaluate advertising campaigns for global clients.
The document is a diagram mapping the mobile advertising ecosystem, showing relationships between different entities like app developers, mobile ad networks, carriers, agencies, and data providers. It depicts how advertisers, agencies, and data companies interact with platforms, tools, and each other across the mobile marketing lifecycle.
This document discusses methods for detecting click fraud through traffic mix adjustment algorithms. It provides two case studies where these algorithms were effective at uncovering click fraud scams. In the first case, a botnet launched a sophisticated attack against multiple advertisers. The mix adjustment algorithm revealed which advertisers were seeing valuable traffic versus those being inundated with fake clicks from the botnet. In the second case, an affiliate was using a conversion fraud scam to artificially inflate their reported conversion rates. Again, the mix adjustment algorithm accurately revealed the true conversion rates by analyzing traffic patterns elsewhere. The document concludes with a reference to a research paper on using mix adjustment to detect click fraud botnets.
BDW Chicago 2016 - Don Deloach, CEO and President, Infobright - Rethinking Ar...Big Data Week
All analysts and associated industry projections suggest the rate at which data volumes will grow continues to pick up momentum. While it may seem we are splitting at the seams now, projections suggest we are on the cusp of hitting the wall with current architectural models with no end in sight. When gaining insight from the data is a function of one or more complex queries, simply applying more hardware and more people becomes unfeasible.
In this presentation, Infobright CEO Don DeLoach will discuss how high-value approximation can be used to gain equivalent insight to exact queries while overcoming the prohibitive time and costs associated with continuing with traditional models.
Rethinking the problem using statistical metadata offers a compelling opportunity to overcome the mounting scale barriers by drastically reducing the resource requirements and query times to enable previously unattainable opportunities.
BDW Chicago 2016 - Jessica Freaner, Data Scientist, Datascope Analytics - You...Big Data Week
The document discusses the history and development of artificial intelligence over several decades. Early research focused on symbolic approaches using rules and logic but progress was slow. More recently, machine learning techniques such as deep learning have seen increasing success by learning from large amounts of data without being explicitly programmed. These new approaches are being applied to many areas and fueling a new wave of innovation and development in AI.
BDW Chicago 2016 - Chris Gladwin, Founder Cleversafe & Ocient - How Big Is D...Big Data Week
This document discusses the rapid growth of data and the challenges posed by increasingly large data volumes. It notes that data is growing exponentially as technology improves, driven by factors like genome sequencing, medical imaging, video, and sensors in devices. The amount of enterprise data is doubling every two years, and will reach 20 exabytes of structured data stored annually by 2017. However, 90% of this data may never be accessed. Analyzing such enormous datasets, with potential queries in the billions per second, requires advanced computing techniques like machine learning to handle the scale.
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...Big Data Week
It’s no secret that there’s a shortage of traditional scientists. They’re hard to find, and even harder to afford when you do find them. And even if you can, you’ll still never feel like you have enough of them. That’s why the rise of the citizen data scientist is so critical to the ongoing analytics revolution. These non-technical but supremely ambitious line of business employees represent the future of analytics. Now, and for the foreseeable future, citizen data scientists will be the driving force behind the use of analytics to drive innovation.
Empowering them with the right tools is thus paramount to the long-term success of analytics. Enter collective intelligence. In a world where empowering the citizen data scientist is paramount, collective intelligence holds the key. In this in-depth session, John K. Thompson, GM, Dell Statistica, will examine the concept of collective intelligence as it relates to analytics, and explain how organizations lacking the skills to build the right analytical models themselves can now leverage the work of those who do have the necessary skills – all without having to hire those experts directly.
FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G...Big Data Week
Martin is VP of Data Science at Skimlinks, leading on large-scale machine learning for natural language processing and the modelling of consumer behaviour. He previously working as a statistician at the University of Oxford, where he conducted research into the genetics of personality. As head of research at Qubit, he also built predictive models of behaviour for online personalisation.
Emma is the Commercial Data Manager for the Evening Standard, Independent, i100 and London Live websites. This summer the team won the AOP award for Best Use of Data for their innovative approach to using data-insight to shape commercial campaigns. Prior to ESI, Emma worked at some of the largest international media owners, including Yahoo! and the commercial arm of the BBC; providing data insights to inform, execute and evaluate advertising campaigns for global clients.
The document is a diagram mapping the mobile advertising ecosystem, showing relationships between different entities like app developers, mobile ad networks, carriers, agencies, and data providers. It depicts how advertisers, agencies, and data companies interact with platforms, tools, and each other across the mobile marketing lifecycle.
KDD Analytics provides expertise in marketing predictive analytics and insightful dashboards for management striving for better data driven solutions. KDD is pioneering the use of ai-one's Analyst Toolbox for conversion of unstructured text documents into exciting BI visualizations.
The document describes the Monte Carlo method for modeling risk and uncertainty in complex systems. It was developed in the 1940s by scientists at Los Alamos National Laboratory to simulate neutron diffusion for nuclear weapons design. The method uses random numbers and distributions to generate multiple scenarios and outcomes. It captures variability that traditional single- or multi-point estimates cannot. The document provides an example of using Monte Carlo simulation to evaluate the financial risks and returns of a potential business acquisition under different leverage and exit multiple scenarios.
The document discusses LoadCentral, a distribution solution that allows retailers to dispense electronic prepaid credits via SMS or web interface without carrying physical inventory. It then describes how a user can become an ALLKONTAK dealer by purchasing a dealer package for 3,999 PHP. As a dealer, they can register unlimited retailers and earn commissions of up to 5% per transaction or load consumed by their retailers. The document provides examples of how dealers can earn substantial income through commissions and overrides by growing their sales force network.
This document summarizes an audit fraud risk analysis project conducted by a group of students. They analyzed data from 776 government firms to develop predictive models for identifying financial fraud risk. Their key steps included visualizing the data to understand predictors, creating and testing three classification models (decision tree, random forest, k-NN), and concluding that the random forest model best identified fraudulent cases with high accuracy and recall.
Tangerine Concepts is considering launching the MasterCard MyCard to expand its market influence. The card would target those with no or poor credit histories. An analysis found that 2 million Canadians fit this profile, with 384,000 in the target Golden Horseshoe region. For the distribution strategy to break even, it would need a 11.6% market share, or 44,773 consumers. However, given high competition and lack of competitive advantage, the decision was made not to launch MyCard.
The document discusses online retail payment fraud and risk management. It notes that online credit card fraud costs the industry $8.53 billion annually, with 4% of revenue lost to fraud. Common fraud prevention tools like CVC numbers and IP addresses can be inaccurate or easily spoofed. The document recommends preventive measures for merchants like velocity checks, positive and negative customer lists, and new technologies for device identification to help stop fraud and reduce chargebacks. It promotes a risk management solution that uses device identification, rules, and customizable reports to help identify fraudulent transactions.
Small Business Adoption of EMV TechnologyIntuit Inc.
Intuit surveyed small business owners to get their perspectives on EMV technology and the upcoming liability shift.
The survey data is based on an online multiple-choice questionnaire, administered to 504 U.S. small businesses, at the owner or manager level, with 1-100 employees. The survey was fielded by Ebiquity from April 22-27, 2015; all respondents accept credit cards through mobile swipers and/or physical point of sale terminals.
The document provides guidance for small merchants on protecting payment card data. It discusses understanding risks to payment card data, protecting business with basic security measures, and where to get help. The security basics are organized from easiest to most complex to implement, and include using strong unique passwords, protecting and limiting storage of card data, inspecting payment terminals for tampering, installing software updates, using trusted partners, and more. The goal is to help small businesses start with basic steps to enhance data security.
In Accounts Payable and Procurement departments, generating millions in early payment discounts is possible--but it isn't easy.
PayStream's 2014 AP & Working Capital report uncovered the latest research from large corporations who have already implemented eInvoicing and Dynamic Discounting, or are interested in implementing in the next 6 months. In this webinar, you'll learn the top metrics on:
1. Why companies are missing discount opportunities
2. Top concerns with dynamic discounting
3. Companies' main benefits of ePayments
The document describes the marketing intelligence services of Eleventy Group, including their large consumer databases containing demographic, behavioral and lifestyle information. They offer profiling, predictive modeling, and data enhancement services to help clients better target, segment, and communicate with audiences. Case studies show how these services helped clients improve targeting, test different audience characteristics, and enhance fundraising appeals.
Designing for Financial Inclusion - Sending Money HomeGabriel White
Most people in the world do not have a bank account, let alone use any kind of formal financial services. Over recent years, more efforts have been made to extend financial services to the poor to increase financial stability and improve livelihoods.
What does it mean to design tools to support financial inclusion? How do you design for people who are not familiar with financial concepts? Or have difficulty reading?
Using real project examples in Myanmar, Pakistan, Nigeria and Ghana, Gabriel will highlight the considerations that are important in designing for financial services in developing countries.
Paul Accinno – Traditional vs Digital AdvertisingSean Bradley
During this session we will focus on 3 key areas of developing a marketing plan.
1. How to develop an integrated marketing budget.
2. How to evaluate traditional and digital media
3. How to create impact with your marketing plan
We find that many dealers struggle with these 3 key steps mainly due to constantly increasing media options they face. In step one we discuss the fundamentals of creating a media budget that is appropriate…
• For Your Dealership size
• For Your Brand
• For Your Location and Market Size
Next we discuss how to evaluate traditional and digital media from a tactical standpoint including…
• What is the Function of Each
• How to measure audience delivery
• How to measure conversions
Finally we determine the proper mix of….
• Brand Advertising,
• Promotional Activity and
• Lower funnel marketing
The document discusses fraud trends and challenges in e-commerce payments. It notes that fraud is becoming more organized and complex as criminals collaborate in international rings. New payment methods like gift cards and private label cards are being increasingly targeted. While new fraud prevention tools are constantly developed, each new tool also adds complexity for merchants balancing fraud risk with customer experience. Effective fraud prevention requires tracking key metrics and finding the right balance between automated decision systems and manual reviews.
Optimising Payments for Strong Customer Authentication (SCA)Elliott Barton
Strong Customer Authentication threatens to add friction to the checkout process. Stripe will discuss what this really means for app users and how retailers can prepare for the change.
Know your Fraudster: Preparing for the Post EMV Card-Not-Present FraudNoam Inbar
Presented as s session in the NRF Protect 2015 conference-
With EMV migration coming up, many retailers mistakenly think that they should no longer worry about fraud. But the reality is that while solving many crucial weaknesses at the point of sale, EMV does not assist with Card Not Present transactions. In fact, since fraudsters will still try to make a living somehow, EMV migration might make things worse for online retailers. The good news is, that while fraudsters are fast, so is technology. This presentation provides knowledge about the changing behavior patterns of fraudsters and to get a practical guide on how to be prepared for the upcoming post-EMV fraud tsunami, while balancing between loss prevention and user experience, without letting the fear of fraud create a spike in false positives or result in an over-conservative policy.
Know Your Fraudster: Leveraging everything you've got to prepare for post-EMV...Forter
EMV is nearly here - which is great news for card present fraud. Not such great news for online retailers, who will be facing more fraud from more fraudsters, as theft attempts move online.
How do you prepare your company for this threat? Make sure you're checking off these 5 crucial steps.
Payments Pulse Survey: Small Business Edition (October 2019)Payments Canada
This year’s survey, which focuses on the payment trends, interests, and views of Canadian small businesses, revealed the majority of small businesses do fewer than 25 per cent of their transactions in cash but still feel obligated to accept it.
The 2019 Payments Pulse: Small Business Edition was undertaken by Leger and Payments Canada between September 17, 2019 and September 24, 2019. An online survey of 300 Canadian small business owners of companies with less than 499 employees was completed using Leger’s online panel. The margin of error for this study was +/- 5.6%, 19 times out of 20.
The document describes VMobile, a company that allows dealers to earn income by selling prepaid load products through a network of referrals. Dealers can purchase a starter package for 3,988 pesos and earn commissions on their own sales as well as those referred through their network. The document promotes VMobile's business model as a way for dealers to generate significant income through building a sales team.
CRiskCo provides a platform that bridges the information gap for small and medium enterprises seeking credit. Their platform allows lenders and vendors to seamlessly input credit data on businesses in a standardized format, allowing for real-time risk analysis and monitoring. This helps reduce errors and biases compared to traditional methods. They are seeking partners to pilot their platform and eventually offer it as a white label solution to existing markets through a software development kit licensing model.
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...Big Data Week
In 2003, three criminals were jailed for nine years following the largest Card Fraud Case in Europe with a publicised loss to Card Companies of £2.21 million.
Find out how they were caught back then and how Big Data Technologies would have brought them to justice quicker.
Steve Bradbury was the Prime Investigator and Evidence Provider which lead to the convictions using data from Floppy Discs!
More Related Content
Similar to BDW Chicago 2016 - Randal Cox, Chief Scientist & Co-Founder, Rippleshot - Energizer Bunny Models: Variable Indirection for Eternally Robust Models
KDD Analytics provides expertise in marketing predictive analytics and insightful dashboards for management striving for better data driven solutions. KDD is pioneering the use of ai-one's Analyst Toolbox for conversion of unstructured text documents into exciting BI visualizations.
The document describes the Monte Carlo method for modeling risk and uncertainty in complex systems. It was developed in the 1940s by scientists at Los Alamos National Laboratory to simulate neutron diffusion for nuclear weapons design. The method uses random numbers and distributions to generate multiple scenarios and outcomes. It captures variability that traditional single- or multi-point estimates cannot. The document provides an example of using Monte Carlo simulation to evaluate the financial risks and returns of a potential business acquisition under different leverage and exit multiple scenarios.
The document discusses LoadCentral, a distribution solution that allows retailers to dispense electronic prepaid credits via SMS or web interface without carrying physical inventory. It then describes how a user can become an ALLKONTAK dealer by purchasing a dealer package for 3,999 PHP. As a dealer, they can register unlimited retailers and earn commissions of up to 5% per transaction or load consumed by their retailers. The document provides examples of how dealers can earn substantial income through commissions and overrides by growing their sales force network.
This document summarizes an audit fraud risk analysis project conducted by a group of students. They analyzed data from 776 government firms to develop predictive models for identifying financial fraud risk. Their key steps included visualizing the data to understand predictors, creating and testing three classification models (decision tree, random forest, k-NN), and concluding that the random forest model best identified fraudulent cases with high accuracy and recall.
Tangerine Concepts is considering launching the MasterCard MyCard to expand its market influence. The card would target those with no or poor credit histories. An analysis found that 2 million Canadians fit this profile, with 384,000 in the target Golden Horseshoe region. For the distribution strategy to break even, it would need a 11.6% market share, or 44,773 consumers. However, given high competition and lack of competitive advantage, the decision was made not to launch MyCard.
The document discusses online retail payment fraud and risk management. It notes that online credit card fraud costs the industry $8.53 billion annually, with 4% of revenue lost to fraud. Common fraud prevention tools like CVC numbers and IP addresses can be inaccurate or easily spoofed. The document recommends preventive measures for merchants like velocity checks, positive and negative customer lists, and new technologies for device identification to help stop fraud and reduce chargebacks. It promotes a risk management solution that uses device identification, rules, and customizable reports to help identify fraudulent transactions.
Small Business Adoption of EMV TechnologyIntuit Inc.
Intuit surveyed small business owners to get their perspectives on EMV technology and the upcoming liability shift.
The survey data is based on an online multiple-choice questionnaire, administered to 504 U.S. small businesses, at the owner or manager level, with 1-100 employees. The survey was fielded by Ebiquity from April 22-27, 2015; all respondents accept credit cards through mobile swipers and/or physical point of sale terminals.
The document provides guidance for small merchants on protecting payment card data. It discusses understanding risks to payment card data, protecting business with basic security measures, and where to get help. The security basics are organized from easiest to most complex to implement, and include using strong unique passwords, protecting and limiting storage of card data, inspecting payment terminals for tampering, installing software updates, using trusted partners, and more. The goal is to help small businesses start with basic steps to enhance data security.
In Accounts Payable and Procurement departments, generating millions in early payment discounts is possible--but it isn't easy.
PayStream's 2014 AP & Working Capital report uncovered the latest research from large corporations who have already implemented eInvoicing and Dynamic Discounting, or are interested in implementing in the next 6 months. In this webinar, you'll learn the top metrics on:
1. Why companies are missing discount opportunities
2. Top concerns with dynamic discounting
3. Companies' main benefits of ePayments
The document describes the marketing intelligence services of Eleventy Group, including their large consumer databases containing demographic, behavioral and lifestyle information. They offer profiling, predictive modeling, and data enhancement services to help clients better target, segment, and communicate with audiences. Case studies show how these services helped clients improve targeting, test different audience characteristics, and enhance fundraising appeals.
Designing for Financial Inclusion - Sending Money HomeGabriel White
Most people in the world do not have a bank account, let alone use any kind of formal financial services. Over recent years, more efforts have been made to extend financial services to the poor to increase financial stability and improve livelihoods.
What does it mean to design tools to support financial inclusion? How do you design for people who are not familiar with financial concepts? Or have difficulty reading?
Using real project examples in Myanmar, Pakistan, Nigeria and Ghana, Gabriel will highlight the considerations that are important in designing for financial services in developing countries.
Paul Accinno – Traditional vs Digital AdvertisingSean Bradley
During this session we will focus on 3 key areas of developing a marketing plan.
1. How to develop an integrated marketing budget.
2. How to evaluate traditional and digital media
3. How to create impact with your marketing plan
We find that many dealers struggle with these 3 key steps mainly due to constantly increasing media options they face. In step one we discuss the fundamentals of creating a media budget that is appropriate…
• For Your Dealership size
• For Your Brand
• For Your Location and Market Size
Next we discuss how to evaluate traditional and digital media from a tactical standpoint including…
• What is the Function of Each
• How to measure audience delivery
• How to measure conversions
Finally we determine the proper mix of….
• Brand Advertising,
• Promotional Activity and
• Lower funnel marketing
The document discusses fraud trends and challenges in e-commerce payments. It notes that fraud is becoming more organized and complex as criminals collaborate in international rings. New payment methods like gift cards and private label cards are being increasingly targeted. While new fraud prevention tools are constantly developed, each new tool also adds complexity for merchants balancing fraud risk with customer experience. Effective fraud prevention requires tracking key metrics and finding the right balance between automated decision systems and manual reviews.
Optimising Payments for Strong Customer Authentication (SCA)Elliott Barton
Strong Customer Authentication threatens to add friction to the checkout process. Stripe will discuss what this really means for app users and how retailers can prepare for the change.
Know your Fraudster: Preparing for the Post EMV Card-Not-Present FraudNoam Inbar
Presented as s session in the NRF Protect 2015 conference-
With EMV migration coming up, many retailers mistakenly think that they should no longer worry about fraud. But the reality is that while solving many crucial weaknesses at the point of sale, EMV does not assist with Card Not Present transactions. In fact, since fraudsters will still try to make a living somehow, EMV migration might make things worse for online retailers. The good news is, that while fraudsters are fast, so is technology. This presentation provides knowledge about the changing behavior patterns of fraudsters and to get a practical guide on how to be prepared for the upcoming post-EMV fraud tsunami, while balancing between loss prevention and user experience, without letting the fear of fraud create a spike in false positives or result in an over-conservative policy.
Know Your Fraudster: Leveraging everything you've got to prepare for post-EMV...Forter
EMV is nearly here - which is great news for card present fraud. Not such great news for online retailers, who will be facing more fraud from more fraudsters, as theft attempts move online.
How do you prepare your company for this threat? Make sure you're checking off these 5 crucial steps.
Payments Pulse Survey: Small Business Edition (October 2019)Payments Canada
This year’s survey, which focuses on the payment trends, interests, and views of Canadian small businesses, revealed the majority of small businesses do fewer than 25 per cent of their transactions in cash but still feel obligated to accept it.
The 2019 Payments Pulse: Small Business Edition was undertaken by Leger and Payments Canada between September 17, 2019 and September 24, 2019. An online survey of 300 Canadian small business owners of companies with less than 499 employees was completed using Leger’s online panel. The margin of error for this study was +/- 5.6%, 19 times out of 20.
The document describes VMobile, a company that allows dealers to earn income by selling prepaid load products through a network of referrals. Dealers can purchase a starter package for 3,988 pesos and earn commissions on their own sales as well as those referred through their network. The document promotes VMobile's business model as a way for dealers to generate significant income through building a sales team.
CRiskCo provides a platform that bridges the information gap for small and medium enterprises seeking credit. Their platform allows lenders and vendors to seamlessly input credit data on businesses in a standardized format, allowing for real-time risk analysis and monitoring. This helps reduce errors and biases compared to traditional methods. They are seeking partners to pilot their platform and eventually offer it as a white label solution to existing markets through a software development kit licensing model.
Similar to BDW Chicago 2016 - Randal Cox, Chief Scientist & Co-Founder, Rippleshot - Energizer Bunny Models: Variable Indirection for Eternally Robust Models (20)
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...Big Data Week
In 2003, three criminals were jailed for nine years following the largest Card Fraud Case in Europe with a publicised loss to Card Companies of £2.21 million.
Find out how they were caught back then and how Big Data Technologies would have brought them to justice quicker.
Steve Bradbury was the Prime Investigator and Evidence Provider which lead to the convictions using data from Floppy Discs!
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal InferenceBig Data Week
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world. In this talk, I argue that the rumours were strongly exaggerated. Causal inference is becoming increasingly relevant thanks to improvements in inference methods and–ironically–the availability of data. Far from becoming marginalised, causal inference is today more relevant than it’s ever been.
BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...Big Data Week
As far as data is concerned, Pharmaceutical Companies have always been clear-sighted and assertive on what insights to get from it, how, and what to do about it. And then the Big Data Era came in, with its frantic pace, transforming multiple industries all around but, for a number of reasons (privacy and data protection issues on top but not alone) keeping the Pharma Industry behind. How to run the extra mile to keep up with the powerful changes Big Data brings along has become a major concern. Strategic opportunities seem to be around the corner. Is the time to bridge gaps finally here?
BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...Big Data Week
Hello London, the ground-breaking media partnership between Transport for London (TfL) and Exterion Media, gives new opportunities for brands to talk to the London audience in innovative ways and generates vital revenue for London’s transport network.
TfL and Exterion have been working together in the Hello London partnership for a year. Part of the collaboration was around the utilisation of data collected by TfL to better inform advertising investment decisions.
This has led to ground-breaking work in the Out-of-Home advertising sector and the first example of this is Taps Segmentation. Developed by the TfL Data Science team, it allows Exterion to understand demographic patterns at stations based on aggregated contactless and Oyster card usage. This de-personalised data can be analysed for different times of the day and is a game changer – allowing Exterion to rethink how both their classic and digital inventory can be packaged and tailored specifically for clients.
The presentation will cover how TfL and Exterion have collaborated, the approach used by TfL and how Exterion are using it to generate revenue which is reinvested in the transport network.
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...Big Data Week
Data Science is now well established in our businesses, and everyone considers data as a key asset and critical for our competitiveness.
However, Data Science is not easy to manage, very often projects failed and the investment made is not seeing as profitable.
The aim of this talk is to share the knowledge in different areas:
* avoid classical mistakes in Data Science
* use the right Big Data technology
* apply the right methodology
* make the Data Science team more efficient
BDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of DataBig Data Week
DISCOVER
UNDERSTAND
EVOLVE
Presenting a use case taking unstructured data into OCR, Entity Extraction, Case Management and simple to use Visualisations.
BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...Big Data Week
The document discusses some of the risks associated with big data, including the risk of data breaches getting more costly as data volumes and repositories increase. It notes that smaller breaches involving 10,000 to 100,000 records on average cost hundreds per record, while mega-breaches of millions of records can cost billions and be in the range of pounds per record. The main sources of risk are identified as user error, system glitches, and attacks, with malicious attacks being the costliest. It provides some recommendations around applying security controls like access management and automation while also considering dependencies and maintaining good data hygiene.
BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...Big Data Week
Content:
1. A brief history of the FT
2. What does Big Data mean to the FT?
3. The benefits of Big Data & how we use it
4. How we do it
5. What’s next for us?
BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...Big Data Week
Science is a way of thinking more than a body of knowledge. It involves asking why, how, and what questions. Artificial intelligence has advanced due to cloud computing, big data, and open source approaches which have enabled data-driven decision making and rapid learning from experiences. There are still issues around creativity, ethics, and replacing human experience with technologies.
BDW16 London - Alex Bordei, Bigstep - Building Data Labs in the CloudBig Data Week
Building Data Labs in the Cloud summarizes how to build data labs in the cloud by connecting on-premise services through VPN or targeted firewalls, integrating identity services between on-premise and cloud realms, enabling single sign-on with two-factor authentication, using encryption with cloud or on-premise HSMs, leveraging Spark for data science, SQL, ETL, machine learning and graph processing, adopting a multi-context architecture for maintenance and efficiency, and ensuring real-time systems provide performance, stability, serviceability and fault tolerance.
BDW16 London - William Vambenepe, Google - 3rd Generation Data PlatformBig Data Week
1. The document discusses Google Cloud's 3rd generation data platform and services for managing large-scale data and analytics workloads. It focuses on managed services that allow users to focus on insights rather than infrastructure maintenance.
2. The platform includes services for data ingestion, processing, storage and analytics including Cloud Pub/Sub, Dataflow, BigQuery, Dataproc, Bigtable and Cloud Storage. It aims to provide a serverless platform with auto-optimized usage and pay per use pricing model.
3. Over 15 years Google has developed technologies for tackling big data problems including papers, open source projects and cloud products. Core components of their data platform are discussed including the Beam programming model and Dataflow for unified
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...Big Data Week
We have seen vast improvements to data collection, storage, processing and transport in recent years. An increasing number of networked devices are emitting data and all of us are preparing to handle this wave of valuable data.
Have we, as data professionals, been too focused on the technical challenges and analytical results?
What about the data quality? Are we confident about it? How can we be sure we are making good decisions?
We need to revisit methods of assessing data quality on our modernized data platforms. The quality of our decision making depends on it.
BDW16 London - Nondas Sourlas, Bupa - Big Data in HealthcareBig Data Week
The document discusses Bupa's use of analytics in healthcare, including risk modelling and care management, and referral management. For risk modelling and care management, Bupa uses predictive modelling to identify high-risk patients for targeted outreach programs, which have led to reductions in outpatient visits, tests, and surgical procedures, saving 9-10% in care costs. For referral management, Bupa profiles over 18,000 consultants based on claims data to guide over 700,000 pre-authorizations, achieving estimated healthcare savings of 9-11% of guided spend.
BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...Big Data Week
Unsuccessful marketing campaigns are leaving customers disgruntled, making them 40% less likely to return. Companies are casting aside useful data that can provide further insights into better products/better connections with customers. John Callan, VP of Marketing at Boxever will discuss how AI can change how businesses predict trends, reduce risks, and improve efficiency.
Audience will:
Gain expert-level understanding of data and machine learning that’s used in today’s market
Identify successful ways companies use machine learning to target customers with personalized content
Learn from major airlines use-cases to skillfully target customers and show them exactly what they want to see.
BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...Big Data Week
Through the experiences of supporting a Multi-Country roll out using data to drive more effective Network capability, we will explain how we have:
Created new internal capability to support local countries, developed skill sets in the country and provided technical infrastucture, algorithms and visualisations to drive the data culture and big data strategies across Telefonica business units.
Through this framework, we will explain how to blend technical and business needs to maximise the benefits and drive better business performance.
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...Big Data Week
The Basel Committee on Banking Supervision (BCBS) and local regulators has been focussed on making banks more safe and resilient. A whole raft of new capital charges and constraints on liquidity and leverage have been introduced: Basel II.5, Basel III, Dodd-Frank, FRTB (“Basel IV”), etc. These have significantly increased the risk data management capabilities banks must have—capabilities that only big data tools can provide.
This talk will cover the challenges of building a position-aware risk management platform that properly aggregates all intra-day trading activity, monitors exposures and risk. The fast data stack can help banks create such a platform and provide a robust foundation to achieve compliance and, ultimately a significant competitive edge by making efficient use of capital.
BDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word BingoBig Data Week
With predictions from the United Nations that 66 percent of the world population, including an extra 2.5 billion people, living in urban areas our cities are getting extra attention. If we want to avoid dystopian megacities of the future, then we must begin the technology transformation in our cities now.
BDW16 London - Josh Partridge, Shazam - How Labels, Radio Stations and Brand...Big Data Week
“At Shazam, we think data can be beautiful and stunningly inspiring. The pictures we paint with our data tell stories about changing culture, tastes, and shared discoveries. A truly great new song can sweep across the globe in a wave of Shazams that transcends politics, language, or religion”, Greg Glanday, Chief Revenue Officer at Shazam.
This presentation will offer the audience a few examples of how they can use the data from Shazam to get fantastic insight into the consumers` preferences, and how to take that insight and apply it to a brand.
Giving 3 or 4 great examples of what we do at Shazam, anyone in the audience can understand what this data means, really see this data and then be able to leverage it to make smart marketing decisions.
Main takeaway: a clear understanding of what Shazam data is and how brands can use it.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
32. MCC Fraud Auths Fraud Rate
5734 57 of 7210 79.06 bp
5542 256 of 203053 12.61 bp
5732 25 of 3951 63.28 bp
6011 47 of 20565 22.85 bp
5691 24 of 4199 57.16 bp
5999 39 of 18494 21.09 bp
5967 16 of 1320 121.21 bp
5972 9 of 43 2093.02 bp
0000 20 of 6141 32.57 bp
5964 15 of 2584 58.05 bp
5651 27 of 12836 21.03 bp
MCC Fraud Auths Fraud Rate
9399 4 of 11786 3.39 bp
5921 7 of 20044 3.49 bp
4121 5 of 16751 2.98 bp
7230 3 of 11866 2.53 bp
5912 27 of 61428 4.40 bp
5411 134 of 255195 5.25 bp
4784 0 of 7360 0.00 bp
5812 55 of 128521 4.28 bp
7832 0 of 9076 0.00 bp
7841 1 of 10970 0.91 bp
SafeRisky
33. If Fraud Score> 600
Decline
If Fraud Score > 900 and MCC == 9399
Decline
If Fraud Score > 200 and MCC == 5734
Decline
Bulk
LOW
RISK
HIGH
RISK
34. Can We Use the Fraud Rate Instead?
If Fraud Score > 900 and MCC Fraud < 0.02%
Decline
If Fraud Score> 200 and MCC Fraud > 10%
Decline
LOW
RISK
HIGH
RISK
Machine learning models are everywhere, doing everything. I’m guessing they are transforming the businesses of everyone here, right?
And these models work great.
Until they don’t.
No matter how good your model, pretty soon the real world works its way in and the model start to tank.
That’s expensive, since building a good model still takes a bunch of your time and attention to build.
I’m Randal Cox, the chief scientist and co-founder at Rippleshot.
I want to talk today about ways to keep your models running a lot longer – for years even.
About us.
Rippleshot detects payment card data breaches, like Target or Home Depot.
We trace fraudulent purchases back in time
About us.
Rippleshot detects payment card data breaches, like Target or Home Depot.
We trace fraudulent purchases back in time
… to where those cards all visited the same location. That’s where the card was stolen.
Think of it like tracing food poisoning back to a the greasy spoon.
Rippleshot builds a lot of machine learning models
We predict which **cards** are going to be used fraudulently soon, based on where and how they shop.
We make models that predict is a store is likely to be breached soon
And in real-time, we build payment card decline rules to stop fraud spends right NOW.
Let’s consider the model that’s most important to card issuers.
This model stops suspicious transactions in real time before the bank or merchant incurs any loss at all. Getting that right is very hard and really important to get right.
Let’s look at a concrete example.
We are a decision tree shop. You’re probably using tree-like rules all the time.
In this example, a payment far from home at a gas station is likely to be fraud, though even more likely on a weekday. Some nearby states with big dollar purchases are also risky.
That said, all of what I’m saying today is equally applicable to other modeling techniques like neural nets.
The reason I’m up here is I’ve been asked to share some of my big data insights. I’ve only got two, really. So this should be a short talk.
- I have some techniques for filling out in feeble data
- and a way to use those variables indirectly that makes your models last longer
Models are only as good as their data, and sometimes the data is TERRIBLE
One of our clients gave us VERY LITTLE about each transaction.
It’s hard to make a great model out of that, so you’re going to have to augment this data.
Luckily we know something about Fraudster behavior. He does things card holders do not.
Basically
he often spend far from the consumer’s home.
he shop at odd hours when there is less scrutiny
They like launderable goods, like big screen tvs.
Let’s look at the where first.
Let’s look at where first. You and I shop close to our homes usually. But the fraudsters might not know where home is or just don’t have presence on the ground there. So,
Distance between home and the point of sale is incredibly predictive of fraud. It’s often my #1 variable in card present models
Distance is a little hard to compute.
You need clean country and postal codes for home and the POS. Then you need to look up the latitude and longitude of those postal codes, and then run some modestly complicated math to get the distance.
Luckily the lot,lons for all worldwide postal codes are available for free. And the Haversine formula is a google search away.
Distance was a big win, let’s look at time.
Here is the legitimate spend on a large cohort of cards. It’s almost like a pulse with a 1-week period and a 1-month automatic payments period.
More regular than my heartbeat.
But the fraud spend, is often REALLY different.
Huge upswings over week-long periods and even in more regular periods, out of phase from the consumers.
The fraud signal is likely to be drowned out by legitimate Friday payments, for example. But the fraudsters are often busy on days when card holders are not.
Same thing with the time of day. Fraudsters seem to like the dark better than the sun.
So, we now have Day of Week and Hour of Day as new features.
The original data set included some things like states and postal codes and merchant types (groceries or gas stations)
A lot of modelers will use these variables DIRECTLY in the model like in that earlier example.
Don’t do that
There are two chief reasons for this: the problems of ordinality and of change.
Ordinality is just a fancy way of saying there are too many possible values for these variables to use directly.
If you feed any modeling tech a column with more than a million possible categories, it is going to barf. Like, game over, usually. If your lucky, it will just perform poorly.
There is another disadvantage here. Splitting on a large list like postal codes makes for HUGE rules. Some environments impose character limits on decline expressions. It would be much better to have some proxy to postal codes to make the expression shorter.
The other problem is change. The fraudsters know you are trying to catch them, so they change as fast as possible. Unfortunately, you usually don’t get the post-it-note about it.
If you model directly on the the state of merchant category, you’re locked in until you can build the next model – and that might take months. Fraud is faster than that.
One way forward is to replace those primary variables – one layer of indirection.
Instead of a postal code, give the model the fraud rate at this postal code.
Here is a table of merchant categories. One MCC has an astounding 20% fraud rate in this data set. And another never has fraud.
How would you roll that information into a fraud model?
Let’s say the rest of your variables can be used to make a fraud score and you want to add this MCC data. For the data as a whole, you usually decline at a score of 600
You might be tempted to just decline more often in the risky MCC and maybe require a higher fraud score for the safe MCC.
But the fraudsters will move from 5734 to 5735 the week after you implement your model.
Better to use the fraud rate instead. Then you can update a table of fraud rates for all your MCCs and not change your model ITSELF.
So that’s a step up, but you need to be careful.
If the number of transactions is small, you can get a high fraud rate by chance.
Let’s say your real fraud rate is 25% - roll a 1 on a 4-sided die. But if you only have two records, you might get unlucky and roll two 1’s on the two dice. There is a 6% chance you get two ones and think your fraud rate is 100% - a huge error!
There is a simple way around this.
If you hate math, close your eyes for the next slide
For everyone else, z-scores encapsulate how sure we are that our observed rate is different from background.
Really, we’re comparing the global rate (all MCCs) with the rate at this MCC.
<click>
So, if you’re running 4bp in fraud overall, and your this MCC is at 6 bp, just divide that 2 bp difference by the sum of the standard deviations for those two curves.
If the width of the curve is very large, then the z-score decreases a lot – you’re not so sure about the result. If the width is small (i.e., you have a large number of transactions), you get a small number in the denominator.
The math phobes can open their eyes now.
The bottom line is that a z-score above 3 means you can be very sure there is a lot of fraud going on at this MCC.
If the z-score is less than -3, then you can be sure that fraud is really avoiding this MCC.
So our tree model might pay a lot of attention to an MCC with a 10.7 z-score.
I set up z-score tables for lots of primary variables, and then discard the primary variables. And some of the added variables.
As fraud changes, update the z-scores – it’s like updating the model, but for less work. Usually, I keep a running fraud rate for, say, the POS state during the last month. I update this TABLE once a week
There is another advantage here. Comparing against z-scores makes for more compact rule text.
There is another disadvantage here. Splitting on a large list like postal codes makes for HUGE rules. Some environments impose character limits on decline expressions. It would be much better to have some proxy to postal codes to make the expression shorter.
Using this approach makes models last dramatically longer. Sure lots of people will just retrain their model frequently, but that’s more work than updating a table.
Also, in my hands most modeling technologies actually perform better with z-scores even directly out of the gate. It’s easier to do numeric comparisons for splits than split by a bunch of categories. The cleaner split usually means better capture.
And this approach is not very hard. I keep a rolling calculator of the z-score for those frauds for, say, grocery stores over the last month.
Every two weeks, I update the table and leave the model along.
So, in quick summary, we’ve
added space and time variables
removed specific features that might change
and replaced them with z-scores
The upshot is you get models that last years, not months.