SlideShare a Scribd company logo
1 of 22
Automated
Retail
Analytics
Omni-Channel
and at Scale
William Komp
We are SmarterHQ
SmarterHQ is the leading multi-channel behavioral marketing
platform, empowering B2C marketers to personalize individual
customer interactions in real-time. We work with some of the
world’s largest brands – such as Bloomingdales, Santander Bank,
Carrentals.com and Finish Line to drive phenomenal business
results. We’ve been recognized by Forbes as technology to push
B2C companies into a new era of personalization and Forrester’s
Total Economic Impact study to deliver 667% in ROI.
So Lets build our models!!!
Easy enough, choose our favorite algorithm (in our case going for eventual
near real time scoring Logistic Regression).
Model build and input data filtering using Standard Deviation, Correlation and
Lasso LARS
We use python libraries (SCIKIT and pySQL Libraries) to automate gathering
the data and delivering to the server for model building!
This was all developed and perfected prior to Jan 2015 (a scant 6 months at
SmarterHQ)
Recently, expanded to include Affinity Analysis for interaction term building and
Product Recommendations
3
So what is the problem???!!!What have I not told you?
StoreFront
StoreFront Data Sources
4
OMS
Retail
Products
Digital Sources
StoreFront Building Blocks
Built on AWS
• EC2, Kinesis, Simple Queue Service (SQS), Lambda, S3, Glacier, Redshift
5
Data Gathering
Digital Sources:
• Tag a website, mobile app, etc
Product views, customer ids, email address, products carted, products purchased, loyalty ids
• Streams to redshift in as little as 5 minutes.
• Incremental batches run on redshift ~5 minutes, so data latency is as little as 10 minutes
OMS:
• Daily Feeds worked out with the Client:
Customer ids, loyalty ids, products, order totals, email address, refunds, cancelations, shipping info
• Processed once a day in a daily process
Product:
• Product ids, client based marketing categories
6
StoreFront Infrastructure Design
Properties:
Modular in design
highly Parallel
Concurrent writing
Processes are Daemonized
Python Apps supporting infrastructure
A typical day for every customer:
Web load (240x/day):
OMS (1x/day):
Product Feeds(1x/day):
7
WEB
streaming
SQS Kinesis Lambda S3 Redshift
ETL from
Client
Informatica S3 Redshift
ETL from
Client
Informatica S3 Redshift
Store Front
StoreFront Data Sources (revisited)
8
OMS
Retail
Products
Digital Sources
5 min
1/day
1/day
Entities!
• Everyone has a definition of what a customer is!!! How do we represent that customer in the data
that we have? If I ask for all of the purchase information from customer X then how can I get it
reliably and quickly?
• Entities are data driven constructs that are the data representation of a customer, location,
marketing campaign, etc….
• Defined by exact matching (Really want to go to Fuzzy land!)
Email Addresses, Loyalty ID, order ids, customer names, other customer ids
Require more than 2 pieces to match (except in the case of web only then email entities!)
Example:
9
Entity Mechanics
Build Entities using Graph Theory
Set of all possible data elements to be linked is the Vertex set
Use the data to build connections between Vertices or Edges!
Set of all connected vertices is the Edge Set
Use a graph building algorithms Breadth First Search or Depth First Search to build out the graphs
10
OMS:
1. Person Identifier fields (name, email address, customer ids, order ids)
2. Parse Email field (filter out with regular expression improperly formatted emails using RFC5322
standard) and get email user id
3. Algorithm Exact match on at least 2 fields (common names and email user names make single
point matches unreliable)
Could expand to 1 point using a frequency analysis to rule out 1 point matches for less common
names or email addresses
Digital:
Personal Identifier fields (email address, order id, loyalty ids)
1. Exact match on at least two of order id, email address or loyalty id to corresponding OMS entity
2. Next do digital email based entities (1 point matches)
11
Entities with both OMS Retail and Digital vertices – CrossChannel
Entities!
StoreFront Predictive Processes
• Asset Quality/ Visit Quality/ Engagement
• Product Recommendations
• Recency Frequency Monitization Latency (RFML)
• Predictive Models
12
Asset Quality/Visit Quality
Measures the expected value based on history of products viewed online
Suppose an Entity “Sarah” views 3 products X, Y and Z.
Asset Quality (AQ) is #purchases * Price / #views
Today Sarah’s AQ:
13
Product Price # views # purchases Asset
Quality
X $5.00 220 23 $0.52
Y $10.00 342 45 $1.32
Z $15.00 122 5 $0.61
Visit Quality (VQ) is Sum of Asset Quality for a visit
e.g. $2.45
Engagement
14
A weeks long Engagement with a 50% decay rate:
Day Visit Quality Engagement
1 $10.98 $10.98
2 $0 $5.49
3 $0 $2.75
4 $0 $1.37
5 $3.46 $4.15
6 $0 $2.07
7 $2.45 $3.49
$-
$2.00
$4.00
$6.00
$8.00
$10.00
$12.00
0 1 2 3 4 5 6 7 8
Dollars($)
Day
VQ Engagement
Product Recommendations
Association Rules with monthly customer sessions
• N1: Count the number of times products appear in pairs (over a month for a customer)
• N2: Count the number of times products (Antecedent or Consequent)appear over a month for a
customer
• N3: Count the number customers in a month
Compute
• Antecedent Support ( N2A / N3)
• Consequent Support ( N2C / N3)
• Rule Confidence (N1 / N2A)
• Lift ( N1/ N2A / (N2C / N3 ) )
All of this is done in database for all the most recent month daily!
15
Recommendation Example
Antecedent: Mens Air Jordan City Collection NYC T-Shirt N2A = 384
Consequent: Mens Air Jordan Retro 10 NYC Basketball Shoes N2C = 9770
Rule Occurrence: N1 = 114
Transaction Count: N3 = 780,005
Antecedent Support ( N2A / N3) = 384/780,005 = 0.00049
Consequent Support ( N2C / N3) = 9770/780,005 = 0.012
Rule Confidence (N1 / N2A) = 114/384 = 0.297
Lift ( N1/ N2A / (N2C / N3 ) ) = Rule Confidence / Consequent Support = 23.7
23.7x more likely to purchase Air Jordans after buying the Jordan City
Collection NYC T-Shirt
16
RFML
Recency: the number of days since the last visit or purchase by a shopper.
Frequency: the number of visits or purchases within a time period of interest.
Monetary: the total dollar spend of a shopper within the time period of interest.
Latency: the average number of days between visits or purchases within the time period of interest.
Recency and Latency are computed 1/day
Computed on demand:
Frequency
Monetary
17
Predictive Models
GOAL: Predict Days To Next Purchase and Days to Next Visit for <= 1, 3, 7, 15 and interval 15-
30, 31-60, 61-90
216 input fields (Engagement, Average order value, Average session value, session count, asset
count, many more plus interactions)
Build models on 6M records at an entity level
Model Building Process:
18
6M records (Redshift) Python pyETL library
Variable Reduction
(Variance, Correlation
and Lasso-LARS
variable reduction)
Build Models
(Parallel!!)
Model Tests (ROC
AUC, Regression
Coefficients)
Upload model &
results to SQL
Models ready to
Deploy
Model scoring handled directly in SQL using a SQL process.
Can score 100M’s of records in minutes!
Example Big A$$ Client
Athletic Retailer, 2 years of data, $1.6B in sales / year,
Typical Daily Adds 50,000 transactions, typical batch gives about 20,000 records every 6 min!
Database size: 866G (compressed) which equates 2.5T (uncompressed)
Total Daily Run time 3 hours (rebuilds from scratch), Batch runtime 5 mins!
Vertex Set: 253,449,334
Entity Set: 203,531,275
There are 50 million non-Atomic equivalence classes!
These amount to $850M or ~53% of the sales
(these customers are the known repeat customers)
These are the customers we can target as we have richer information about their repeated
browsing.
19
This is StoreFront Personalization
20
Website Mobile App In-Store Call Center 3rd PartyAnnual Spend: $4,500
Transactional History
• Online: INV 1215 $103.98
• Store: INV 4672 $50.45
• Store: INV 8500 $123.87 [etc]
Email Addresses
• Transactional: sarahhall@gmail.com
• Account: shall@home.com
• Promotional: sarahh@yahoo.com
Category Affinity: Kid’s, Women’s,
Running
Brand Affinity: Nike
S AR AH
Sales Channel
Category, Brand, Product
Cross-Channel
Email Website Mobile Display Social
Here’s what it delivers.
PROMO
Brands personalizing interactions in real-time and email
22

More Related Content

Similar to Automated Retail Analytics at Scale

Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
 
[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming AppsWSO2
 
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming AppsWSO2
 
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your DataMongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your DataMongoDB
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligenceAhsan Kabir
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAmazon Web Services
 
Analytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseAnalytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseSriskandarajah Suhothayan
 
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
How Retail Banks Use MongoDB
How Retail Banks Use MongoDBHow Retail Banks Use MongoDB
How Retail Banks Use MongoDBMongoDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Telecom datascience master_public
Telecom datascience master_publicTelecom datascience master_public
Telecom datascience master_publicVincent Michel
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsDataWorks Summit
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Amazon Web Services
 
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docxjackiewalcutt
 
Analytics what to look for sustaining your growing business-
Analytics   what to look for sustaining your growing business-Analytics   what to look for sustaining your growing business-
Analytics what to look for sustaining your growing business-Ajay Ohri
 

Similar to Automated Retail Analytics at Scale (20)

1030 track2 komp
1030 track2 komp1030 track2 komp
1030 track2 komp
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
Patterns for Building Streaming Apps
Patterns for Building Streaming AppsPatterns for Building Streaming Apps
Patterns for Building Streaming Apps
 
[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps[WSO2Con USA 2018] Patterns for Building Streaming Apps
[WSO2Con USA 2018] Patterns for Building Streaming Apps
 
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps[WSO2Con Asia 2018] Patterns for Building Streaming Apps
[WSO2Con Asia 2018] Patterns for Building Streaming Apps
 
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your DataMongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
MongoDB World 2019: re:Innovate from Siloed to Deep Insights on Your Data
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon Kinesis
 
Analytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseAnalytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital Enterprise
 
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
How Retail Banks Use MongoDB
How Retail Banks Use MongoDBHow Retail Banks Use MongoDB
How Retail Banks Use MongoDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Telecom datascience master_public
Telecom datascience master_publicTelecom datascience master_public
Telecom datascience master_public
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural Patterns
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
 
Analytics what to look for sustaining your growing business-
Analytics   what to look for sustaining your growing business-Analytics   what to look for sustaining your growing business-
Analytics what to look for sustaining your growing business-
 

More from Rising Media, Inc.

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptopRising Media, Inc.
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptopRising Media, Inc.
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptopRising Media, Inc.
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptopRising Media, Inc.
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptopRising Media, Inc.
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptopRising Media, Inc.
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptopRising Media, Inc.
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareableRising Media, Inc.
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptopRising Media, Inc.
 

More from Rising Media, Inc. (20)

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
 
Elder shareable
Elder shareableElder shareable
Elder shareable
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
 

Recently uploaded

Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 

Recently uploaded (20)

Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 

Automated Retail Analytics at Scale

  • 2. We are SmarterHQ SmarterHQ is the leading multi-channel behavioral marketing platform, empowering B2C marketers to personalize individual customer interactions in real-time. We work with some of the world’s largest brands – such as Bloomingdales, Santander Bank, Carrentals.com and Finish Line to drive phenomenal business results. We’ve been recognized by Forbes as technology to push B2C companies into a new era of personalization and Forrester’s Total Economic Impact study to deliver 667% in ROI.
  • 3. So Lets build our models!!! Easy enough, choose our favorite algorithm (in our case going for eventual near real time scoring Logistic Regression). Model build and input data filtering using Standard Deviation, Correlation and Lasso LARS We use python libraries (SCIKIT and pySQL Libraries) to automate gathering the data and delivering to the server for model building! This was all developed and perfected prior to Jan 2015 (a scant 6 months at SmarterHQ) Recently, expanded to include Affinity Analysis for interaction term building and Product Recommendations 3 So what is the problem???!!!What have I not told you?
  • 5. StoreFront Building Blocks Built on AWS • EC2, Kinesis, Simple Queue Service (SQS), Lambda, S3, Glacier, Redshift 5
  • 6. Data Gathering Digital Sources: • Tag a website, mobile app, etc Product views, customer ids, email address, products carted, products purchased, loyalty ids • Streams to redshift in as little as 5 minutes. • Incremental batches run on redshift ~5 minutes, so data latency is as little as 10 minutes OMS: • Daily Feeds worked out with the Client: Customer ids, loyalty ids, products, order totals, email address, refunds, cancelations, shipping info • Processed once a day in a daily process Product: • Product ids, client based marketing categories 6
  • 7. StoreFront Infrastructure Design Properties: Modular in design highly Parallel Concurrent writing Processes are Daemonized Python Apps supporting infrastructure A typical day for every customer: Web load (240x/day): OMS (1x/day): Product Feeds(1x/day): 7 WEB streaming SQS Kinesis Lambda S3 Redshift ETL from Client Informatica S3 Redshift ETL from Client Informatica S3 Redshift
  • 8. Store Front StoreFront Data Sources (revisited) 8 OMS Retail Products Digital Sources 5 min 1/day 1/day
  • 9. Entities! • Everyone has a definition of what a customer is!!! How do we represent that customer in the data that we have? If I ask for all of the purchase information from customer X then how can I get it reliably and quickly? • Entities are data driven constructs that are the data representation of a customer, location, marketing campaign, etc…. • Defined by exact matching (Really want to go to Fuzzy land!) Email Addresses, Loyalty ID, order ids, customer names, other customer ids Require more than 2 pieces to match (except in the case of web only then email entities!) Example: 9
  • 10. Entity Mechanics Build Entities using Graph Theory Set of all possible data elements to be linked is the Vertex set Use the data to build connections between Vertices or Edges! Set of all connected vertices is the Edge Set Use a graph building algorithms Breadth First Search or Depth First Search to build out the graphs 10
  • 11. OMS: 1. Person Identifier fields (name, email address, customer ids, order ids) 2. Parse Email field (filter out with regular expression improperly formatted emails using RFC5322 standard) and get email user id 3. Algorithm Exact match on at least 2 fields (common names and email user names make single point matches unreliable) Could expand to 1 point using a frequency analysis to rule out 1 point matches for less common names or email addresses Digital: Personal Identifier fields (email address, order id, loyalty ids) 1. Exact match on at least two of order id, email address or loyalty id to corresponding OMS entity 2. Next do digital email based entities (1 point matches) 11 Entities with both OMS Retail and Digital vertices – CrossChannel Entities!
  • 12. StoreFront Predictive Processes • Asset Quality/ Visit Quality/ Engagement • Product Recommendations • Recency Frequency Monitization Latency (RFML) • Predictive Models 12
  • 13. Asset Quality/Visit Quality Measures the expected value based on history of products viewed online Suppose an Entity “Sarah” views 3 products X, Y and Z. Asset Quality (AQ) is #purchases * Price / #views Today Sarah’s AQ: 13 Product Price # views # purchases Asset Quality X $5.00 220 23 $0.52 Y $10.00 342 45 $1.32 Z $15.00 122 5 $0.61 Visit Quality (VQ) is Sum of Asset Quality for a visit e.g. $2.45
  • 14. Engagement 14 A weeks long Engagement with a 50% decay rate: Day Visit Quality Engagement 1 $10.98 $10.98 2 $0 $5.49 3 $0 $2.75 4 $0 $1.37 5 $3.46 $4.15 6 $0 $2.07 7 $2.45 $3.49 $- $2.00 $4.00 $6.00 $8.00 $10.00 $12.00 0 1 2 3 4 5 6 7 8 Dollars($) Day VQ Engagement
  • 15. Product Recommendations Association Rules with monthly customer sessions • N1: Count the number of times products appear in pairs (over a month for a customer) • N2: Count the number of times products (Antecedent or Consequent)appear over a month for a customer • N3: Count the number customers in a month Compute • Antecedent Support ( N2A / N3) • Consequent Support ( N2C / N3) • Rule Confidence (N1 / N2A) • Lift ( N1/ N2A / (N2C / N3 ) ) All of this is done in database for all the most recent month daily! 15
  • 16. Recommendation Example Antecedent: Mens Air Jordan City Collection NYC T-Shirt N2A = 384 Consequent: Mens Air Jordan Retro 10 NYC Basketball Shoes N2C = 9770 Rule Occurrence: N1 = 114 Transaction Count: N3 = 780,005 Antecedent Support ( N2A / N3) = 384/780,005 = 0.00049 Consequent Support ( N2C / N3) = 9770/780,005 = 0.012 Rule Confidence (N1 / N2A) = 114/384 = 0.297 Lift ( N1/ N2A / (N2C / N3 ) ) = Rule Confidence / Consequent Support = 23.7 23.7x more likely to purchase Air Jordans after buying the Jordan City Collection NYC T-Shirt 16
  • 17. RFML Recency: the number of days since the last visit or purchase by a shopper. Frequency: the number of visits or purchases within a time period of interest. Monetary: the total dollar spend of a shopper within the time period of interest. Latency: the average number of days between visits or purchases within the time period of interest. Recency and Latency are computed 1/day Computed on demand: Frequency Monetary 17
  • 18. Predictive Models GOAL: Predict Days To Next Purchase and Days to Next Visit for <= 1, 3, 7, 15 and interval 15- 30, 31-60, 61-90 216 input fields (Engagement, Average order value, Average session value, session count, asset count, many more plus interactions) Build models on 6M records at an entity level Model Building Process: 18 6M records (Redshift) Python pyETL library Variable Reduction (Variance, Correlation and Lasso-LARS variable reduction) Build Models (Parallel!!) Model Tests (ROC AUC, Regression Coefficients) Upload model & results to SQL Models ready to Deploy Model scoring handled directly in SQL using a SQL process. Can score 100M’s of records in minutes!
  • 19. Example Big A$$ Client Athletic Retailer, 2 years of data, $1.6B in sales / year, Typical Daily Adds 50,000 transactions, typical batch gives about 20,000 records every 6 min! Database size: 866G (compressed) which equates 2.5T (uncompressed) Total Daily Run time 3 hours (rebuilds from scratch), Batch runtime 5 mins! Vertex Set: 253,449,334 Entity Set: 203,531,275 There are 50 million non-Atomic equivalence classes! These amount to $850M or ~53% of the sales (these customers are the known repeat customers) These are the customers we can target as we have richer information about their repeated browsing. 19
  • 20. This is StoreFront Personalization 20 Website Mobile App In-Store Call Center 3rd PartyAnnual Spend: $4,500 Transactional History • Online: INV 1215 $103.98 • Store: INV 4672 $50.45 • Store: INV 8500 $123.87 [etc] Email Addresses • Transactional: sarahhall@gmail.com • Account: shall@home.com • Promotional: sarahh@yahoo.com Category Affinity: Kid’s, Women’s, Running Brand Affinity: Nike S AR AH Sales Channel Category, Brand, Product Cross-Channel Email Website Mobile Display Social
  • 21. Here’s what it delivers. PROMO
  • 22. Brands personalizing interactions in real-time and email 22