SlideShare a Scribd company logo
1 of 42
Driving Personalized Experiences
Using Customer Profiles
Matt Kalan
Sr. Solution Architect
MongoDB, Inc.
@matthewkalan
matt.kalan@mongodb.com
2
Big Data Analytics Track
1. Driving Personalized Experiences Using Customer Profiles
2. Leveraging Customer Behavior to Enhance Relevancy in
Personalization
3. Machine Learning to Engage the Customer, with Apache Spark,
IBM Watson, and MongoDB
3
Agenda For This Session
1.Benefits of Personalization
2.High level process
3.Data capture steps
4.Data analysis steps
5.Real-time personalization
6.Summary
7.Q&A
4
You Notice When Content is Personalized
When it looks like this outside
Left: from www.johnbyronkuhner.com via Google Images
Right: from www.steinmart.com via Google Images
Is this the best ad to show you?
5
Or Better This
When it looks like this outside
Left: from www.johnbyronkuhner.com via Google Images
Right: www.linkedin.com/pulse/20140729161519-34678510-take-note-time-to-move-beyond-personalization-to-contextualization
More relevant
6
Personalization Pays – Conversion Rates
7
Personalization Pays – ROI Impact
8
High Level Personalization Process
1. Profile created
2. Enrich with public data
3. Capture activity
4. Clustering analysis
5. Define Personas
6. Tag with personas
7. Personalize interactions
Batch analytics
Public data
Common
technologies
• R
• Hadoop
• Spark
• Python
• Java
• Many other
options
4 & 5 performed
much less often
than tagging
9
Why MongoDB for Personalization?
• Document model => customer profiles are rich structures perfect for documents
• High throughput => profiles are read/written every page so high performance is critical
• High scalability => high performance must scale easily for any data size & request volume
• Rich querying & indexes => often only portions of the profile are queried for and especially
ad hoc marketing requires rich querying capabilities. Geospatial indexes critical for mobile
• Real-time analytics => can analyze directly on MongoDB or prepare aggregated results for
external analysis with the aggregation framework
• Strong consistency => want profile changes & tracking to take effect immediately
• Hadoop/Spark integration => can run distributed analytics on data in MongoDB or copy it
to HDFS to run there both with the MongoDB Hadoop Connector
• Low TCO => Low cost enterprise software license, commodity hardware, & management
10
Customer Example: Scratchpad
• Records all
activity in
researched trips
• Needed
– Document
model
– Dynamic
schema
– Rich querying
– Easy scaling
11
And Many Other Customers Personalizing with MongoDB
• Sailthru
• Sitecore
• Adobe (AEM)
• Expedia
• ADP
• Foursquare
• Otto
• Chico’s
and 100s more…
Data Capture
13
Anonymous user
Might just start with this if no cookie
{
"ipAddress" : "216.58.219.238",
"referrer" : "google.com"
}
Pretty useless, right?
14
More Than Just What You Collect
IP Address
Referrer
Information
Broker
Location
Company
Weather
Avg Income
Interests
Possible Interests
e.g. Kay Jewelers, Dick’s Sporting Goods
Budget Indication
e.g. Barney’s
Search term
15
Often User Creates a Profile
{
"_id" : ObjectId("553ea57b588ac9ef066428e1"),
"ipAddress" : "216.58.219.238",
"referrer" : ”kay.com",
"firstName" : "John",
"lastName" : "Doe",
"email" : "johndoe@gmail.com"
}
16
Even Email Unlocks Useful Info
17
Available Early in Relationship
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName” : "John",
"lastName” : "Doe",
"address” : "229 W. 43rd St.",
"city” : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "johndoe@gmail",
"gender" : "male”
}
18
Often Users Even Volunteer Preferences
19
Easy to Store in Profile
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName” : "John",
"lastName” : "Doe",
"address” : "229 W. 43rd St.",
"city” : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "johndoe@gmail.com",
"gender" : "male”,
"interests" : [
”dumplings",
”board games",
”rooftop",
”ginger beer",
”ahi tuna",
”healthy food"
]
}
20
In Return, User Gets Relevant Info
21
Customer Activity Valuable to Track
{
"_id”: ObjectId("553e7dca588ac9ef066428e0"),
"firstName : "John",
"lastName” : "Doe",
"address” : "229 W. 43rd St.",
"city” : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "johndoe@gmail.com",
"gender" : "male”,
...
"visitedCounts" : {
"watches" : 3,
"shirts" : 1,
"sunglasses" : 1,
"bags" : 2
}
}
From gilt.com
22
Purchases Are Usually Even More Valuable
{
"_id”: ObjectId("553e7dca588ac9ef066428e0"),
"firstName : "John",
"lastName” : "Doe",
"address” : "229 W. 43rd St.",
"city” : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "johndoe@gmail.com",
"gender" : "male”,
...
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress
Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
]
}
From gilt.com
23
Data Capture – Simple to Sophisticated
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName" : "John",
"lastName" : "Doe",
"address" : "229 W. 43rd St.",
"city" : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "john.doe@mongodb.com",
"twitterHandle" : "johndoe",
"gender" : "male",
"interests" : [
"electronics",
"basketball",
"weightlifting",
"ultimate frisbee",
"traveling",
"technology"
],
"visitedCounts" : {
"watches" : 3,
"shirts" : 1,
"sunglasses" : 1,
"bags" : 2
},
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
]
}
Additional behavior tracking
• How long on each page (e.g. publishing)?
• What is reaction to pop-up promotions?
• Looks at cross-sold items on page?
• What categories are clicked on?
• Does a certain price point drive buying?
• Purchases at certain times of year?
Data Analysis
25
Clustering Overview
• Think of each of your customers or users of your site as a data point
• How can we group users into like sets for marketing, cross-sell, etc. similarly
• K-means is a common algorithm for clustering
Image from: http://pypr.sourceforge.net/kmeans.html
Clustered DataOriginal Unclustered Data
26
Clustering Process for Personalization
Customer Profile
Documents
Map to Vectors
[1, 3, 0, …]
Clustering Algo
Vectors
Iterate on inputs
Define
Personas
Clusters of customersUpdate profiles with
persona
Tag Profiles
with Personas
Clusters of customers
27
Mapping Profile to Vector Input
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName" : "John",
...
"visitedCounts" : {
”Mens watches" : 3,
”Mens shirts" : 1,
”Mens sunglasses" : 1,
”Mens bags" : 2
},
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
]
}
Mens shirts Mens pants Mens
shoes
Mens ties
Mens
Sunglass
Mens
Watch …
11 0 10 0 1 3
[ 11, 0, 10, 0, 1, 3, ...]
(example vector)
e.g. 1 purchase = 10 visited counts
28
Aggregation Framework for Filtering Profiles
//Adds up the visited counts (vc) and purchases to filter out those below 20 counts
db.profiles.aggregate( [
{$project:
{
vc: "$vc",
purchases: "$purchases",
total:
{$add: [
{$ifNull: ["$vc.mShirts", 0]},
{$ifNull: ["$vc.mPants", 0]},
{$ifNull: ["$vc.mShoes", 0]},
{$ifNull: ["$vc.mTies", 0]},
{$ifNull: ["$vc.mSunglass", 0]},
{$ifNull: ["$vc.mWatch", 0]},
{$ifNull: ["$vc.mBags", 0]},
{$multiply: [ {$size: "$purchases"}, 10 ]}
]}
}
},
{$match:
{total: {$gte: 20}}
}
])
29
Input/Output for K-Means Algo
Clustering Algo
Iterate on inputs
Clusters of customers
Vectors: [
[11, 0, 10, 0, 1, 3, ...],
[ 0, 5, 10, 3, 0, 0, ...],
...
]
K = # of clusters
Driven by
marketing effort
or data analysis
N = # of iterations
{
Centers: [
{name: C1, vector:[..] },
{name: C2, vector:[..] }],
...
]
Clusters: [
{C1: [[11, 0, 10, 0, 1, 3, ...],...]},
{C2: [[ 0, 5, 0, 0, 10, 0, ...],...]},
...
]
}
Vectors
30
Clustered DataOriginal Unclustered Data
Choosing Personas
• Each cluster would usually map to one persona you can identify, name, and target
• Common to name personas to be memorable, e.g. shoe fanatic, bargain hunter, researcher, etc.
C1
C2
C3 Shoe Fanatic?
31
Mapping Customer Profile to Persona
{
Centers: [
{name: C1, vector:[..] },
{name: C2, vector:[..] }],
...
]
Clusters: [
{C1: [[11, 0, 10, 0, 1, 3, ...],...]},
{C2: [[ 0, 5, 0, 0, 10, 0, ...],...]},
...
]
}
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName" : "John",
...
"visitedCounts" : {
”Mens watches" : 3,
”Mens shirts" : 1,
”Mens sunglasses" : 1,
”Mens bags" : 2
},
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
],
"persona" : "shoe-fanatic"
}
Loop through each
vector in cluster, map to
customer, and tag
customer with persona
Real-time Personalization
33
Easier with a Rich Customer Profile to Personalize
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName" : "John",
"lastName" : "Doe",
"address" : "229 W. 43rd St.",
"city" : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "john.doe@mongodb.com",
"twitterHandle" : "johndoe",
"gender" : "male",
"interests" : [
"electronics",
"basketball",
"weightlifting",
"ultimate frisbee",
"traveling",
"technology"
],
"visitedCounts" : {
"watches" : 3,
"shirts" : 1,
"sunglasses" : 1,
"bags" : 2
},
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
],
"persona" : "shoe-fanatic”
}
34
Example
Images from Target.com
Can cross-sell
based on
current page
Also to the real
person
35
Many Personalization Techniques to Mix & Match
• Related content
• Content history
• Next best offer
• Trigger-based
• Threshold
• Last behavior
• Time & event
• Offer matching
• Filter-based
• Crowd-sourcing
• Voice of customer
• User-directed
• Persona matching
Source: http://semphonic.blogs.com/semangel/2014/03/strategies-for-personalization-delivering-an-extra-unexpected-treat-.html
36
Alternatives Give Less Capabilities
Activity Logs
Customer Profiles
(no activity)
Application
Option - separate weblogs
Customer Profiles
with Activity Tracking
Application
Better option
Tag with Persona
Marketing
Clustering &
Analytics
Can market:
• On activity today
• With rich & specific
queries
37
Better Option Enables Real-time Persona Matching
1. Profile created
2. Enrich with public data
3. Capture activity
4. Clustering analysis
5. Define Personas
6. Tag with personas
7. Personalize interactions
Batch analytics
Public data
Can even match customer
to a persona while
customer is engaged
Logic is to calculate the
distance to each cluster
center and tag with the
closest one’s persona
Summary
39
Personalization Pays – ROI Impact
40
High Level Personalization Process
1. Profile created
2. Enrich with public data
3. Capture activity
4. Clustering analysis
5. Define Personas
6. Tag with personas
7. Personalize interactions
Batch analytics
Public data
Common
technologies
• R
• Hadoop
• Spark
• Python
• Java
• Many other
options
4 & 5 performed
much less often
than tagging
41
Big Data Analytics Track
 Driving Personalized Experiences Using Customer Profiles
2. Leveraging Customer Behavior to Enhance Relevancy in
Personalization
3. Machine Learning to Engage the Customer, with Apache Spark,
IBM Watson, and MongoDB
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles

More Related Content

Viewers also liked

Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
MongoDB
 
Leverage Customer Data to Deliver a Personalized Digital Experience
Leverage Customer Data to Deliver a Personalized Digital ExperienceLeverage Customer Data to Deliver a Personalized Digital Experience
Leverage Customer Data to Deliver a Personalized Digital Experience
Perficient, Inc.
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Flink Forward
 
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu VatsBuilding a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
Spark Summit
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 

Viewers also liked (19)

Acquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, FastAcquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, Fast
 
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
 
Modern Databases for Modern Application Architectures: The Next Wave of Desig...
Modern Databases for Modern Application Architectures: The Next Wave of Desig...Modern Databases for Modern Application Architectures: The Next Wave of Desig...
Modern Databases for Modern Application Architectures: The Next Wave of Desig...
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI ConnectorWebinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
 
Creating Personalized Website Experiences: One Site Does NOT Fit All!
Creating Personalized Website Experiences: One Site Does NOT Fit All!Creating Personalized Website Experiences: One Site Does NOT Fit All!
Creating Personalized Website Experiences: One Site Does NOT Fit All!
 
Leverage Customer Data to Deliver a Personalized Digital Experience
Leverage Customer Data to Deliver a Personalized Digital ExperienceLeverage Customer Data to Deliver a Personalized Digital Experience
Leverage Customer Data to Deliver a Personalized Digital Experience
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
 
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu VatsBuilding a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
 
Moving Graphs to Production At Scale
Moving Graphs to Production At ScaleMoving Graphs to Production At Scale
Moving Graphs to Production At Scale
 
Wingify Culture Values
Wingify Culture ValuesWingify Culture Values
Wingify Culture Values
 
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
MongoDB IoT City Tour STUTTGART: Industrial Internet, Industry 4.0, Smart Fac...
 
The Best of the Best: Media and Publishing Newsletter Edition
The Best of the Best: Media and Publishing Newsletter EditionThe Best of the Best: Media and Publishing Newsletter Edition
The Best of the Best: Media and Publishing Newsletter Edition
 
2017 Digital Retail Innovation: 9 Areas Retail Marketers are Investing and Why
2017 Digital Retail Innovation: 9 Areas Retail Marketers are Investing and Why2017 Digital Retail Innovation: 9 Areas Retail Marketers are Investing and Why
2017 Digital Retail Innovation: 9 Areas Retail Marketers are Investing and Why
 
Persuasion is an Art. Coherence is a Duty
Persuasion is an Art. Coherence is a DutyPersuasion is an Art. Coherence is a Duty
Persuasion is an Art. Coherence is a Duty
 
50 Facts That Will Make Businesses Rethink their Customer Service
50 Facts That Will Make Businesses Rethink their Customer Service50 Facts That Will Make Businesses Rethink their Customer Service
50 Facts That Will Make Businesses Rethink their Customer Service
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
50 Ways To Understand The Digital Customer Experience
50 Ways To Understand The Digital Customer Experience50 Ways To Understand The Digital Customer Experience
50 Ways To Understand The Digital Customer Experience
 

Similar to Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles

Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data Analysis
MongoDB
 
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
MongoDB
 

Similar to Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles (20)

JSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa TechfestJSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling - July 2018 - Tulsa Techfest
 
Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data Analysis
 
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
Webinar: Delivering the Complete Customer View - Today’s Table Stakes by Infu...
 
Powering Systems of Engagement
Powering Systems of EngagementPowering Systems of Engagement
Powering Systems of Engagement
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)Mongo at Sailthru (MongoNYC 2011)
Mongo at Sailthru (MongoNYC 2011)
 
Mongo db 101 dc group
Mongo db 101 dc groupMongo db 101 dc group
Mongo db 101 dc group
 
Creating a Single View: Overview and Analysis
Creating a Single View: Overview and AnalysisCreating a Single View: Overview and Analysis
Creating a Single View: Overview and Analysis
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
Rich Results and Structured Data
Rich Results and Structured DataRich Results and Structured Data
Rich Results and Structured Data
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
The Rise of NoSQL
The Rise of NoSQLThe Rise of NoSQL
The Rise of NoSQL
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & Spark
 
Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your business
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it tooQuerying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles

  • 1. Driving Personalized Experiences Using Customer Profiles Matt Kalan Sr. Solution Architect MongoDB, Inc. @matthewkalan matt.kalan@mongodb.com
  • 2. 2 Big Data Analytics Track 1. Driving Personalized Experiences Using Customer Profiles 2. Leveraging Customer Behavior to Enhance Relevancy in Personalization 3. Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB
  • 3. 3 Agenda For This Session 1.Benefits of Personalization 2.High level process 3.Data capture steps 4.Data analysis steps 5.Real-time personalization 6.Summary 7.Q&A
  • 4. 4 You Notice When Content is Personalized When it looks like this outside Left: from www.johnbyronkuhner.com via Google Images Right: from www.steinmart.com via Google Images Is this the best ad to show you?
  • 5. 5 Or Better This When it looks like this outside Left: from www.johnbyronkuhner.com via Google Images Right: www.linkedin.com/pulse/20140729161519-34678510-take-note-time-to-move-beyond-personalization-to-contextualization More relevant
  • 6. 6 Personalization Pays – Conversion Rates
  • 8. 8 High Level Personalization Process 1. Profile created 2. Enrich with public data 3. Capture activity 4. Clustering analysis 5. Define Personas 6. Tag with personas 7. Personalize interactions Batch analytics Public data Common technologies • R • Hadoop • Spark • Python • Java • Many other options 4 & 5 performed much less often than tagging
  • 9. 9 Why MongoDB for Personalization? • Document model => customer profiles are rich structures perfect for documents • High throughput => profiles are read/written every page so high performance is critical • High scalability => high performance must scale easily for any data size & request volume • Rich querying & indexes => often only portions of the profile are queried for and especially ad hoc marketing requires rich querying capabilities. Geospatial indexes critical for mobile • Real-time analytics => can analyze directly on MongoDB or prepare aggregated results for external analysis with the aggregation framework • Strong consistency => want profile changes & tracking to take effect immediately • Hadoop/Spark integration => can run distributed analytics on data in MongoDB or copy it to HDFS to run there both with the MongoDB Hadoop Connector • Low TCO => Low cost enterprise software license, commodity hardware, & management
  • 10. 10 Customer Example: Scratchpad • Records all activity in researched trips • Needed – Document model – Dynamic schema – Rich querying – Easy scaling
  • 11. 11 And Many Other Customers Personalizing with MongoDB • Sailthru • Sitecore • Adobe (AEM) • Expedia • ADP • Foursquare • Otto • Chico’s and 100s more…
  • 13. 13 Anonymous user Might just start with this if no cookie { "ipAddress" : "216.58.219.238", "referrer" : "google.com" } Pretty useless, right?
  • 14. 14 More Than Just What You Collect IP Address Referrer Information Broker Location Company Weather Avg Income Interests Possible Interests e.g. Kay Jewelers, Dick’s Sporting Goods Budget Indication e.g. Barney’s Search term
  • 15. 15 Often User Creates a Profile { "_id" : ObjectId("553ea57b588ac9ef066428e1"), "ipAddress" : "216.58.219.238", "referrer" : ”kay.com", "firstName" : "John", "lastName" : "Doe", "email" : "johndoe@gmail.com" }
  • 16. 16 Even Email Unlocks Useful Info
  • 17. 17 Available Early in Relationship { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName” : "John", "lastName” : "Doe", "address” : "229 W. 43rd St.", "city” : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "johndoe@gmail", "gender" : "male” }
  • 18. 18 Often Users Even Volunteer Preferences
  • 19. 19 Easy to Store in Profile { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName” : "John", "lastName” : "Doe", "address” : "229 W. 43rd St.", "city” : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "johndoe@gmail.com", "gender" : "male”, "interests" : [ ”dumplings", ”board games", ”rooftop", ”ginger beer", ”ahi tuna", ”healthy food" ] }
  • 20. 20 In Return, User Gets Relevant Info
  • 21. 21 Customer Activity Valuable to Track { "_id”: ObjectId("553e7dca588ac9ef066428e0"), "firstName : "John", "lastName” : "Doe", "address” : "229 W. 43rd St.", "city” : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "johndoe@gmail.com", "gender" : "male”, ... "visitedCounts" : { "watches" : 3, "shirts" : 1, "sunglasses" : 1, "bags" : 2 } } From gilt.com
  • 22. 22 Purchases Are Usually Even More Valuable { "_id”: ObjectId("553e7dca588ac9ef066428e0"), "firstName : "John", "lastName” : "Doe", "address” : "229 W. 43rd St.", "city” : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "johndoe@gmail.com", "gender" : "male”, ... "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ] } From gilt.com
  • 23. 23 Data Capture – Simple to Sophisticated { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName" : "John", "lastName" : "Doe", "address" : "229 W. 43rd St.", "city" : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "john.doe@mongodb.com", "twitterHandle" : "johndoe", "gender" : "male", "interests" : [ "electronics", "basketball", "weightlifting", "ultimate frisbee", "traveling", "technology" ], "visitedCounts" : { "watches" : 3, "shirts" : 1, "sunglasses" : 1, "bags" : 2 }, "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ] } Additional behavior tracking • How long on each page (e.g. publishing)? • What is reaction to pop-up promotions? • Looks at cross-sold items on page? • What categories are clicked on? • Does a certain price point drive buying? • Purchases at certain times of year?
  • 25. 25 Clustering Overview • Think of each of your customers or users of your site as a data point • How can we group users into like sets for marketing, cross-sell, etc. similarly • K-means is a common algorithm for clustering Image from: http://pypr.sourceforge.net/kmeans.html Clustered DataOriginal Unclustered Data
  • 26. 26 Clustering Process for Personalization Customer Profile Documents Map to Vectors [1, 3, 0, …] Clustering Algo Vectors Iterate on inputs Define Personas Clusters of customersUpdate profiles with persona Tag Profiles with Personas Clusters of customers
  • 27. 27 Mapping Profile to Vector Input { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName" : "John", ... "visitedCounts" : { ”Mens watches" : 3, ”Mens shirts" : 1, ”Mens sunglasses" : 1, ”Mens bags" : 2 }, "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ] } Mens shirts Mens pants Mens shoes Mens ties Mens Sunglass Mens Watch … 11 0 10 0 1 3 [ 11, 0, 10, 0, 1, 3, ...] (example vector) e.g. 1 purchase = 10 visited counts
  • 28. 28 Aggregation Framework for Filtering Profiles //Adds up the visited counts (vc) and purchases to filter out those below 20 counts db.profiles.aggregate( [ {$project: { vc: "$vc", purchases: "$purchases", total: {$add: [ {$ifNull: ["$vc.mShirts", 0]}, {$ifNull: ["$vc.mPants", 0]}, {$ifNull: ["$vc.mShoes", 0]}, {$ifNull: ["$vc.mTies", 0]}, {$ifNull: ["$vc.mSunglass", 0]}, {$ifNull: ["$vc.mWatch", 0]}, {$ifNull: ["$vc.mBags", 0]}, {$multiply: [ {$size: "$purchases"}, 10 ]} ]} } }, {$match: {total: {$gte: 20}} } ])
  • 29. 29 Input/Output for K-Means Algo Clustering Algo Iterate on inputs Clusters of customers Vectors: [ [11, 0, 10, 0, 1, 3, ...], [ 0, 5, 10, 3, 0, 0, ...], ... ] K = # of clusters Driven by marketing effort or data analysis N = # of iterations { Centers: [ {name: C1, vector:[..] }, {name: C2, vector:[..] }], ... ] Clusters: [ {C1: [[11, 0, 10, 0, 1, 3, ...],...]}, {C2: [[ 0, 5, 0, 0, 10, 0, ...],...]}, ... ] } Vectors
  • 30. 30 Clustered DataOriginal Unclustered Data Choosing Personas • Each cluster would usually map to one persona you can identify, name, and target • Common to name personas to be memorable, e.g. shoe fanatic, bargain hunter, researcher, etc. C1 C2 C3 Shoe Fanatic?
  • 31. 31 Mapping Customer Profile to Persona { Centers: [ {name: C1, vector:[..] }, {name: C2, vector:[..] }], ... ] Clusters: [ {C1: [[11, 0, 10, 0, 1, 3, ...],...]}, {C2: [[ 0, 5, 0, 0, 10, 0, ...],...]}, ... ] } { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName" : "John", ... "visitedCounts" : { ”Mens watches" : 3, ”Mens shirts" : 1, ”Mens sunglasses" : 1, ”Mens bags" : 2 }, "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ], "persona" : "shoe-fanatic" } Loop through each vector in cluster, map to customer, and tag customer with persona
  • 33. 33 Easier with a Rich Customer Profile to Personalize { "_id" : ObjectId("553e7dca588ac9ef066428e0"), "firstName" : "John", "lastName" : "Doe", "address" : "229 W. 43rd St.", "city" : "New York", "state" : "NY", "zipCode" : "10036", "age" : 30, "email" : "john.doe@mongodb.com", "twitterHandle" : "johndoe", "gender" : "male", "interests" : [ "electronics", "basketball", "weightlifting", "ultimate frisbee", "traveling", "technology" ], "visitedCounts" : { "watches" : 3, "shirts" : 1, "sunglasses" : 1, "bags" : 2 }, "purchases" : [ { "id" : 1, "desc" : "Power Oxford Dress Shoe", "category" : "Mens shoes" }, { "id" : 2, "desc" : "Striped Sportshirt", "category" : "Mens shirts" } ], "persona" : "shoe-fanatic” }
  • 34. 34 Example Images from Target.com Can cross-sell based on current page Also to the real person
  • 35. 35 Many Personalization Techniques to Mix & Match • Related content • Content history • Next best offer • Trigger-based • Threshold • Last behavior • Time & event • Offer matching • Filter-based • Crowd-sourcing • Voice of customer • User-directed • Persona matching Source: http://semphonic.blogs.com/semangel/2014/03/strategies-for-personalization-delivering-an-extra-unexpected-treat-.html
  • 36. 36 Alternatives Give Less Capabilities Activity Logs Customer Profiles (no activity) Application Option - separate weblogs Customer Profiles with Activity Tracking Application Better option Tag with Persona Marketing Clustering & Analytics Can market: • On activity today • With rich & specific queries
  • 37. 37 Better Option Enables Real-time Persona Matching 1. Profile created 2. Enrich with public data 3. Capture activity 4. Clustering analysis 5. Define Personas 6. Tag with personas 7. Personalize interactions Batch analytics Public data Can even match customer to a persona while customer is engaged Logic is to calculate the distance to each cluster center and tag with the closest one’s persona
  • 40. 40 High Level Personalization Process 1. Profile created 2. Enrich with public data 3. Capture activity 4. Clustering analysis 5. Define Personas 6. Tag with personas 7. Personalize interactions Batch analytics Public data Common technologies • R • Hadoop • Spark • Python • Java • Many other options 4 & 5 performed much less often than tagging
  • 41. 41 Big Data Analytics Track  Driving Personalized Experiences Using Customer Profiles 2. Leveraging Customer Behavior to Enhance Relevancy in Personalization 3. Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB

Editor's Notes

  1. P2P10 Driving Personalized Experiences Using Customer Profiles This session covers the end-to-end process of personalization and demonstrates a great example of combining operational data for an application in MongoDB with the ability to analyze that data and operationalize the results. We will discuss storing rich customer profiles in MongoDB, using clustering to develop a customer segmentation, and leveraging that as a filter for valuable personalization of your application. You'll walk away with a good idea of how to drive targeted experiences to customers for more relevant engagement and how personalization is accessible to companies large and small.
  2. This session is broad end-to-end, then deeper in next 2 session. Goal is for everyone here to believe personalization is achievable to build into your applications
  3. Explain who did the survey and who was asked questions. Actually easy to get value incrementally from starting small and adding more complex personalization
  4. Actually easy to get value incrementally from starting small and adding more complex personalization
  5. Mention other parts of track will cover the technologies used for batch analytics
  6. 70% of marketers said user preferences give high ROI
  7. 68% said user behavior
  8. Point out schema design might be different depending on requirements and how using profile info Probably have a separate collection for order info but relevant info stored with profile
  9. 2 dimensions might be how many shoes bought vs. how many tops (forgetting the axes). In reality can be many more dimensions
  10. Might filter out any counts lower than 20 or some number, only run on customers with enough information (frequent customers) Could have a different part of the vector for purchasing.
  11. There is a choice of what vectors to send. Might just choose counts larger than e.g. 5 or only for those customers with at least 20 counts because you judge you have enough samples
  12. Marketing might decide they want to focus on 5 personas to start, or through data analysis, you find one cluster really exhibits very different behavior within it and you want to break it up (could mention the technology products that can use for clustering, e.g. spark, ML, language libraries)
  13. Explain how k-means works at high level, iteratively moving the centers to define the nearby clusters e.g. if the two axes were shoes vs. clothes, then green might be high frequency buyer of everything, red is high shoe buyers, and blue is little of everything Might name it by the cluster center, especially focusing on how it is different from other cluster centers Over time, you would learn whether these personas are stable or not or change frequently, in which case you might not focus on those, e.g. patterns in the month before Christmas (buying patterns very different).
  14. A lot of work just for that little tag, but that tag represents a fast way to characterize that person and add to personalization rules
  15. Even counts and therefore persona very helpful. A good problem to have is too much information to personalize with – start simple, measure, and add
  16. Great juxtaposition of two approaches. Even though I’m looking at a woman’s dress, it uses Feature Products to market to me personally. Other sections cover related items to this dress so best of both worlds Featured products could be items commonly bought for my persona, or trending today by persona More advanced Could track products selling by persona today Figuring out whether things are gifts (e.g. clothes for women and I’m a man)
  17. Many of these are useful by themselves but many made better when you add a persona Beyond just personalizing from customer profiles, rules-based Suggest based on what already in the cart what page visited for a while weather in the area this weekend responds to discounts
  18. Can do ad hoc marketing & promotions, e.g. Who looks at the swimwear or shoe category a lot Who shopped last year on Black Friday Who shopped a lot right before spring last year Who bought a suit and bought or looked at ties
  19. Most importantly can identify a persona while the person is shopping (once browsing enough) instead of waiting until next time they come to your app Mention other parts of track will cover the technologies used for batch analytics
  20. Actually easy to get value incrementally from starting small and adding more complex personalization
  21. Mention other parts of track will cover the technologies used for batch analytics
  22. If time, can ask people what algorithms they are using for personalization