SlideShare a Scribd company logo
Audience segmentation with
machine learning
Richard Lawrence
Rise at Seven
@richlawre
About me
SEO background,
studying a Data
Science degree in
spare time.
@richlawre
About me
Follow me on Twitter
@richlawre
@richlawre
What we’re going to
cover
@richlawre
@richlawre
A bit of context
about machine
learning AUDIENCE 1
Pageviews
Transaction
revenue
AUDIENCE 2
AUDIENCE 3
The agenda
@richlawre
An overview of
how audience
segmentation
works
AUDIENCE 1
Pageviews
Transaction
revenue
AUDIENCE 2
AUDIENCE 3
The agenda
@richlawre
Some detail about
how to do it
AUDIENCE 1
Pageviews
Transaction
revenue
AUDIENCE 2
AUDIENCE 3
The agenda
@richlawre
How to take things
further
AUDIENCE 1
Pageviews
Transaction
revenue
AUDIENCE 2
AUDIENCE 3
The agenda
A bit of context
@richlawre
It learns with
labelled data
@richlawre
What is supervised
machine learning?
@richlawre
It finds
patterns in
data
What is unsupervised
machine learning?
Audience segmentation
in a nutshell
@richlawre
We extract data about
individual sessions from
web analytics
@richlawre
Extracting the data
CHANNEL SESSIONS TRANSACTIONS REVENUE
Organic search 1000 50 £12,000
Paid search 700 30 £3,000
Direct 500 25 £6,000
Referral 300 30 £4,000
Instead of
grouping sessions
by channel or
section...
@richlawre
Extracting the data
...we extract
details about
individual
sessions
@richlawre
SESSION ID PAGEVIEWS TIME PER PAGE REVENUE
Session 1 7 30 seconds £77.50
Session 2 10 20 seconds £27.50
Session 3 5 23 seconds £36.50
Session 4 8 18 seconds £45.30
We then use unsupervised
machine learning to find
interesting patterns
@richlawre
Instead of
analysing sessions
grouped together
in some way...
Finding patterns
@richlawre
AUDIENCE 1
Pageviews
Transaction
revenue
AUDIENCE 2
AUDIENCE 3
...we use machine
learning to find
patterns in user
behaviour.
@richlawre
Finding patterns
This results in
actionable audience
segments
@richlawre
The Gatherer
Landing section: Homepage
Least time
per page
Most number of
pages viewed
Highest number of
conversions per
session
Most likely to
download a
brochure
Description:
The Gatherer comes directly to the website to the homepage,
visits multiple car models to download a brochure for each to look
at offline later.
Example CRO Test:
Link to a model comparison table from
the homepage with option to download
a brochure for each model
Likely onsite journey
Example segment from
Car manufacturer
Second section: Car Models
Exit section: Car Models
@richlawre
The Skipper
Example segment from
Train operator
Description:
The Skipper has likely already done their travel research (around
when to travel & where) multiple times without buying and are
simply returning - likely at the last minute - to finally finish task.
Example CRO Test:
Use a cookie to add a banner to the
homepage that takes a returning user
back to where they left off in the
transaction process.
Slightly more time
per page then average
More likely to buy in
the evening or at night
Fewest days since last
session
Fewest pages per
visit
Over index for
visiting via tablet
Over index for
visiting via email
@richlawre
Why do you need to do
this?
@richlawre
1.Find behaviours you
may not have realised
existed
@richlawre
2.Generate test
hypotheses for CRO
@richlawre
3.Track behaviours
over time (more about
this later)
@richlawre
How to do it
@richlawre
The key steps
1. Extract the data
2. Process the data
3. Select features
4. Cluster the data
5. Manually explore the segments
@richlawre
1.Extracting the data
@richlawre
Using Google Analytics API
Extract by Session ID
or Client ID
@richlawre
https://www.jcchouinard.com/google-analytics-api-using-python/
Using Google Analytics API
Useful dimensions:
landingPagePath
secondPagePath
exitPagePath
@richlawre
https://www.jcchouinard.com/google-analytics-api-using-python/
Using Google Analytics API
Useful metrics:
pageviewsPerSession
revenuePerTransaction
goalXXCompletions
https://www.jcchouinard.com/google-analytics-api-using-python/
@richlawre
Using Google Analytics API
There is a limit on the
number of
metrics/dimensions
10
@richlawre
https://www.jcchouinard.com/google-analytics-api-using-python/
Using Google Analytics API
There is also a limit
on the number of
rows per call
25,000
@richlawre
https://www.jcchouinard.com/google-analytics-api-using-python/
Using Google Analytics API
The answer is to
loop over days,
metrics, dimensions
& merge!
@richlawre
https://www.jcchouinard.com/google-analytics-api-using-python/
Using BigQuery
Data is nested -
I’ve found it makes
things more
difficult at the
session level
https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/
@richlawre
Using BigQuery
However it is
possible to do and
there is some great
information around
https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/
@richlawre
Using BigQuery
Can also run the
unsupervised
machine learning
algorithm directly
in SQL
https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/
@richlawre
Using BigQuery
Previously used 1M
sessions with
Python & Google
Colab - BigQuery
wasn’t necessary
https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/
@richlawre
Using BigQuery
Choose days at
random to ensure
variation
https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/
@richlawre
2.Processing the data
@richlawre
Useful data transformations
Change hours of
the day to
morning,
afternoon,
evening,night
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
@richlawre
Change days to
weekday &
weekend
@richlawre
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
Useful data transformations
Change pages to
sections
@richlawre
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
Useful data transformations
Useful data transformations
Combine certain
conversion points
@richlawre
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
Here is a useful link
to do find and
replace it in Python
& Pandas
@richlawre
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
Useful data transformations
You could use
Google DataPrep
instead
@richlawre
SESSION ID DAY DAY TYPE
Session 1 Monday Weekday
Session 2 Tuesday Weekday
Session 3 Saturday Weekend
Session 4 Wednesday Weekday
Useful data transformations
One hot encoding
Converts categories
to 1s & 0s.
SESSION
ID
CHANNEL
Session 1 Organic Search
Session 2 Paid Search
Session 3 Direct
Session 4 Direct
SESSION
ID
ORGANIC
SEARCH
PAID
SEARCH
DIRECT
Session 1 1 0 0
Session 2 0 1 0
Session 3 0 0 1
Session 4 0 0 1
@richlawre
Values aren’t
increasing so doesn’t
skew the clustering
algorithm
SESSION
ID
CHANNEL
Session 1 Organic Search
Session 2 Paid Search
Session 3 Direct
Session 4 Direct
SESSION
ID
ORGANIC
SEARCH
PAID
SEARCH
DIRECT
Session 1 1 0 0
Session 2 0 1 0
Session 3 0 0 1
Session 4 0 0 1
@richlawre
One hot encoding
Use for numerical as
well as categorical
data
SESSION
ID
CHANNEL
Session 1 Organic Search
Session 2 Paid Search
Session 3 Direct
Session 4 Direct
SESSION
ID
ORGANIC
SEARCH
PAID
SEARCH
DIRECT
Session 1 1 0 0
Session 2 0 1 0
Session 3 0 0 1
Session 4 0 0 1
@richlawre
One hot encoding
See here for how to
do it with Python
SESSION
ID
CHANNEL
Session 1 Organic Search
Session 2 Paid Search
Session 3 Direct
Session 4 Direct
SESSION
ID
ORGANIC
SEARCH
PAID
SEARCH
DIRECT
Session 1 1 0 0
Session 2 0 1 0
Session 3 0 0 1
Session 4 0 0 1
@richlawre
One hot encoding
3.Selecting features
@richlawre
Best subset regression
Choose desired
response variable
& find potential
explanatory
variables
@richlawre
Best subset regression
Runs regression
analysis for
combinations of
variables at once to
find correlation
@richlawre
Best subset regression
This will help you
narrow down
features to find
useful patterns
within
@richlawre
Best subset regression
See Python
walkthrough here
@richlawre
4.Clustering the data
@richlawre
Principal Component
Analysis
Transforms a large
set of variables into
a smaller one
without much loss
@richlawre
Principal Component
Analysis
See walkthrough
here.
@richlawre
Using a KMeans algorithm
The
unsupervised
machine learning
algorithm to find
patterns
@richlawre
Using a KMeans algorithm
See a full
walkthrough here
with Python.
@richlawre
Using a KMeans algorithm
You can also do this
directly in BigQuery.
@richlawre
Using a silhouette score
Way of finding the
optimum
number of
clusters
@richlawre
Using a silhouette score
Optimal number is
at the elbow in
the graph - not
much gain after
this
@richlawre
5.Always manually
explore the segments!
@richlawre
Taking it to the next level
@richlawre
Classify any future session
@richlawre
Use the labelled
data to train a
supervised
machine learning
algorithm - we use
deep learning
Classify any future session
@richlawre
The better defined
your segments,
the better this will
perform
Classify any future session
@richlawre
Push the labelled
sessions back into
Google Analytics
via Data Import
Visualise in Streamlit
@richlawre
CRM segment 1
CRM segment 2
CRM segment 3
CRM segment 4
CRM segment 5
Summary
@richlawre
Unsupervised machine learning
finds interesting patterns in data.
@richlawre
Apply this to individual sessions
from Google Analytics to create
behaviour segments.
@richlawre
This can be a great source of ideas
for CRO hypotheses.
@richlawre
There are 5 steps for the analysis:
extract,process,feature selection,
cluster,manually explore
@richlawre
You can use Python or other
toolsets (Google Cloud) to do the
analysis.
@richlawre
You can use the segments to label
any future session on the website.
@richlawre
Thanks!
@richlawre

More Related Content

What's hot

Benchmark_Winner or Loser_GiuliaPanozzo.pdf
Benchmark_Winner or Loser_GiuliaPanozzo.pdfBenchmark_Winner or Loser_GiuliaPanozzo.pdf
Benchmark_Winner or Loser_GiuliaPanozzo.pdf
GiuliaPanozzo1
 
BrightonSEO 2022- Surena Chande .pptx
BrightonSEO 2022- Surena Chande .pptxBrightonSEO 2022- Surena Chande .pptx
BrightonSEO 2022- Surena Chande .pptx
Surena Chande
 
What we can learn from losing SEO tests
What we can learn from losing SEO testsWhat we can learn from losing SEO tests
What we can learn from losing SEO tests
Will Critchlow
 
Freddy Krueger's Guide to Scary Good Reporting
Freddy Krueger's Guide to Scary Good ReportingFreddy Krueger's Guide to Scary Good Reporting
Freddy Krueger's Guide to Scary Good Reporting
Greg Gifford
 
BrightonSEO Deck - April 2023.pdf
BrightonSEO Deck - April 2023.pdfBrightonSEO Deck - April 2023.pdf
BrightonSEO Deck - April 2023.pdf
Nick Vines
 
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
LazarinaStoyanova
 
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
Daniel Smullen
 
SEO Proposal eMarket Agency
SEO Proposal eMarket AgencySEO Proposal eMarket Agency
SEO Proposal eMarket Agency
eMarket Education
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
BethBarnham1
 
Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022
Andrew Charlton
 
Brighton SEO Talk HS FINAL.pptx
Brighton SEO Talk HS FINAL.pptxBrighton SEO Talk HS FINAL.pptx
Brighton SEO Talk HS FINAL.pptx
Harry Sumner
 
How to overcome the fear of taking time off.pptx
How to overcome the fear of taking time off.pptxHow to overcome the fear of taking time off.pptx
How to overcome the fear of taking time off.pptx
Quibble
 
[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion
Felipe Bazon
 
BrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
BrightonSEO - Master Crawl Budget Optimization for Enterprise WebsitesBrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
BrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
Manick Bhan
 
Don't be a cannibal
Don't be a cannibalDon't be a cannibal
Don't be a cannibal
Limor Barenholtz
 
Creating An Inclusive Web
Creating An Inclusive WebCreating An Inclusive Web
Creating An Inclusive Web
Miracle Inameti-Archibong
 
Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022
MonetBlake
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
BethBarnham1
 
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdfBrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
Hana Bednarova
 
Managing Expectations with Impossible Keywords - Jess Maloney - BrightonSEO ...
Managing Expectations with Impossible Keywords - Jess Maloney  - BrightonSEO ...Managing Expectations with Impossible Keywords - Jess Maloney  - BrightonSEO ...
Managing Expectations with Impossible Keywords - Jess Maloney - BrightonSEO ...
JessMaloney
 

What's hot (20)

Benchmark_Winner or Loser_GiuliaPanozzo.pdf
Benchmark_Winner or Loser_GiuliaPanozzo.pdfBenchmark_Winner or Loser_GiuliaPanozzo.pdf
Benchmark_Winner or Loser_GiuliaPanozzo.pdf
 
BrightonSEO 2022- Surena Chande .pptx
BrightonSEO 2022- Surena Chande .pptxBrightonSEO 2022- Surena Chande .pptx
BrightonSEO 2022- Surena Chande .pptx
 
What we can learn from losing SEO tests
What we can learn from losing SEO testsWhat we can learn from losing SEO tests
What we can learn from losing SEO tests
 
Freddy Krueger's Guide to Scary Good Reporting
Freddy Krueger's Guide to Scary Good ReportingFreddy Krueger's Guide to Scary Good Reporting
Freddy Krueger's Guide to Scary Good Reporting
 
BrightonSEO Deck - April 2023.pdf
BrightonSEO Deck - April 2023.pdfBrightonSEO Deck - April 2023.pdf
BrightonSEO Deck - April 2023.pdf
 
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
 
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
News SEO: Why we’ve de commissioned AMP - Brighton SEO September 2021
 
SEO Proposal eMarket Agency
SEO Proposal eMarket AgencySEO Proposal eMarket Agency
SEO Proposal eMarket Agency
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
 
Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022
 
Brighton SEO Talk HS FINAL.pptx
Brighton SEO Talk HS FINAL.pptxBrighton SEO Talk HS FINAL.pptx
Brighton SEO Talk HS FINAL.pptx
 
How to overcome the fear of taking time off.pptx
How to overcome the fear of taking time off.pptxHow to overcome the fear of taking time off.pptx
How to overcome the fear of taking time off.pptx
 
[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion[BrightonSEO October 2022] On-page SEO: from intention to conversion
[BrightonSEO October 2022] On-page SEO: from intention to conversion
 
BrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
BrightonSEO - Master Crawl Budget Optimization for Enterprise WebsitesBrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
BrightonSEO - Master Crawl Budget Optimization for Enterprise Websites
 
Don't be a cannibal
Don't be a cannibalDon't be a cannibal
Don't be a cannibal
 
Creating An Inclusive Web
Creating An Inclusive WebCreating An Inclusive Web
Creating An Inclusive Web
 
Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
 
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdfBrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
BrightonSEO Oct 22 Hana Bednarova Shout Bravo.pdf
 
Managing Expectations with Impossible Keywords - Jess Maloney - BrightonSEO ...
Managing Expectations with Impossible Keywords - Jess Maloney  - BrightonSEO ...Managing Expectations with Impossible Keywords - Jess Maloney  - BrightonSEO ...
Managing Expectations with Impossible Keywords - Jess Maloney - BrightonSEO ...
 

Similar to MeasureFest July 2021 - Session Segmentation with Machine Learning

A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
Dr. Wilfred Lin (Ph.D.)
 
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Looker
 
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Energy Digital Summit
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
ALTER WAY
 
Empower customer success at LinkedIn with advanced analytics and great visual...
Empower customer success at LinkedIn with advanced analytics and great visual...Empower customer success at LinkedIn with advanced analytics and great visual...
Empower customer success at LinkedIn with advanced analytics and great visual...
Michael Li
 
Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014
ALTER WAY
 
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
Authoritas
 
Rdfa semtech2011
Rdfa semtech2011Rdfa semtech2011
Rdfa semtech2011
Barbara Starr
 
Transitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimpsTransitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimps
Ash Maurya
 
Free Basic SEO Course/Workshop - Anadigme
Free Basic SEO Course/Workshop - AnadigmeFree Basic SEO Course/Workshop - Anadigme
Free Basic SEO Course/Workshop - Anadigme
Joaquin Poggi
 
Mining Google Analytics for Marketing Insights
Mining Google Analytics for Marketing InsightsMining Google Analytics for Marketing Insights
Mining Google Analytics for Marketing Insights
Kash Dhanda
 
Online SEO Meetup
Online SEO MeetupOnline SEO Meetup
Online SEO Meetup
Semrush
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
Looker
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
Looker
 
Technical SEO Audit
Technical SEO AuditTechnical SEO Audit
Technical SEO Audit
Outreach Digital
 
Advanced Keyword Research for SEO - Training Deck
Advanced Keyword Research for SEO - Training DeckAdvanced Keyword Research for SEO - Training Deck
Advanced Keyword Research for SEO - Training Deck
Daniel Brooks
 
Search Analytics at Enterprise Search Summit Fall 2011
Search Analytics at Enterprise Search Summit Fall 2011Search Analytics at Enterprise Search Summit Fall 2011
Search Analytics at Enterprise Search Summit Fall 2011
Sematext Group, Inc.
 
Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018
Nate Plaunt
 
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments' SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
Distilled
 
Cross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics ShortcutsCross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics Shortcuts
Craig Sullivan
 

Similar to MeasureFest July 2021 - Session Segmentation with Machine Learning (20)

A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...
 
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Empower customer success at LinkedIn with advanced analytics and great visual...
Empower customer success at LinkedIn with advanced analytics and great visual...Empower customer success at LinkedIn with advanced analytics and great visual...
Empower customer success at LinkedIn with advanced analytics and great visual...
 
Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014Elasticsearch : petit déjeuner du 13 mars 2014
Elasticsearch : petit déjeuner du 13 mars 2014
 
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
 
Rdfa semtech2011
Rdfa semtech2011Rdfa semtech2011
Rdfa semtech2011
 
Transitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimpsTransitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimps
 
Free Basic SEO Course/Workshop - Anadigme
Free Basic SEO Course/Workshop - AnadigmeFree Basic SEO Course/Workshop - Anadigme
Free Basic SEO Course/Workshop - Anadigme
 
Mining Google Analytics for Marketing Insights
Mining Google Analytics for Marketing InsightsMining Google Analytics for Marketing Insights
Mining Google Analytics for Marketing Insights
 
Online SEO Meetup
Online SEO MeetupOnline SEO Meetup
Online SEO Meetup
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
 
Technical SEO Audit
Technical SEO AuditTechnical SEO Audit
Technical SEO Audit
 
Advanced Keyword Research for SEO - Training Deck
Advanced Keyword Research for SEO - Training DeckAdvanced Keyword Research for SEO - Training Deck
Advanced Keyword Research for SEO - Training Deck
 
Search Analytics at Enterprise Search Summit Fall 2011
Search Analytics at Enterprise Search Summit Fall 2011Search Analytics at Enterprise Search Summit Fall 2011
Search Analytics at Enterprise Search Summit Fall 2011
 
Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018Demand quest seo training 1 16x9 10.2018
Demand quest seo training 1 16x9 10.2018
 
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments' SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
SearchLove London | Dave Sottimano, 'Using Data to Win Arguments'
 
Cross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics ShortcutsCross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics Shortcuts
 

Recently uploaded

Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 

Recently uploaded (20)

Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 

MeasureFest July 2021 - Session Segmentation with Machine Learning

  • 1. Audience segmentation with machine learning Richard Lawrence Rise at Seven @richlawre
  • 2. About me SEO background, studying a Data Science degree in spare time. @richlawre
  • 3. About me Follow me on Twitter @richlawre @richlawre
  • 4. What we’re going to cover @richlawre
  • 5. @richlawre A bit of context about machine learning AUDIENCE 1 Pageviews Transaction revenue AUDIENCE 2 AUDIENCE 3 The agenda
  • 6. @richlawre An overview of how audience segmentation works AUDIENCE 1 Pageviews Transaction revenue AUDIENCE 2 AUDIENCE 3 The agenda
  • 7. @richlawre Some detail about how to do it AUDIENCE 1 Pageviews Transaction revenue AUDIENCE 2 AUDIENCE 3 The agenda
  • 8. @richlawre How to take things further AUDIENCE 1 Pageviews Transaction revenue AUDIENCE 2 AUDIENCE 3 The agenda
  • 9. A bit of context @richlawre
  • 10. It learns with labelled data @richlawre What is supervised machine learning?
  • 11. @richlawre It finds patterns in data What is unsupervised machine learning?
  • 12. Audience segmentation in a nutshell @richlawre
  • 13. We extract data about individual sessions from web analytics @richlawre
  • 14. Extracting the data CHANNEL SESSIONS TRANSACTIONS REVENUE Organic search 1000 50 £12,000 Paid search 700 30 £3,000 Direct 500 25 £6,000 Referral 300 30 £4,000 Instead of grouping sessions by channel or section... @richlawre
  • 15. Extracting the data ...we extract details about individual sessions @richlawre SESSION ID PAGEVIEWS TIME PER PAGE REVENUE Session 1 7 30 seconds £77.50 Session 2 10 20 seconds £27.50 Session 3 5 23 seconds £36.50 Session 4 8 18 seconds £45.30
  • 16. We then use unsupervised machine learning to find interesting patterns @richlawre
  • 17. Instead of analysing sessions grouped together in some way... Finding patterns @richlawre
  • 18. AUDIENCE 1 Pageviews Transaction revenue AUDIENCE 2 AUDIENCE 3 ...we use machine learning to find patterns in user behaviour. @richlawre Finding patterns
  • 19. This results in actionable audience segments @richlawre
  • 20. The Gatherer Landing section: Homepage Least time per page Most number of pages viewed Highest number of conversions per session Most likely to download a brochure Description: The Gatherer comes directly to the website to the homepage, visits multiple car models to download a brochure for each to look at offline later. Example CRO Test: Link to a model comparison table from the homepage with option to download a brochure for each model Likely onsite journey Example segment from Car manufacturer Second section: Car Models Exit section: Car Models @richlawre
  • 21. The Skipper Example segment from Train operator Description: The Skipper has likely already done their travel research (around when to travel & where) multiple times without buying and are simply returning - likely at the last minute - to finally finish task. Example CRO Test: Use a cookie to add a banner to the homepage that takes a returning user back to where they left off in the transaction process. Slightly more time per page then average More likely to buy in the evening or at night Fewest days since last session Fewest pages per visit Over index for visiting via tablet Over index for visiting via email @richlawre
  • 22. Why do you need to do this? @richlawre
  • 23. 1.Find behaviours you may not have realised existed @richlawre
  • 25. 3.Track behaviours over time (more about this later) @richlawre
  • 26. How to do it @richlawre
  • 27. The key steps 1. Extract the data 2. Process the data 3. Select features 4. Cluster the data 5. Manually explore the segments @richlawre
  • 29. Using Google Analytics API Extract by Session ID or Client ID @richlawre https://www.jcchouinard.com/google-analytics-api-using-python/
  • 30. Using Google Analytics API Useful dimensions: landingPagePath secondPagePath exitPagePath @richlawre https://www.jcchouinard.com/google-analytics-api-using-python/
  • 31. Using Google Analytics API Useful metrics: pageviewsPerSession revenuePerTransaction goalXXCompletions https://www.jcchouinard.com/google-analytics-api-using-python/ @richlawre
  • 32. Using Google Analytics API There is a limit on the number of metrics/dimensions 10 @richlawre https://www.jcchouinard.com/google-analytics-api-using-python/
  • 33. Using Google Analytics API There is also a limit on the number of rows per call 25,000 @richlawre https://www.jcchouinard.com/google-analytics-api-using-python/
  • 34. Using Google Analytics API The answer is to loop over days, metrics, dimensions & merge! @richlawre https://www.jcchouinard.com/google-analytics-api-using-python/
  • 35. Using BigQuery Data is nested - I’ve found it makes things more difficult at the session level https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/ @richlawre
  • 36. Using BigQuery However it is possible to do and there is some great information around https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/ @richlawre
  • 37. Using BigQuery Can also run the unsupervised machine learning algorithm directly in SQL https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/ @richlawre
  • 38. Using BigQuery Previously used 1M sessions with Python & Google Colab - BigQuery wasn’t necessary https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/ @richlawre
  • 39. Using BigQuery Choose days at random to ensure variation https://adswerve.com/blog/google-analytics-queries-in-bigquery-part-two-users-sessions-unnesting-hits/ @richlawre
  • 41. Useful data transformations Change hours of the day to morning, afternoon, evening,night SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday @richlawre
  • 42. Change days to weekday & weekend @richlawre SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday Useful data transformations
  • 43. Change pages to sections @richlawre SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday Useful data transformations
  • 44. Useful data transformations Combine certain conversion points @richlawre SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday
  • 45. Here is a useful link to do find and replace it in Python & Pandas @richlawre SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday Useful data transformations
  • 46. You could use Google DataPrep instead @richlawre SESSION ID DAY DAY TYPE Session 1 Monday Weekday Session 2 Tuesday Weekday Session 3 Saturday Weekend Session 4 Wednesday Weekday Useful data transformations
  • 47. One hot encoding Converts categories to 1s & 0s. SESSION ID CHANNEL Session 1 Organic Search Session 2 Paid Search Session 3 Direct Session 4 Direct SESSION ID ORGANIC SEARCH PAID SEARCH DIRECT Session 1 1 0 0 Session 2 0 1 0 Session 3 0 0 1 Session 4 0 0 1 @richlawre
  • 48. Values aren’t increasing so doesn’t skew the clustering algorithm SESSION ID CHANNEL Session 1 Organic Search Session 2 Paid Search Session 3 Direct Session 4 Direct SESSION ID ORGANIC SEARCH PAID SEARCH DIRECT Session 1 1 0 0 Session 2 0 1 0 Session 3 0 0 1 Session 4 0 0 1 @richlawre One hot encoding
  • 49. Use for numerical as well as categorical data SESSION ID CHANNEL Session 1 Organic Search Session 2 Paid Search Session 3 Direct Session 4 Direct SESSION ID ORGANIC SEARCH PAID SEARCH DIRECT Session 1 1 0 0 Session 2 0 1 0 Session 3 0 0 1 Session 4 0 0 1 @richlawre One hot encoding
  • 50. See here for how to do it with Python SESSION ID CHANNEL Session 1 Organic Search Session 2 Paid Search Session 3 Direct Session 4 Direct SESSION ID ORGANIC SEARCH PAID SEARCH DIRECT Session 1 1 0 0 Session 2 0 1 0 Session 3 0 0 1 Session 4 0 0 1 @richlawre One hot encoding
  • 52. Best subset regression Choose desired response variable & find potential explanatory variables @richlawre
  • 53. Best subset regression Runs regression analysis for combinations of variables at once to find correlation @richlawre
  • 54. Best subset regression This will help you narrow down features to find useful patterns within @richlawre
  • 55. Best subset regression See Python walkthrough here @richlawre
  • 57. Principal Component Analysis Transforms a large set of variables into a smaller one without much loss @richlawre
  • 59. Using a KMeans algorithm The unsupervised machine learning algorithm to find patterns @richlawre
  • 60. Using a KMeans algorithm See a full walkthrough here with Python. @richlawre
  • 61. Using a KMeans algorithm You can also do this directly in BigQuery. @richlawre
  • 62. Using a silhouette score Way of finding the optimum number of clusters @richlawre
  • 63. Using a silhouette score Optimal number is at the elbow in the graph - not much gain after this @richlawre
  • 64. 5.Always manually explore the segments! @richlawre
  • 65. Taking it to the next level @richlawre
  • 66. Classify any future session @richlawre Use the labelled data to train a supervised machine learning algorithm - we use deep learning
  • 67. Classify any future session @richlawre The better defined your segments, the better this will perform
  • 68. Classify any future session @richlawre Push the labelled sessions back into Google Analytics via Data Import
  • 69. Visualise in Streamlit @richlawre CRM segment 1 CRM segment 2 CRM segment 3 CRM segment 4 CRM segment 5
  • 71. Unsupervised machine learning finds interesting patterns in data. @richlawre
  • 72. Apply this to individual sessions from Google Analytics to create behaviour segments. @richlawre
  • 73. This can be a great source of ideas for CRO hypotheses. @richlawre
  • 74. There are 5 steps for the analysis: extract,process,feature selection, cluster,manually explore @richlawre
  • 75. You can use Python or other toolsets (Google Cloud) to do the analysis. @richlawre
  • 76. You can use the segments to label any future session on the website. @richlawre