SlideShare a Scribd company logo
How to create your own
search quality
evaluation algorithms
Richard Lawrence
Sanity.io
@richlawre
@richlawre
● Principal SEO at
Sanity
Who the hell is this guy anyway?
Who the hell is this guy anyway?
@richlawre
● Sanity is a headless
CMS and more!
@richlawre
● Doing a Data Science
degree in my spare
time
Who the hell is this guy anyway?
Onto some context
@richlawre
The ‘helpful content update’ might have
been a bit of a damp squib…
@richlawre
…but Google is always working towards
ranking helpful content more highly
@richlawre
So wouldn’t it be great to know if your
content is helping your audience - at scale?
@richlawre
The search rater guidelines hold the key
@richlawre
167 page document
that says what good
looks like!
Google says it doesn’t directly use the
ratings in its ranking algorithms
“We use responses from Raters
to evaluate changes, but they
don’t directly impact how our
search results are ranked.”
bit.ly/ratings-answer
@richlawre
But it will use the rated content to help find
features of what ‘good’ looks like
@richlawre
Similar methods have been used for years
in various areas - like counterfeit notes
@richlawre
Features are found that best separate
authentic and counterfeit notes
Distance between edge & watermark
Width of
shaded area
Counterfeit
Authentic
@richlawre
Features for high vs. low quality content will
likely be more complex
@richlawre
Bing confirmed this is how it works in 2019
bit.ly/bing-confirmation @richlawre
With 90% of its algorithms being ML based
@richlawre
bit.ly/bing-features
Plus it revealed its process
@richlawre
bit.ly/bing-process
So how can we harness this as an industry?
@richlawre
We can try to create our own!
@richlawre
1. Label the content
2. Create a ‘Needs Met’ algorithm
3. Create a ‘Page Quality’ algorithm
What we need to do
@richlawre
Labelling the content
@richlawre
Get a representative sample of searches
448 million search queries
bit.ly/448-million @richlawre
Here’s how to play around with the file
@richlawre
bit.ly/large-file
Then gather the top 20 rankings for each
sample query
Likely available
feature of your
favourite rank
tracking software
@richlawre
Use some search raters to rate the content
Collect
labels
Choose
provider
Create
guidelines
Must not be
identical to
Google’s…
Needs Met &
Page Quality
2 search raters
with 3rd called in
for disagreements
@richlawre
Creating a Needs Met algorithm
@richlawre
This measures fulfilling search intent
Features will mainly be
relating to relevance
and structure
@richlawre
GPT language models are perfect for this
The open source option
@richlawre
GPT-3 became cheaper in September too
@richlawre
We need to create a pattern for GPT-J to learn
Content:
<h1>Compare car insurance quotes</h1>
<p>It's quick and easy to compare car insurance
and find cheaper cover – we just need a few
details about you and your vehicle.</p>
Target query: car insurance
Needs Met rating: Good
@richlawre
It will then rate new content
Content:
<h1>Car insurance</h1>
<p>From theft to write-offs and even lost keys,
you'll be covered with us. Here's what you'll like
about our comprehensive cover </p>
Target query: car insurance
Needs Met rating: ?????
@richlawre
We need to scrape content from each page to
give to the language model - with the rating
@richlawre
Then use this info to train GPT-J
@richlawre
bit.ly/finetune-gptj
You can also use existing services
@richlawre
NLP Cloud Forefront.ai
NLP Cloud also became cheaper!
@richlawre
Validate performance with a test set
@richlawre
Judge performance with a Confusion Matrix
@richlawre
Correct
Wrong
Correct Wrong
True positive False negative
False positive True negative
Actual
Prediction
Few shot learning can help improve
performance
@richlawre
Prompt
Example 1
Rating: Excellent
Example 2
Rating: Poor
Example 3
Rating: ????
GPT-J
Good
As can explaining to the model what it
needs to do!
@richlawre
Consider the content to rate.
Rate it according how well it
fits the search query.
We’ve done this for you within Sanity Studio
@richlawre
And lots of other great features
@richlawre
Contact us for more info about the beta for
these features:
bit.ly/sanity-beta
@richlawre
This isn’t perfect of course - though still very
useful
@richlawre
● Only text content
● Useful indication only
● Great at scale
Creating a Page Quality algorithm
@richlawre
This is much more difficult!
@richlawre
It measures how well a page achieves its
purpose
@richlawre
This is about quality of
content, independent
of search queries
So features can relate to a large number of
areas!
@richlawre
‘Main Content’ vs
‘Supplementary
Content’
Website
background
information
Amount of Main Content
Position of Main Content
Depth of ‘about’ info
Wikipedia presence
And you have to work out how to measure
them
@richlawre
Amount of Main
Content
Length of Main
Content area
Number of words
in Main Content
It becomes a huge multivariate challenge
@richlawre
Page
Length of
MC area
‘About us’
word count
Clicks to
‘About us’
Page 1 17cm 500 2
Page 2 20cm 300 1
Page 3 15cm 1000 2
Page 4 25cm 750 3
Then we need to find features that best
separate the groups
Number of words in ‘About’ section
Length of
‘Main Content’
area
High quality
Low quality
@richlawre
But with a large number of features!
@richlawre
This can be explored with a number of
potential models
@richlawre
Linear Discriminant Analysis
@richlawre
This can be explored with a number of
potential models
Random Forest
@richlawre
This can be explored with a number of
potential models
Neural Network
This is a huge challenge!
@richlawre
Which features?
@richlawre
How to measure them?
@richlawre
Which model?
@richlawre
The work is ongoing here!
@richlawre
Let’s sum up
@richlawre
Google likely uses its raters to gather
labelled data on content quality
@richlawre
It will then likely use that to find features of
‘good’ and ‘bad’ content
@richlawre
And creates algorithms to distinguish
between the two
@richlawre
You can do the same!
@richlawre
Get your own labelled content and create
your own scoring algorithms
@richlawre
We have created a ‘Needs Met’ score within
Sanity Studio
@richlawre
So that you can get an indication of content
calibre directly in your publishing workflow
@richlawre
Contact us to get more info about the beta
here:
bit.ly/sanity-beta
@richlawre
Richard Lawrence
Principal at Sanity.io
@richlawre
@richlawre

More Related Content

What's hot

BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce WebsitesBrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
Dan Taylor
 
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
BethBarnham1
 
How to get more traffic with less content - BrightonSEO
How to get more traffic with less content - BrightonSEOHow to get more traffic with less content - BrightonSEO
How to get more traffic with less content - BrightonSEO
Anna Gregory-Hall
 
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Ahrefs
 
Agile SEO: Prioritise SEO Activities with Cadence and Risk Radius
Agile SEO: Prioritise SEO Activities with Cadence and Risk RadiusAgile SEO: Prioritise SEO Activities with Cadence and Risk Radius
Agile SEO: Prioritise SEO Activities with Cadence and Risk Radius
Parth Suba
 
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble RomagnoliHow to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
Rumble Romagnoli
 
Why Scaling (Great) Content Is So Bloody Hard
Why Scaling (Great) Content Is So Bloody HardWhy Scaling (Great) Content Is So Bloody Hard
Why Scaling (Great) Content Is So Bloody Hard
JoshuaHardwickAhrefs
 
How SEO changes, as we say bye bye to cookies
How SEO changes, as we say bye bye to cookiesHow SEO changes, as we say bye bye to cookies
How SEO changes, as we say bye bye to cookies
AccuraCast
 
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth SouthgateMartin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry
 
Data Pitfalls - Brighton SEO - Katie Swann.pptx
Data Pitfalls - Brighton SEO - Katie Swann.pptxData Pitfalls - Brighton SEO - Katie Swann.pptx
Data Pitfalls - Brighton SEO - Katie Swann.pptx
KatieSwann5
 
Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022
Andrew Charlton
 
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdfCore Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Sophie Gibson
 
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your ClientsbrightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
Kathryn Monkcom
 
Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022
MonetBlake
 
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
LazarinaStoyanova
 
How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...
Ian Helms
 
How to put together a search strategy for a new category
How to put together a search strategy for a new categoryHow to put together a search strategy for a new category
How to put together a search strategy for a new category
Amir Jirbandey
 
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
Louise Ali
 
Swipe left: Why your content is getting ghosted
Swipe left: Why your content is getting ghostedSwipe left: Why your content is getting ghosted
Swipe left: Why your content is getting ghosted
Eleni Cashell
 
BrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UXBrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UX
Oban International
 

What's hot (20)

BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce WebsitesBrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
BrightonSEO October 2022 - Dan Taylor SEO - Indexing Ecommerce Websites
 
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
Accessibility, strategy and schema - do they go hand in hand? Beth Barnham Br...
 
How to get more traffic with less content - BrightonSEO
How to get more traffic with less content - BrightonSEOHow to get more traffic with less content - BrightonSEO
How to get more traffic with less content - BrightonSEO
 
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
 
Agile SEO: Prioritise SEO Activities with Cadence and Risk Radius
Agile SEO: Prioritise SEO Activities with Cadence and Risk RadiusAgile SEO: Prioritise SEO Activities with Cadence and Risk Radius
Agile SEO: Prioritise SEO Activities with Cadence and Risk Radius
 
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble RomagnoliHow to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
How to Create an Airtight SEO Strategy to Beat Any Competitor - Rumble Romagnoli
 
Why Scaling (Great) Content Is So Bloody Hard
Why Scaling (Great) Content Is So Bloody HardWhy Scaling (Great) Content Is So Bloody Hard
Why Scaling (Great) Content Is So Bloody Hard
 
How SEO changes, as we say bye bye to cookies
How SEO changes, as we say bye bye to cookiesHow SEO changes, as we say bye bye to cookies
How SEO changes, as we say bye bye to cookies
 
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth SouthgateMartin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
 
Data Pitfalls - Brighton SEO - Katie Swann.pptx
Data Pitfalls - Brighton SEO - Katie Swann.pptxData Pitfalls - Brighton SEO - Katie Swann.pptx
Data Pitfalls - Brighton SEO - Katie Swann.pptx
 
Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022
 
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdfCore Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
 
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your ClientsbrightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
 
Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022Monet BrightonSEO Slides 2022
Monet BrightonSEO Slides 2022
 
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
 
How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...
 
How to put together a search strategy for a new category
How to put together a search strategy for a new categoryHow to put together a search strategy for a new category
How to put together a search strategy for a new category
 
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
Brighton SEO: Self Esteem Optimisation - The most important type of SEO - Lou...
 
Swipe left: Why your content is getting ghosted
Swipe left: Why your content is getting ghostedSwipe left: Why your content is getting ghosted
Swipe left: Why your content is getting ghosted
 
BrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UXBrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UX
 

Similar to Creating Search Quality Algorithms - Richard Lawrence - BrightonSEO.pdf

Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
Richard Lawrence
 
Master Class SEO
Master Class SEOMaster Class SEO
Master Class SEO
DQ Network
 
Advanced Keyword Research
Advanced Keyword ResearchAdvanced Keyword Research
Advanced Keyword Research
Dave Snyder
 
The In-depth Guide to Website On-page Optimization
The In-depth Guide to Website On-page OptimizationThe In-depth Guide to Website On-page Optimization
The In-depth Guide to Website On-page Optimization
Julia Blake
 
intoduction to search engine optimization.pptx
intoduction to search engine optimization.pptxintoduction to search engine optimization.pptx
intoduction to search engine optimization.pptx
sumanjeetkaur15
 
SEO, PPC and AI in 2023 and Beyond
SEO, PPC and AI in 2023 and BeyondSEO, PPC and AI in 2023 and Beyond
SEO, PPC and AI in 2023 and Beyond
Lily Ray
 
SearchCon 2016 | High Velocity Presentations
SearchCon 2016 | High Velocity PresentationsSearchCon 2016 | High Velocity Presentations
SearchCon 2016 | High Velocity Presentations
SearchCon
 
180 Fusion - SEO capabilities
180 Fusion - SEO capabilities180 Fusion - SEO capabilities
180 Fusion - SEO capabilities
Justin Campbell
 
Demand Quest SEO Training Sept. 2017 - Session 1
Demand Quest SEO Training Sept. 2017 - Session 1Demand Quest SEO Training Sept. 2017 - Session 1
Demand Quest SEO Training Sept. 2017 - Session 1
Nate Plaunt
 
Seo questions for 2013
Seo questions for 2013Seo questions for 2013
Seo questions for 2013
Lalit Kant
 
How your (non-SEO) work affects Organic Search.
How your (non-SEO) work affects Organic Search. How your (non-SEO) work affects Organic Search.
How your (non-SEO) work affects Organic Search.
Matt Lacuesta
 
Critical Rules for SEO Success in 2014
Critical Rules for SEO Success in 2014Critical Rules for SEO Success in 2014
Critical Rules for SEO Success in 2014
Act-On Software
 
Search Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level ViewSearch Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level View
justin spratt
 
SEO for humans, without the jargon- Halton Business Fair November 16
SEO for humans, without the jargon- Halton Business Fair November 16SEO for humans, without the jargon- Halton Business Fair November 16
SEO for humans, without the jargon- Halton Business Fair November 16
Jonathan Guy ✯ Paid and Organic Search Specialist
 
Purplegator SEO Pitch Deck.pptx
Purplegator SEO Pitch Deck.pptxPurplegator SEO Pitch Deck.pptx
Purplegator SEO Pitch Deck.pptx
Purplegator
 
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO TutorialSEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
Deep Mehta
 
Demand quest SEO training Session 1 May 2017
Demand quest SEO training Session 1 May 2017Demand quest SEO training Session 1 May 2017
Demand quest SEO training Session 1 May 2017
Nate Plaunt
 
Creating Findable Content: SEO for Non-SEOs
Creating Findable Content: SEO for Non-SEOsCreating Findable Content: SEO for Non-SEOs
Creating Findable Content: SEO for Non-SEOs
Harris A. Schachter
 
SEO Overview
SEO OverviewSEO Overview
SEO Overview
Bridgett Gutierrez
 
Demand quest seo training
Demand quest seo trainingDemand quest seo training
Demand quest seo training
Nate Plaunt
 

Similar to Creating Search Quality Algorithms - Richard Lawrence - BrightonSEO.pdf (20)

Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
Richard Lawrence - How to measure the impact of LinkedIn ads with zero clicks...
 
Master Class SEO
Master Class SEOMaster Class SEO
Master Class SEO
 
Advanced Keyword Research
Advanced Keyword ResearchAdvanced Keyword Research
Advanced Keyword Research
 
The In-depth Guide to Website On-page Optimization
The In-depth Guide to Website On-page OptimizationThe In-depth Guide to Website On-page Optimization
The In-depth Guide to Website On-page Optimization
 
intoduction to search engine optimization.pptx
intoduction to search engine optimization.pptxintoduction to search engine optimization.pptx
intoduction to search engine optimization.pptx
 
SEO, PPC and AI in 2023 and Beyond
SEO, PPC and AI in 2023 and BeyondSEO, PPC and AI in 2023 and Beyond
SEO, PPC and AI in 2023 and Beyond
 
SearchCon 2016 | High Velocity Presentations
SearchCon 2016 | High Velocity PresentationsSearchCon 2016 | High Velocity Presentations
SearchCon 2016 | High Velocity Presentations
 
180 Fusion - SEO capabilities
180 Fusion - SEO capabilities180 Fusion - SEO capabilities
180 Fusion - SEO capabilities
 
Demand Quest SEO Training Sept. 2017 - Session 1
Demand Quest SEO Training Sept. 2017 - Session 1Demand Quest SEO Training Sept. 2017 - Session 1
Demand Quest SEO Training Sept. 2017 - Session 1
 
Seo questions for 2013
Seo questions for 2013Seo questions for 2013
Seo questions for 2013
 
How your (non-SEO) work affects Organic Search.
How your (non-SEO) work affects Organic Search. How your (non-SEO) work affects Organic Search.
How your (non-SEO) work affects Organic Search.
 
Critical Rules for SEO Success in 2014
Critical Rules for SEO Success in 2014Critical Rules for SEO Success in 2014
Critical Rules for SEO Success in 2014
 
Search Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level ViewSearch Engine Optimisation: A High Level View
Search Engine Optimisation: A High Level View
 
SEO for humans, without the jargon- Halton Business Fair November 16
SEO for humans, without the jargon- Halton Business Fair November 16SEO for humans, without the jargon- Halton Business Fair November 16
SEO for humans, without the jargon- Halton Business Fair November 16
 
Purplegator SEO Pitch Deck.pptx
Purplegator SEO Pitch Deck.pptxPurplegator SEO Pitch Deck.pptx
Purplegator SEO Pitch Deck.pptx
 
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO TutorialSEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
SEO Training Course Online, Learn SEO, SEO for Beginners, Complete SEO Tutorial
 
Demand quest SEO training Session 1 May 2017
Demand quest SEO training Session 1 May 2017Demand quest SEO training Session 1 May 2017
Demand quest SEO training Session 1 May 2017
 
Creating Findable Content: SEO for Non-SEOs
Creating Findable Content: SEO for Non-SEOsCreating Findable Content: SEO for Non-SEOs
Creating Findable Content: SEO for Non-SEOs
 
SEO Overview
SEO OverviewSEO Overview
SEO Overview
 
Demand quest seo training
Demand quest seo trainingDemand quest seo training
Demand quest seo training
 

Recently uploaded

一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
cjimenez2581
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 

Recently uploaded (20)

一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 

Creating Search Quality Algorithms - Richard Lawrence - BrightonSEO.pdf

  • 1. How to create your own search quality evaluation algorithms Richard Lawrence Sanity.io @richlawre
  • 2. @richlawre ● Principal SEO at Sanity Who the hell is this guy anyway?
  • 3. Who the hell is this guy anyway? @richlawre ● Sanity is a headless CMS and more!
  • 4. @richlawre ● Doing a Data Science degree in my spare time Who the hell is this guy anyway?
  • 6. The ‘helpful content update’ might have been a bit of a damp squib… @richlawre
  • 7. …but Google is always working towards ranking helpful content more highly @richlawre
  • 8. So wouldn’t it be great to know if your content is helping your audience - at scale? @richlawre
  • 9. The search rater guidelines hold the key @richlawre 167 page document that says what good looks like!
  • 10. Google says it doesn’t directly use the ratings in its ranking algorithms “We use responses from Raters to evaluate changes, but they don’t directly impact how our search results are ranked.” bit.ly/ratings-answer @richlawre
  • 11. But it will use the rated content to help find features of what ‘good’ looks like @richlawre
  • 12. Similar methods have been used for years in various areas - like counterfeit notes @richlawre
  • 13. Features are found that best separate authentic and counterfeit notes Distance between edge & watermark Width of shaded area Counterfeit Authentic @richlawre
  • 14. Features for high vs. low quality content will likely be more complex @richlawre
  • 15. Bing confirmed this is how it works in 2019 bit.ly/bing-confirmation @richlawre
  • 16. With 90% of its algorithms being ML based @richlawre bit.ly/bing-features
  • 17. Plus it revealed its process @richlawre bit.ly/bing-process
  • 18. So how can we harness this as an industry? @richlawre
  • 19. We can try to create our own! @richlawre
  • 20. 1. Label the content 2. Create a ‘Needs Met’ algorithm 3. Create a ‘Page Quality’ algorithm What we need to do @richlawre
  • 22. Get a representative sample of searches 448 million search queries bit.ly/448-million @richlawre
  • 23. Here’s how to play around with the file @richlawre bit.ly/large-file
  • 24. Then gather the top 20 rankings for each sample query Likely available feature of your favourite rank tracking software @richlawre
  • 25. Use some search raters to rate the content Collect labels Choose provider Create guidelines Must not be identical to Google’s… Needs Met & Page Quality 2 search raters with 3rd called in for disagreements @richlawre
  • 26. Creating a Needs Met algorithm @richlawre
  • 27. This measures fulfilling search intent Features will mainly be relating to relevance and structure @richlawre
  • 28. GPT language models are perfect for this The open source option @richlawre
  • 29. GPT-3 became cheaper in September too @richlawre
  • 30. We need to create a pattern for GPT-J to learn Content: <h1>Compare car insurance quotes</h1> <p>It's quick and easy to compare car insurance and find cheaper cover – we just need a few details about you and your vehicle.</p> Target query: car insurance Needs Met rating: Good @richlawre
  • 31. It will then rate new content Content: <h1>Car insurance</h1> <p>From theft to write-offs and even lost keys, you'll be covered with us. Here's what you'll like about our comprehensive cover </p> Target query: car insurance Needs Met rating: ????? @richlawre
  • 32. We need to scrape content from each page to give to the language model - with the rating @richlawre
  • 33. Then use this info to train GPT-J @richlawre bit.ly/finetune-gptj
  • 34. You can also use existing services @richlawre NLP Cloud Forefront.ai
  • 35. NLP Cloud also became cheaper! @richlawre
  • 36. Validate performance with a test set @richlawre
  • 37. Judge performance with a Confusion Matrix @richlawre Correct Wrong Correct Wrong True positive False negative False positive True negative Actual Prediction
  • 38. Few shot learning can help improve performance @richlawre Prompt Example 1 Rating: Excellent Example 2 Rating: Poor Example 3 Rating: ???? GPT-J Good
  • 39. As can explaining to the model what it needs to do! @richlawre Consider the content to rate. Rate it according how well it fits the search query.
  • 40. We’ve done this for you within Sanity Studio @richlawre
  • 41. And lots of other great features @richlawre
  • 42. Contact us for more info about the beta for these features: bit.ly/sanity-beta @richlawre
  • 43. This isn’t perfect of course - though still very useful @richlawre ● Only text content ● Useful indication only ● Great at scale
  • 44. Creating a Page Quality algorithm @richlawre
  • 45. This is much more difficult! @richlawre
  • 46. It measures how well a page achieves its purpose @richlawre This is about quality of content, independent of search queries
  • 47. So features can relate to a large number of areas! @richlawre ‘Main Content’ vs ‘Supplementary Content’ Website background information Amount of Main Content Position of Main Content Depth of ‘about’ info Wikipedia presence
  • 48. And you have to work out how to measure them @richlawre Amount of Main Content Length of Main Content area Number of words in Main Content
  • 49. It becomes a huge multivariate challenge @richlawre Page Length of MC area ‘About us’ word count Clicks to ‘About us’ Page 1 17cm 500 2 Page 2 20cm 300 1 Page 3 15cm 1000 2 Page 4 25cm 750 3
  • 50. Then we need to find features that best separate the groups Number of words in ‘About’ section Length of ‘Main Content’ area High quality Low quality @richlawre
  • 51. But with a large number of features! @richlawre
  • 52. This can be explored with a number of potential models @richlawre Linear Discriminant Analysis
  • 53. @richlawre This can be explored with a number of potential models Random Forest
  • 54. @richlawre This can be explored with a number of potential models Neural Network
  • 55. This is a huge challenge! @richlawre
  • 57. How to measure them? @richlawre
  • 59. The work is ongoing here! @richlawre
  • 61. Google likely uses its raters to gather labelled data on content quality @richlawre
  • 62. It will then likely use that to find features of ‘good’ and ‘bad’ content @richlawre
  • 63. And creates algorithms to distinguish between the two @richlawre
  • 64. You can do the same! @richlawre
  • 65. Get your own labelled content and create your own scoring algorithms @richlawre
  • 66. We have created a ‘Needs Met’ score within Sanity Studio @richlawre
  • 67. So that you can get an indication of content calibre directly in your publishing workflow @richlawre
  • 68. Contact us to get more info about the beta here: bit.ly/sanity-beta @richlawre
  • 69. Richard Lawrence Principal at Sanity.io @richlawre @richlawre