SlideShare a Scribd company logo
1 of 35
Modern On Page Factors
1
SMX Advanced
Matthew Peters, PhD
matt@moz.com @mattthemathman
2
“philadelphia
phillies”
3
“philadelphia
phillies”
4
“Relevance” vs “Ranking”
Conceptually “relevance” determination and “ranking” can be thought of a two
different steps (even if they are implemented as one in a search engine)
5
“Relevance” vs “Ranking”
Conceptually “relevance” determination and “ranking” can be thought of a two
different steps (even if they are implemented as one in a search engine)
Relevance
6
“Relevance” vs “Ranking”
Conceptually “relevance” determination and “ranking” can be thought of a two
different steps (even if they are implemented as one in a search engine)
Relevance
Ranking
1
2
7
Is this page relevant to “philadelphia phillies”?
8
Is this page relevant to “philadelphia phillies”?
query-body similarity: 0.74
9
Is this page relevant to “philadelphia phillies”?
query-body similarity: 0.74
query-title similarity: 0.8
query-H1 similarity: 1.0
etc …
10
Measuring query-document similarity
Goal: given query + document string, compute “similarity”
11
Measuring query-document similarity
See “Introduction to Information Retrieval” by Manning et al:
http://nlp.stanford.edu/IR-book/
> 700
papers
Goal: given query + document string, compute “similarity”
12
Measuring query-document similarity
“philadelphia phillies”
In this context “document” can also refer to title tag, meta description, H1, etc.
0.74
13
Measuring query-document similarity
“philadelphia phillies”
Query Model
tokenization
normalization (stemming)
query expansion
intent
In this context “document” can also refer to title tag, meta description, H1, etc.
0.74
14
Measuring query-document similarity
“philadelphia phillies”
Query Model
tokenization
normalization (stemming)
query expansion
intent
Document Model
tokenization
normalization (stemming)
vector space representation
language model
In this context “document” can also refer to title tag, meta description, H1, etc.
0.74
15
Measuring query-document similarity
“philadelphia phillies”
Query Model
tokenization
normalization (stemming)
query expansion
intent
Document Model
tokenization
normalization (stemming)
vector space representation
language model
In this context “document” can also refer to title tag, meta description, H1, etc.
Scoring function
0.74
16
Query representation
Language identification
Word segmentation
(Japanese, Chinese)
Tokenization + normalization
{reviews, reviewer, reviewing} -> review
Spelling correction
17
Query representation
Language identification
Word segmentation
(Japanese, Chinese)
Tokenization + normalization
{reviews, reviewer, reviewing} -> review
Query expansion
User intent (transactional,
navigational, informational)
Local
Classification
(images, video, news)
Spelling correction
18
Query representation
Language identification
Word segmentation
(Japanese, Chinese)
Tokenization + normalization
{reviews, reviewer, reviewing} -> review
Query expansion
User intent
(transactional, navigational, i
nformational)
Local
Classification
(images, video, news)
Topic Model (LDA)
Entity extraction
Spelling correction
Document representation
TF-IDF
Document representation
TF-IDF Language Model
P(optimization | search, engine)
>>
P(walking | search, engine)
Document representation
Probability Ranking Principle
P(R = 1 | d, q) or P(R = 0 |
d, q)
TF-IDF Language Model
P(optimization | search, engine)
>>
P(walking | search, engine)
Which method performs best?
What are the characteristics of sites that rank highly?
14,000+ keywords
Top 50 results
600,000 URLs
Google-US, no personalization
March 2013
Mean Spearman Correlation
Remember: “correlation is not causation”
Which method performs best?
We tried a few different types of smoothing for the language model,
Dirichlet worked best (Zhai and Lafferty SIGIR 2001)
Impact of stemming
Porter stemmer provided a slight increase in correlations
These correlations are still relatively low compared to other factors
50 results
450
random
pages
movie reviews
50 results
450
random
pages
movie reviews For each
query:500 pages
10% relevant
90% irrelevant
50 results
450
random
pages
movie reviews For each
query:500 pages
10% relevant
90% irrelevant
URL ID PA In SERP?
86 92 1
355 90 0
… … …
27 18 0
URL ID Language
Model
In SERP?
213 0.97 1
156 0.95 1
… … …
355 0.06 0
50 results
450
random
pages
movie reviews For each
query:500 pages
10% relevant
90% irrelevant
URL ID PA In SERP?
86 92 1
355 90 0
… … …
27 18 0
URL ID Language
Model
In SERP?
213 0.97 1
156 0.95 1
… … …
355 0.06 0
P@50 is the “Precision of the top 50 results”. It is the percentage of top 50
results by PA/Language Model that are actually in the SERP.
Top 50
ranked
50 results
450
random
pages
movie reviews For each
query:500 pages
10% relevant
90% irrelevant
URL ID PA In SERP?
86 92 1
355 90 0
… … …
27 18 0
URL ID Language
Model
In SERP?
213 0.97 1
156 0.95 1
… … …
355 0.06 0
P@50 is the “Precision of the top 50 results”. It is the percentage of top 50
results by PA/Language Model that are actually in the SERP.
Top 50
ranked
Takeaways
Implication: Query-document similarity is based on decades of
research. It’s immune to algorithm change.
Takeaways
Implication: Query-document similarity is based on decades of
research. It’s immune to algorithm change.
Action item: With sophisticated query and document models, no
need to optimize separately for similar words, e.g. “movie
reviews” vs “movie review”.
Takeaways
Implication: Query-document similarity is based on decades of
research. It’s immune to algorithm change.
Action item: With sophisticated query and document models, no
need to optimize separately for similar words, e.g. “movie
reviews” vs “movie review”.
Action item: Each page is relevant to many different keywords,
so optimize each page for a broad set of related keywords,
instead of a single keyword.
Takeaways
Implication: Query-document similarity is based on decades of
research. It’s immune to algorithm change.
Action item: With sophisticated query and document models, no
need to optimize separately for similar words, e.g. “movie
reviews” vs “movie review”.
Action item: Each page is relevant to many different keywords,
so optimize each page for a broad set of related keywords,
instead of a single keyword.
Use case: Content creation. What keywords will this new blog
post target? Is it relevant to a set of queries?
Thanks for watching!
Matthew Peters
matt@moz.com @mattthemathman
35

More Related Content

Similar to Peters matthew periodictableseo

C-T-R-You Ready for 2021?! - On-SERP SEO Strategies
C-T-R-You Ready for 2021?! - On-SERP SEO StrategiesC-T-R-You Ready for 2021?! - On-SERP SEO Strategies
C-T-R-You Ready for 2021?! - On-SERP SEO StrategiesIzzi Smith
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customersrichwig
 
TCDrupal 2018: SEO! Snippets! Schema!
TCDrupal 2018: SEO! Snippets! Schema! TCDrupal 2018: SEO! Snippets! Schema!
TCDrupal 2018: SEO! Snippets! Schema! Diane Kulseth
 
Graphs for Recommendation Engines: Looking beyond Social, Retail, and Media
Graphs for Recommendation Engines: Looking beyond Social, Retail, and MediaGraphs for Recommendation Engines: Looking beyond Social, Retail, and Media
Graphs for Recommendation Engines: Looking beyond Social, Retail, and MediaNeo4j
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
 
Semantic Search_ NLP_ ML.pdf
Semantic Search_ NLP_ ML.pdfSemantic Search_ NLP_ ML.pdf
Semantic Search_ NLP_ ML.pdfPlamenaDzharadat
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
 
SEO Social Blog: SEO Training 2010 From SEOmoz
SEO Social Blog:  SEO Training 2010 From SEOmoz SEO Social Blog:  SEO Training 2010 From SEOmoz
SEO Social Blog: SEO Training 2010 From SEOmoz SEO Social Blog
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEORand Fishkin
 
Seo training-2010-100818134052-phpapp02 (1)
Seo training-2010-100818134052-phpapp02 (1)Seo training-2010-100818134052-phpapp02 (1)
Seo training-2010-100818134052-phpapp02 (1)Dharmendra Patel
 
Search Enginge Optimization: SEOmoz
Search Enginge Optimization: SEOmozSearch Enginge Optimization: SEOmoz
Search Enginge Optimization: SEOmozmbragi
 
Search Engine Marketing MD4
Search Engine Marketing MD4Search Engine Marketing MD4
Search Engine Marketing MD4pointstores
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)Harshita Srivastava
 
Concept Based Search
Concept Based SearchConcept Based Search
Concept Based Searchfreewi11
 
Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glancepoojagupta267
 
Semrush Ranking Factors Study 2.0 April 2019
Semrush Ranking Factors Study 2.0 April 2019Semrush Ranking Factors Study 2.0 April 2019
Semrush Ranking Factors Study 2.0 April 2019Megumi Tsukada
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your siteLouis Rosenfeld
 

Similar to Peters matthew periodictableseo (20)

C-T-R-You Ready for 2021?! - On-SERP SEO Strategies
C-T-R-You Ready for 2021?! - On-SERP SEO StrategiesC-T-R-You Ready for 2021?! - On-SERP SEO Strategies
C-T-R-You Ready for 2021?! - On-SERP SEO Strategies
 
Key Phrases for Better Search
Key Phrases for Better SearchKey Phrases for Better Search
Key Phrases for Better Search
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
 
TCDrupal 2018: SEO! Snippets! Schema!
TCDrupal 2018: SEO! Snippets! Schema! TCDrupal 2018: SEO! Snippets! Schema!
TCDrupal 2018: SEO! Snippets! Schema!
 
SphinnCon Israel 2008
SphinnCon Israel 2008SphinnCon Israel 2008
SphinnCon Israel 2008
 
Graphs for Recommendation Engines: Looking beyond Social, Retail, and Media
Graphs for Recommendation Engines: Looking beyond Social, Retail, and MediaGraphs for Recommendation Engines: Looking beyond Social, Retail, and Media
Graphs for Recommendation Engines: Looking beyond Social, Retail, and Media
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
 
Semantic Search_ NLP_ ML.pdf
Semantic Search_ NLP_ ML.pdfSemantic Search_ NLP_ ML.pdf
Semantic Search_ NLP_ ML.pdf
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
SEO Social Blog: SEO Training 2010 From SEOmoz
SEO Social Blog:  SEO Training 2010 From SEOmoz SEO Social Blog:  SEO Training 2010 From SEOmoz
SEO Social Blog: SEO Training 2010 From SEOmoz
 
SEO MARKETING TRAINING
SEO MARKETING TRAININGSEO MARKETING TRAINING
SEO MARKETING TRAINING
 
Introduction to SEO
Introduction to SEOIntroduction to SEO
Introduction to SEO
 
Seo training-2010-100818134052-phpapp02 (1)
Seo training-2010-100818134052-phpapp02 (1)Seo training-2010-100818134052-phpapp02 (1)
Seo training-2010-100818134052-phpapp02 (1)
 
Search Enginge Optimization: SEOmoz
Search Enginge Optimization: SEOmozSearch Enginge Optimization: SEOmoz
Search Enginge Optimization: SEOmoz
 
Search Engine Marketing MD4
Search Engine Marketing MD4Search Engine Marketing MD4
Search Engine Marketing MD4
 
Search engine optimization (seo)
Search engine optimization (seo)Search engine optimization (seo)
Search engine optimization (seo)
 
Concept Based Search
Concept Based SearchConcept Based Search
Concept Based Search
 
Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glance
 
Semrush Ranking Factors Study 2.0 April 2019
Semrush Ranking Factors Study 2.0 April 2019Semrush Ranking Factors Study 2.0 April 2019
Semrush Ranking Factors Study 2.0 April 2019
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 

Recently uploaded

AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 

Recently uploaded (20)

AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 

Peters matthew periodictableseo