Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
Presented by: Michael King
Contact: mike@ipullrank.com
Twitter: @iPullRank
Technical Content
Optimization
@IPULLRANKAGEN...
2
No one knows Google’s algorithms.
TECHNICALCONTENTOPTIMIZATION
Disclaimer.
3
No one knows Google’s algorithms.
Patents are not an indication that something is currently in use.
TECHNICALCONTENTOPTI...
4
No one knows Google’s algorithms.
Patents are not an indication that something is currently in use.
I’m not aiming to of...
5
Search Engine
Expectations
How do Search Engines See Content?
TECHNICALCONTENTOPTIMIZATION
6
My Hypothesis
Just my educated opinion…
TECHNICALCONTENTOPTIMIZATION
7
Our Correlation Studies
Showcase Our Bias, Not
Google’s
I believe that because we are biased by
the impact of links, we ...
8
PageRank
PageRank certainly changed the game.
However, at its simplest, it is largely just
about link volume.
TECHNICALC...
9
Google Cares About Content
Relevance and Parity
In addition to the volume of links, the
source and target content releva...
10
Our Metrics Mostly Measure
Volume
First, our link metrics are, at best, just
approximations. They are not necessarily
i...
11
Except for Majestic.
TECHNICALCONTENTOPTIMIZATION
Citation Trust Flow
12
Topical PageRank
It is not realistic to expect that Google
has been using pure PageRank. Rather, it
makes way more sens...
13
But we just want the link, no matter what?
TECHNICALCONTENTOPTIMIZATION
Disclaimer.
14
How Content is Understood in
Information Retrieval
Content is tokenized and then
relationships between terms are
determ...
15
Yet The Question is Always How
Long Should Content Be?
TECHNICALCONTENTOPTIMIZATION
16
How Content is Understood
by Google
Google has a more sophisticated and
living system for doing the same thing.
TECHNIC...
17
But we just “make great content?”
TECHNICALCONTENTOPTIMIZATION
Content.
18
Content Scoring and Storing
Search engines can score documents
based on a variety of features. Google’s
file system sug...
19
Google’s Scoring Functions
There’s more than one scoring function.
Google scores content and links a variety
of differe...
2020
Document Scoring Simplified
TECHNICALCONTENTOPTIMIZATION
Content Factor Content Factor Speed Factor Link Factor Link ...
2121
Each Component to the Equation Has a Weight
TECHNICALCONTENTOPTIMIZATION
Content Factor Content Factor Speed Factor L...
2222
The Weights May Look Like This
TECHNICALCONTENTOPTIMIZATION
Content Factor Content Factor Speed Factor Link Factor Li...
2323
This is What SEOs Do
TECHNICALCONTENTOPTIMIZATION
5 2 4 95 74
Content Factor Content Factor Speed Factor Link Factor ...
2424
So, Then, Google Turns Down the Weights on Links
TECHNICALCONTENTOPTIMIZATION
5 2 4 95 74
Content Factor Content Fact...
2525
When Really, Content Optimization May Go Further
TECHNICALCONTENTOPTIMIZATION
100 100 4 95 74
Content Factor Content ...
26
Google Constantly Evaluates
and Experiments
TECHNICALCONTENTOPTIMIZATION
27
There is Evidence of CTR
Expectations
Granted, the screenshot is from the Paid
side, but this suggests that Google has ...
28
But we don’t A/B test our metadata?
TECHNICALCONTENTOPTIMIZATION
Experiments.
29
Queries Become Entities
Google has explicitly indicated that they
break a query into a series of entities
first.
TECHNI...
30
Attributes of Entities Inform
Context
Google builds context around the query
leveraging the information about the entit...
31
Entities are Extracted From
Content to Inform Rankings
When Google is processing content, they
extract ad score entitie...
32
But our industry doesn’t develop entity
strategies?
TECHNICALCONTENTOPTIMIZATION
Entity Salience.
33
Intent for a given keyword can change
over time.
TECHNICALCONTENTOPTIMIZATION
Intent.
34
Intent for a given keyword can change
over time.
Yet we set and forget our keyword research?
TECHNICALCONTENTOPTIMIZATI...
35
Technical Content Optimization
is a Multi-dimensional Word
Game, but…
TECHNICALCONTENTOPTIMIZATION
3636
TECHNICALCONTENTOPTIMIZATION
..we don’t know
what we’re doing
37
Mind the Text Analysis
Knowledge Gap
What is missing from the SEO skillset
TECHNICALCONTENTOPTIMIZATION
38
We’re Entering Different Fields
We are no longer talking about SEO.
Rather, we are talking about natural
language proce...
39
Natural Language Processing
Use Cases
TECHNICALCONTENTOPTIMIZATION
40
A “Corpus” is Built from
Crawling
Googlebot crawls the web and builds a
corpus of web documents. This is how
the proces...
41
Documents are Parsed
Documents are broken down into words
and then tagged with parts of speech.
TECHNICALCONTENTOPTIMIZ...
42
Topics are Modeled
A series of transformations and analyses
are done on the content to understand
what they are about.
...
43
Language is Understood
Once the content is understood, and
features are extracted, it can then be
scored and used for r...
44
Google Has Something
Called BERT
BERT is Google’s technique for pre-
training NLP for classification, question
answerin...
4545
TECHNICALCONTENTOPITIMIZATION
46
In other words, it’s the thing that
does this.
TECHNICALCONTENTOPTIMIZATION
47
Learn More about BERT
(and ELMo)
Jay Alammar has a great article with
illustrations that walks you through BERT
and som...
48
BERT Q&A Demo
Get a preview of what Google might
extract from a page for a given question
with this BERT demo.
https://...
4949
TECHNICALCONTENTOPTIMIZATION
50
Keyword Usage
As you might imagine, leveraging Text
Statistics to inform rankings indicates
that you might want to use ...
51
Tokenization
Search Engines break paragraphs into
sentences and sentences into “tokens” or
individual words.
This bette...
52
N-grams
N-grams are terms or phrases of N-
length.
TECHNICALCONTENTOPTIMIZATION
5353
TECHNICALCONTENTOPTIMIZATION
5454
TECHNICALCONTENTOPTIMIZATION
55
Synonyms and Close Variants
Statistical relevance is computed using
both synonyms and close variants of
words.
TECHNICA...
56
Stemming
Search Engines break words into their
stems to better determine relationships
and relevance.
TECHNICALCONTENTO...
57
Lemmatization
Similar to stemming, lemmatization is the
grouping of inflected forms of the same
word or idea.
For examp...
58
Semantic Distance & Term
Relationships
Search engines look for how physically
close words are together to better
determ...
59
Latent Direchlet Allocation
LDA is a recursive algorithm for mapping
keywords to topics.
TECHNICALCONTENTOPTIMIZATION
6060
TECHNICALCONTENTOPTIMIZATION
61
Zipf’s Law
Zipf’s law is a theory that words will be
similarly distributed across documents in
the same corpus (or docu...
62
Zipf’s Law Applied
When run on an actual dataset, zipf’s law
tends to hold up in high rankings.
TECHNICALCONTENTOPTIMIZ...
63
Term Frequency Inverse
Document Frequency
Term frequency, inverse document
frequency (TF-IDF) identifies the
importance...
6464
TECHNICALCONTENTOPTIMIZATION
65
Phrase-based Indexing &
Co-occurence
Google Specifically looks for keywords
that co-occur with a target keyword in a
do...
66
But You Heard TF-IDF is Old And
You Shouldn’t Focus On it…
I actually agree with this, you should not
focus on TF-IDF i...
67
Heap’s Law
Heaps law indicates that vocabulary
within a corpus grows at a predictable
rate.
TECHNICALCONTENTOPTIMIZATION
68
Entity Salience
Search engines leverage the features of
the entity to understand the context in
which the entity is ref...
69
Hidden Markov Models
Hidden Markov Models allow search
engines to extract implicit entities.
TECHNICALCONTENTOPTIMIZATI...
70
Entities are Also Used in
Verification
Google double checks what you’re talking
about by using entities.
H/t @TomAnthon...
71
Page Segmentation
Modern Search engines determine both
prominence of content and review the
visual hierarchy to determi...
72
Search engines are recursive.
TECHNICALCONTENTOPTIMIZATION
How it all works.
73
We need to understand Google’s statistical
expectation of content and add it to our
content’s requirements.
TECHNICALCO...
74
What Should We
Be Doing?
Let’s Get Actionable
TECHNICALCONTENTOPTIMIZATION
75
Step 1 - Research & Create:
Create content that fulfills
expectations
We need to create or optimize content
that balanc...
7676
We need to do more of Column B
Our industry talks a lot about the
concepts in the column B, but in practice
we do not...
77
It Starts with Keyword Research
This cannot be the extent of your keyword
research in 2019.
7878
Let’s Take it back to 2013
At minimum you should be pulling,
keywords, search volume, ranking
difficulty, co-occurrin...
79
The ultimate goal is to determine a series of
co-occurring terms and entities to
incorporate into your content for each...
80
Meet Knime Analytics
https://www.knime.com/knime-
software/knime-analytics-platform
TECHNICALCONTENTOPTIMIZATION
81
Knime Allows for Drag & Drop
Text Analysis
More than just text analysis, Knime has a
series of features for.
TECHNICALC...
82
TextProcessing Extension
https://www.knime.com/knime-text-
processing
TECHNICALCONTENTOPTIMIZATION
83
Palladian HTTP Extension
https://www.knime.com/community/pall
adian
TECHNICALCONTENTOPTIMIZATION
84
Download a “Corpus” for a
Keyword
Ideally, we’d review all of the results for a
given keyword to achieve parity with
Go...
85
Here’s a Starter Workflow
This workflow allows you to get started
with LDA and computes Term Frequency.
TECHNICALCONTEN...
8686
TECHNICALCONTENTOPTIMIZATION
1. Put your domain into SEMRush.
8787
TECHNICALCONTENTOPTIMIZATION
2. See what you don’t rank well for in
organic research.
8888
TECHNICALCONTENTOPTIMIZATION
3. Pick a keyword.
8989
TECHNICALCONTENTOPTIMIZATION
4. Look at the SERP.
9090
TECHNICALCONTENTOPTIMIZATION
Really Google?
9191
9292
TECHNICALCONTENTOPTIMIZATION
This is the page that ranks #86?!
9393
TECHNICALCONTENTOPTIMIZATION
5. See what else could rank for our site.
9494
TECHNICALCONTENTOPTIMIZATION
Enter Ryte’s Content Success.
9595
TECHNICALCONTENTOPTIMIZATION
6. Put in the keyword and compare TF-IDF
to ranking pages.
9696
TECHNICALCONTENTOPTIMIZATION
You can see the corpus they pulled for review.
9797
TECHNICALCONTENTOPTIMIZATION
7. Put the copy in. See what important keywords
are missing and make edits.
9898
TECHNICALCONTENTOPTIMIZATION
Content
Optimization
implemented
Content
Optimization
implemented
Results with no Link B...
9999
TECHNICALCONTENTOPTIMIZATION
Results with No Link Building
(Competitive Medicare Terms Min MSV 8100)
Content
Optimiza...
100100
TECHNICALCONTENTOPTIMIZATION
101101
TECHNICALCONTENTOPTIMIZATION
The topic explorer helps you identify
semantically clustered keyword opps.
102102
TECHNICALCONTENTOPTIMIZATION
You can generate a new page around
the core topics.
103103
TECHNICALCONTENTOPTIMIZATION
It abstracts a lot of the topic modeling and gives
you detailed direction as you write.
104104
TECHNICALCONTENTOPTIMIZATION
105
A/B Test Metadata
A/B test your page titles and meta
descriptions by creating different buckets
of pages, making chang...
106
Analysis: Compare your CTR
AWR publishes CTR stats by vertical
https://advancedwebranking.com/ctrstu
dy
TECHNICALCONTE...
107
Hypothesis: Identify Specifically
Where You Underperform
Develop your hypotheses based
keywords that underperform base...
108
Use Wordnet to Scale Keyword
Variant Discovery
https://wordnet.princeton.edu/
TECHNICALCONTENTOPTIMIZATION
109
Automate Inclusion of Emerging
Keywords from GSC in Rankings
Monitor your queries on a page-level
week over week to id...
110110
TECHNICALCONTENTOPTIMIZATION
111
Review Competitors’ Usage
of Entities
When Google is processing content, they
extract and score entities. Compare your...
112112
Use Entities for Intelligent
Internal Link Building
You can use the same data on your own
site to determine how to ...
113113
TECHNICALCONTENTOPTIMIZATION
114114
TECHNICALCONTENTOPTIMIZATION
115115
TECHNICALCONTENTOPTIMIZATION
116116
TECHNICALCONTENTOPTIMIZATION
117
Uncomfortable with APIs?
If APIs sound daunting to you, here’s a
Knime Analytics workflow that will make
it easier for...
118118
TECHNICALCONTENTOPTIMIZATION
119
Natural Language Generation is
the Next Big Thing
I suspect that in the near future, a lot of
this will be available t...
120120
TECHNICALCONTENTOPTIMIZATION
121
Wrapping Up
Bringing it Home!
TECHNICALCONTENTOPTIMIZATION
122
How to do effective Technical
Content Optimization
A. Determine Search Engine
Expectations
B. Optimize your Content
C....
123
I’m Mike King.
(@IPULLRANK)
TECHNICALCONTENTOPTIMIZATION
FEATURED BY
124
We are iPullRank.
(@IPULLRANKAGENCY)
TECHNICALCONTENTOPTIMIZATION
125
This is Our CEO.
(And, I’m #ZORASDAD)
TECHNICALCONTENTOPTIMIZATION
126
At iPullRank, we’re incorporating
Natural Language Processing
into our workflows to better
understand the expectations...
127
Thank You | Q&A
Mike King
Managing Director
Twitter: @iPullRank
Email: mike@ipullrank.com
TECHNICALCONTENTOPTIMIZATION
Upcoming SlideShare
Loading in …5
×

SearchLove Boston 2019 - Michael King - Technical Content Optimization

413 views

Published on

Search engines have spent years telling marketers to just focus on "making great content," but what does that truly mean to a robot? When you dig into Information Retrieval, the Computer Science behind search engines, it becomes clear that there is a specific statistical expectation that drives the understanding of what content is relevant for which queries. In this talk, Mike will be discussing the text analysis concepts behind how content is processed and how we as marketers can harness it to develop more optimized content.

  • Be the first to comment

SearchLove Boston 2019 - Michael King - Technical Content Optimization

  1. 1. 1 Presented by: Michael King Contact: mike@ipullrank.com Twitter: @iPullRank Technical Content Optimization @IPULLRANKAGENCY
  2. 2. 2 No one knows Google’s algorithms. TECHNICALCONTENTOPTIMIZATION Disclaimer.
  3. 3. 3 No one knows Google’s algorithms. Patents are not an indication that something is currently in use. TECHNICALCONTENTOPTIMIZATION Disclaimer.
  4. 4. 4 No one knows Google’s algorithms. Patents are not an indication that something is currently in use. I’m not aiming to offend anyone with this presentation. TECHNICALCONTENTOPTIMIZATION Disclaimer.
  5. 5. 5 Search Engine Expectations How do Search Engines See Content? TECHNICALCONTENTOPTIMIZATION
  6. 6. 6 My Hypothesis Just my educated opinion… TECHNICALCONTENTOPTIMIZATION
  7. 7. 7 Our Correlation Studies Showcase Our Bias, Not Google’s I believe that because we are biased by the impact of links, we focus heavily on link acquisition. As a result our correlation studies reflect that. TECHNICALCONTENTOPTIMIZATION Link Analysis Correlations Text Analysis Correlations
  8. 8. 8 PageRank PageRank certainly changed the game. However, at its simplest, it is largely just about link volume. TECHNICALCONTENTOPTIMIZATION
  9. 9. 9 Google Cares About Content Relevance and Parity In addition to the volume of links, the source and target content relevance is examined. TECHNICALCONTENTOPTIMIZATION
  10. 10. 10 Our Metrics Mostly Measure Volume First, our link metrics are, at best, just approximations. They are not necessarily indicative of anything that Google is actually doing or measuring. Second, our link metrics are mostly some measure of link volume. TECHNICALCONTENTOPTIMIZATION
  11. 11. 11 Except for Majestic. TECHNICALCONTENTOPTIMIZATION Citation Trust Flow
  12. 12. 12 Topical PageRank It is not realistic to expect that Google has been using pure PageRank. Rather, it makes way more sense to expect they have defaulted to some form of Topical PageRank for several years. h/t @CyrusShepard TECHNICALCONTENTOPTIMIZATION
  13. 13. 13 But we just want the link, no matter what? TECHNICALCONTENTOPTIMIZATION Disclaimer.
  14. 14. 14 How Content is Understood in Information Retrieval Content is tokenized and then relationships between terms are determined statistically. TECHNICALCONTENTOPTIMIZATION
  15. 15. 15 Yet The Question is Always How Long Should Content Be? TECHNICALCONTENTOPTIMIZATION
  16. 16. 16 How Content is Understood by Google Google has a more sophisticated and living system for doing the same thing. TECHNICALCONTENTOPTIMIZATION
  17. 17. 17 But we just “make great content?” TECHNICALCONTENTOPTIMIZATION Content.
  18. 18. 18 Content Scoring and Storing Search engines can score documents based on a variety of features. Google’s file system suggests that they have the capability to store multiple versions of any document over time. TECHNICALCONTENTOPTIMIZATION
  19. 19. 19 Google’s Scoring Functions There’s more than one scoring function. Google scores content and links a variety of different ways and then chooses the best results. There is not just one “algorithm.” TECHNICALCONTENTOPTIMIZATION
  20. 20. 2020 Document Scoring Simplified TECHNICALCONTENTOPTIMIZATION Content Factor Content Factor Speed Factor Link Factor Link Factor Document Score + + + + =
  21. 21. 2121 Each Component to the Equation Has a Weight TECHNICALCONTENTOPTIMIZATION Content Factor Content Factor Speed Factor Link Factor Link Factor Document Score a b c d e+ + + + =
  22. 22. 2222 The Weights May Look Like This TECHNICALCONTENTOPTIMIZATION Content Factor Content Factor Speed Factor Link Factor Link Factor Document Score 3 6 1 2 2+ + + + =
  23. 23. 2323 This is What SEOs Do TECHNICALCONTENTOPTIMIZATION 5 2 4 95 74 Content Factor Content Factor Speed Factor Link Factor Link Factor 1483 6 1 2 2+ + + + =
  24. 24. 2424 So, Then, Google Turns Down the Weights on Links TECHNICALCONTENTOPTIMIZATION 5 2 4 95 74 Content Factor Content Factor Speed Factor Link Factor Link Factor 55.493 6 1 .25 .01+ + + + =
  25. 25. 2525 When Really, Content Optimization May Go Further TECHNICALCONTENTOPTIMIZATION 100 100 4 95 74 Content Factor Content Factor Speed Factor Link Factor Link Factor 928.493 6 1 .25 .01+ + + + =
  26. 26. 26 Google Constantly Evaluates and Experiments TECHNICALCONTENTOPTIMIZATION
  27. 27. 27 There is Evidence of CTR Expectations Granted, the screenshot is from the Paid side, but this suggests that Google has at least considered an expected CTR. This is also consistent with the idea of evaluating a successful search based on click activity. The hypothesis being that if a result does not consistently meet the expected CTR, it should be demoted. TECHNICALCONTENTOPTIMIZATION
  28. 28. 28 But we don’t A/B test our metadata? TECHNICALCONTENTOPTIMIZATION Experiments.
  29. 29. 29 Queries Become Entities Google has explicitly indicated that they break a query into a series of entities first. TECHNICALCONTENTOPTIMIZATION
  30. 30. 30 Attributes of Entities Inform Context Google builds context around the query leveraging the information about the entity to inform what documents to consider. TECHNICALCONTENTOPTIMIZATION
  31. 31. 31 Entities are Extracted From Content to Inform Rankings When Google is processing content, they extract ad score entities. TECHNICALCONTENTOPTIMIZATION
  32. 32. 32 But our industry doesn’t develop entity strategies? TECHNICALCONTENTOPTIMIZATION Entity Salience.
  33. 33. 33 Intent for a given keyword can change over time. TECHNICALCONTENTOPTIMIZATION Intent.
  34. 34. 34 Intent for a given keyword can change over time. Yet we set and forget our keyword research? TECHNICALCONTENTOPTIMIZATION Intent.
  35. 35. 35 Technical Content Optimization is a Multi-dimensional Word Game, but… TECHNICALCONTENTOPTIMIZATION
  36. 36. 3636 TECHNICALCONTENTOPTIMIZATION ..we don’t know what we’re doing
  37. 37. 37 Mind the Text Analysis Knowledge Gap What is missing from the SEO skillset TECHNICALCONTENTOPTIMIZATION
  38. 38. 38 We’re Entering Different Fields We are no longer talking about SEO. Rather, we are talking about natural language processing, computational linguistics, information retrieval and graph theory. All things that conceptually underpin the computer science behind how search engines model topics in content. TECHNICALCONTENTOPTIMIZATION
  39. 39. 39 Natural Language Processing Use Cases TECHNICALCONTENTOPTIMIZATION
  40. 40. 40 A “Corpus” is Built from Crawling Googlebot crawls the web and builds a corpus of web documents. This is how the process begins. TECHNICALCONTENTOPTIMIZATION
  41. 41. 41 Documents are Parsed Documents are broken down into words and then tagged with parts of speech. TECHNICALCONTENTOPTIMIZATION
  42. 42. 42 Topics are Modeled A series of transformations and analyses are done on the content to understand what they are about. TECHNICALCONTENTOPTIMIZATION
  43. 43. 43 Language is Understood Once the content is understood, and features are extracted, it can then be scored and used for rankings. This is how search engines develop their expectation of what should be featured in content. TECHNICALCONTENTOPTIMIZATION
  44. 44. 44 Google Has Something Called BERT BERT is Google’s technique for pre- training NLP for classification, question answering and named entity recognition. https://ai.googleblog.com/2018/11/open -sourcing-bert-state-of-art-pre.html TECHNICALCONTENTOPTIMIZATION
  45. 45. 4545 TECHNICALCONTENTOPITIMIZATION
  46. 46. 46 In other words, it’s the thing that does this. TECHNICALCONTENTOPTIMIZATION
  47. 47. 47 Learn More about BERT (and ELMo) Jay Alammar has a great article with illustrations that walks you through BERT and some of the other pre-existing NLP mechanisms http://jalammar.github.io/illustrated- bert/ TECHNICALCONTENTOPTIMIZATION
  48. 48. 48 BERT Q&A Demo Get a preview of what Google might extract from a page for a given question with this BERT demo. https://www.pragnakalp.com/demos/BE RT-NLP-QnA-Demo/ TECHNICALCONTENTOPTIMIZATION
  49. 49. 4949 TECHNICALCONTENTOPTIMIZATION
  50. 50. 50 Keyword Usage As you might imagine, leveraging Text Statistics to inform rankings indicates that you might want to use the keywords on the page. This is where having the opportunity to rank begins. TECHNICALCONTENTOPTIMIZATION
  51. 51. 51 Tokenization Search Engines break paragraphs into sentences and sentences into “tokens” or individual words. This better positions content for statistical analysis TECHNICALCONTENTOPTIMIZATION
  52. 52. 52 N-grams N-grams are terms or phrases of N- length. TECHNICALCONTENTOPTIMIZATION
  53. 53. 5353 TECHNICALCONTENTOPTIMIZATION
  54. 54. 5454 TECHNICALCONTENTOPTIMIZATION
  55. 55. 55 Synonyms and Close Variants Statistical relevance is computed using both synonyms and close variants of words. TECHNICALCONTENTOPTIMIZATION
  56. 56. 56 Stemming Search Engines break words into their stems to better determine relationships and relevance. TECHNICALCONTENTOPTIMIZATION
  57. 57. 57 Lemmatization Similar to stemming, lemmatization is the grouping of inflected forms of the same word or idea. For example, “better” and “good” have the same lemma. TECHNICALCONTENTOPTIMIZATION
  58. 58. 58 Semantic Distance & Term Relationships Search engines look for how physically close words are together to better determine their relationship in statistical models. TECHNICALCONTENTOPTIMIZATION
  59. 59. 59 Latent Direchlet Allocation LDA is a recursive algorithm for mapping keywords to topics. TECHNICALCONTENTOPTIMIZATION
  60. 60. 6060 TECHNICALCONTENTOPTIMIZATION
  61. 61. 61 Zipf’s Law Zipf’s law is a theory that words will be similarly distributed across documents in the same corpus (or document set). TECHNICALCONTENTOPTIMIZATION
  62. 62. 62 Zipf’s Law Applied When run on an actual dataset, zipf’s law tends to hold up in high rankings. TECHNICALCONTENTOPTIMIZATION
  63. 63. 63 Term Frequency Inverse Document Frequency Term frequency, inverse document frequency (TF-IDF) identifies the importance of keywords with respect to other keywords in documents in a corpus. TECHNICALCONTENTOPTIMIZATION
  64. 64. 6464 TECHNICALCONTENTOPTIMIZATION
  65. 65. 65 Phrase-based Indexing & Co-occurence Google Specifically looks for keywords that co-occur with a target keyword in a document set. Usage of those keywords is an indication of relevance for subsequent documents. TECHNICALCONTENTOPTIMIZATION
  66. 66. 66 But You Heard TF-IDF is Old And You Shouldn’t Focus On it… I actually agree with this, you should not focus on TF-IDF in isolation. I suspect that Google is using a series of text analysis functions to calculate relevance, but the age of the technique does not make it any less valuable. TECHNICALCONTENTOPTIMIZATION
  67. 67. 67 Heap’s Law Heaps law indicates that vocabulary within a corpus grows at a predictable rate. TECHNICALCONTENTOPTIMIZATION
  68. 68. 68 Entity Salience Search engines leverage the features of the entity to understand the context in which the entity is referenced. Implicit features of the entity may be used to understand the document. TECHNICALCONTENTOPTIMIZATION
  69. 69. 69 Hidden Markov Models Hidden Markov Models allow search engines to extract implicit entities. TECHNICALCONTENTOPTIMIZATION
  70. 70. 70 Entities are Also Used in Verification Google double checks what you’re talking about by using entities. H/t @TomAnthonySEO TECHNICALCONTENTOPTIMIZATION
  71. 71. 71 Page Segmentation Modern Search engines determine both prominence of content and review the visual hierarchy to determine how valuable the content is to the page. TECHNICALCONTENTOPTIMIZATION
  72. 72. 72 Search engines are recursive. TECHNICALCONTENTOPTIMIZATION How it all works.
  73. 73. 73 We need to understand Google’s statistical expectation of content and add it to our content’s requirements. TECHNICALCONTENTOPTIMIZATION What we’re missing.
  74. 74. 74 What Should We Be Doing? Let’s Get Actionable TECHNICALCONTENTOPTIMIZATION
  75. 75. 75 Step 1 - Research & Create: Create content that fulfills expectations We need to create or optimize content that balances user needs and search engine expectations. This requires researching the query space based on what is currently ranking in Google’s corpus. Step 2 - Build Relevant Links: Build Links with Content Parity Building links needs to be a more detailed process. You need to only queue up link prospects that are topically relevant. These are the links that will propel you further than links from random unrelated content. TECHNICALCONTENTOPTIMIZATION Step 3 - Test & Optimize: Measure performance and experiment with metadata You need to continually review the performance of your content against that of the average performance in your space. If you are performing below average, this an opportunity to perform experiments on your metadata.
  76. 76. 7676 We need to do more of Column B Our industry talks a lot about the concepts in the column B, but in practice we do not have enough SEO tools to facilitate the latter. TECHNICALCONTENTOPTIMIZATION
  77. 77. 77 It Starts with Keyword Research This cannot be the extent of your keyword research in 2019.
  78. 78. 7878 Let’s Take it back to 2013 At minimum you should be pulling, keywords, search volume, ranking difficulty, co-occurring keywords, target personas, need states, entities. Check out my Persona-driven Keyword research deck for how to pull most of this. https://www.slideshare.net/ipullrank/per sona-driven-keyword-research/
  79. 79. 79 The ultimate goal is to determine a series of co-occurring terms and entities to incorporate into your content for each target keyword. TECHNICALCONTENTOPTIMIZATION The goal.
  80. 80. 80 Meet Knime Analytics https://www.knime.com/knime- software/knime-analytics-platform TECHNICALCONTENTOPTIMIZATION
  81. 81. 81 Knime Allows for Drag & Drop Text Analysis More than just text analysis, Knime has a series of features for. TECHNICALCONTENTOPTIMIZATION
  82. 82. 82 TextProcessing Extension https://www.knime.com/knime-text- processing TECHNICALCONTENTOPTIMIZATION
  83. 83. 83 Palladian HTTP Extension https://www.knime.com/community/pall adian TECHNICALCONTENTOPTIMIZATION
  84. 84. 84 Download a “Corpus” for a Keyword Ideally, we’d review all of the results for a given keyword to achieve parity with Google, but in this case we’ll download the top 100 results for “seo tools” from SEMRush. TECHNICALCONTENTOPTIMIZATION
  85. 85. 85 Here’s a Starter Workflow This workflow allows you to get started with LDA and computes Term Frequency. TECHNICALCONTENTOPTIMIZATION
  86. 86. 8686 TECHNICALCONTENTOPTIMIZATION 1. Put your domain into SEMRush.
  87. 87. 8787 TECHNICALCONTENTOPTIMIZATION 2. See what you don’t rank well for in organic research.
  88. 88. 8888 TECHNICALCONTENTOPTIMIZATION 3. Pick a keyword.
  89. 89. 8989 TECHNICALCONTENTOPTIMIZATION 4. Look at the SERP.
  90. 90. 9090 TECHNICALCONTENTOPTIMIZATION Really Google?
  91. 91. 9191
  92. 92. 9292 TECHNICALCONTENTOPTIMIZATION This is the page that ranks #86?!
  93. 93. 9393 TECHNICALCONTENTOPTIMIZATION 5. See what else could rank for our site.
  94. 94. 9494 TECHNICALCONTENTOPTIMIZATION Enter Ryte’s Content Success.
  95. 95. 9595 TECHNICALCONTENTOPTIMIZATION 6. Put in the keyword and compare TF-IDF to ranking pages.
  96. 96. 9696 TECHNICALCONTENTOPTIMIZATION You can see the corpus they pulled for review.
  97. 97. 9797 TECHNICALCONTENTOPTIMIZATION 7. Put the copy in. See what important keywords are missing and make edits.
  98. 98. 9898 TECHNICALCONTENTOPTIMIZATION Content Optimization implemented Content Optimization implemented Results with no Link Building (Competitive Medicare Terms Min MSV 8100)
  99. 99. 9999 TECHNICALCONTENTOPTIMIZATION Results with No Link Building (Competitive Medicare Terms Min MSV 8100) Content Optimization implemented
  100. 100. 100100 TECHNICALCONTENTOPTIMIZATION
  101. 101. 101101 TECHNICALCONTENTOPTIMIZATION The topic explorer helps you identify semantically clustered keyword opps.
  102. 102. 102102 TECHNICALCONTENTOPTIMIZATION You can generate a new page around the core topics.
  103. 103. 103103 TECHNICALCONTENTOPTIMIZATION It abstracts a lot of the topic modeling and gives you detailed direction as you write.
  104. 104. 104104 TECHNICALCONTENTOPTIMIZATION
  105. 105. 105 A/B Test Metadata A/B test your page titles and meta descriptions by creating different buckets of pages, making changes, and reviewing their CTR performance. TECHNICALCONTENTOPTIMIZATION
  106. 106. 106 Analysis: Compare your CTR AWR publishes CTR stats by vertical https://advancedwebranking.com/ctrstu dy TECHNICALCONTENTOPTIMIZATION
  107. 107. 107 Hypothesis: Identify Specifically Where You Underperform Develop your hypotheses based keywords that underperform based on the expected CTR. TECHNICALCONTENTOPTIMIZATION
  108. 108. 108 Use Wordnet to Scale Keyword Variant Discovery https://wordnet.princeton.edu/ TECHNICALCONTENTOPTIMIZATION
  109. 109. 109 Automate Inclusion of Emerging Keywords from GSC in Rankings Monitor your queries on a page-level week over week to identify new emerging keyword opportunities and automatically add them to your rank tracking. TECHNICALCONTENTOPTIMIZATION
  110. 110. 110110 TECHNICALCONTENTOPTIMIZATION
  111. 111. 111 Review Competitors’ Usage of Entities When Google is processing content, they extract and score entities. Compare your page’s entity salience and ordering against your competitors to identify what you need to feature. Do the same as you do your keyword research. TECHNICALCONTENTOPTIMIZATION
  112. 112. 112112 Use Entities for Intelligent Internal Link Building You can use the same data on your own site to determine how to automate internal link building. TECHNICALCONTENTOPTIMIZATION
  113. 113. 113113 TECHNICALCONTENTOPTIMIZATION
  114. 114. 114114 TECHNICALCONTENTOPTIMIZATION
  115. 115. 115115 TECHNICALCONTENTOPTIMIZATION
  116. 116. 116116 TECHNICALCONTENTOPTIMIZATION
  117. 117. 117 Uncomfortable with APIs? If APIs sound daunting to you, here’s a Knime Analytics workflow that will make it easier for you to pull data from an API and format it into a table. https://www.knime.com/blog/a-restful- way-to-find-and-retrieve-data TECHNICALCONTENTOPTIMIZATION
  118. 118. 118118 TECHNICALCONTENTOPTIMIZATION
  119. 119. 119 Natural Language Generation is the Next Big Thing I suspect that in the near future, a lot of this will be available to automate. There are already solutions that are more than just sophisticated content spinners. TECHNICALCONTENTOPTIMIZATION
  120. 120. 120120 TECHNICALCONTENTOPTIMIZATION
  121. 121. 121 Wrapping Up Bringing it Home! TECHNICALCONTENTOPTIMIZATION
  122. 122. 122 How to do effective Technical Content Optimization A. Determine Search Engine Expectations B. Optimize your Content C. Test All the Things D. Win the SERPs TECHNICALCONTENTOPTIMIZATION
  123. 123. 123 I’m Mike King. (@IPULLRANK) TECHNICALCONTENTOPTIMIZATION FEATURED BY
  124. 124. 124 We are iPullRank. (@IPULLRANKAGENCY) TECHNICALCONTENTOPTIMIZATION
  125. 125. 125 This is Our CEO. (And, I’m #ZORASDAD) TECHNICALCONTENTOPTIMIZATION
  126. 126. 126 At iPullRank, we’re incorporating Natural Language Processing into our workflows to better understand the expectations of Google. TECHNICALCONTENTOPTIMIZATION
  127. 127. 127 Thank You | Q&A Mike King Managing Director Twitter: @iPullRank Email: mike@ipullrank.com TECHNICALCONTENTOPTIMIZATION

×