The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)

Paul Shapiro
Paul ShapiroHead of SEO, Catalyst
The Actionable Guide to Doing
Better Semantic Keyword
Research
Who Am I?
Read Later: http://searchwilderness.com/
The Prevalence of Semantic Search (Unstructured)
Search Engines are coming to rely more-and-more on semantic search
technology to understand websites and how users search.
• As a result SEOs need to better understand how language and
keywords relate to each other in order to do more effective
keyword research.
Do semantic keyword
research!
What Is Semantic Search?
Strings can represent things:
• Search Engines are looking past exact match keyword occurrences
on web pages.
• They are learning the meaning behind keywords and examining how
they relate to each other conceptually
• The strength of that conceptual connection being scored for
relevancy within search queries and on-page.
What is a mammal that has a
vertebrate and lives in water?
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)
+1 Probability
+1 Probability
+1 Probability
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)
Google Hummingbird
What’s up with Hummingbird?
“Hummingbird is paying more attention to
each word in a query, ensuring that the whole
query – the whole sentence or conversation or
meaning – is taken into account, rather than
particular words. The goal is that pages
matching the meaning do better, rather than
pages matching just a few words.”
Hummingbird improves semantic understanding of search queries AND
makes conversational search better, which is important for the future of
mobile and voice search.
Hummingbird Summarized
I like Gianluca Fiorelli’s analysis of the theoretical capabilities of a post-
Hummingbird Google search:
1. To better understand the intent of a query;
2. To broaden the pool of web documents that may answer that query;
3. To simplify how it delivers information, because if query A, query B, and query C
substantively mean the same thing, Google doesn't need to propose three
different SERPs, but just one;
4. To offer a better search experience, because expanding the query and better
understanding the relationships between search entities (also based on
direct/indirect personalization elements), Google can now offer results that have
a higher probability of satisfying the needs of the user.
5. As a consequence, Google may present better SERPs also in terms of better
ads, because in 99% of the cases, verbose queries were not presenting ads in
their SERPs before Hummingbird.
Source: http://pshapi.ro/mozingbird
How Can SEOs Optimize for Semantic Search?
1. Make sure our content delights our users
 Create quality content and use personas
2. Optimize for searcher intent and build topical authority using semantic
topic modeling
 Understand how users search and
have command of your niche’s language
Now THIS is
great content.
Build Topical Authority for a Subject
When conducting keyword research, optimizing on-page, or creating
content, have a deep understanding of your niche’s language:
1. Understand how concepts relate to one another and which
keywords pertain to those concepts.
2. Ensure these concepts are well represented.
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
keywor
d
Optimize for Searcher Intent
Have an exceptional understanding of consumer language and the
myriad of ways users may search about your niche
1. What are consumers looking for when they are familiar with your
niche?
• Language used should represent core keywords.
2. What are consumers looking for when they are not familiar with
your niche?
• Language tends to be more conversational. You may
uncover more related terms when exploring your niche
from this perspective.
3. What else do these two groups search for?
• These searches may be directly and/or indirectly related.
Actually doing Semantic Keyword
Research…
Social Media Is an Awesome Data Source
Social Media Is an AWESOME Data Source
for Semantic Keyword Research
1. Social media data helps us expand our collection of keyword
ideas—especially new, breaking keywords.
2. Social media language is inherently conversational and can help us
understand how conversation queries may be phrased.
3. We can use it to mimic the language of the customer, which has a
secondary CRO benefit.
#Awesome
Secondary CRO Benefit: The Echo Effect
While you’re at it, use social media language to mimic the language of
your consumer. There are several studies that indicate it may help build
trust and boost conversions
• Study published in the International Journal of Hospitality
Management:
 Waitresses who verbally mimicked a person’s order were more
likely to receive higher tips.
• Study publish in the Journal of Language and Social Psychology:
 Mirroring people’s words can be very important in building
likability, safety, rapport, and social cohesion.
http://pshapi.ro/echohospitality
http://pshapi.ro/echoinfluence
Once We Collect SERP and Social Media Data...
There are some way we can break it down and analyze.
Co-occurrence
• How often two or more words appear along side each other in a
corpus of documents.
Latent Dirichlet Allocation (LDA)
• Finds semantically related keywords and groups them into topical
buckets.
TF-IDF (Term Frequency-Inverse Document Frequency)
• Reflects how important a keyword is to a document in a whole
collection of documents.
The Ultimate Tool
KNIME Is the One Tool to Rule Them All
• Free and open source, running on every platform
• Allows you to do things using a drag-and-drop interface that you would
normally need a developer or programming background to accomplish.
• Synergizes data-oriented tasks and helps easily automate:
 Data collection
 Data manipulation
 Analysis
 Visualization
 Reporting
http://pshapi.ro/downloadknime
Visualizations KNIME Produces
That Will Help Optimize for Semantic Search
Keyword Node GraphsSegmented Word Clouds
Basics of KNIME
What’s a Node?
• Pre-built drag-and-drop boxes designed to do a single task.
• They are combined together into “workflows”
to do larger, more complex tasks.
• Nodes can be grouped together into meta-nodes which can be
configured in unison.
How Do You Add Nodes and How Do They Connect?
How do you add nodes?
How do you connect nodes to one-another?
Configuring Nodes and Running Workflows
Configuring Nodes
Running Workflows
OR
Accessing Data from SERP and Twitter +
Common Node Configurations We’ll Be Using
Get a Twitter API Key
Fill out the forms!
• Application “Name”,
“Description”, and “Website”
don’t matter for our
purposes.
Go to “Keys and Access Tokens”
tab and grab:
• Consumer Key (API Key)
• Consumer Secret (API Secret)
Click “Create my access token”
and grab:
• Access Token
• Access Token Secret
Go to: https://apps.twitter.com/
Accessing Social Data – Twitter API Nodes
Right-Click and “Configure”
to input API information
Right-Click and “Configure”
Twitter Search Query (and
type)
It’s Stupid Easy
Extract Only the Links from Twitter
A little trickier than it should be since you have to expand t.co links and
URL shorteners.
Accessing SERP Data – Inputting Data Manually
Manually input URLs with Excel Spreadsheet or CSV (Desktop Rank
Checkers)
Manually input URLs with “Table Creator” node (Right-Click Configure –
edit just like a spreadsheet)
Accessing SERP Data – Inputting Data via API (Better)
Example – GetSTAT
More-Complicated Meta Node Method
Make Webpages Plain Text (for Analysis)
Use Boilerpipe API (pre-made meta-node download to be provided)
http://boilerpipe-web.appspot.com/
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)
Getting Things into a Text Analysis Format
Use the built-in “Strings To Document” node
A Few More Useful Base Nodes for Text Analysis
Parts of Speech Tagging (POS)
Calculate TF-IDF
Co-Occurrence Nodes
LDA (Latent Dirichlet Allocation) Node
Color Manager & Word Cloud
Network Graph
Process: Using KNIME for Semantic
Topic Modeling and Keyword Research
Bringing It All Together:
Applying Concepts to Visualizations
1. Search Twitter for keyword and collect all of the Tweet text
2. Search Twitter for keyword, extract links only, scrape text from links
3. Extract top 10 ranking pages keyword and scrape text from links
4. Isolate single word keywords and/or multi-word N-grams
5. Calculate TF-IDF
THEN we can…
• Tag Parts of Speech (Nouns, Adjectives, Verbs, etc.) and display in
Word Cloud
• Do Co-Occurrence Analysis and display in Node Graph (remember
earlier patent?)
• Identify semantic topic groupings with LDA and display in Node Graph
Analysis We Can Do Based on a Google Patent
Simplified with a smaller corpus, but easily replicable with KNIME:
1. Filter out too common terms using TF-IDF
2. Take the top 20 or so terms that are above a certain threshold based
upon TF-IDF and remove the rest.
3. Calculate Co-occurrence of the remaining terms.
4. Optimize your site for these!
Bill Slawski Patent Analysis: http://pshapi.ro/cooccurencepatent
Bringing It All Together – Parts of Speech Output
Bringing It All Together – TF-IDF + Co-Occurence Output
Bringing It All Together – TF-IDF + LDA Output
Now Start Building More Effective
semantically Optimized Websites!
© 2015 by Catalyst Digital. All rights reserved.
1 of 50

More Related Content

Similar to The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)(20)

Recently uploaded(20)

The HUMAN Brand Building Lasting Customer Loyalty - Chris Malone, Fidelum HealthThe HUMAN Brand Building Lasting Customer Loyalty - Chris Malone, Fidelum Health
The HUMAN Brand Building Lasting Customer Loyalty - Chris Malone, Fidelum Health
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions34 views
The Art and Science of Data-Driven Creativity (in Advertising) - Ken Gamage, ...The Art and Science of Data-Driven Creativity (in Advertising) - Ken Gamage, ...
The Art and Science of Data-Driven Creativity (in Advertising) - Ken Gamage, ...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions18 views
Generative AI The New Wild West of SEO - Ryan Huser, ResignalGenerative AI The New Wild West of SEO - Ryan Huser, Resignal
Generative AI The New Wild West of SEO - Ryan Huser, Resignal
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions25 views
Key SEO Elements To Take In To Consideration - Mateen Agha, AssemblyKey SEO Elements To Take In To Consideration - Mateen Agha, Assembly
Key SEO Elements To Take In To Consideration - Mateen Agha, Assembly
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions59 views
Predictive Data Generation for New Agile Marketing Systems - Michael Cohen, P...Predictive Data Generation for New Agile Marketing Systems - Michael Cohen, P...
Predictive Data Generation for New Agile Marketing Systems - Michael Cohen, P...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions27 views
Improve Your Digital Experience to Drive More Revenue - Alp Mimaroglu, SyscoImprove Your Digital Experience to Drive More Revenue - Alp Mimaroglu, Sysco
Improve Your Digital Experience to Drive More Revenue - Alp Mimaroglu, Sysco
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions17 views
MultiChannel Marketing Strategy Master Class - Jeff Turnbow, WinningLocalMultiChannel Marketing Strategy Master Class - Jeff Turnbow, WinningLocal
MultiChannel Marketing Strategy Master Class - Jeff Turnbow, WinningLocal
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions16 views
Conversational AI and PlatinumConversational AI and Platinum
Conversational AI and Platinum
JerryMaurer334 views
Sell More by Saying Less With the ABT Framework - Park Howell, The Business o...Sell More by Saying Less With the ABT Framework - Park Howell, The Business o...
Sell More by Saying Less With the ABT Framework - Park Howell, The Business o...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions17 views
Marketing Automation Master Class - Yemi Oluseun, The Change HiveMarketing Automation Master Class - Yemi Oluseun, The Change Hive
Marketing Automation Master Class - Yemi Oluseun, The Change Hive
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions47 views
Panel - Digital Marketing Trends - Martin Weinberg, MarketGenesisPanel - Digital Marketing Trends - Martin Weinberg, MarketGenesis
Panel - Digital Marketing Trends - Martin Weinberg, MarketGenesis
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions33 views
The Modern Content Challenge - Charlie Bell, ContentfulThe Modern Content Challenge - Charlie Bell, Contentful
The Modern Content Challenge - Charlie Bell, Contentful
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions49 views
Account Based Marketing - Amanda, Pragmattica Digital ConsultingAccount Based Marketing - Amanda, Pragmattica Digital Consulting
Account Based Marketing - Amanda, Pragmattica Digital Consulting
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions32 views
Understanding Your Consumer Through Data - Hiram Enriquez & Katie Mack, Amazo...Understanding Your Consumer Through Data - Hiram Enriquez & Katie Mack, Amazo...
Understanding Your Consumer Through Data - Hiram Enriquez & Katie Mack, Amazo...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions32 views
The Relationship Between Strategy, Marketing and Technology - Nikki Cockcroft...The Relationship Between Strategy, Marketing and Technology - Nikki Cockcroft...
The Relationship Between Strategy, Marketing and Technology - Nikki Cockcroft...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions34 views
How To Build Digital Marketing Strategies - Kuralay Assainova, Liana Technolo...How To Build Digital Marketing Strategies - Kuralay Assainova, Liana Technolo...
How To Build Digital Marketing Strategies - Kuralay Assainova, Liana Technolo...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions52 views
Document Processing Made Better - Hadi Harb, ApryseDocument Processing Made Better - Hadi Harb, Apryse
Document Processing Made Better - Hadi Harb, Apryse
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions48 views

The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)

  • 1. The Actionable Guide to Doing Better Semantic Keyword Research
  • 2. Who Am I? Read Later: http://searchwilderness.com/
  • 3. The Prevalence of Semantic Search (Unstructured) Search Engines are coming to rely more-and-more on semantic search technology to understand websites and how users search. • As a result SEOs need to better understand how language and keywords relate to each other in order to do more effective keyword research. Do semantic keyword research!
  • 4. What Is Semantic Search? Strings can represent things: • Search Engines are looking past exact match keyword occurrences on web pages. • They are learning the meaning behind keywords and examining how they relate to each other conceptually • The strength of that conceptual connection being scored for relevancy within search queries and on-page.
  • 5. What is a mammal that has a vertebrate and lives in water?
  • 10. What’s up with Hummingbird? “Hummingbird is paying more attention to each word in a query, ensuring that the whole query – the whole sentence or conversation or meaning – is taken into account, rather than particular words. The goal is that pages matching the meaning do better, rather than pages matching just a few words.” Hummingbird improves semantic understanding of search queries AND makes conversational search better, which is important for the future of mobile and voice search.
  • 11. Hummingbird Summarized I like Gianluca Fiorelli’s analysis of the theoretical capabilities of a post- Hummingbird Google search: 1. To better understand the intent of a query; 2. To broaden the pool of web documents that may answer that query; 3. To simplify how it delivers information, because if query A, query B, and query C substantively mean the same thing, Google doesn't need to propose three different SERPs, but just one; 4. To offer a better search experience, because expanding the query and better understanding the relationships between search entities (also based on direct/indirect personalization elements), Google can now offer results that have a higher probability of satisfying the needs of the user. 5. As a consequence, Google may present better SERPs also in terms of better ads, because in 99% of the cases, verbose queries were not presenting ads in their SERPs before Hummingbird. Source: http://pshapi.ro/mozingbird
  • 12. How Can SEOs Optimize for Semantic Search? 1. Make sure our content delights our users  Create quality content and use personas 2. Optimize for searcher intent and build topical authority using semantic topic modeling  Understand how users search and have command of your niche’s language Now THIS is great content.
  • 13. Build Topical Authority for a Subject When conducting keyword research, optimizing on-page, or creating content, have a deep understanding of your niche’s language: 1. Understand how concepts relate to one another and which keywords pertain to those concepts. 2. Ensure these concepts are well represented. keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d
  • 14. Optimize for Searcher Intent Have an exceptional understanding of consumer language and the myriad of ways users may search about your niche 1. What are consumers looking for when they are familiar with your niche? • Language used should represent core keywords. 2. What are consumers looking for when they are not familiar with your niche? • Language tends to be more conversational. You may uncover more related terms when exploring your niche from this perspective. 3. What else do these two groups search for? • These searches may be directly and/or indirectly related.
  • 15. Actually doing Semantic Keyword Research… Social Media Is an Awesome Data Source
  • 16. Social Media Is an AWESOME Data Source for Semantic Keyword Research 1. Social media data helps us expand our collection of keyword ideas—especially new, breaking keywords. 2. Social media language is inherently conversational and can help us understand how conversation queries may be phrased. 3. We can use it to mimic the language of the customer, which has a secondary CRO benefit. #Awesome
  • 17. Secondary CRO Benefit: The Echo Effect While you’re at it, use social media language to mimic the language of your consumer. There are several studies that indicate it may help build trust and boost conversions • Study published in the International Journal of Hospitality Management:  Waitresses who verbally mimicked a person’s order were more likely to receive higher tips. • Study publish in the Journal of Language and Social Psychology:  Mirroring people’s words can be very important in building likability, safety, rapport, and social cohesion. http://pshapi.ro/echohospitality http://pshapi.ro/echoinfluence
  • 18. Once We Collect SERP and Social Media Data... There are some way we can break it down and analyze. Co-occurrence • How often two or more words appear along side each other in a corpus of documents. Latent Dirichlet Allocation (LDA) • Finds semantically related keywords and groups them into topical buckets. TF-IDF (Term Frequency-Inverse Document Frequency) • Reflects how important a keyword is to a document in a whole collection of documents.
  • 20. KNIME Is the One Tool to Rule Them All • Free and open source, running on every platform • Allows you to do things using a drag-and-drop interface that you would normally need a developer or programming background to accomplish. • Synergizes data-oriented tasks and helps easily automate:  Data collection  Data manipulation  Analysis  Visualization  Reporting http://pshapi.ro/downloadknime
  • 21. Visualizations KNIME Produces That Will Help Optimize for Semantic Search Keyword Node GraphsSegmented Word Clouds
  • 23. What’s a Node? • Pre-built drag-and-drop boxes designed to do a single task. • They are combined together into “workflows” to do larger, more complex tasks. • Nodes can be grouped together into meta-nodes which can be configured in unison.
  • 24. How Do You Add Nodes and How Do They Connect? How do you add nodes? How do you connect nodes to one-another?
  • 25. Configuring Nodes and Running Workflows Configuring Nodes Running Workflows OR
  • 26. Accessing Data from SERP and Twitter + Common Node Configurations We’ll Be Using
  • 27. Get a Twitter API Key Fill out the forms! • Application “Name”, “Description”, and “Website” don’t matter for our purposes. Go to “Keys and Access Tokens” tab and grab: • Consumer Key (API Key) • Consumer Secret (API Secret) Click “Create my access token” and grab: • Access Token • Access Token Secret Go to: https://apps.twitter.com/
  • 28. Accessing Social Data – Twitter API Nodes Right-Click and “Configure” to input API information Right-Click and “Configure” Twitter Search Query (and type)
  • 30. Extract Only the Links from Twitter A little trickier than it should be since you have to expand t.co links and URL shorteners.
  • 31. Accessing SERP Data – Inputting Data Manually Manually input URLs with Excel Spreadsheet or CSV (Desktop Rank Checkers) Manually input URLs with “Table Creator” node (Right-Click Configure – edit just like a spreadsheet)
  • 32. Accessing SERP Data – Inputting Data via API (Better) Example – GetSTAT More-Complicated Meta Node Method
  • 33. Make Webpages Plain Text (for Analysis) Use Boilerpipe API (pre-made meta-node download to be provided) http://boilerpipe-web.appspot.com/
  • 35. Getting Things into a Text Analysis Format Use the built-in “Strings To Document” node
  • 36. A Few More Useful Base Nodes for Text Analysis
  • 37. Parts of Speech Tagging (POS)
  • 40. LDA (Latent Dirichlet Allocation) Node
  • 41. Color Manager & Word Cloud
  • 43. Process: Using KNIME for Semantic Topic Modeling and Keyword Research
  • 44. Bringing It All Together: Applying Concepts to Visualizations 1. Search Twitter for keyword and collect all of the Tweet text 2. Search Twitter for keyword, extract links only, scrape text from links 3. Extract top 10 ranking pages keyword and scrape text from links 4. Isolate single word keywords and/or multi-word N-grams 5. Calculate TF-IDF THEN we can… • Tag Parts of Speech (Nouns, Adjectives, Verbs, etc.) and display in Word Cloud • Do Co-Occurrence Analysis and display in Node Graph (remember earlier patent?) • Identify semantic topic groupings with LDA and display in Node Graph
  • 45. Analysis We Can Do Based on a Google Patent Simplified with a smaller corpus, but easily replicable with KNIME: 1. Filter out too common terms using TF-IDF 2. Take the top 20 or so terms that are above a certain threshold based upon TF-IDF and remove the rest. 3. Calculate Co-occurrence of the remaining terms. 4. Optimize your site for these! Bill Slawski Patent Analysis: http://pshapi.ro/cooccurencepatent
  • 46. Bringing It All Together – Parts of Speech Output
  • 47. Bringing It All Together – TF-IDF + Co-Occurence Output
  • 48. Bringing It All Together – TF-IDF + LDA Output
  • 49. Now Start Building More Effective semantically Optimized Websites!
  • 50. © 2015 by Catalyst Digital. All rights reserved.

Editor's Notes

  1. Google Understand the query better, understands the meaning on text on pages, use query rewriting to be more efficient – return same results for searches that mean the same thing Understands the connections between keywords and entity keywords
  2. In Addition to Data from Ranking Webpages in the SERP…
  3. Better than a spreadsheet – makes looking and the relation of keywords less onerous task – keyword relations are easier to identify