The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)


Published on

For a detailed recap:
My BrightonSEO presentation...
1st Half: What is semantic search and why does it matter to SEOs.
2nd Half: Using KNIME to do semantic keyword research using SERP and Twitter data.

Published in: Marketing
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Google Understand the query better,
    understands the meaning on text on pages,
    use query rewriting to be more efficient – return same results for searches that mean the same thing
    Understands the connections between keywords and entity keywords
  • In Addition to Data from Ranking Webpages in the SERP…
  • Better than a spreadsheet – makes looking and the relation of keywords less onerous task – keyword relations are easier to identify
  • The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (with KNIME)

    1. 1. The Actionable Guide to Doing Better Semantic Keyword Research
    2. 2. Who Am I? Read Later:
    3. 3. The Prevalence of Semantic Search (Unstructured) Search Engines are coming to rely more-and-more on semantic search technology to understand websites and how users search. • As a result SEOs need to better understand how language and keywords relate to each other in order to do more effective keyword research. Do semantic keyword research!
    4. 4. What Is Semantic Search? Strings can represent things: • Search Engines are looking past exact match keyword occurrences on web pages. • They are learning the meaning behind keywords and examining how they relate to each other conceptually • The strength of that conceptual connection being scored for relevancy within search queries and on-page.
    5. 5. What is a mammal that has a vertebrate and lives in water?
    6. 6. +1 Probability +1 Probability +1 Probability
    7. 7. Google Hummingbird
    8. 8. What’s up with Hummingbird? “Hummingbird is paying more attention to each word in a query, ensuring that the whole query – the whole sentence or conversation or meaning – is taken into account, rather than particular words. The goal is that pages matching the meaning do better, rather than pages matching just a few words.” Hummingbird improves semantic understanding of search queries AND makes conversational search better, which is important for the future of mobile and voice search.
    9. 9. Hummingbird Summarized I like Gianluca Fiorelli’s analysis of the theoretical capabilities of a post- Hummingbird Google search: 1. To better understand the intent of a query; 2. To broaden the pool of web documents that may answer that query; 3. To simplify how it delivers information, because if query A, query B, and query C substantively mean the same thing, Google doesn't need to propose three different SERPs, but just one; 4. To offer a better search experience, because expanding the query and better understanding the relationships between search entities (also based on direct/indirect personalization elements), Google can now offer results that have a higher probability of satisfying the needs of the user. 5. As a consequence, Google may present better SERPs also in terms of better ads, because in 99% of the cases, verbose queries were not presenting ads in their SERPs before Hummingbird. Source:
    10. 10. How Can SEOs Optimize for Semantic Search? 1. Make sure our content delights our users  Create quality content and use personas 2. Optimize for searcher intent and build topical authority using semantic topic modeling  Understand how users search and have command of your niche’s language Now THIS is great content.
    11. 11. Build Topical Authority for a Subject When conducting keyword research, optimizing on-page, or creating content, have a deep understanding of your niche’s language: 1. Understand how concepts relate to one another and which keywords pertain to those concepts. 2. Ensure these concepts are well represented. keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d keywor d
    12. 12. Optimize for Searcher Intent Have an exceptional understanding of consumer language and the myriad of ways users may search about your niche 1. What are consumers looking for when they are familiar with your niche? • Language used should represent core keywords. 2. What are consumers looking for when they are not familiar with your niche? • Language tends to be more conversational. You may uncover more related terms when exploring your niche from this perspective. 3. What else do these two groups search for? • These searches may be directly and/or indirectly related.
    13. 13. Actually doing Semantic Keyword Research… Social Media Is an Awesome Data Source
    14. 14. Social Media Is an AWESOME Data Source for Semantic Keyword Research 1. Social media data helps us expand our collection of keyword ideas—especially new, breaking keywords. 2. Social media language is inherently conversational and can help us understand how conversation queries may be phrased. 3. We can use it to mimic the language of the customer, which has a secondary CRO benefit. #Awesome
    15. 15. Secondary CRO Benefit: The Echo Effect While you’re at it, use social media language to mimic the language of your consumer. There are several studies that indicate it may help build trust and boost conversions • Study published in the International Journal of Hospitality Management:  Waitresses who verbally mimicked a person’s order were more likely to receive higher tips. • Study publish in the Journal of Language and Social Psychology:  Mirroring people’s words can be very important in building likability, safety, rapport, and social cohesion.
    16. 16. Once We Collect SERP and Social Media Data... There are some way we can break it down and analyze. Co-occurrence • How often two or more words appear along side each other in a corpus of documents. Latent Dirichlet Allocation (LDA) • Finds semantically related keywords and groups them into topical buckets. TF-IDF (Term Frequency-Inverse Document Frequency) • Reflects how important a keyword is to a document in a whole collection of documents.
    17. 17. The Ultimate Tool
    18. 18. KNIME Is the One Tool to Rule Them All • Free and open source, running on every platform • Allows you to do things using a drag-and-drop interface that you would normally need a developer or programming background to accomplish. • Synergizes data-oriented tasks and helps easily automate:  Data collection  Data manipulation  Analysis  Visualization  Reporting
    19. 19. Visualizations KNIME Produces That Will Help Optimize for Semantic Search Keyword Node GraphsSegmented Word Clouds
    20. 20. Basics of KNIME
    21. 21. What’s a Node? • Pre-built drag-and-drop boxes designed to do a single task. • They are combined together into “workflows” to do larger, more complex tasks. • Nodes can be grouped together into meta-nodes which can be configured in unison.
    22. 22. How Do You Add Nodes and How Do They Connect? How do you add nodes? How do you connect nodes to one-another?
    23. 23. Configuring Nodes and Running Workflows Configuring Nodes Running Workflows OR
    24. 24. Accessing Data from SERP and Twitter + Common Node Configurations We’ll Be Using
    25. 25. Get a Twitter API Key Fill out the forms! • Application “Name”, “Description”, and “Website” don’t matter for our purposes. Go to “Keys and Access Tokens” tab and grab: • Consumer Key (API Key) • Consumer Secret (API Secret) Click “Create my access token” and grab: • Access Token • Access Token Secret Go to:
    26. 26. Accessing Social Data – Twitter API Nodes Right-Click and “Configure” to input API information Right-Click and “Configure” Twitter Search Query (and type)
    27. 27. It’s Stupid Easy
    28. 28. Extract Only the Links from Twitter A little trickier than it should be since you have to expand links and URL shorteners.
    29. 29. Accessing SERP Data – Inputting Data Manually Manually input URLs with Excel Spreadsheet or CSV (Desktop Rank Checkers) Manually input URLs with “Table Creator” node (Right-Click Configure – edit just like a spreadsheet)
    30. 30. Accessing SERP Data – Inputting Data via API (Better) Example – GetSTAT More-Complicated Meta Node Method
    31. 31. Make Webpages Plain Text (for Analysis) Use Boilerpipe API (pre-made meta-node download to be provided)
    32. 32. Getting Things into a Text Analysis Format Use the built-in “Strings To Document” node
    33. 33. A Few More Useful Base Nodes for Text Analysis
    34. 34. Parts of Speech Tagging (POS)
    35. 35. Calculate TF-IDF
    36. 36. Co-Occurrence Nodes
    37. 37. LDA (Latent Dirichlet Allocation) Node
    38. 38. Color Manager & Word Cloud
    39. 39. Network Graph
    40. 40. Process: Using KNIME for Semantic Topic Modeling and Keyword Research
    41. 41. Bringing It All Together: Applying Concepts to Visualizations 1. Search Twitter for keyword and collect all of the Tweet text 2. Search Twitter for keyword, extract links only, scrape text from links 3. Extract top 10 ranking pages keyword and scrape text from links 4. Isolate single word keywords and/or multi-word N-grams 5. Calculate TF-IDF THEN we can… • Tag Parts of Speech (Nouns, Adjectives, Verbs, etc.) and display in Word Cloud • Do Co-Occurrence Analysis and display in Node Graph (remember earlier patent?) • Identify semantic topic groupings with LDA and display in Node Graph
    42. 42. Analysis We Can Do Based on a Google Patent Simplified with a smaller corpus, but easily replicable with KNIME: 1. Filter out too common terms using TF-IDF 2. Take the top 20 or so terms that are above a certain threshold based upon TF-IDF and remove the rest. 3. Calculate Co-occurrence of the remaining terms. 4. Optimize your site for these! Bill Slawski Patent Analysis:
    43. 43. Bringing It All Together – Parts of Speech Output
    44. 44. Bringing It All Together – TF-IDF + Co-Occurence Output
    45. 45. Bringing It All Together – TF-IDF + LDA Output
    46. 46. Now Start Building More Effective semantically Optimized Websites!
    47. 47. © 2015 by Catalyst Digital. All rights reserved.