Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

In Search of Natural Language Processing: Rank Brain, Google, SEO, and You.


Published on

It is often mistakenly thought that Google does natural language processing in its search results, as of 2018 it still doesn't. This presentation looks at how Google started, its historical approach to language, and how it is working towards NLP along with new methods of machine learning that are supporting the "strings to things" interpretation of text and voice and how Rank Brain plays into all of this.

Published in: Internet
  • I've Saved Over $400 On Batteries! I can't believe how simple your reconditioning steps are! My old (and once dead) car batteries, cell phone battery, drill battery, camera battery and tons of other batteries are all reconditioned and working great again! Since starting your program I've saved over $400 on batteries! ♥♥♥
    Are you sure you want to  Yes  No
    Your message goes here

In Search of Natural Language Processing: Rank Brain, Google, SEO, and You.

  1. 1. #Ungagged #Vegas @schachin Kristine Schachinger In Search of NLP
  2. 2. #Ungagged #Vegas @schachin Kristine Schachinger In the beginning, there was a … Large-Scale Hypertextual Web Search Engine
  3. 3. #Ungagged #Vegas @schachin Kristine Schachinger What?
  4. 4. #Ungagged #Vegas @schachin Kristine Schachinger What?
  5. 5. #Ungagged #Vegas @schachin Kristine Schachinger Link Profiles
  6. 6. #Ungagged #Vegas @schachin Kristine Schachinger The Web 1998
  7. 7. #Ungagged #Vegas @schachin Kristine Schachinger Google Goes To Work
  8. 8. #Ungagged #Vegas @schachin Kristine Schachinger In 2018 … Roughly half of the world's population or 3.8 billion people use the internet every day.
  9. 9. #Ungagged #Vegas @schachin Kristine Schachinger Google processes TRILLIONS of queries a year & has indexed BILLIONS of Websites.
  10. 10. #Ungagged #Vegas @schachin Kristine Schachinger IN 2015, THERE WERE 2,834,650,000,000 Google searches with an average 7,766,000,000 searches a day.
  11. 11. #Ungagged #Vegas @schachin Kristine Schachinger Breaks down to … 7.7 billion average searches per day or over 63,000 search queries per second.
  12. 12. #Ungagged #Vegas @schachin Kristine Schachinger
  13. 13. #Ungagged #Vegas @schachin Kristine Schachinger Dealing With The Data.
  14. 14. #Ungagged #Vegas @schachin Kristine Schachinger
  15. 15. #Ungagged #Vegas @schachin Kristine Schachinger Google Search was founded on unstructured data.
  16. 16. #Ungagged #Vegas @schachin Kristine Schachinger Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.
  17. 17. #Ungagged #Vegas @schachin Kristine Schachinger Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.
  18. 18. #Ungagged #Vegas @schachin Kristine Schachinger Keywords.
  19. 19. #Ungagged #Vegas @schachin Kristine Schachinger Unstructured Data uses keywords.
  20. 20. #Ungagged #Vegas @schachin Kristine Schachinger TF-IDF Term Frequency Inverse Document Frequency ie the frequency of keywords
  21. 21. #Ungagged #Vegas @schachin Kristine Schachinger As queries number in the trillions unstructured data becomes inefficient. Data needs structure.
  22. 22. #Ungagged #Vegas @schachin Kristine Schachinger Keywords to Queries. Welcome Semantic Search!
  23. 23. #Ungagged #Vegas @schachin Kristine Schachinger So Google moved from Relational Databases to Knowledge Graphs. Knowledge Graphs
  24. 24. #Ungagged #Vegas @schachin Kristine Schachinger NOTE Knowledge Graphs DO NOT EQUAL THE KNOWLEDGE GRAPH Knowledge Graphs
  25. 25. #Ungagged #Vegas @schachin Kristine Schachinger “Graph-based knowledge representation has been researched for decades and the term knowledge graph does not constitute a new technology. Rather, it is a buzzword reinvented by Google and adopted by other companies and academia to describe different knowledge representation applications.” Knowledge Graphs
  26. 26. #Ungagged #Vegas @schachin Kristine Schachinger Enter Semantic Search and TensorFlow
  27. 27. #Ungagged #Vegas @schachin Kristine Schachinger MACHINE LEARNING
  28. 28. #Ungagged #Vegas @schachin Kristine Schachinger What is Semantic Search?
  29. 29. #Ungagged #Vegas @schachin Kristine Schachinger Semantic Search = Understanding Intent
  30. 30. #Ungagged #Vegas @schachin Kristine Schachinger Welcome G Squared
  31. 31. #Ungagged #Vegas @schachin Kristine Schachinger Google Squared Google Squared returns search results in a spreadsheet format. It structures the unstructured data on web pages. So a search for Small Dogs returns results with names, description, size, weight, origin, etc., in columns and rows.” ~Techcrunch
  32. 32. #Ungagged #Vegas @schachin Kristine Schachinger Before the Knowledge Graph
  33. 33. #Ungagged #Vegas @schachin Kristine Schachinger Google Squared “Call it structured data if you like, I call it a surefire recipe for making a bad dog buying decision.”
  34. 34. #Ungagged #Vegas @schachin Kristine Schachinger Google Kills Google Squared. RIP Google Squared 2009-2011
  35. 35. #Ungagged #Vegas @schachin Kristine Schachinger (Knowledge Graphs) ”…quite possibly ... one of Google's significant achievements” Nathania Johnson of Search Engine Watch Knowledge Graphs
  36. 36. #Ungagged #Vegas @schachin Kristine Schachinger Why?
  37. 37. #Ungagged #Vegas @schachin Kristine Schachinger The Holy Grail of Search? NLP (Natural Language Processing)
  38. 38. #Ungagged #Vegas @schachin Kristine Schachinger “Strings to Things" But Google doesn’t process Natural Language.
  39. 39. #Ungagged #Vegas @schachin Kristine Schachinger G-Squared was the early stages of Google moving search from strings (unstructured data) or the “bag of words” approach  to “things” (structured data) “Strings to Things"
  40. 40. #Ungagged #Vegas @schachin Kristine Schachinger “Things” are known objects with known (or learned) relationships. “Strings to Things"
  41. 41. #Ungagged #Vegas @schachin Kristine Schachinger Before THE Knowledge Graph – Wonder Wheel
  42. 42. #Ungagged #Vegas @schachin Kristine Schachinger Before the Knowledge Graph – Wonder Wheel
  43. 43. #Ungagged #Vegas @schachin Kristine Schachinger Welcome THE Knowledge Graph 2012.
  44. 44. #Ungagged #Vegas @schachin Kristine Schachinger Knowledge Graphs are based on known relationships. THE Knowledge Graph is Google’s graph database. THE Knowledge Graph
  45. 45. #Ungagged #Vegas @schachin Kristine Schachinger The Knowledge Graph (Google) is seeded by things known. Instead of just text without meaning, The KG is a relational graph with known objects and mapped relationships. THE Knowledge Graph
  46. 46. #Ungagged #Vegas @schachin Kristine Schachinger THE Knowledge Graph Seeds.
  47. 47. #Ungagged #Vegas @schachin Kristine Schachinger "Four years ago this July, Google acquired Metaweb, bringing Freebase and linked open data to Google," he wrote. Google software engineer Barak Michener THE Knowledge Graph Seeds
  48. 48. #Ungagged #Vegas @schachin Kristine Schachinger Also includes trusted sources such as the CIA Fact Book, Wikipedia, Wikidata etc. THE Knowledge Graph Seeds
  49. 49. #Ungagged #Vegas @schachin Kristine Schachinger Why the Knowledge Graph? To help better match user intent. To understand what users want. THE Knowledge Graph
  50. 50. #Ungagged #Vegas @schachin Kristine Schachinger The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query THE Knowledge Graph
  51. 51. #Ungagged #Vegas @schachin Kristine Schachinger In other words NOUNS THE Knowledge Graph
  52. 52. #Ungagged #Vegas @schachin Kristine Schachinger NOUNS=ENTITIES THE Knowledge Graph
  53. 53. #Ungagged #Vegas @schachin Kristine Schachinger Google moves to ENTITY SEARCH THE Knowledge Graph
  54. 54. #Ungagged #Vegas @schachin Kristine Schachinger Knowledge Graph entities The Knowledge Graph has millions of entries that describe real-world entities like people, places, and things. These entities form the nodes of the graph. The following are some of the types of entities found in the Knowledge Graph: Book BookSeries EducationalOrganization Event GovernmentOrganization LocalBusiness Movie MovieSeries MusicAlbum MusicGroup MusicRecording Organization Periodical Person Place SportsTeam TVEpisode TVSeries VideoGame VideoGameSeries WebSite THE Knowledge Graph ENTITIES
  55. 55. #Ungagged #Vegas @schachin Kristine Schachinger Entities + Relationships= THE Knowledge Graph THE Knowledge Graph
  56. 56. #Ungagged #Vegas @schachin Kristine Schachinger Knowledge Graph = the Answer Engine THE Knowledge Graph
  57. 57. #Ungagged #Vegas @schachin Kristine Schachinger Knowledge Graph = the Answer Engine THE Knowledge Graph
  58. 58. #Ungagged #Vegas @schachin Kristine Schachinger Google as an Answer Engine
  59. 59. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird  “Strings to Things”.
  60. 60. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird The name was derived from the speed and accuracy of the hummingbird. “Strings to Things"
  61. 61. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird Arrives 2013 Google moves from matching keyword terms to Google trying to process Natural Language Queries. “Strings to Things"
  62. 62. #Ungagged #Vegas @schachin Kristine Schachinger But Google doesn’t process Natural Language very well. “Strings to Things"
  63. 63. #Ungagged #Vegas @schachin Kristine Schachinger Moving to vectors.
  64. 64. #Ungagged #Vegas @schachin Kristine Schachinger KEY FACTOR word2vec: Vector space models (VSMs) represent (embed) words in a continuous vector space where semantically similar words are mapped to nearby points ('are embedded nearby each other'). Hummingbird
  65. 65. #Ungagged #Vegas @schachin Kristine Schachinger Embedded Word Model Hummingbird
  66. 66. #Ungagged #Vegas @schachin Kristine Schachinger “…words that appear in the same contexts share semantic meaning. The different approaches that leverage this principle can be divided into two categories: count-based methods (e.g. Latent Semantic Analysis), and predictive methods (e.g. neural probabilistic language models).” Hummingbird
  67. 67. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird
  68. 68. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird added a semantic layer to the search algorithms. “Strings to Things"
  69. 69. #Ungagged #Vegas @schachin Kristine Schachinger Semantic Interpretations.
  70. 70. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird adds a semantic layer to the search algorithms like synonyms and close variants.
  71. 71. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird adds a semantic layer to the search algorithms that uses “semantic distance and term relationships”.
  72. 72. #Ungagged #Vegas @schachin Kristine Schachinger Hummingbird adds a semantic layer to the search algorithms that uses “phrase based Indexing and co- occurrence.”
  73. 73. #Ungagged #Vegas @schachin Kristine Schachinger Page Segmentation. This part of the algorithm determines meaning through placement.
  74. 74. #Ungagged #Vegas @schachin Kristine Schachinger Entity Salience. This part of the algorithm determines meaning through known relationships.
  75. 75. #Ungagged #Vegas @schachin Kristine Schachinger So Hummingbird moves from strict word count based modeling (ie keyword counts) to probabilistic modeling (ie predictive interpretation) via known word vectors. Hummingbird
  76. 76. #Ungagged #Vegas @schachin Kristine Schachinger
  77. 77. #Ungagged #Vegas @schachin Kristine Schachinger What does this look like? From Google’s Natural Language Cloud Tool
  78. 78. #Ungagged #Vegas @schachin Kristine Schachinger What does this look like?
  79. 79. #Ungagged #Vegas @schachin Kristine Schachinger What does this look like mathematically?
  80. 80. #Ungagged #Vegas @schachin Kristine Schachinger BUT ….. Google Search still doesn’t process Natural Language. This means we must add an “interpreter”.
  81. 81. #Ungagged #Vegas @schachin Kristine Schachinger Structured Data and Schema.
  82. 82. #Ungagged #Vegas @schachin Kristine Schachinger What is Structured Data?
  83. 83. #Ungagged #Vegas @schachin Kristine Schachinger What is Structured Data? Structured data for SEO purposes is on-page markup that enables search engines to better understand the information currently on your site’s web pages, and then use this information to improve search results listing by better matching user intent.
  84. 84. #Ungagged #Vegas @schachin Kristine Schachinger What is Structured Data? This structured data is defined by using schema to act as the interpreter. This is the definition we add to the page using schema code. Google allows 3 types. • RDFa • Microdata • JSON-LD
  85. 85. #Ungagged #Vegas @schachin Kristine Schachinger Schema JSON-LD is the recommended schema code. JSON-LD stands for JavaScript Object Notation for Linked Data This is just a way to implement schema outside the HTML mark-up structure. RDFa and Microformats required the code to be implemented via HTML.
  86. 86. #Ungagged #Vegas @schachin Kristine Schachinger Schema Benefit is it can be removed from the HTML structure, which makes it easier to write, implement, and maintain. For a good breakdown on what JSON is at the code level. Portent’s JSON Implementation Guide is very helpful.
  87. 87. #Ungagged #Vegas @schachin Kristine Schachinger JSON-LD Schema
  88. 88. #Ungagged #Vegas @schachin Kristine Schachinger Schema IMPORTANT! Test your JSON-LD. Use the Google Structured Mark-Up Helper.
  89. 89. #Ungagged #Vegas @schachin Kristine Schachinger Schema NOTE this tool only tells you if it is semantically correct, NOT if you are using the proper schema. Make sure to check with Google’s Guides on schema implementation. Improper use or implementation can result in a manual action. • •
  90. 90. #Ungagged #Vegas @schachin Kristine Schachinger Schema IMPORTANT! Your JSON content MUST match what is in the page exactly. If they differ, you will likely get a manual action as Google sees this as cloaking.
  91. 91. #Ungagged #Vegas @schachin Kristine Schachinger Schema
  92. 92. #Ungagged #Vegas @schachin Kristine Schachinger Why Does Schema Matter?
  93. 93. #Ungagged #Vegas @schachin Kristine Schachinger We can act as the interpreter and help “teach” Google what our site is about.
  94. 94. #Ungagged #Vegas @schachin Kristine Schachinger Adding semantic mark-up (structured data via schema) allows us to tell Google what WE SAY our site is about and WHAT RELATIONSHIPS we define within it.
  95. 95. #Ungagged #Vegas @schachin Kristine Schachinger We can act as the interpreter and help “teach” Google the context of our content.
  96. 96. #Ungagged #Vegas @schachin Kristine Schachinger
  97. 97. #Ungagged #Vegas @schachin Kristine Schachinger We can help give Google a clearer understanding. That helps us help Google better answer the questions users ask and to better surface our content for those users We give our data meaning Google Understands
  98. 98. #Ungagged #Vegas @schachin Kristine Schachinger Ranking Without Links
  99. 99. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain
  100. 100. #Ungagged #Vegas @schachin Kristine Schachinger
  101. 101. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain is used for Unknown Queries where entity meanings/relationships are unclear or unknown.
  102. 102. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain
  103. 103. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain. Only algorithm that uses AI on the live results Rank Brain.
  104. 104. #Ungagged #Vegas @schachin Kristine Schachinger Presence of Rank Brain means Google is confused …
  105. 105. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain
  106. 106. #Ungagged #Vegas @schachin Kristine Schachinger Why? Google does not use NLP in Search (Natural Language Processing) Rank Brain.
  107. 107. #Ungagged #Vegas @schachin Kristine Schachinger Uses Structured Data, Entities, & Known Relationships Person, Place, Thing = Noun = Entities. Nouns or Persons/Places/People/Things are what we call entities. Entities are known to Google and their meaning is defined in the databases Google references. Rank Brain.
  108. 108. #Ungagged #Vegas @schachin Kristine Schachinger • Words go in. • Words get assigned a mathematical address in a vector. • Similar and related words sit close to each other in the vector space. • Words are retrieved based on your query and the words it locates in the “best fit” vector. • These word “interpretations” are used to return results. • If the relationships are weak or unknown, enter Rank Brain. • Behind the scenes, data is continually fed into the machine learning process, so as to make those results more relevant the next time. Rank Brain – Known Relationships.
  109. 109. #Ungagged #Vegas @schachin Kristine Schachinger Rank Brain Also Uses Users Queries & Clicks to Help It Understand Query Intent.
  110. 110. #Ungagged #Vegas @schachin Kristine Schachinger Should you optimize for it? Rank Brain.
  111. 111. #Ungagged #Vegas @schachin Kristine Schachinger Why would you optimize to rank with AI? Rank Brain.
  112. 112. #Ungagged #Vegas @schachin Kristine Schachinger Google Does Not Even Understand What Rank Brain is Actually Doing. Rank Brain.
  113. 113. #Ungagged #Vegas @schachin Kristine Schachinger (Gary Illyes) Rank Brain.
  114. 114. #Ungagged #Vegas @schachin Kristine Schachinger Just write in natural and conversational language. Create holistic content.
  115. 115. #Ungagged #Vegas @schachin Kristine Schachinger Write holistic content. Use terms that are semantically related. For a detailed explanation Google explains here >
  116. 116. #Ungagged #Vegas @schachin Kristine Schachinger Write holistic content. DOES YOUR CONTENT HAVE DEPTH AND WIDTH? For a detailed explanation Google explains here >
  117. 117. #Ungagged #Vegas @schachin Kristine Schachinger Takeaways. • Think Search Queries NOT Simple Keywords • Write in natural, conversational language • Write using holistic content • Focus on depth and breadth with related terms • Add Structured Data Takeaways.
  118. 118. #Ungagged #Vegas @schachin Kristine Schachinger THINK in Query Terms & Context.
  119. 119. #Ungagged #Vegas @schachin Kristine Schachinger
  120. 120. #Ungagged #Vegas @schachin Kristine Schachinger
  121. 121. #Ungagged #Vegas @schachin Kristine Schachinger In Search of NLP