Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search


Published on

Talk from Tech SEO Boost 2019 by Dawn Anderson on the move to the just in time predictive personalised search experience for search engines and users. Exploring recommender systems, collaborative filtering, temporal and location based queries and the rise of predictive, personal dynamic search. Exploring the work of information retrieval researchers and Google Discover.

Published in: Marketing
  • Got a new Iphone 6 in just 7 days completing surveys and offers! Now I'm just a few days away from completing and receiving my samsung tablet! Highly recommended! Definitely the best survey site out there! ➤➤
    Are you sure you want to  Yes  No
    Your message goes here

2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search

  1. 1. Dawn Anderson | @dawnieando | #TechSEOBoost The rise of predictive, proactive search TechSEO Boost 2019 The User is The Query
  2. 2. Dawn Anderson | @dawnieando | #TechSEOBoost “Today you are you! That is truer than true! There is no one alive who is you-er than you!” (Dr Seuss)
  3. 3. Dawn Anderson | @dawnieando | #TechSEOBoost Said Dr Seuss… and Google
  4. 4. When introducing Google Feed (now Discover)
  5. 5. Dawn Anderson | @dawnieando | #TechSEOBoost Today’s Topic: The User is the Query
  6. 6. Dawn Anderson | @dawnieando | #TechSEOBoost
  7. 7. Dawn Anderson | @dawnieando | #TechSEOBoost Also… Meet Bert and Ted
  8. 8. Dawn Anderson | @dawnieando | #TechSEOBoost There’s a problem with queries, content & users too
  9. 9. “In 1998 the web consisted of just 25 million pages…” (Ben Gomez, Google, 2018)
  10. 10. “… That’s roughly the equivalent number of those in a small library” (Ben Gomez, Google, 2018)
  11. 11. In 2019… we know the web is huge… billions of web pages (Netcraft, 2019)
  12. 12. App usage is huge too - By 2018 – App Store has 20 million registered developers. (Techcrunch, 2018)
  13. 13. 42% of the global population use social media (Emarsys, 2019)
  14. 14. We are competing with programmatic solutions spraying content & information EVERYWHERE
  15. 15. Over-choice: Too much choice often has negative impacts
  16. 16. Almost 98% of visits are people window shopping Average ecommerce conversion +/- 2% (Monetate)
  17. 17. Despite this… users are still seeking even more information
  18. 18. The number of Google searches increases year on year (Internetlivestat, 2018, curation from various sources)
  19. 19. Dawn Anderson | @dawnieando | #TechSEOBoost 15% of queries every day are new (Google)
  20. 20. Humans forage (like bears) all over the place seeking information… we are informavores
  21. 21. Dawn Anderson | @dawnieando | #TechSEOBoost Researching ALL THE THINGS… before making final decisions
  22. 22. We have become very good at filtering out things which are NOT interesting enough (8 second filter)
  23. 23. Dawn Anderson | @dawnieando | #TechSEOBoost It’s NOT a short attention span thing
  24. 24. Otherwise we would not binge on ‘Stranger Things’
  25. 25. Dawn Anderson | @dawnieando | #TechSEOBoost This is cognitive load management & information filtering
  26. 26. Dawn Anderson | @dawnieando | #TechSEOBoost AT THE SAME TIME words are problematic. Ambiguous… polysemous… synonymous
  27. 27. Often words have multiple meanings. Like “like” can be 5 possible parts of speech (POS)
  28. 28. Spoken word can be worse. Like “four candles” and “fork handles”
  29. 29. Which does not bode well for the likes of conversational search
  30. 30. In query understanding sometimes users don’t know what they want either
  31. 31. Sometimes exactly the same users express an information need in a different way
  32. 32. Sometimes different users use lots of different ways to mean exactly the same thing
  33. 33. 'The Vocabulary Problem’ Furnas, G.W., Landauer, T.K., Gomez, L.M. and Dumais, S.T., 1987. The vocabulary problem in human- system communication. Communications of the ACM, 30(11), pp.964-971. 1987
  34. 34. One of the inventors of ‘Latent Semantic Indexing’, created to solve ‘The Vocabulary Problem’ whilst researching at Bellcore (1990)
  35. 35. BTW… No-one said LSI was used by Google (aside)
  36. 36. Sometimes the searcher query is a ‘cold start’ query
  37. 37. Broad or cold start queries might call for result diversification due to lack of intent detection
  38. 38. Search engines may return a broad blend of results to match these queries Freshness Serendipity Novelty Diversity
  39. 39. AKA Result Diversification
  40. 40. The searcher has to click around to provide feedback on their intent or reformulate the query by entering something else (‘query refinement’)
  41. 41. To then deliver sequential queries with greater intent understanding
  42. 42. Human in the loop
  43. 43. Query refinement says… “Your move next”
  44. 44. A kind of ‘probability- driven fork in the road’ (Sadikov et al, 2010) CLUSTERING QUERY REFINEMENTS BY USER INTENT
  45. 45. BUT word’s meaning & user intent /context combined are still very hard to understand for search engines
  46. 46. Despite assistance from Google’s BERT & progress in NLP
  47. 47. Glue Benchmark Leaderboard
  48. 48. Superglue Benchmarks
  49. 49. Stanford Question And Answer Dataset 2.0 • Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P., 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
  50. 50. MS MARCO Paper • Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R. and Deng, L., 2016. MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset.
  51. 51. The exact same queries have different intent at different times & different locations
  52. 52. What did you really mean when you searched for ‘Easter’? • Radinsky, K., Svore, K.M., Dumais, S.T., Shokouhi, M., Teevan, J., Bocharov, A. and Horvitz, E., 2013. Behavioral dynamics on the web: Learning, modeling, and prediction. ACM Transactions on Information Systems (TOIS), 31(3), p.16. When did you search for ‘Easter’? A few weeks before Easter A few days before Easter During Easter What you mostly meant When is Easter? Things to do at Easter What is the meaning of Easter?
  53. 53. Modeling & Predicting Behavioural Dynamics on The Web (Radinsky et al, 2012)
  54. 54. “When users’ information needs change over time, the ranking of results should also change to accommodate these needs.” (Radinsky, 2013)
  55. 55. This is ‘Query Intent Shift’
  56. 56. The intent of queries changes over time
  57. 57. The passage of time adds new meaning to queries sometimes too
  58. 58. The rise and fall of the Blackberry?
  59. 59. ‘iPhone’ – Query Example (Google Quality Raters Guidelines)
  60. 60. Temporal Dynamic Intent (Burstiness) is a huge factor for intent
  61. 61. At certain times far more intents will be transactional
  62. 62. “dresses”, “shoes”, “bags” “buy dresses”, “buy shoes”, “buy bags”, “dress sales”, “shoe sales” Really means
  63. 63. And sometimes only reasons a particular audience would understand spike temporal queries
  64. 64. [Four candles] + [fork handles] interest over time
  65. 65. Sometimes it is other events which trigger unexpected queries
  66. 66. Your ranking flux might well be shifting query intents at scale
  67. 67. Dawn Anderson | @dawnieando | #TechSEOBoost What a nightmare queries are
  68. 68. Maybe It’s Time For A Change?
  69. 69. Enter… The Next 20 Years of Search
  70. 70. Hmm… That sounds big Google… This is HUGE
  71. 71. Dawn Anderson | @dawnieando | #TechSEOBoost Three FUNDAMENTAL shifts in search
  72. 72. Dawn Anderson | @dawnieando | #TechSEOBoost Fundamental: “forming a necessary base or core; of central importance.”
  73. 73. Dawn Anderson | @dawnieando | #TechSEOBoost Three Fundamental Shifts • The shift from answers to journeys • The shift from queries to queryless • The shift from text to visual information
  74. 74. Dawn Anderson | @dawnieando | #TechSEOBoost The shift from text to more visual information
  75. 75. This feels like a huge UX / accessibility shift… Hoorah
  76. 76. Images are much easier to mentally consume than text & audio
  77. 77. Images & video engage… Images & video entertain Images & video provoke emotion
  78. 78. Photography app usage had a 210% increase between 2016 and 2018 according to App Annie
  79. 79. People spend on average 2.6x more time on pages with video
  80. 80. Image search is curation. Totally different to text-based search
  81. 81. Dawn Anderson | @dawnieando | #TechSEOBoost This is cognitive load management & information filtering
  82. 82. Go nuts with quality images & video
  83. 83. Dawn Anderson | @dawnieando | #TechSEOBoost The shift from queries to queryless
  84. 84. “Queries Are Difficult To Understand in Isolation” (Susan Dumais, Microsoft Research, 2016)
  85. 85. “Easier if we can model: who is asking, what they have done in the past, where they are, when it is, etc.” (Susan Dumais, CIKM, 2016)
  86. 86. Better still… what about predicting the user’s informational needs to proactively make suggestions?
  87. 87. QueryLess: Next Gen Proactive Search And Recommender Engines (2016)
  88. 88. “Nevertheless, as the world is becoming more mobile-centric, this old-fashioned query-driven search scenario and clickbased evaluation mechanism can no longer catch up with the rapid evolution of user demand on mobile devices.” (Song and Guo,2016 (Microsoft Research))
  89. 89. “Therefore,a more user- friendly, mobile-centric and scenario driven search paradigm that requires minimal user inputs is ready to come out” (Song and Guo,2016 (Microsoft Research))
  90. 90. It kind of sounds like Google Discover
  91. 91. At last announcement Google Discover had 800 million users (May, 2018)
  92. 92. It’s now on mobile home page. It knows you… and the things you do… where you’ve been… where you’re going
  93. 93. “In many cases predicting informational needs removes the need for the query & reactive search engine” (Song & Guo, 2016)
  94. 94. Zero-Query Queries – No Query Required
  95. 95. Google’s Recommender Systems
  96. 96. QueryLess: Next Gen Proactive Search And Recommender Engines
  97. 97. Google Scholar is now a Recommender System Too
  98. 98. YouTube is a Recommender System
  99. 99. YouTube Feedback Controls is ‘The Human in The Loop’
  100. 100. Reinforcement learning thrives from rewards (implicit feedback)
  101. 101. Contextual Bandit Algorithms
  102. 102. Dawn Anderson | @dawnieando | #TechSEOBoost The User (needs) is ‘The Query’
  103. 103. Dawn Anderson | @dawnieando | #TechSEOBoost The shift from answers to journeys
  104. 104. An information need is rarely a task with a single finite item
  105. 105. It’s more like a series of little chunks (sub-tasks)
  106. 106. People are creatures of habit it seems
  107. 107. “Patterns were spotted about repetitive task driven search behaviours – predictable” (Song & Guo, 2016)
  108. 108. Tasks & timelines go hand in hand… it seems
  109. 109. “Predictable task timeline patterns are more prevalent on mobile devices” (Song & Guo, 2016)
  110. 110. Like e.g. ‘checking the stock market’ every morning if you’re interested in stocks and shares
  111. 111. Mobile Device Sensors (14 sensors or more) Proximity sensors GPS sensor Ambient light sensor Accelerometer Compass Gyroscope Back illuminated sensor
  112. 112. Many tasks & intents can be modelled according to predicted patterns
  113. 113. Personalising Search via Interests & Activities 2005 paper awarded the 2017 SIGIR Test of Time Award. Cited 1029 times to date Teevan, J., Dumais, S.T. and Horvitz, E., 2005, August. Personalizing search via automated analysis of interests and activities. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 449-456). ACM.
  114. 114. Google Discover looks to be focusing on hobbies, interests, news and social activities
  115. 115. Very Recent Microsoft Research
  116. 116. The Ideal is Personalisation • Not easy to achieve fully • Sparsity of data • Privacy concerns • Broken sequences
  117. 117. In the absence of personalization… collaborative Filtering
  118. 118. There are other people nearly like you
  119. 119. You (and me) are unique… but may be similar
  120. 120. Matrix Factorisation (Netflix Recommendation System) + Matrix Factorisation (WALS Algorithm, Tensorflow)
  121. 121. Tensorflow Matrix Factorisation
  122. 122. Based on users liking the same things (with hidden common preferences)
  123. 123. Those sharing similar interests likely share other hidden interests too (i.e. the system does not know of them yet)
  124. 124. Google Discover ‘Topics’
  125. 125. Modelling cohorts
  126. 126. Understand the user, understand their cohort… Understand other similar informational needs
  127. 127. Progressive personalisation
  128. 128. The two sides of assistant will both be proactive Provide answers / search Conversation Search Help with activities / tasks Conversation Actions
  129. 129. Extend Actions on Google using Machine Learning
  130. 130. Understand your customers to assist with AI Perceived Information need Micro-task Micro-task Micro-task Micro-task Micro-task Task Micro-task Micro-task Micro-task Micro-task Task Micro-task Micro-task Task Micro-task Micro-task Micro-task Task Micro-task Micro-task Task Micro-task Task We can identify the user’s probable top tasks & subtasks Identify their needs & what info they need along the way
  131. 131. Tell us about the tasks, order and steps involved in booking a hotel
  132. 132. Many built-in intents & many ‘coming soon’
  133. 133. Connecting Tasks Across Devices & Applications
  134. 134. Multi- platforming • Switching between search and video • Between search and a recommender system
  135. 135. Connections Between Things
  136. 136. Building a Personal Knowledge Graph
  137. 137. A Recent Microsoft Personal Knowledge Graph Patent
  138. 138. Dawn Anderson | @dawnieando | #TechSEOBoost This is ’Task- driven’ Search & Recommender Systems
  139. 139. Where the user is truly ‘the query’
  140. 140. Dawn Anderson | @dawnieando | #TechSEOBoost Toward a Personal Knowledge Graph
  141. 141. Truly PERSONAL AI is not possible without a PERSONAL KNOWLEDGE GRAPH (Krisztian Balog, ECIR 2019)
  142. 142. Dawn Anderson | @dawnieando | #TechSEOBoost But where will users be reached?
  143. 143. By 2022 PCs will account for only 19 percent of IP traffic (Comscore, 2019)
  144. 144. Interest over time for Google Home & Amazon Alexa
  145. 145. Assistant + Home + Discover + Search App + Desktop + Location Tracker + Calendar + Gmail + YouTube
  146. 146. In your car
  147. 147. In your console
  148. 148. Carrier’s for Recommender Systems
  149. 149. Toward An Audience of One
  150. 150. What Can SEOs Do About This?
  151. 151. Realise… your ranking tools are mostly wrong
  152. 152. Dawn Anderson | @dawnieando | #TechSEOBoost Think CRM for SEO
  153. 153. Dawn Anderson | @dawnieando | #TechSEOBoost Identify interests & affinity groups
  154. 154. Map every single informational need sub-task you can think of to the sections of a model like the RACE model
  155. 155. Build task timeline clusters
  156. 156. Map & cluster ‘Related’ content by task & temporal type. Categories are too broad, and topics may be too
  157. 157. Dawn Anderson | @dawnieando | #TechSEOBoost Continually improve and update solid URL seasonal & temporal content
  158. 158. Contextual Order Matters
  159. 159. Dawn Anderson | @dawnieando | #TechSEOBoost Continually improve and update solid URL evergreen content
  160. 160. Dawn Anderson | @dawnieando | #TechSEOBoost Map content clearly to tasks and task timelines
  161. 161. Identify predictable patterns of user behavior
  162. 162. Understand the shared preferences, learn the hidden preferences
  163. 163. Go • Go big on evergreen content & keep updated Optimise • Optimise images well – think curation / collections Map • Map user journeys to content plans Optimise • video well – enhance with markup / transcription Get • Get personal – keep refining segments / personas Identify • Identify & cluster content around task timelines Use • Use relatedness across content, tasks & temporality
  164. 164. Dawn Anderson | @dawnieando | #TechSEOBoost Bias and reproducibility is a challenge
  165. 165. Reproducibility problems in research & RecSys (very high)
  166. 166. Bias on the web and recommender systems
  167. 167. Bias Considerations Presentation Bias Programming Bias Audience Manipulated Bias (e.g fake reviews) Machine Learning / AI Bias (Black box algorithms) Matthew’s Law Zipfian Distribution of Web Content
  168. 168. NoBIAS Project
  169. 169. Spotify add novelty items to home page to avoid biased personalisation
  170. 170. Do yourself a favour and follow Mounia Lalmas @mounialalmas
  171. 171. And this polar bear
  172. 172. The QueryLess change will not come overnight … things move slowly
  173. 173. Dawn Anderson | @dawnieando | #TechSEOBoost References
  174. 174. • Broder, A., 2002, September. A taxonomy of web search. In ACM Sigir forum (Vol. 36, No. 2, pp. 3-10). ACM. • Chuklin, A., Severyn, A., Trippas, J., Alfonseca, E., Silen, H. and Spina, D., 2018. Prosody Modifications for Question-Answering in Voice-Only Settings. arXiv preprint arXiv:1806.03957. • HigherVisibility. 2018. How Popular is Voice Search? | HigherVisibility. [ONLINE] Available at: • Filippova, K., Alfonseca, E., Colmenares, C.A., Kaiser, L. and Vinyals, O., 2015. Sentence compression by deletion with lstms. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 360-368). • Filippova, K. and Alfonseca, E., 2015. Fast k-best sentence compression. arXiv preprint arXiv:1510.08418. • Google Developers. 2018. Content-based Actions | Actions on Google | Google Developers. [ONLINE] Available at: actions/. [Accessed 18 June 2018]
  175. 175. References Radinsky, K., Svore, K.M., Dumais, S.T., Shokouhi, M., Teevan, J., Bocharov, A. and Horvitz, E., 2013. Behavioral dynamics on the web: Learning, modeling, and prediction. ACM Transactions on Information Systems (TOIS), 31(3), p.16 Sadikov, E., Madhavan, J. and Halevy, A., Google LLC, 2013. Clustering query refinements by inferred user intent. U.S. Patent 8,423,538. Official Google Webmaster Central Blog. 2019. Official Google Webmaster Central Blog: Rolling out mobile-first indexing . [ONLINE] Available at: mobile-first-indexing.html. [Accessed 25 September 2019]. Zhou, S., Cheng, K. and Men, L., 2017, April. The survey of large-scale query classification. In AIP Conference Proceedings (Vol. 1834, No. 1, p. 040045). AIP Publishing.
  176. 176. References Search Engine Land. 2019. Starting July 1, all new sites will be indexed using Google's mobile-first indexing - Search Engine Land. [ONLINE] Available at: indexed-using-googles-mobile-first-indexing-317490. [Accessed 25 September 2019]. Teevan, J., Dumais, S.T. and Horvitz, E., 2005, August. Personalizing search via automated analysis of interests and activities. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 449-456). ACM. Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R. and Deng, L., 2016. MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset.
  177. 177. Dawn Anderson | @dawnieando | #TechSEOBoost Keep in touch @dawnieando