Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Some Current Challenges With Voice & Conversational Search By Dawn Anderson

611 views

Published on

From the SMX West Conference in San Jose, CA March 13-15, 2018. SESSION: The Latest In Advanced Technical SEO. PRESENTATION: Some Current Challenges With Voice & Conversational Search - Given by Dawn Anderson, @dawnieando - Manchester Metropolitan University, International SEO Consultant / Director / Lecturer, Move It Marketing. #SMX #32A

Published in: Marketing
  • Be the first to comment

  • Be the first to like this

Some Current Challenges With Voice & Conversational Search By Dawn Anderson

  1. 1. #SMX #32A @dawnieando …And how you can overcome some of them SOME CURRENT CHALLENGES WITH VOICE & CONVERSATIONAL SEARCH
  2. 2. #SMX #32A @dawnieando Who  is  Dawn  Anderson? • From  rainy  Manchester,  UK • A  bit  of  a  ‘pracademic’  (hybrid  of  academic  and   practitioner) • International  SEO  consultant • Move  It  Marketing • I  lecture  on  search  and  digital  marketing  strategy • But  I  mostly  ‘do’  SEO • 11  years  in  SEO  now • Googlebot hunter  ;P  ;P • Consulting  with  brands,  in-­‐house  teams  and  start-­‐ ups • My  pomeranian Bert  is  often  featured  in  tweets   and  posts  ;P  ;P
  3. 3. #SMX #32A @dawnieando Interest  over  time  on  Alexa  and  Google  Home
  4. 4. #SMX #32A @dawnieando Seasonal  social  media  demonstrates  mass  engagement
  5. 5. #SMX #32A @dawnieando Eyes-­‐free  device  sales  are  sky-­‐rocketing
  6. 6. #SMX #32A @dawnieando Search  Engines  are  Getting  Better  At  Voice  Recognition  &  Question   Answering
  7. 7. #SMX #32A @dawnieando TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) In 2017 was the year of “questions”
  8. 8. #SMX #32A @dawnieando Google  Raters  guidelines  for  voice  search  published
  9. 9. #SMX #32A @dawnieando What  does  a  good  result  look  like? SPOILER • Meets informational needs • In short answers (as applicable) • Or the answer is at the beginning of the paragraph or result • Grammatically correct (syntactically well-formed) • No spelling mistakes • With accurate pronunciation
  10. 10. #SMX #32A @dawnieando What  does  a  bad  result  look  like?
  11. 11. #SMX #32A @dawnieando • [Skip] • [play  mumford and  sons  reminder]  -­‐ Action  Response:  Set  a   Reminder  Time:  Please  specify  a  time  Fails  to  Meet  The  user   wanted  to  play  a  specific  song,  and  the  device  instead  set  a   reminder.  No  users  would  be  satisfied  with  this  response. Bad Result - Confusion between ‘actions’ & ‘queries’
  12. 12. #SMX #32A @dawnieando Who  knows  how  many  times  Google  Home  cannot  help? • Only  Google  knows • But  they  aren’t   sharing • Search  engine   embarrassment?
  13. 13. #SMX #32A @dawnieando RECOGNITION IS NOT NATURAL LANGUAGE UNDERSTANDING
  14. 14. #SMX #32A @dawnieando ESSIR2017   European  Summer   School  on  Information   Retrieval Information Retrieval Lectures
  15. 15. #SMX #32A @dawnieando Enrique Alfonseca – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  16. 16. #SMX #32A @dawnieando Better ranking needed because the user tends to focus on a single answer
  17. 17. #SMX #32A @dawnieando § One  shot  at  the  answer § Berrypicking ‘evolving  search’  may   not  apply  so  easily § Does  not  benefit  from  query   refinement  and  user  feedback  as   desktop  SERPs  do – May  be  why  there  are  still  many   unanswered  queries Better Ranking Is Needed As The User Focuses On A Single Result
  18. 18. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  diversity  ‘clusters’   in  keyboard  ‘evolving’   user  search
  19. 19. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  refinement  (via   user  feedback)  is  not   possible  with  voice   search
  20. 20. #SMX #32A @dawnieando #SMXInsights § No query expansion or relaxation – Precision more important than recall – Because there can be only one (or 2)
  21. 21. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Precision  >  Recall  in  voice   search Accuracy  >  Diversity
  22. 22. #SMX #32A @dawnieando A rambled answer at the end is the worst possible result
  23. 23. #SMX #32A @dawnieando “There  is  no  re-­‐ordering  in   voice  search  – no   paraphrasing  – just   extraction  and   compression.” (Alfonseca,  2017,   ESSIR2017)
  24. 24. #SMX #32A @dawnieando Example of classic IR teaching query interpretation system TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  25. 25. #SMX #32A @dawnieando #SMXInsights § No paraphrasing with conversational search – Paraphrasing likely needs full understanding of query & intent to reformulate
  26. 26. #SMX #32A @dawnieando • The  knowledge  base  is  checked  first • Then  the  web  is  checked  to  ‘fill  in  gaps’ • Taking  from  the  messy  unstructured   data  of  web  pages Knowledge base first, web text second
  27. 27. #SMX #32A @dawnieando • Structured  data  (tables  and  data  stored  in  databases) • Semi-­‐structured  data  (XML,  JSON,  meta  headings  [h1-­‐h6]) • Semantically-­‐enriched  data  (marked  up  schema,  entities) • Unstuctured data  (normal  web  text  copy) • The  web  is  messy  and  noisy • Unstructured  data  is  difficult  to  make  sense  of  (no  topical   strength) The different types of data & the problem with unstructured data
  28. 28. #SMX #32A @dawnieando Structured  data  has   never  been  more   important  for   disambiguation
  29. 29. #SMX #32A @dawnieando • Adds  meaning • Disambiguates • Adds  structure • Helps  with  context • The  web  is  noisy • Unstructured  data  is  voluminous Structured Data is very, very useful here
  30. 30. #SMX #32A @dawnieando #SMXInsights § Simply adding topical H1 – H6 headings turns unstructured web data into semi-structured data
  31. 31. #SMX #32A @dawnieando Share these #SMXInsights on your social channels! #SMXInsights § Tables are problematic for voice search – Support tabular data with well formed paragraphs and sentences
  32. 32. #SMX #32A @dawnieando • What  may  be  good  for  featured   snippets  (tabular  data)  may  not  be   good  for  voice  search • You  may  need  additional  strategy   for  voice  search  &  tabular  data  in   featured  snippets • Pete  Myers  from  Moz found  only   30%  voice  search  results  on  Google   Home  came  from  tables  in  featured   snippets  (Image  credit:  Pete  Myers,   Moz) Tables are currently problematic
  33. 33. #SMX #32A @dawnieando CONFIRMED  BY: • Google’s  Enrique  Alfonseca (2017) • Microsoft’s  Harry  Shum  (2018) • Conversational  contextual  search  is  difficult Multi-turn conversations are still challenging
  34. 34. #SMX #32A @dawnieando • (“anaphoric”  is  referring   upward  to  previously   mentioned  words) • Resolution  means  trying  to   understand  what  it  was   which  is  referred  to  in  those   previously  mentioned  words Anaphoric Resolution
  35. 35. #SMX #32A @dawnieando • (“cataphoric”  is  referring   downward  to  subsequent   words) • Resolution  means  trying  to   understand  what  it  is  which  is   referred  to  in  those   subsequent  words Cataphoric Resolution
  36. 36. #SMX #32A @dawnieando Likely  relates  to  anaphoric  (likely)  &  cataphoric (far  less  likely)   resolution Pronouns seem still Problematic
  37. 37. #SMX #32A @dawnieando Our ’Previous’ Work
  38. 38. #SMX #32A @dawnieando AKA  – Word  category  disambiguation • Function  words  – POS  (Syntax) • Content  words  – POS  (relevant) • Verbs  – POS • Nouns  -­‐ POS • Pronouns  -­‐ POS • Plural-­‐pronouns  -­‐ POS Pygmalion are carrying out Part of Speech (POS) & Named Entity Tagging (NE tags) manually
  39. 39. #SMX #32A @dawnieando WORD DISAMBIGUATION
  40. 40. #SMX #32A @dawnieando Ambiguous queries need context – ‘House’
  41. 41. #SMX #32A @dawnieando Linguistics are complex Homophora Endophora Exophora Hyponyms Hypernyms Homonyms
  42. 42. #SMX #32A @dawnieando COREFERENCE RESOLUTION IS A CHALLENGING PROBLEM FOR DISAMBIGUATION
  43. 43. #SMX #32A @dawnieando THE IMPORTANCE OF CO-OCCURRENCE
  44. 44. #SMX #32A @dawnieando ”You shall know a word by the company it keeps” (Firth)
  45. 45. #SMX #32A @dawnieando Other ’Previous’ Work – Similarity & Relatedness
  46. 46. #SMX #32A @dawnieando WordSimilarity353 Test Collection
  47. 47. #SMX #32A @dawnieando money cash 9.08 money currency 9.04 football soccer 9.03 magician wizard 9.02 gem jewel 8.96 car automobile 8.94 boy lad 8.83 furnace stove 8.79 Maradona football 8.62 king queen 8.58 money bank 8.5 Jerusalem Israel 8.46 vodka gin 8.46 planet star 8.45 money dollar 8.42 vodka brandy 8.13 bank money 8.12 physics proton 8.12 planet galaxy 8.11 stock market 8.08 psychology psychiatry 8.08 planet moon 8.08 planet constellation 8.06 planet sun 8.02 tiger feline 8 planet astronomer 7.94 movie theater 7.92 planet space 7.92 baby mother 7.85 wood forest 7.73 money deposit 7.73 psychology mind 7.69 Jerusalem Palestinian 7.65 Arafat terror 7.65 computer keyboard 7.62 computer internet 7.58 money property 7.57 tennis racket 7.56 psychology cognition 7.48 book paper 7.46 book library 7.46 media radio 7.42 psychology depression 7.42 jaguar cat 7.42 movie star 7.38 bird crane 7.38 tiger cat 7.35 physics chemistry 7.35 money possession 7.29 jaguar car 7.27 cup drink 7.25 psychology health 7.23 bird cock 7.1 company stock 7.08 tiger carnivore 7.08 WordSimilarity353 Test Collection
  48. 48. #SMX #32A @dawnieando #SMXInsights § Secondary or 3-way strategy may be needed – Add a TL:DR – Or an executive summary – Or Q & A based table of contents – Or a ‘Short Answer’ then ‘Longer Answer’
  49. 49. #SMX #32A @dawnieando #SMXInsights § Mine forums, customer service, chat & emails – Build word clouds to provide answers to topics which matter to your audience
  50. 50. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Soundex,  Metaphone or   similar  ’misspelling’   algorithms  may  not  apply   to  voice  search
  51. 51. #SMX #32A @dawnieando LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX
  52. 52. #SMX #32A @dawnieando • WordSimilarity353  Test  Collection  -­‐http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/ • Miller,  G.A.  and  Charles,  W.G.,  1991.  Contextual  correlates  of  semantic  similarity. Language  and   cognitive  processes, 6(1),  pp.1-­‐28. • Linkedin Harry  Shum.  2018. From  Search  to  Research.  [ONLINE]  Available   at: https://www.linkedin.com/pulse/from-­‐search-­‐research-­‐harry-­‐shum/.  [Accessed  22  February  2018]. • Coreference Resolution  -­‐ The  Stanford  Natural  Language  Processing  Group.  2018. The  Stanford  Natural   Language  Processing  Group.  [ONLINE]  Available  at: https://nlp.stanford.edu/projects/coref.shtml.   [Accessed  19  February  2018]. Sources & References
  53. 53. #SMX #32A @dawnieando APPENDIX
  54. 54. #SMX #32A @dawnieando EXAMPLES • Look  at  Wikipedia  Redirects • Alternative  names  redirect  to  the most  appropriate  article   title (for  example, Edison  Arantes  do  Nascimento redirects   to Pelé)  (Wikipedia) • SPARQL  and  DBPedia identifies  many  variations   • (Beethoven  example) • https://dbpedia.org/sparql • https://en.wikipedia.org/wiki/Wikipedia:Redirect Terms can have many ‘surface forms’
  55. 55. #SMX #32A @dawnieando ” It is concluded…the more often two words can be substituted into the same contexts the more similar in meaning they are judged to be.” (Miller & Charles,1991)
  56. 56. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Difficult  to  deal  with   ‘query  ambiguity’ Result  ‘diversity’   assists  with  query   ambiguity  in  desktop   or  non-­‐voice  results
  57. 57. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Page  Length   ‘Normalization’  may  not   apply  as  with  traditional   results?? (Me  musing)
  58. 58. #SMX #32A @dawnieando Long numbers should be rounded § 60,999,888.999999999 – It  reads  terribly – Needs  to  be  rounded
  59. 59. #SMX #32A @dawnieando • First  checks  whether  the  next  ‘turn’  of  question  relates  to   the  previous  question • Using  LSTMs  (Long  Short  Term  Memory) • Bi-­‐directional  context  embedding • Query  and  its  context  are  both  used  as  input Conversational Context & Microsoft
  60. 60. #SMX #32A @dawnieando Katja Filippova – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  61. 61. #SMX #32A @dawnieando Query expansion and query relaxation
  62. 62. #SMX #32A @dawnieando https://www.ntid.rit.edu/sea/processes/referencewords/practice/ph oric Example of cataphoric and anaphoric resolution

×