2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search

Dawn Anderson | @dawnieando | #TechSEOBoost
The rise of predictive,
proactive search
TechSEO Boost 2019
The User is The Query

“Today you are you!
That is truer than
true! There is no one
alive who is you-er
than you!” (Dr Seuss)

Said Dr Seuss…
and Google

When introducing Google Feed (now Discover)

Today’s
Topic:
The User is
the Query

Also… Meet Bert and Ted

There’s a problem
with queries, content
& users too

“In 1998 the web consisted of just 25
million pages…” (Ben Gomez, Google,
2018)

“… That’s roughly
the equivalent
number of those
in a small library”
(Ben Gomez,
Google, 2018)

In 2019… we
know the web
is huge…
billions of
web pages
(Netcraft,
2019)

App usage is huge too - By 2018 – App Store has 20 million registered
developers. (Techcrunch, 2018)

42% of
the global
population
use social
media
(Emarsys,
2019)

We are competing with
programmatic solutions
spraying content &
information EVERYWHERE

Over-choice: Too
much choice
often has
negative
impacts

Almost 98% of visits are
people window shopping
Average ecommerce
conversion +/- 2% (Monetate)

Despite this…
users are still
seeking even
more
information

The number of
Google searches
increases year on
year
(Internetlivestat,
2018, curation from
various sources)

15% of queries every day are new (Google)

Humans forage (like bears) all over the place
seeking information… we are informavores

Researching ALL THE THINGS… before making final decisions

We have
become very
good at
filtering out
things which
are NOT
interesting
enough (8
second filter)

It’s NOT a
short attention
span thing

Otherwise we would not binge on ‘Stranger Things’

This is cognitive load
management &
information filtering

AT THE SAME
TIME words are
problematic.
Ambiguous…
polysemous…
synonymous

Often words
have multiple
meanings.
Like “like” can
be 5 possible
parts of speech
(POS)

Spoken word
can be worse.
Like “four
candles” and
“fork handles”

Which does not bode well for the likes of
conversational search

In query
understanding
sometimes users
don’t know what
they want either

Sometimes exactly the same users express an
information need in a different way

Sometimes
different users
use lots of
different ways
to mean
exactly the
same thing

'The Vocabulary
Problem’
Furnas, G.W., Landauer, T.K.,
Gomez, L.M. and Dumais, S.T., 1987.
The vocabulary problem in human-
system
communication. Communications of
the ACM, 30(11), pp.964-971.
1987

One of the
inventors of ‘Latent
Semantic Indexing’,
created to solve
‘The Vocabulary
Problem’ whilst
researching at
Bellcore (1990)

BTW… No-one said LSI was
used by Google (aside)

Sometimes the
searcher query is a
‘cold start’ query

Broad or cold
start queries
might call for
result
diversification
due to lack of
intent
detection

Search engines may return a broad
blend of results to match these queries
Freshness
Serendipity
Novelty
Diversity

The searcher has to click around
to provide feedback on their
intent or reformulate the query
by entering something else
(‘query refinement’)

To then deliver
sequential
queries with
greater intent
understanding

Query
refinement
says… “Your
move next”

A kind of
‘probability-
driven fork in
the road’
http://delivery.acm.org/10.1145/1780000/1772776/p841-sadikov.pdf
(Sadikov et al, 2010) CLUSTERING QUERY REFINEMENTS BY
USER INTENT

BUT word’s meaning & user
intent /context combined are still
very hard to understand for
search engines

Despite assistance
from Google’s BERT
& progress in NLP

Stanford
Question
And
Answer
Dataset 2.0
• Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P., 2016. Squad:
100,000+ questions for machine comprehension of text. arXiv
preprint arXiv:1606.05250.

MS
MARCO
Paper
• Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S.,
Majumder, R. and Deng, L., 2016. MS MARCO: A Human-Generated
MAchine Reading COmprehension Dataset.

The exact same
queries have
different intent at
different times &
different locations

What did
you really
mean when
you
searched
for
‘Easter’?
• Radinsky, K., Svore, K.M., Dumais, S.T., Shokouhi, M., Teevan, J.,
Bocharov, A. and Horvitz, E., 2013. Behavioral dynamics on the web:
Learning, modeling, and prediction. ACM Transactions on
Information Systems (TOIS), 31(3), p.16.
When did
you search
for
‘Easter’?
A few
weeks
before
Easter
A few
days
before
Easter
During
Easter
What you
mostly meant
When is
Easter?
Things to do
at Easter
What is the
meaning of
Easter?

Modeling & Predicting Behavioural Dynamics on
The Web (Radinsky et al, 2012)

“When users’
information needs
change over time, the
ranking of results
should also change to
accommodate these
needs.” (Radinsky,
2013)

This is
‘Query
Intent Shift’

The intent of queries changes over time

The passage of time adds new
meaning to queries sometimes too

The rise and fall of
the Blackberry?

‘iPhone’ –
Query
Example
(Google
Quality
Raters
Guidelines)

Temporal Dynamic Intent (Burstiness) is a huge factor for intent

At certain times far more intents will be
transactional

“dresses”,
“shoes”, “bags”
“buy dresses”, “buy
shoes”, “buy bags”,
“dress sales”, “shoe
sales”
Really means

And sometimes only
reasons a particular
audience would
understand spike
temporal queries

[Four candles] + [fork handles]
interest over time

Sometimes it is other events which trigger
unexpected queries

Your ranking flux might well be shifting query intents at
scale

What a nightmare
queries are

Maybe It’s Time For A Change?

Enter… The Next 20 Years of Search

Hmm… That sounds
big Google… This is
HUGE

Three
FUNDAMENTAL
shifts in search

Fundamental:
“forming a
necessary base
or core; of central
importance.”

Three Fundamental Shifts
• The shift from answers to journeys
• The shift from queries to queryless
• The shift from text to visual information

The shift from text to
more visual information

This feels
like a huge
UX /
accessibility
shift…
Hoorah

Images are much easier to mentally consume than text & audio

Images & video
engage… Images
& video entertain
Images & video
provoke emotion

Photography app
usage had a 210%
increase between
2016 and 2018
according to App
Annie

People spend on
average 2.6x
more time on
pages with video

Image search is curation. Totally different to text-based
search

Go nuts with quality images & video

The shift from queries
to queryless

“Queries Are Difficult To
Understand in Isolation”
(Susan Dumais, Microsoft
Research, 2016)

“Easier if we can model: who
is asking, what they have
done in the past, where they
are, when it is, etc.” (Susan
Dumais, CIKM, 2016)

Better still… what about
predicting the user’s
informational needs to
proactively make suggestions?

QueryLess: Next Gen Proactive Search And Recommender
Engines (2016)

“Nevertheless, as the world is
becoming more mobile-centric,
this old-fashioned query-driven
search scenario and clickbased
evaluation mechanism can no
longer catch up with the rapid
evolution of user demand on
mobile devices.” (Song and
Guo,2016 (Microsoft Research))

“Therefore,a more user-
friendly, mobile-centric and
scenario driven search
paradigm that requires
minimal user inputs is ready
to come out” (Song and
Guo,2016 (Microsoft
Research))

It kind of
sounds like
Google
Discover

At last
announcement
Google Discover
had 800 million
users (May,
2018)

It’s now on mobile home
page. It knows you… and
the things you do… where
you’ve been… where you’re
going

“In many cases
predicting
informational needs
removes the need for
the query & reactive
search engine” (Song
& Guo, 2016)

Zero-Query
Queries – No
Query Required

Google’s Recommender Systems

QueryLess: Next Gen Proactive Search And Recommender Engines

Google Scholar is now a
Recommender System Too

YouTube is a Recommender System

YouTube Feedback Controls is ‘The Human in The
Loop’

Reinforcement
learning thrives from
rewards (implicit
feedback)

The User (needs) is
‘The Query’

The shift from answers
to journeys

An information
need is rarely a
task with a
single finite item

It’s more like a series of little chunks (sub-tasks)

People are creatures of habit it seems

“Patterns were spotted about
repetitive task driven search
behaviours – predictable”
(Song & Guo, 2016)

Tasks & timelines
go hand in hand…
it seems

“Predictable task timeline
patterns are more prevalent
on mobile devices” (Song &
Guo, 2016)

Like e.g. ‘checking
the stock market’
every morning if
you’re interested in
stocks and shares

Mobile
Device
Sensors
(14
sensors
or more)
Proximity sensors
GPS sensor
Ambient light sensor
Accelerometer
Compass
Gyroscope
Back illuminated sensor

Many tasks & intents
can be modelled
according to
predicted patterns

Personalising Search via Interests & Activities
2005 paper awarded the 2017 SIGIR Test of Time Award. Cited 1029 times to date
Teevan, J., Dumais, S.T. and Horvitz, E., 2005, August. Personalizing search via automated analysis of interests
and activities. In Proceedings of the 28th annual international ACM SIGIR conference on Research and
development in information retrieval (pp. 449-456). ACM.

Google Discover
looks to be
focusing on
hobbies, interests,
news and social
activities

Very Recent Microsoft Research

The Ideal is
Personalisation
• Not easy to achieve fully
• Sparsity of data
• Privacy concerns
• Broken sequences

In the absence of
personalization…
collaborative Filtering

There are
other people
nearly like
you

You (and me)
are unique…
but may be
similar

Matrix Factorisation
(Netflix Recommendation
System) + Matrix
Factorisation (WALS
Algorithm, Tensorflow)

Tensorflow Matrix
Factorisation

Based on users liking
the same things (with
hidden common
preferences)

Those sharing similar interests
likely share other hidden
interests too (i.e. the system
does not know of them yet)

Understand the user,
understand their
cohort… Understand
other similar
informational needs

The two sides of assistant will both be
proactive
Provide
answers
/ search
Conversation
Search
Help
with
activities
/ tasks
Conversation
Actions

Extend Actions on Google using
Machine Learning

Understand your customers to assist
with AI
Perceived
Information need
Micro-task
Micro-task Micro-task Micro-task Micro-task Task
Micro-task Micro-task Micro-task Micro-task Task
Micro-task Micro-task Task
Micro-task
Micro-task
Micro-task Task
Micro-task Micro-task Task
Micro-task Task
We can identify the user’s
probable top tasks & subtasks
Identify their needs & what info
they need along the way

Tell us about
the tasks,
order and
steps involved
in booking a
hotel

Many built-in intents & many ‘coming soon’

Connecting Tasks Across Devices & Applications

Multi-
platforming
• Switching between search
and video
• Between search and a
recommender system

Building a Personal Knowledge Graph

A Recent Microsoft Personal Knowledge Graph Patent

This is ’Task-
driven’ Search &
Recommender
Systems

Where the
user is
truly ‘the
query’

Toward a Personal
Knowledge Graph

Truly PERSONAL AI is not
possible without a PERSONAL
KNOWLEDGE GRAPH
(Krisztian Balog, ECIR 2019)

But where will users be
reached?

By 2022 PCs will
account for only 19
percent of IP
traffic (Comscore,
2019)

Interest over time for Google Home & Amazon Alexa

Assistant +
Home +
Discover +
Search App +
Desktop +
Location
Tracker +
Calendar +
Gmail +
YouTube

Carrier’s for
Recommender
Systems

Realise… your ranking tools are mostly wrong

Think CRM for SEO

Identify interests &
affinity groups

Map every single
informational need
sub-task you can think
of to the sections of a
model like the RACE
model

Map & cluster ‘Related’
content by task & temporal
type. Categories are too
broad, and topics may be
too

Continually
improve and
update solid
URL
seasonal &
temporal
content

Continually improve and update solid URL evergreen content

Map content clearly to tasks and task timelines

Identify predictable
patterns of user
behavior

Understand the
shared
preferences,
learn the hidden
preferences

Go • Go big on evergreen content & keep updated
Optimise • Optimise images well – think curation / collections
Map • Map user journeys to content plans
Optimise • video well – enhance with markup / transcription
Get • Get personal – keep refining segments / personas
Identify • Identify & cluster content around task timelines
Use • Use relatedness across content, tasks & temporality

Bias and reproducibility is a challenge

Reproducibility
problems in
research &
RecSys (very
high)

Bias on the web and recommender systems

Bias Considerations
Presentation
Bias
Programming
Bias
Audience
Manipulated
Bias (e.g fake
reviews)
Machine
Learning / AI
Bias (Black box
algorithms)
Matthew’s Law
Zipfian
Distribution of
Web Content

Spotify add novelty items to
home page to avoid biased
personalisation

Do yourself a favour and
follow Mounia Lalmas
@mounialalmas

The
QueryLess
change
will not
come
overnight
… things
move
slowly

References

• Broder, A., 2002, September. A taxonomy of web search. In ACM Sigir forum (Vol. 36, No.
2, pp. 3-10). ACM.
• Chuklin, A., Severyn, A., Trippas, J., Alfonseca, E., Silen, H. and Spina, D., 2018. Prosody
Modifications for Question-Answering in Voice-Only Settings. arXiv preprint
arXiv:1806.03957.
• HigherVisibility. 2018. How Popular is Voice Search? | HigherVisibility. [ONLINE] Available
at: https://www.highervisibility.com/blog/how-popular-is-voice-search/
• Filippova, K., Alfonseca, E., Colmenares, C.A., Kaiser, L. and Vinyals, O., 2015. Sentence
compression by deletion with lstms. In Proceedings of the 2015 Conference on Empirical
Methods in Natural Language Processing (pp. 360-368).
• Filippova, K. and Alfonseca, E., 2015. Fast k-best sentence compression. arXiv preprint
arXiv:1510.08418.
• Google Developers. 2018. Content-based Actions | Actions on Google | Google
Developers. [ONLINE] Available at: https://developers.google.com/actions/content-
actions/. [Accessed 18 June 2018]

References
Radinsky, K., Svore, K.M., Dumais, S.T., Shokouhi, M., Teevan, J., Bocharov, A. and Horvitz, E.,
2013. Behavioral dynamics on the web: Learning, modeling, and prediction. ACM Transactions on
Information Systems (TOIS), 31(3), p.16
Sadikov, E., Madhavan, J. and Halevy, A., Google LLC, 2013. Clustering query refinements by
inferred user intent. U.S. Patent 8,423,538.
Official Google Webmaster Central Blog. 2019. Official Google Webmaster Central Blog: Rolling out
mobile-first indexing . [ONLINE] Available at: https://webmasters.googleblog.com/2018/03/rolling-out-
mobile-first-indexing.html. [Accessed 25 September 2019].
Zhou, S., Cheng, K. and Men, L., 2017, April. The survey of large-scale query classification. In AIP
Conference Proceedings (Vol. 1834, No. 1, p. 040045). AIP Publishing.

References
Search Engine Land. 2019. Starting July 1, all new sites will be indexed using Google's mobile-first indexing
- Search Engine Land. [ONLINE] Available at: https://searchengineland.com/july-1-new-sites-will-be-
indexed-using-googles-mobile-first-indexing-317490. [Accessed 25 September 2019].
Teevan, J., Dumais, S.T. and Horvitz, E., 2005, August. Personalizing search via automated analysis of
interests and activities. In Proceedings of the 28th annual international ACM SIGIR conference on Research
and development in information retrieval (pp. 449-456). ACM.
Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R. and Deng, L., 2016. MS MARCO: A
Human-Generated MAchine Reading COmprehension Dataset.

Keep in touch
@dawnieando

2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search

More Related Content

What's hot

Similar to 2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search

More from Dawn Anderson MSc DigM

Recently uploaded

2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search