More Related Content
Similar to Recent advances in computational advertising
Similar to Recent advances in computational advertising (20)
Recent advances in computational advertising
- 1. Recent advances in computational advertising:
design and analysis of ad retrieval systems
Evgeniy Gabrilovich
g @y
gabr@yahoo-inc.com
1
- 2. What is “Computational Advertising”?
• A new scientific sub-discipline that provides the
foundation f building online ad retrieval platforms
f d i for b ildi li d i l l f
– To wit: given a certain user in a certain context,
find the most suitable ad
• At the intersection of
– Large scale text analysis
– Information retrieval
– Statistical modeling and machine learning
– Optimization
– Microeconomics
2
© Yahoo! Research 2010 Technologies described might or might not be in actual use at Yahoo!
- 5. Textual advertising
1.
1 Ads driven by search keywords –
Sponsored Search (a.k.a. “keyword driven
ads”, “paid search”, etc.)
, p , )
2. Ads directly driven by the content of a web
page – Content Match (a k a “context
(a.k.a. context
driven ads”, “contextual ads”, etc.)
Textual advertising on the Web is strongly related
to NLP and information retrieval
5
© Yahoo! Research 2010
- 6. Sponsored search
Text-based
Text based ads driven by a keyword search
6
© Yahoo! Research 2010
- 7. Content match ads
Text-based ads driven by the page content
Content
C t t
match
ads
7
© Yahoo! Research 2010
- 8. Anatomy of an ad
Bid phrases: {SIGIR 2010,
computational advertising,
advertising
Evgeniy Gabrilovich, ...}
Bid: $0.10
Title
Creative
Display URL
Landing URL:
http://research.yahoo.com/t
utorials/sigir10_compadv
Landing page 8
© Yahoo! Research 2010
- 9. So when do advertising dollars
actually change hands?
– CPM = cost per thousand i
t th d impressions
i
• Typically used for graphical/banner ads
(brand advertising)
– CPC = cost per click
p
• Typically used for textual ads
– CPT/CPA = cost per transaction/action
a.k.a. referral fees or affiliate fees
9
© Yahoo! Research 2010
- 10. Beyond keyword matching
• Matching ads is relatively simple for explicitly bid keywords
What about queries on which there are no bids ?
– Advertisers should be able to bid on “broad queries” and/or
“concept queries”
– Advertisers need volume – the total amount of searches on bid
phrases is not enough !
• Suppose your ad is “Good prices on Seattle hotels”
Good hotels
• Naïve approach: bid on any query that contains the word Seattle
• Problems
• “Seattle's Best Coffee Chicago”
• “Alaska cruises start point”
• Ideally: bid on any query related to Seattle as a travel destination
10
© Yahoo! Research 2010
- 11. The old school:
heuristic ad matching
• Sponsored search
p
– Exact match between the query and the bid phrase
of the ad (modulo simple normalization, e.g.,
stemming)
– Advertisers cannot possibly bid on all relevant
queries (especially rare ones)
• Use advanced match (e.g., through query-to-query rewrites)
• Content match
– Extract bid phrases from pages, thus reducing the
problem to exact match
Both essentially perform record lookup
11
© Yahoo! Research 2010
- 12. The old school (cont’d)
Query
Abbey Road
lyrics
Front end
Simplistic Query rewriting module
query Query
expansion Query rewrites
Ignoring (or
underusing) Exact match
the multitude
of information
available
il bl Candidate ads
Revenue
reordering
d i Ad slate
12
© Yahoo! Research 2010
- 13. The new approach:
knowledge based
knowledge-based ad retrieval
• Ad indexing and scoring based on all the information
available (bid terms, title, creative, URL, landing page, ...)
– Similar to document indexing in IR
• Use standard IR tools (text preprocessing – tokenization, stemming,
entity extraction; inverted indexes etc.)
– Use multiple features of the query and the ad
• Elaborate query expansion
• 2nd pass relevance reordering (
l d i (re-ranking)
ki )
– Using features not available to the 1st pass model (e.g., set-level
features, click history)
13
© Yahoo! Research 2010
- 14. The new approach (cont’d)
Query Miele
Front end
<Miele, appliances, kitchen,
Ad query “appliances repair”, “appliance parts”,
appliances repair appliance parts
Rich query generation Business/Shopping/Home/Appliances>
Ad query
The hidden Ad search engine
parts of ads
(bid phrases +
landing pages) First
Fi pass
allow us to retrieval
augment the
ads (cf. query Relevance
expansion) reordering Revenue
reordering Ad slate
Candidate
© Yahoo! Research 2010 ads 14
- 15. Research How to Should we
How to
questions index the
select
relevant
show ads
Can we generate bid ad corpus? at all?
ads?
phrases (or even
entire ad campaigns)
automatically?
What is the
Wh t i th
interplay between
the organic and
sponsored
p
results?
Should
Sh ld we
use the
landing for
indexing?
g Can we optimally
p y
choose the
landing page?
15
© Yahoo! Research 2010
- 16. How to
select
relevant
ads?
Feature generation for
improved ad retrieval
(SIGIR 2007 w. B d et al.;
2007, Broder t l
ACM TWEB 2009, Gabrilovich et al.)
)
16
© Yahoo! Research 2010
- 17. Query classification using
Web search results
• Humans often find it hard to readily see what the
y
query is about …
– But they can easily make sense of it once they look at
the
th search results…
h lt
• Let computers do the same thing
– Infer the query intent from the top algorithmic search
q er
results (“pseudo relevance feedback”)
• Classify search results (either summaries or full pages)
• Let these results “vote” to determine the query class(es) in a
large taxonomy of commercial topics
• Our goal: Construct additional features to retrieve better ads
17
© Yahoo! Research 2010
- 18. Example: ex560lku
CATEGORIES
1. Computing/Computer/
Hardware/Computer/Peri-
pherals/Computer
Modems
18
© Yahoo! Research 2010
- 19. If we know it is about actiontec usb modem
then we have plenty of ads …
p y
19
© Yahoo! Research 2010
- 20. Our approach
Traditional approach:
Insufficient
Query Classifier
data
Our approach:
Very large
scale
Query
y Search engine
Search results Pre-classify
all pages
Using Web just once !
Classifier as external
knowledge
20
© Yahoo! Research 2010
- 21. Research questions
Number
of search
Snippets or
results to
full pages?
obtain
Number f
N b of
classes per
search result
Aggregation:
bundling or voting?
21
© Yahoo! Research 2010
- 22. The effect of using Web search results
22
© Yahoo! Research 2010
- 23. Beyond the bag of
B d th b f
words: matching
textual ads in the
enriched feature space
(
(SIGIR 2007, Broder et al.;
, ;
CIKM 2008, w. Broder et al.)
23
© Yahoo! Research 2010
- 24. What can we do about non-English queries ?
(iNEWS @ CIKM 2008, w. Wang et al.;
WSDM 2009, w. W
2009 Wang et al.)
t l)
• Developing a taxonomy and building a query
classifier for every language is prohibitively
expensive
• Solution: apply off-the-shelf MT to the
search results in the source language
g g
Machine
Translation
Very short
text Sufficiently
long text
24
© Yahoo! Research 2010
- 25. The effect of query expansion
prior to applying MT.
MT
The gap for
infrequent
queries is wider
Baseline = translate
the
th query ( i MT)
(using MT),
then classify the result
as an English query
(Head) (Tail)
more frequent less frequent
25
© Yahoo! Research 2010
- 26. How to
index the
ad corpus?
The Anatomy of an ad:
Structured indexing and retrieval
for sponsored search
(WWW 2010, w. Bendersky et al )
2010 w al.)
26
© Yahoo! Research 2010
- 27. Structure of online ad campaigns: the
ad schema
Advertiser
New Year deals on
Buy appliances on lawn & garden tools Account 1 Account 2 …
Black Friday
Kitchen appliances Campaign Campaign
…
1 2
Ad group Ad group
…
1 2
Creatives Ad Bid phrases Can be just a single
bid phrase, or
thousands of bid
Brand name appliances { Miele, phrases (which are
Compare prices and save money KitchenAid, not necessarily
www.appliances-r-us.com Cuisinart, …} topically coherent)
27
© Yahoo! Research 2010
- 28. Implications of the campaign
structure
• What is the appropriate indexing unit?
g
– Cartesian product of creatives and bid phrases? Ad group?
• Leveraging information from higher levels to address data sparsity
at children nodes
• What is the right approach to document length normalization?
– Large variability of document lengths
– Probability of shorter documents (smaller ad groups) to be retrieved is
higher than their probability of being relevant
• How to index and score templated ads?
p
• Prior work mostly considered ads as independent atomic units and
ignored hierarchical campaign structure
g p g
28
© Yahoo! Research 2010
- 29. Possible approaches
1. Term index (Cartesian product of all creatives and bid terms)
• Huge index, small focused documents
2. Creative index (a creative is coupled with all the bid terms in
the ad group)
• Two-stage retrieval (first choose the creative, then pick the term)
• Bid terms are duplicated across creatives
3. Ad group index
• Indexing units are entire ad groups
• Three stage retrieval (first choose
Three-stage
the ad group, then the creative,
and finally pick the term)
• M t compact index
Most ti d
29
© Yahoo! Research 2010
- 30. Retrieval speed vs. relevance
Term index yields most relevant
ads, yet is least efficient (20x slower
than the ad group index)
Are we trading
effectiveness
for efficiency ?
Ad group index is most efficient
(2x faster than creative index), yet
least effective
30
© Yahoo! Research 2010
- 31. Using learning to rank techniques:
structured re-ranking
re ranking
• Step 1: Retrieve an initial set of candidates using the ad group index
• Step 2: Re-rank the candidate set using structural features (instead of
ignoring the structure and scoring creatives and terms independently)
– Ad group score, creative-term pair score
g p , p
– # bid terms in the ad group
– Unigram entropy (cohesiveness)
of the ad group
– Ratio of query words covered
by the ad group text
– Fraction of the titles / terms /
URLs that contain at least
one query term
– Other features are possible !
feature functions
31
© Yahoo! Research 2010
- 32. Re-ranking retrieval performance
nDCG@5 Len 1 Len 2-3 Len 4+
(143 queries)
i ) (443 queries)
i ) (187 queries)
i )
Term index 0.841 0.716 0.656
Structured
St t d 0.849
0 849 0.731
0 731 0.686
0 686
re-ranking (+ 0.95%) (+ 2.1%) (+ 4.6%)
• Structured re-ranking is superior
for all query lengths
• Most notable improvements are
obtained for longer queries
• Still very efficient!
32
© Yahoo! Research 2010
- 33. To swing or not to swing: learning when (not)
to advertise (CIKM 2008, w. Broder et al.)
Should we
• Repeatedly showing non-
non show ads
relevant ads can have at all?
detrimental long-term effects
• Want to be able to predict
when (not) to show individual
ads or a set of ads (“swing”)
( swing )
• Modeling actual short- and
long-term costs of showing
f
non-relevant ads is very
difficult
33
© Yahoo! Research 2010
- 34. Thresholding approach
• Decision made on individual ads based on
ad scores
– Set a global score threshold
– Only retrieve ads with scores above it
– If none of the ad scores are above the
threshold, then no ads are shown (“no swing”)
• Scores are not necessarily comparable
across queries!
q
34
© Yahoo! Research 2010
- 35. Machine learning approach
• Decision made on sets of ads based on a
variety of features
– Learn a binary prediction model (“swing” /
( swing
“no swing”) for sets of ads
– If we swing, then all ads are retrieved
swing
– If we do not swing, then no ads are retrieved
• F t
Features d fi d over sets of ads, rather
defined t f d th
than individual ads
35
© Yahoo! Research 2010
- 36. Features
• Relevance features
– Word overlap, cosine similarity between ad and query/page
• Vocabulary mismatch features
– Translation models
– PMI between query/page terms and bid terms
• Ad-based features
– Bid price ( g
p (higher bids may indicate better ads)
y )
• Result set cohesiveness features
– Coefficient of variation of ad scores (std/mean)
– Result set clarity
• If the set of ads is very cohesive and focused on 1-2 topics, the
relevance language model is very different from the collection
model
– Entropy
36
© Yahoo! Research 2010
- 37. What h
Wh t happens after an ad click?
ft d li k?
Quantifying the impact of landing
y g p g
pages in Web advertising
(CIKM 2009 w. B k et al.)
2009, Becker t l )
Can we
optimally
choose the
landing p g
g page?
37
© Yahoo! Research 2010
- 38. Conceptually: context transfer
Search engine result p g
g page
Click!
Landing page
User’s activity
on th
the
advertiser’s
Conversion Web site
(e.g., purchase of the
product or service
© Yahoo! Research 2010
being advertised) 38
- 39. All landing pages are not created equal
(and neither are the corresponding conversion rates)
• We propose a concise taxonomy of landing page types:
I. Homepage (25%) – top-level page of the advertiser’s site
(e.g., Verizon.com)
II. Category browse (37.5%) – main page of a sub-section of
sub section
the advertiser’s site, which describes a category of related
products
III. Search transfer (26%) – search within the advertiser’s site
( )
OR on other Web sites
IV. Other (11.5%) – terminal pages (e.g., promotion pages or
forms)
39
© Yahoo! Research 2010
- 43. Landing page classifier
• Features: bag of words, HTML patterns
– [ST] “
“search results”, “f
h lt ” “found”
d”
– [CB] “Home > Verizon > LG phones”
– [HP] HTML overlap between given URL and base URL
– [O] ratio of form elements to text, few outgoing links
• Accuracy on the pilot dataset (10-fold xval): 83%
• Accuracy on additional 100 labeled pages: 80%
• Distribution of landing p g types in a set of 20,000
g page yp
landing pages from Yahoo! Toolbar logs:
Homepage Search Category Other
Transfer Browse
34.4% 22.3% 36.0% 7.3%
43
© Yahoo! Research 2010
- 44. Using the landing page taxonomy
Picking the right landing page
type for each ad
Improving the conversion rate
Improving advertisers’ ROI !
44
© Yahoo! Research 2010
- 45. Landing page type usage vs. conversion:
breakdown by query frequency
Navigational Category and search
transfer become more
queries
p p
popular for rare q
queries
Observed conversion rates are in
sharp contrast with usage frequency
of the different page types 45
© Yahoo! Research 2010
- 46. Landing page type usage vs. conversion:
b ea do
breakdown by query price
que y p ce
Category and search
transfer are dominant
for cheaper queries
p q
As the price goes up, so
does the conversion rate
(higher quality pages?) 46
© Yahoo! Research 2010
- 47. What is the
interplay between
p y
the organic and
sponsored results?
Competing for users’ attention:
On the interplay between organic and
sponsored search results
(WWW 2010, w. Danescu-Niculescu-Mizil et al )
2010 w Danescu Niculescu Mizil al.)
47
© Yahoo! Research 2010
- 48. The interplay between ads and
organic results
“... in an information-rich world, the wealth of information means a
dearth of something else: a scarcity of whatever it is that
information consumes. What information consumes is rather
obvious: it consumes the attention of its recipients. Hence a
wealth of information creates a poverty of attention and a
need to allocate that attention efficiently among the
overabundance of information sources that might consume it.”
-- Herbert Simon, “Designing Organizations for an Information-Rich
World”, 1971.
,
• Is there competition for clicks between ads and organic results ?
• Do users prefer ads that are similar to the organic results, or do
they prefer diversity ?
We found that the nature of this interplay depends
on the type of the query
48
© Yahoo! Research 2010
- 49. Relation between the CTR of ads
and the CTR of organic results
• Negative correlation (
g (competition)
p )
– Users are only willing to spend limited time and effort on
each query
• P iti correlation (d
Positive l ti (depends on th quality of
d the lit f
results)
– Easy query ( online radio”) – decent ads and organic
(“online radio )
results – clicks on both
– Hard query (“who is giving this talk?”) – poor results on
both sides – no clicks on either
• Independence (null hypothesis)
– Users consider ads and organic results as two
g
independent sources of information
49
© Yahoo! Research 2010
- 50. Findings:
competition + positive correlation
50
© Yahoo! Research 2010
- 51. Decoupling the forces
• Users are willing to invest limited effort in
g
each query competition
• In order to single out the competition effect, we
g
tried to explicitly model the amount of effort
the user is willing to invest
• L
Low effort = navigational queries [B d 2002]
ff i i l i [Broder,
(27% of queries)
– “Pandora radio”, “Bank of America
Pandora radio Bank America”
• High effort = non-navigational queries
– “Meaning of life , “academia vs industry”
Meaning life” academia vs. industry
51
© Yahoo! Research 2010
- 52. Competition clearly exists for
navigational queries
We also examined different
degrees of navigationality:
the less navigational the query
is, the less competition we
observed
52
© Yahoo! Research 2010
- 53. Another viewpoint:
Do users prefer ads that are more similar to
the organic results or more diverse ads?
• Both have been argued for in prior work
• Preference for similarity
– Ads are more likely to be relevant
– This assumption is often made in query
expansion f advertising [B d et al., 2008]
i for d ti i [Broder t l
• Preference of diversity
– Diversity among organic search results has
often been shown to be desirable (e.g., entire
session on di
i diversity @ WWW 2010)
it
53
© Yahoo! Research 2010
- 54. We found evidence for users’ preferring
bot d e s ty a d s
both diversity and similarity
a ty
So we need to
dig deeper
again ...
Overlap measured
using the Jaccard
coefficient
between titles of
ads and organic
results 54
© Yahoo! Research 2010
- 56. Break down by navigationality
(cont d)
(cont’d)
56
© Yahoo! Research 2010
- 58. Responsive and incidental ads
• Responsive ads directly address the user s
user’s
information need
– More likely to be similar to the organic results
• Incidental ads are only somewhat related to the
user’s information need
– Unreasonable as organic results but ok for ads
results,
– More likely to be different from the organic results
• Example: query = “free internet radio
free radio”
– Responsive: “Pandora Internet Radio”
– Incidental: “Discount Bose Computer Speakers”
Discount Speakers
58
© Yahoo! Research 2010
- 59. Now it all make sense ...
Using the features
that quantify this
interplay,
we improved the
accuracy of CTR
prediction by 5%
59
© Yahoo! Research 2010
- 60. Summary
1.
1 The financial scale is huge
2. Advertising is a form of information
3. Finding the “best ad” is an information
retrieval problem
Multiple, possibly contradictory utility functions
Classical IR needs significant adaptation
4. The optimal solution requires extensive
g
use of external knowledge
60
© Yahoo! Research 2010
- 61. Thank
Th k you!
!
gabr@yahoo-inc.com
http://research.yahoo.com/~gabr
61
- 62. This talk is Copyright Yahoo! 2010.
Yahoo! d th A th
Y h ! and the Author retain all rights, including
t i ll i ht i l di
copyright and distribution rights. No publication or
further distribution in full or in part is permitted
without explicit written permission.
The opinions expressed herein are the responsibility
of the author and do not necessarily reflect the
opinion of Yahoo! Inc.
This talk benefitted from the contributions of many
colleagues and co-authors at Yahoo! and elsewhere.
Their help is gratefully acknowledged.
62
© Yahoo! Research 2010