An Approach to cover more advertisers in Adwords

An Approach to Cover More
Advertisers in Adwords
A.Budhiraja and P. Krishna Reddy
{amar.budhiraja@research.iiit.ac.in,
pkreddy@iiit.ac.in }

Outline
Introduction
Background
Proposed Approach
Experiments
Related Work
Conclusions and Future Work

Introduction
Search engines have been deemed as the starting point of most of the web
transactions.
This results in a significantly larger user base, making search engines an ideal
avenue for businesses to reach potential consumers.
In the meta-model of knowledge sharing of search engines, advertising has
become a major revenue earning source for their sustenance.
According to IAB standards, sponsored search is the most dominant form of
online advertising covering almost 43 of the entire market.

Introduction : Sponsored Search
The model of search engine advertising is more popularly known as Adwords.
When a user queries a search engine, a list of search results and sponsored
results or advertisements are displayed.
Advertisers bid on search keywords and pay the search engine according to
Pay Per Click (PPC) model to display the advertisement on the query page
containing the desired keywords.

Introduction: Problem Statement
Search keywords follow a long tail frequency distribution with
a small and fat head of highly frequent keywords
a long but thin tail of less frequent keywords
During the keyword auctions, there is a high competition for head keywords
while there is little to no competition for the tail keywords.
This leads to underutilization of ad space of a large number of tail keywords.
Also, it leads to ignorance of a diverse set of potential consumers who could
be captured by targeting tail keywords.

Background: Model of Adwords
The present model is considered as
the online bipartite graph matching,
with advertisers as one disjoint set
and incoming queries are the other.
When a new query comes in, it is to be
matched to a set of advertisers.
The advertisers are then ranked and
their ads are display in that ranked
order.

Background: Adwords Architecture
1. Analyze Query
2. Retrieve Relevant Advertisers
3. Bidding
4. Ranking Advertisers

Background: Coverage Patterns- Central Idea
The basic idea of Coverage Pattern is inspired from the set cover problem in set
theory.
Given a universe U and a family S of subsets of U, a cover is a subfamily C⊂ S
of sets whose union is U.
Using the same notion, coverage patterns aim to identify items that cover
certain percentage of the entire data.
A keypoint to be mentioned is that coverage patterns aim at identifying
that usually “do not” occur together, in contrast to frequent patterns
that identify patterns in data that occur together.

Background: Coverage Patterns - Notations
Let W be a set of webpages of a website, W = {w1 ,w2 … wN }.
Let D be a set of transactions from the click stream data, D = {T1
,T2 … } such that T ⊂ W.
X is defined as a patterns of webpages such that X ⊂ W, X = {wp
,wq ,… }.
Twi denotes the set of transactions containing the webpage wi and
its cardinality is denoted as |Twi|.

Background: Coverage Patterns - Definitions

Background : Coverage Patterns
A pattern is interesting if it has a high CS and low OR.
A high CS value indicates more number of visitors and a low OR value means
less repetitions amongst the visitors.
A pattern is said to be interesting if CS(X) > minCS(X), OR(X) < maxOR and
RF(wi) > minRF

Background: Coverage Patterns - Example
Dataset :
Assuming, minRF - 0.2, minCS - 0.3 and maxOR- to be 0.5.
Ta is 5, Tb is 7 and f, Tf is 1 . So, RF for a is 0.5, for b is 0.7 and for f is 0.1.
RF(f) = 0.1 < 0.2 (minRF), f will be removed. RF(a) = 0.5 > 0.2and RF(b) = 0.7 >
0.2, so a and b are not removed.
{b,a} is a candidate pattern. (Order in a pattern is decreasing order of the RF.)
The Coverage Set for {b,a} is {1,2,3,4,5,6,7,8,9,10} and |CSet {b,a}| is 10.
Hence, CS = 10/10 = 1 > 0.3 (minCS).

Outline
Introduction
Background :
Proposed Approach
Experiments
Related Work

Proposed Approach : Basic Idea
Because of the nature of distribution of search keywords, there is very less
competition for tail keywords. As a result, there are very less or no advertisers
for such keywords.
We noticed that if could combine such keywords into groups such that these
keyword groups have a certain number of visitors, we can utilize the ad space
of such keywords.
To perform the grouping of search keywords, we employ the notion of query
taxonomy to group semantically similar words.
These groups are further mined from the logs in the form of coverage patterns.

Proposed Model
We propose to add a middle layer of
coverage patterns to the bipartite
model of Adwords.
In the proposed model, the incoming
queries are first matched to a
coverage pattern using the concept
taxonomy.
The coverage pattern is then matched to
a set of advertisers.
The advertisers are then ranked

Architecture Comparison
Step Modification with respect to bipartite
architecture
Analyze Query This step remains the same except that the
subconcept of the query is also retrieved
Retrieve Relevant
Ads From the
Matching
In this step, the advertisers who have been
matched to the coverage pattern containing the
subconcept of the query are retrieved. (The
matching between coverage patterns and
advertisers would be explained later.)
Bidding Stays the same
Ranking Advertisers Stays the same

Coverage Pattern and Advertisers Matching
Coverage Pattern and Advertiser matching is the most important phase in the
architecture.
This has been further divided into four steps:
a. Converting Query Logs to Concept Transactions
b. Extraction of Coverage Patterns
c. Estimation of Number of Impressions for Advertisers
d. Matching Coverage Patterns and Advertisers
Each step is explained in the later slides.

Step 1 : Converting Query Logs to Concept Transactions
One key point to note here is that the web query logs cannot be directly mined
for coverage patterns because of the large vocabulary size (even when we
only consider English).
To generalize the coverage pattern mining, we proposed to use a three-level
concept taxonomy to classify queries into a pair of concept and subconcept.

Step 1:Converting Query Logs to Concept Transactions (cntd.)
Using the same taxonomy, we
convert the web query logs
into concept transactions
using the techniques of query
classification.
To define a transaction, we
consider a session boundary
of 30 mins in the query logs
for each user.
Sample Sessions
Converted Concepted Transactions

Step 2: Extraction of Coverage Patterns
● Coverage patterns are
extracted from the converted
concept transactions.
● One key point to note is that
coverage patterns mine
unique visitors while the
standard models of
advertising are either based
on Impressions or Clicks.
● So, we convert coverage
patterns’ coverage into
number of impressions as
follows:
● For the above example, we consider the concept of
Science and Agriculture, Biology, Chemistry,
Environment, Physics and Technology as its
subconcepts.
● The transaction size is also assumed to be 1000.
● NOTE: We also rank the coverage patterns in
ascending order of their CS - OR parameter.

Step 3: Estimating Required Impressions for
Advertisers
In Adwords, advertisers create an ad
campaign for their website.
In an ad campaign, a daily budget and a
bid is specified on the keywords that
they chose to bid upon.
Using CTR, bid and daily budget values,
we calculated the number of
impressions it will take to exhaust the
budget of an advertiser using the
following identity:
The above table shows details of nine
advertisers who bid upon the concept of
Science

Step 4: Matching Advertisers to Coverage Patterns
With coverage patterns and advertisers in the same unit of number of
impressions, we can create a matching between the two.
The matching can be termed as a MANY-TO-ONE matching between coverage
patterns and advertisers because a coverage patterns covers multiple
keywords from different nodes in the taxonomy.
The matching algorithm has a relaxation parameter, ε to perform faster.
The algorithms loops over coverage patterns and then advertisers, and a
coverage pattern is allocated to an advertisers if the following condition is
satisfied:
Ad.Impressions - 􀀀 CP.Coverage < ε × Ad.Impressions

Experiments : Dataset
We performed a comparative study on the bipartite model of Adwords with and
without coverage patterns layers.
We used AOL search query dataset to run the experiments.
We took the most popular four categories of queries to run our experiments.
Query Dataset for the most popular four categories

Experiments: Performance Metrics
1. Number of Advertisements per Session (AS) as the ratio of Sum of Unique
Advertisements of all Sessions (SUAS) and Number of Sessions with Advertisements
(NSA) to indicate the utilization of a session. Higher value of AS indicates better use
of ad space.
AS = SUAS / NAS
1. An increase in diversity among the viewers of the advertisements was also observed.
To indicate the same, the value of Sessions per Advertisement (SA) which is the
ratio of Number of Advertisements of all Sessions (NAS) to Number of Advertisements
(NA). Higher value of metric implies the more number of unique eyeballs and thus,
increasing the chances of the advertisement being clicked by diverse users.
SA = NAS/NA
Experiments: Performance Metrics

Graphs show a comparison of the bipartite Adwords system - With and Without Coverage Patterns layer
Experiments: Results- Utilization of Ad Space

Graphs show a comparison of the bipartite Adwords system - With and Without Coverage Patterns layer
Experiments: Results - Diversity

Related Work
Most works in Adwords target algorithms to optimize different aspects of the
system, including revenue, welfarism and display of ads.
Another aspect that is targeted in Adwords is the bidding scenarios with respect
to Adwords. Several studies have touched up on how to increase revenue in
dynamic bidding scenario when you only have a partial information of the
system.
In this paper, we have proposed an architectural solution to use the ad space of
the search keywords. Bidding strategies and Budget optimization can be
placed on top of it.

In this paper, an architectural solution is proposed for Adwords to utilize the ad
space of tail keywords. The proposed approach also considerable
improvement with respect to diversity in reach of advertisements.
We plan on investigating the coverage patterns approach with respect to
different taxonomies. We believe a hybrid taxonomy would be the best when
it comes to Adwords architecture.
We also plan on expanding the user boundaries of exploration in searching
beyond search sessions. We plan on extracting user goals and modelling the
transactions from them.

An Approach to cover more advertisers in Adwords

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Similar to An Approach to cover more advertisers in Adwords

Similar to An Approach to cover more advertisers in Adwords (20)

Recently uploaded

Recently uploaded (20)

An Approach to cover more advertisers in Adwords