Just In Time Contextual Advertising


Published on

Published in: Business, News & Politics
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Just In Time Contextual Advertising

  1. 1. Just-in-Time Contextual Advertising Aris Anagnostopoulos, Andrei Z. Broder, Evgeniy Gabrilovich, Vanja Josifovski, Lance Riedel, CIKM ’07. Advisor: Chia-Hui Chang Presenter: Teng-Kai Fan Date: 2008-08-20
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Web Advertising Basic </li></ul><ul><li>Methodology </li></ul><ul><li>Empirical Evaluation </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Introduction <ul><li>The Internet advertising spending is estimated over 17 billion dollars in 2006. </li></ul><ul><li>Two main types of textual Web advertising: </li></ul><ul><ul><li>Sponsored search which serves ads in response to search queries. </li></ul></ul><ul><ul><li>Content match which places ads on third-party pages. </li></ul></ul>
  4. 4. Introduction cont. <ul><li>Web advertising for two types of Web page: </li></ul><ul><ul><li>Static page (Offline) : the matching of ads can be based on prior analysis of their entire content. </li></ul></ul><ul><ul><li>Dynamic page (Online) : ads need to be matched to the page while it is being served to the end-user . Thus, limiting the amount of time allotted for its content analysis. </li></ul></ul>
  5. 5. Introduction cont. <ul><li>In this paper, the challenge is to find relevant ads while maintaining low latency and communication costs: </li></ul><ul><ul><li>Using the text summarization techniques to extract short excerpt that are representative of the entire page content. </li></ul></ul><ul><ul><li>Using the classification technique to classify the page summaries with respect to a large taxonomy of advertising categories. </li></ul></ul><ul><ul><li>They perform page-ad matching based on both bag-of-words and classification features. </li></ul></ul>
  6. 6. Contextual Advertising Basic <ul><li>Four interactive entities: </li></ul><ul><ul><li>The publisher is the owner of Web pages on which advertising is displayed. </li></ul></ul><ul><ul><li>The advertiser provides the supply of ads. </li></ul></ul><ul><ul><li>The ad network is a mediator between the advertiser and the publisher, who selects the ads that are put on the pages. </li></ul></ul><ul><ul><li>End-users visit the Web pages of the publisher and interact with the ads. </li></ul></ul>
  7. 7. Overview of Ad display
  8. 8. Advertising Basic cont. <ul><li>Four pricing models: </li></ul><ul><ul><li>CPM (Cost Per Impression) is where advertisers pay for exposure of their message to a specific audience. </li></ul></ul><ul><ul><li>CPV (Cost Per Visitor) is where advertisers pay for the delivery of a Targeted Visitor to the advertisers website. </li></ul></ul><ul><ul><li>CPC (Cost Per Click) is also known as Pay per click (PPC). Advertisers pay every time a user clicks on their listing and is redirected to their website. They do not actually pay for the listing, but only when the listing is clicked on. </li></ul></ul><ul><ul><li>CPA (Cost Per Action) is based on each time an order is transacted. </li></ul></ul>
  9. 9. Overview of the Proposed Solution <ul><li>Using text summarization techniques paired with external knowledge to craft short page summaries in real-time. </li></ul><ul><li>Balance of two conflicts: analyzing as much page content as possible for better ad match vs. analyzing as little as possible to save transmission and analysis time. </li></ul><ul><li>External knowledge : </li></ul><ul><ul><li>URL often contain meaningful words. </li></ul></ul><ul><ul><li>Reference URL might contain relevant words that to some extent capture the user intent. </li></ul></ul><ul><ul><li>Page Classification. </li></ul></ul>
  10. 10. Text Summarization <ul><li>Text summarization techniques are divided into extractive and non-extractive approaches. </li></ul><ul><li>Considering the following components in constructing summaries: </li></ul><ul><ul><li>Title ( T ) </li></ul></ul><ul><ul><li>Meta knowledge and description ( M ) </li></ul></ul><ul><ul><li>Headings ( H ): the contents of <h1> and <h2> HTML tags. </li></ul></ul><ul><ul><li>Tokenized URL of the page ( U ) </li></ul></ul><ul><ul><li>Tokenized referrer URL ( R ) </li></ul></ul><ul><ul><li>First N bytes of the page text. ( P < N >). </li></ul></ul><ul><ul><li>Anchor text of all outgoing link on the page ( A ) </li></ul></ul><ul><ul><li>Full of the page ( F ). </li></ul></ul>
  11. 11. Text Classification <ul><li>Using a summary of the page in place of its entire content can ostensibly eliminate some information. </li></ul><ul><li>To alleviate harmful effect of summarization, they study the effects of using text classification. </li></ul><ul><ul><li>They classify both page excerpts and ads with respect to a taxonomy and use classification-based features to augment the original bag of words. </li></ul></ul>
  12. 12. Choice of Taxonomy <ul><li>Taxonomy: they employ a large taxonomy of approximately 6,000 nodes, arranged in a hierarchy with median depth 5 and maximum depth 9. </li></ul><ul><li>Human editors populated the taxonomy with labeled bid phrase of ad (approx. 150 phrases per node) </li></ul>
  13. 13. Classification Method <ul><li>For each taxonomy node, they concatenated all the phrases associated with this node into a single meta-document. </li></ul><ul><li>Then, they computed a centroid for each node by summing up the TFIDF values of individual terms, and normalizing by the number of phrases in the class: </li></ul><ul><li>where, </li></ul><ul><ul><li>is the centroid for class C j and p iterates over the phrases in class. </li></ul></ul>
  14. 14. Classification Method cont. <ul><li>The classification is based on the cosine of the angle between the document and the centroid meta-document: </li></ul><ul><li>where, </li></ul><ul><ul><li>F is the set of features </li></ul></ul><ul><ul><li>c i and d i represent the weight of the i th feature in the class and the document. </li></ul></ul>
  15. 15. Using Classification Features & Ad Retrieval Function <ul><li>Each page and as were represented as a bag of words (BOW) and as additional vector of classification feature. </li></ul><ul><li>The ad retrieval function was formulated as a linear combination of similarity scores based on both BOW and classification features : </li></ul>
  16. 16. Dataset <ul><li>From 12,000 human judgments (page-ad pairs): </li></ul><ul><ul><li>Dataset 1 consists of 105 Web pages that are accessible through a major search engine. </li></ul></ul><ul><ul><ul><li>2680 ads and 2946 page-ad score (some ads have been scored for more than one page) </li></ul></ul></ul><ul><ul><ul><li>The classification precision was 70% for the pages and 86% for the ads. </li></ul></ul></ul><ul><ul><li>Dataset 2: consists of 827 pages from publishers that are not found in the search engine index. </li></ul></ul><ul><ul><ul><li>5056 unique ads. </li></ul></ul></ul>
  17. 17. Evaluation Metrics <ul><li>Precision </li></ul><ul><li>MAP (Mean Average Precision) </li></ul><ul><li>bpref-10 (Buckley et al ., SIGIR’04) </li></ul><ul><ul><li>Its idea is to measure the effectiveness of a system on the basis of judged documents only. </li></ul></ul><ul><ul><li>Since the scores for MAP and P@(N) are completely determined by the ranks of the relevant documents in the result set, these measures make no distinction in pooled collections between documents that are explicitly judged as nonrelevant and documents that are assumed to be nonrelevant because they are unjudged. </li></ul></ul>
  18. 18. bpref-10 <ul><li>The preference measure is a function of the number of times judged non-relevant documents are retrieved before relevant document. </li></ul><ul><li>Formulation: </li></ul><ul><ul><li>Naïve: Simple counts of the number of judged nonrelevant documents retrieved before some relevant document are poor because the score is dependent on the absolute numbers of relevant judged nonrelevant documents. </li></ul></ul><ul><ul><li>For a topic with R relevant documents where r is a relevant document and n is a member of the first R judged nonrelevant documents </li></ul></ul><ul><ul><li>bprep: </li></ul></ul><ul><ul><li>bprep-10: </li></ul></ul>
  19. 19. The effect of Focused Page Analysis <ul><li>FullText( F ), AnchorText( A ), First 500 bytpes( P500 ), MetaData( M ), Headings( H ), Title( T ), PageURL( U ), ReferrerURL( R ) </li></ul>
  20. 20. The contribution of individual fragments <ul><li>FullText( F ), AnchorText( A ), First 500 bytpes( P500 ), MetaData( M ), Headings( H ), Title( T ), PageURL( U ), ReferrerURL( R ) </li></ul>
  21. 21. Precision-Recall tradeoff
  22. 22. Incremental Addition of Information
  23. 23. The Effect of Classification
  24. 24. Conclusion <ul><li>They presented a new methodology for contextual Web advertising in real time. </li></ul><ul><ul><li>They focused on the contributions of the different fragments of the pages. </li></ul></ul>