Inform: Targeting the Interest Graph


Published on

Personalization of content and ad selection using the Inform Service

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Content / Activity Ingestion Diversity of content sources Data / activity ingestion Occam Big Data Processing and Scale Search, Storage, Archive Text analysis for categorization and organization Algorithms drive content discovery Intersection of content and activity data yields trends and personalization Content Distribution Dynamic content assignment and publishing Cross-platform publishing via apps and APIs Emphasis on integration with emerging data & content standards
  • Inform: Targeting the Interest Graph

    1. 1. Targeting the Interest Graph: Personalization of content and ad selection using the Inform Service Marc Hadfield CTO, Inform Semantic Technology Conference, 2011
    2. 2. Introduction Marc Hadfield is CTO of Inform Technologies. Interests: Natural Language Processing, Semantics, Life Science Graph Algorithms, Machine Learning, Big Data Inform Technologies is a semantic technology company. Inform provides semantic technology – NLP and Analytics to Publishers, and operates a user generated forum site We at Inform have been evolving our technology to the user generated content space. We’ve adapted our technology to different kinds of content such as informal text, photos, videos, and questions. We’ve recently addressed Ad Selection, Video Selection, and Personalization. I’ll discuss some of our results with the Interest Graph. 2
    3. 3. Inform Service Semantic Software-as-a-Service for Publishers Advantage: ~30% boost in engagement in “traditional” publisher websites. Tracks 4,000+ Subjects and 320,000+ Entities: Inform Topics Inform Service: –  In-Article links to Topics Pages –  Related Articles from the Archive –  Related Articles around the Web –  Related Photos –  Related Videos –  Topic Pages including mix of content sources –  Tools (Publishing Tools, etc.) 3
    4. 4. Inform Publisher Customers 4
    5. 5. Yuku Forums   Forum Content –  “Old School” user generated content –  ~40,000 forums –  Top 100 forums account for about 50% of traffic –  ~1 Billion short form content pieces –  ~1 Million monthly unique users –  ~150K new content objects per day –  ~1 Million Page Views per Day   Subscription / Advertising Revenue   Inform adapting / integration our Semantic Tech   Great laboratory for testing algorithms / theories –  Apply more broadly than Yuku platform   Nice A/B testing environment   Testing new algorithms on our ForumFind search engine –  And embedded widgets in Yuku   Good reason to improve Ad Selection 5
    6. 6. Today: Personalization for Enhanced Targeting •  Capturing the Interest Graph •  Personalized experience   Help People find interesting content   Make Ads relevant Occam 6
    7. 7. Inform Content & Analytics Platform Licensed / 3rd Content / Data Crawled Party / Ingestion Content Activity Data Text Analysis Algorithms Core Engine Occam Categorization / Personalization Content Distribution Publisher site Yuku Widgets 7
    8. 8. Inform “Occam” ArchitectureExample Workflow: • REST Webservice Call Receive • Queue Message • Get URL • Extract Document Features Extract • Extract Text • NLP Features (Machine Learning) • Inference Engine (Prolog / Frame Logic) NLP • Discourse / Behavior / Sentiment Models (Prolog / Frame Logic) (New) • Trend Analysis (incremental data) • Graph Analysis (incremental data) Analysis • Store in Semantic Repository (if needed) • Send Reply Message (via Queue or Webservice) Reply 8
    9. 9. Inform API   REST Based   Queue for high volume content exchange   Returns data in RDF, XML, or JSON   All Content has a URI   All Inform Topics have URIs (can be dereferenced)   Insert Content, Update Content, Delete Content   Login / Logout   Change Status of Content (Published, Unpublished)   Content can be “GET” –  Associated Topics (Subjects and Entities) returned –  Include scores   Search Inform Topics   Semantic Search –  Simplified queries (not full sparql) –  Typical Query: Get Content of Type “Article” about “Barack Obama” ranked by score 9
    10. 10. Inform API (2)   Related Content –  Articles, Messages, Photos, Videos, Questions, Web   AdContext™ (new) –  URL  IAB Topics + Inform Topics   VideoContext™ (new) –  URL  Inform Topics –  Related Videos   InterestGraph (new) –  Parameters: user-id / session-id  Inform Topics   Personalized AdContext™ (new) –  URL + session-id / user-id (anonymized)  IAB Topics + Inform Topics 10
    11. 11. AdContext™: IAB Ad Standards IAB (Interactive Advertising Bureau) Standard to return a set of metadata about a website, webpage, section of a webpage to assist advertising within web content. Defines how a Topic may be associated with web content. Defines a set of standard upper level Topics such as “Science”, “Sports”, and “Business”, and mid-level Topics such as “Golf” and “Fashion”. These are tier-1 and tier-2. Inform has aligned the IAB Topics with Inform’s Topics. Inform can deliver more specific Topics (the full set of Inform Topics) as “tier-3” IAB Topics. The AdContext™ service returns this metadata. Ad Networks may use the service to assist in ad selection. Semantic Ad Selection may improve yield 2X – 5X (as per various external studies). 11
    12. 12. Aside: rNews RDFa Standard rNews: embedding metadata in online news rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents. The rNews proposal has been developed by the IPTC, a consortium of the worlds major news agencies, news publishers and news industry vendors. rNews is currently in draft form and the IPTC welcomes feedback on how to improve the standard in the rNews Forum. Why? SEO, Rich Snippets, Reduce “scrapper” error, better metadata. Inform API returns via the API rNews metadata ready to embed in news articles (in testing). 12
    13. 13. Publisher Customer Example: Inform automatically tags entities (people, places, companies, and organizations) and provides related topics, articles, and media The Related News Widget pulls in the most relevant and recent articles from within the New York Daily News Archive 13
    14. 14. Customer Example: Inform also generates highly Inform’s tags engaging can be brought and together in relevant numerous ways slideshows to create a richer experience for consumers 14
    15. 15. Demo Inform API w/FacebookHow to connect Inform to the social graph? 15
    16. 16. Demo Inform API w/Facebook 16
    17. 17. Demo Inform API w/Facebook 17
    18. 18. Demo Inform API w/Facebook Inform Topics mapped to Wikipedia Pages and thus to other Concepts – including the Facebook “Like” Graph 18
    19. 19. Interest Graph •  Inform Topics •  ~1 Billion content pieces   4,000+ Subjects in Hierarchy total (SKOS)   Forum Messages, Replies,   320,000+ Entities Photos, Videos   Wikipedia Pages   Wikipedia Categories •  150K new content pieces per day   Inform “same-as” links to Wikipedia •  1 Million+ PageViews per Day •  1 Million+ Monthly Unique •  ~5 Million ads serviced per Day Users Goal: Link Users to Topics for selection of content and ads 19
    20. 20. Personalization Signals •  Content is “about” a Topic (subject or entity) •  User submits Content (“write”)   Message, Reply, Photo, Video, Question, … •  User reads Content (“view”)   Message, Reply, Photo, Video, Question Trends / Global Aggregation: •  Importance Metric •  Bursty / Velocity •  Sentiment ( “:-)”, “LOL”, …)   “Like” the topic? “Dislike” the topic? Context? –  i.e. dislike a Football Team, so “likes” to hear when they lose (negative sentiment) •  Other features… 20
    21. 21. Interest Graph Algorithms Criteria: •  Near Real-Time •  Highly parallel to allow for scaling •  Fuzzy Data, Flexible data model Implementation: •  General Graph Representation   Node Weights, Edge Weights, Node Types, Edge Types •  Graph walk to extract a User’s Interest Graph •  Parallel Message-Passing Algorithms for Graph Analysis   Importance, PageRank, Centrality   Spreading Activitation   Pregel-like implementation (Signal/Collect) •  Add Graph Analytics to Workflow 21
    22. 22. Neighborhood around JJB User 22
    23. 23. Niketalk User Interest Graph (local) Without global importance metric: 23
    24. 24. Niketalk User Interest Graph (global) With global importance metric: Recommendations can be made reflecting the shifting interests of the global community. 24
    25. 25. Example Yuku Forum - Gymnastics 25
    26. 26. ForumFind – “laboratory” 26
    27. 27. ForumFind – Topic, Ad, Content 27
    28. 28. ForumFind – MyForumFind (user: jjb2 ) 28
    29. 29. Interest Graph – User Insights •  “Everybody Lies” (“House” TV Show) –  The only way to know the users interests is to have an implicit channel to detect interests without impacting user behavior •  People have broad / dynamic interests •  People read “trash” –  i.e. everyone reads Celebrity Gossip –  If convenient / no one looking •  Global Data can be used to make recommendations   No surprise, but nice to have confirmation •  People move on   “Likes” need to expire •  Recommendations for content and ads can be implemented in a highly dynamic and parallel fashion running in real time with reasonable resources using graph analysis 29
    30. 30. Interest Graph – Conclusion •  Using a User’s Graph of Interests can dramatically improve the user’s engagement   Data still being gathered within Inform as to percentage increase, but so far very encouraging numbers! •  The Inform Service can be used to implement a more personalized content and ad experience with minimal implementation effort. •  Talk to me about using our API! 30
    31. 31. Thank You! Questions? Marc Hadfield CTO, Inform Technologies 31
    32. 32. Example CMS Integration 32
    33. 33. Published Article: 33