Future of Search | Yury Lifshits, Yahoo! Research
Upcoming SlideShare
Loading in...5

Future of Search | Yury Lifshits, Yahoo! Research







Total Views
Views on SlideShare
Embed Views



4 Embeds 272

http://yury.name 255
http://lj-toys.com 14
http://feeds.feedburner.com 2
http://www.health.medicbd.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Future of Search | Yury Lifshits, Yahoo! Research Future of Search | Yury Lifshits, Yahoo! Research Presentation Transcript

  • Yury Lifshits Yahoo! Research http://yury.name Future of Search St. Petersburg | Helsinki December 2008
  • Outline
    • Structured Search
    • Yahoo! Work in Search
      • SearchMonkey
      • BOSS
    • Research Agenda
  • Structured Search: work in progress
  • Structured Search = Bring structured data to search users M.K. Bergman. The Deep Web: Surfacing Hidden Value. 2001.
  • Value Proposition
    • Coverage
      • Real-time data
      • Semi-private data
    • Structured queries
    • Ordering and filtering results
    • Straight-to-answers
  • User Interface: Query
    • Search assist: Yahoo!
    • Selector: LinkedIn, VKontakte.ru
    • Multiple search buttons: Gmail
    • Search tabs: Yahoo / Google
  • User Interface: Results
    • Federated page
    • Facets
    • Search transfer / search form
    K.P. Yee, K. Swearingen, K. Li, M. Hearst. Faceted metadata for image search and browsing. CHI 2003. Fernando Diaz. Aggregation of News Content Into Web Results. WSDM 2009. http:// glue.yahoo.com http:// au.alpha.yahoo.com
  • Data Supply Chain
    • Atomic fact
    • Flight, Event, Patent
    • Data aggregator
    • US Patents, Amadeus/Sabre flights, Upcoming.com
    • Domain search
    • Expedia, Spock
    • General purpose search
    • Yahoo!, Google, Yandex, Baidu
  • Getting structured data
    • Entity extraction
    • Markup
    • Feeds
    • Search API (OpenSearch)
    • OR
    • Do a search transfer
  • Give Us Your Data For …
    • Traffic via search transfer
    • Firefox search box
    • Better presentation in search
    • SearchMonkey
    • Hosted search
    • BOSS Custom
    • Showing your ads
    • Yahoo Local + AT&T
  • Yahoo! Work in Search
  • Slides by: Paul Tarjan, Chief Technical Monkey ( [email_address] ) Full version http://www.slideshare.net/ptarjan/searchmonkey-presentation
  • What is SearchMonkey? an open platform for using structured data to build more useful and relevant search results Before After
  • Enhanced Result: Zagat Key/Value Pairs or Abstract Links Image
  • Infobar: Wikipedia Preview Summary Blob
  • Creating an Infobar
    • Infobar advantages
      • Annotate someone else’s site
      • Use links and images from other domains
        • Mash up info from multiple sites
        • Affiliate / coupon links? Hmmm…
      • Can act on *, all websites
        • But these apps can be annoying if poorly designed
    • Key design principles
      • Put something useful in the summary
      • Be creative with the HTML
  • How to get data to SearchMonkey?
    • Humans see:
    • name
    • picture of a person
    • current job
    • industry, …
    • Computers see:
    • an undifferentiated
    • blob of HTML
    • Can we make computers smarter?
  • How does it work? Acme.com’s database Index RDF/Microformat Markup site owners/publishers share structured data with Yahoo!. 1 consumers customize their search experience with Enhanced Results or Infobars 3 site owners & third-party developers build SearchMonkey apps. 2 DataRSS feed Web Services Page Extraction Acme.com’s Web Pages
  • SearchMonkey Resources
    • Main:
      • http:// developer.yahoo.com/searchmonkey
    • Lists and forums:
      • [email_address]
      • http:// suggestions.yahoo.com/searchmonkey
  • Vik Singh (Architect) Graham Mudd (Senior PMM)
  • BOSS = B uild your O wn S earch S ervice Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search Unrestricted What
    • Unrestricted:
    • Unlimited queries
    • Blend, re-order, discard
    • Full presentation control
    • Non-search apps OK
    • Monetization: Free or CPM or Ads
    • Barriers to entry are massive
    • $300M, top talent, a prayer to get to basic parity
    • No monopoly over great ideas
    • Search anywhere
    • Improve Vertical Quality w/ Web comprehensiveness
    • Fragment the market, foster more players, choice, competition
    • Yahoo extends advertising reach, 3rd parties revenue share
  • Why Traditional Search Distribution + BOSS Distribution
  • Tracks API A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences.
      • UIUC
      • CMU
      • Stanford
      • Purdue
      • IIT Bombay
      • MIT
      • UMass
    CUSTOM Working with 3rd parties to build a more relevant, brand/site specific web search experience. This option is jointly built by Yahoo! and select partners.
      • ACADEMIC
      • Working with the following universities to allow for wide-scale research in the search field:
    Interested in Custom? Email us [email_address]
  • http://boss.yahooapis.com/ysearch/{vert}/v1/{q} {vert} := {web, news, images, spelling} @ required appid @ optional (Y!OS compliant) start, count, lang, region, format, callback, sites BOSS API v1
  • Python (v2.5+) library BOSS Search SDK plus … SQL for remixing arbitrary XML/JSON sources Loosely Functional programming paradigm BOSS Mashup Framework
  • Ported enhanced version of BMF to GAE platform http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/ Easiest way to deploy a BOSS application online BMF + Google App Engine
  • http://www.4hoursearch.com http://123people.com Mashable! Contest for BOSS search engines http://mashable.com/boss/ Examples
  • BOSS Custom for TechCrunch
  • TechCrunch Network Search
    • CrunchBase + Posts + Web
    • Sort by time / relevance
    • Enhanced results
    • Domain-specific facets
    • Yahoo! sponsored search
    • Real-time indexing
    • Special results
  • Research Agenda
  • Structured Search
    • Analysis of search demand
      • Intent classification
      • General search vs. vertical
    • Incentives in data supply
    • Push & real-time indexing
    • Search user interface
      • One box vs. multi-box
      • General vs. vertical
    • Deciding search transfer
      • When?
      • To whom?
  • Key Scientific Challenges Draft: http://research.yahoo.com/ksc
    • Search intent
    • Quality metrics
    • Web mining
    • Multilingual IR
    • Nextgen search
      • Synthesized result pages
    • World knowledge
    A.Z. Broder. Taxonomy of web search. SIGIR 2002.
  • More Problems
    • Discovery search
    • Web search vs. asking people
    • Event search
  • Thanks for your attention!
    • Yury Lifshits
    • http://yury.name
    • [email_address]