BOSS Open Hack Day, Bangalore

1,945
-1

Published on

An introduction to BOSS API

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,945
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

BOSS Open Hack Day, Bangalore

  1. 1. Build your Own Search Service Chris Heilmann Saurabh Sahni Open Hack Day 2009 - Bangalore http://www.slideshare.net/saurabhsahni/
  2. 2. Outline •  Search engines using BOSS •  About BOSS API –  What? –  Why? –  Features •  How to use it –  BOSS API –  Code example –  BOSS Mashup framework -2-
  3. 3. Search engines using BOSS -3-
  4. 4. hakia: http://hakia.com/ -4-
  5. 5. hakia: http://hakia.com/ -5-
  6. 6. hakia: http://hakia.com/ -6-
  7. 7. Cluuz: http://cluuz.com -7-
  8. 8. Cluuz: http://cluuz.com -8-
  9. 9. Cluuz: http://cluuz.com -9-
  10. 10. Keyword finder - http://keywordfinder.org/ - 10 -
  11. 11. askBOSS: http://ask-boss.appspot.com/ - 11 -
  12. 12. askBOSS: http://ask-boss.appspot.com/ - 12 -
  13. 13. askBOSS: http://ask-boss.appspot.com/ - 13 -
  14. 14. askBOSS: http://ask-boss.appspot.com/ - 14 -
  15. 15. askBOSS: http://ask-boss.appspot.com/ - 15 -
  16. 16. About BOSS API - 16 -
  17. 17. What? •  Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search http://developer.yahoo.com/search/boss - 17 -
  18. 18. Opening the search technology stack Rank Assist EXTRACT Retrieve SPAM <-> Gold Usage CRAWL Web Map Analyze Index Index 50B pages * 20ms page download = 31 years - 18 -
  19. 19. Opening the search technology stack Your App here WEB API Rank Assist EXTRACT Retrieve SPAM <-> Gold Usage CRAWL Web Map Analyze Index Index 50B pages * 20ms page download = 31 years - 19 -
  20. 20. Why? •  Removes entry barriers •  Asset to Innovate –  Develop new relevance models –  Change presentation style •  Search anywhere –  Improve Vertical Quality w/ Web comprehensiveness - 20 -
  21. 21. BOSS API features •  No branding or attribution •  Ability to change presentation stlye •  Ability to re-order results and blend-in additional content •  Access to multiple verticals (web search, image, news) •  Keyword suggestions, spell checks •  Semantic data, in-links, abstracts •  Ability to monetize - 21 -
  22. 22. How to use it? - 22 -
  23. 23. Get Started •  Register for an application id http://developer.yahoo.com/wsregapp/ •  Documentation http://developer.yahoo.com/search/boss/boss_guide/ •  Code samples: Javascript, PHP and Python http://www.saurabhsahni.com/boss-examples.zip - 23 -
  24. 24. BOSS API Searching Slumdog Millionaire (Source: http://en.wikipedia.org/wiki/File:Slumdog_Millionaire_poster.jpg) - 24 -
  25. 25. BOSS API •  Search for slumdog millionaire: –  http://boss.yahooapis.com/ysearch/web /v1/slumdog+millionaire ?appid=xyz&format=xml - 25 -
  26. 26. BOSS API: XML response http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml - 26 -
  27. 27. Site Restrict Search •  Search for slumdog millionaire on selected movie sites –  Add param sites=indiatimes.com,movies.yahoo.com,imdb.com –  http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&sites=indiatimes.co m%2Cmovies.yahoo.com&format=xml - 27 -
  28. 28. http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire? appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml - 28 -
  29. 29. Search images •  http://boss.yahooapis.com/ysearch/images/v1 /slumdog +millionaire?dimensions=large - 29 -
  30. 30. http://boss.yahooapis.com/ysearch/images/v1/ slumdog +millionaire - 30 -
  31. 31. Search News •  http://boss.yahooapis.com/ysearch/news/v1 /slumdog +millionaire?age=15d - 31 -
  32. 32. http://boss.yahooapis.com/ysearch/news/v1/ slumdog + millionaire?age=15d - 32 -
  33. 33. Movie Search Code Example - 33 -
  34. 34. - 34 -
  35. 35. Movie Search Code Example - 35 -
  36. 36. http://www.saurabhsahni.com/boss-examples.zip - 36 -
  37. 37. More with BOSS API - 37 -
  38. 38. Related keywords Add parameter view=keyterms –  http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&view=keyterms&format=xml - 38 -
  39. 39. http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&view=keyterms&format=xml - 39 -
  40. 40. Semantic Data •  Access structured data acquired through SearchMonkey - 40 -
  41. 41. Semantic Data view=searchmonkey_rdf view=searchmonkey_feed http://developer.yahoo.com/search/boss/stuctureddata.html - 41 -
  42. 42. http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz& view=searchmonkey_feed&format=xml - 42 -
  43. 43. Long abstracts •  Add parameter abstract=long –  get up to 300 characters instead of 130 - 43 -
  44. 44. Spell Check http://boss.yahooapis.com/ysearch/spelling/v1/ milionare?format=xml Response - 44 -
  45. 45. BOSS Search API REST Interface http://boss.yahooapis.com/ysearch/{vert}/v1/{query} •  {query}: term to look for (url-encoded) •  {vert} := {web, news, images, spelling} •  @ required –  appid •  @ optional –  start, count, lang, region, format, callback, sites, view - 45 -
  46. 46. Site Explorer •  Get page inlinks –  http://boss.yahooapis.com/ysearch/se_inlink/v1/{URL} ?appid={APPID} •  Page data: collection of subpages in a domain –  http://boss.yahooapis.com/ysearch/se_pagedata/v1/{URL} ?appid={APPID} - 46 -
  47. 47. BOSS Mashup Framework •  Python (v2.5+) library •  BOSS Search SDK plus … •  SQL for remixing arbitrary XML/JSON sources http://developer.yahoo.com/search/boss/mashup.html - 47 -
  48. 48. BMF + Google App Engine •  Enhanced version of BMF to GAE platform •  http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/ •  Enables quick deployment of BOSS applications online - 48 -
  49. 49. More BOSS Implementations •  http://mashable.com/boss/ •  http://delicious.com/tag/bossmashup •  Add yours by tagging it with “bossmashup” on Del.icio.us! - 49 -
  50. 50. One more thing… - 50 -
  51. 51. BOSS Custom Your App here WEB API Rank Assist EXTRACT Retrieve SPAM <-> Gold Usage CRAWL Web Map Analyze Index 50B pages * 20ms page download = 31 years - 51 -
  52. 52. Thank You Questions? More: http://developer.yahoo.com/search/boss/ Slides: http://www.slideshare.net/saurabhsahni/ - 52 -
  53. 53. Appendix - 53 -
  54. 54. Search UI Templates are Included in the BOSS Mashup Framework http://www.yahoo.com BOSS Mashup Framework simplifies aggregating and presenting multiple data sources - 54 -
  55. 55. BMF Features •  select, group, sort, union, joins, udfs, where •  Text normalization and duplicate removal •  Auto-transformation of resource-oriented API results into tables w/o parsing •  All-in-memory storage and retrieval operations •  Ability to join lists of tables via an arbitrary predicate function (map-like) •  Search UI template framework •  Single search function provides total access to BOSS REST API - 55 -
  56. 56. BOSS in Academic Research •  The biggest dataset available on web •  Very useful for Web-mining research experiments –  Natural language processing –  Semantic extraction –  Related keywords –  Similarity detection –  Clustering algorithms –  Spelling corrections - 56 -
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×