Hack the BOSS at Open Hack Day - London


Published on

Yahoo! BOSS presentation for Open Hack Day 2009 in London. Learn about the basic usage and how to start hacking with Yahoo! BOSS.

Published in: Technology, Design
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Hack the BOSS at Open Hack Day - London

    1. 1. Hack the BOSS Ted DRAKE Yahoo! France
    2. 2. BOSS = Data “ BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect www2009 Conference, Madrid
    3. 3. The British Museum
    4. 4. <ul><li>Change ranking </li></ul><ul><li>Create your own look and feel </li></ul><ul><li>Use your favorite ads </li></ul><ul><li>Mash with external APIs </li></ul>BOSS = Freedom
    5. 5. Coming Soon… <ul><li>SLA </li></ul><ul><li>Customer Support </li></ul><ul><li>Fees: </li></ul><ul><ul><li>Free for most uses </li></ul></ul><ul><ul><li>Costs based on usage </li></ul></ul>
    6. 6. BOSS AIN’T GOING AWAY Build for this weekend. Build for the future!
    7. 7. BOSS Details <ul><li>REST based API. </li></ul><ul><li>XML or JSON output </li></ul><ul><li>Web, News, Image, SiteSearch, and Spelling Suggestion services </li></ul><ul><li>Time span filtering for News Search </li></ul><ul><li>Delicious Tags and Popularity </li></ul><ul><li>Keyterm extraction </li></ul><ul><li>Microformat and RDF data </li></ul><ul><li>Extended abstracts </li></ul><ul><li>Recognizes most search filters from Yahoo! and Google (backdoor hacks) </li></ul>
    8. 8. What is the most important part of your application? <ul><li>The results display? </li></ul><ul><li>The text ads? </li></ul><ul><li>The rounded borders? </li></ul><ul><li>The smooth animations? </li></ul><ul><li>The perfect URL? </li></ul>THE QUERY STRING!!!
    9. 9. The Query <ul><li>Tells you what the user is looking for </li></ul><ul><li>Generates related topics </li></ul><ul><li>Powers secondary APIs </li></ul><ul><li>Can be generated by a search box, URL, tags,or keyword extraction from the page. </li></ul><ul><li>The Query is your BFF! </li></ul>
    10. 10. Let’s Start Hacking! <ul><li>Get an API key </li></ul><ul><li>http://developer.yahoo.com </li></ul><ul><li>You don’t need a URL for now. </li></ul><ul><li>Update it later for better tracking and promotion. </li></ul>
    11. 11. Site Specific Results <ul><li>Search only one site: /ysearch/web/v1/golf +site:vw.com ? </li></ul><ul><li>Search from a select group of sites: /ysearch/web/v1/golf? sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com </li></ul>
    12. 12. Tag or Title Filters <ul><li>Use the inurl: filter to simulate tag search: /ysearch/web/v1/ inurl:golf ? </li></ul><ul><li>Use intitle: to filter results with query in title /ysearch/web/v1/ intitle:golf ? </li></ul>
    13. 13. Get Related Sites <ul><li>Use related:foo.html to find related sites /ysearch/web/v1/ related:http://www.caranddriver.com/car/2006-models/2006-golf.html? </li></ul>
    14. 14. BOSS Keyterms <ul><li>Keyterms are words used to find a site while searching on Yahoo! </li></ul><ul><li>Listed in order of relevance. </li></ul><ul><li>/web/v1/{query}? view=keyterms </li></ul>
    15. 15. Delicious Tags and Popularity <ul><li>How many times has a page been saved in Delicious? </li></ul><ul><li>What tags have been associated with the page? How many times? </li></ul><ul><li>view=delicious_saves,delicious_toptags </li></ul>
    16. 16. KeyTerms + Delicious Tags: What are they good for? <ul><li>Relevancy </li></ul><ul><li>Related Searches </li></ul><ul><li>Search Suggest </li></ul><ul><li>Tag Clouds </li></ul><ul><li>Trigger secondary APIs </li></ul><ul><li>Highlight Popular Results </li></ul>
    17. 17. What it looks like <keyterms> <terms> <term>Bucharest</term> <term>city</term> <term>Romanian</term> <term>population</term> <term>Romania</term> <term>architecture</term> <term>city centre</term> <term>clubs</term> </terms> </keyterms>
    18. 18. BOSS Mashup Framework <ul><li>Python based framework to mash BOSS API with secondary web services and proprietary data </li></ul><ul><li>Easy integration with Google APP Engine </li></ul><ul><li>Powers the infamous YUIL (4 hour search) project. </li></ul><ul><li>Fast prototyping with minimal code </li></ul>
    19. 19. BOSSY Code on BOSS Mashup Platform <ul><li>__author__ = &quot;Vik Singh (viksi@yahoo-inc.com)&quot; </li></ul><ul><li>from yos.util import text, typechecks </li></ul><ul><li>from yos.yql import db </li></ul><ul><li>from yos.boss import ysearch </li></ul><ul><li>def month_lookup(s): </li></ul><ul><li>for m in [&quot;jan&quot;, &quot;feb&quot;, &quot;mar&quot;, &quot;apr&quot;, &quot;may&quot;, &quot;jun&quot;, &quot;jul&quot;, &quot;aug&quot;, &quot;sept&quot;, &quot;oct&quot;, &quot;nov&quot;, &quot;dec&quot;]: </li></ul><ul><li>if s.startswith(m): return m </li></ul><ul><li>def parse_month(s): </li></ul><ul><li>months = filter(lambda m: m is not None, map(month_lookup, text.uniques(s))) </li></ul><ul><li>if len(months) > 0: </li></ul><ul><li>return text.norm(months[0]).capitalize() </li></ul><ul><li>def parse_year(s): </li></ul><ul><li>years = filter(lambda t: len(t) == 4 and typechecks.is_int(t), text.uniques(s)) </li></ul><ul><li>if len(years) > 0: return text.norm(years[0]) </li></ul>
    20. 20. Relevancy Hacking
    21. 21. Location Based Relevancy <ul><li>Where am I? </li></ul><ul><li>Where am I going? </li></ul><ul><li>What can I find? </li></ul>Map generated by FirePin application on iPhone
    22. 22. Location Based Relevancy <ul><li>Fire Eagle: Standardized location and sharing platform </li></ul><ul><li>Live location tracking </li></ul><ul><li>Find upcoming traffic cameras, landmarks, restaurants, headlines, photos, twitter buzz, etc… </li></ul><ul><li>Shared locations with friends </li></ul><ul><li>Mining Interesting Locations and Travel Sequences from GPS Trajectories for Mobile Users by Yu Zheng, Lizhu Zhang, Xing Xie and Wei-Ying Ma </li></ul>
    23. 23. Secondary Sources Wikipedia, Craigslist, Government Data… <ul><li>Blah </li></ul><ul><li>Foo </li></ul><ul><li>Blah Blah </li></ul><ul><li>Baz </li></ul><ul><li>Bar </li></ul><ul><li>Foo </li></ul>1. Foo <ul><li>Multiple sources to increase relevance </li></ul><ul><li>DuckDuckGo .com = BOSS + Wikipedia (and other services) </li></ul><ul><li>Understanding User's Query Intent with Wikipedia by Jian Hu, gang wang, Fred Lochovsky and Zheng Chen - www2009 conference </li></ul><ul><li>OpenData: DataMob .org , TheInfo .org , InfoChimps .org </li></ul>
    24. 24. Real Time Events <ul><li>Tweet News : Twitter + News Search </li></ul><ul><li>Twitter users share most timely articles </li></ul><ul><li>Relevancy highlights tweeted stories </li></ul>BOSS
    25. 25. Internal + External Data Sources BOSS <ul><li> Tech Crunch Search : BOSS + Access to proprietary data </li></ul><ul><li>Create custom tables in YQL </li></ul><ul><li>BOSS “Vertical Lens” defines what internal data BOSS should index as well as your preferred external sources. </li></ul>
    26. 26. Offline Analysis <ul><li>Coloralo </li></ul><ul><li>requests extra images </li></ul><ul><li>caches them </li></ul><ul><li>analyzes them for relevancy </li></ul><ul><li>Coloralo finds coloring book images. </li></ul>
    27. 27. Quick and Easy semantic Search <ul><li>Limit your results to sites with microformats or rdf data: searchmonkeyid:com.yahoo.page.uf.hreview </li></ul><ul><li>Request structured data, keyterms, and Delicious data from BOSS: view=keyterms,searchmonkey_feed,searchmonkey_rdf,delicious_toptags,delicious_saves </li></ul><ul><li>Sample request: http://boss.yahooapis.com/ysearch/web/v1/cocorosie+searchmonkeyid:com.yahoo.page.uf.hreview?appid=YourAppId&format=xml&start=0&count=15&view=keyterms%2Csearchmonkey_feed%2Csearchmonkey_rdf%2Cdelicious_toptags </li></ul>
    28. 28. Inurl and Intitle Hacks <ul><li>Use your favorite search engine hacks with BOSS. </li></ul><ul><li>Most of the SERP advanced search tricks will work with your BOSS requests. </li></ul><ul><li>This does not include Google, Yahoo!, or other specific patterns such as !sports </li></ul>
    29. 29. Website Description <ul><li>Get a more complete picture of a target web site by combining multiple requests </li></ul><ul><li>Find the number of external sites linking to the site: /ysearch/ se_inlink/ v1/ {site}?omit_inlinks=domain </li></ul><ul><li>Find the pages within the site: /ysearch/se_pagedata/ v1/ {site}? </li></ul><ul><li>Find related web pages: /ysearch/web/v1/ related:{site}?view=delicious_saves,delicious_toptags </li></ul>
    30. 30. Filter News by Time <ul><li>Older, less timely articles may have more natural relevancy. Control this by selecting the age range for news articles. </li></ul><ul><li>Use orderby=date to show latest instead of most relevant. </li></ul><ul><li>What happened while you were asleep: /ysearch/ news /v1/ {query}?age=9h&orderby=date </li></ul><ul><li>Limit news articles to 1-7 days old: /ysearch/ news /v1/ {query}?age=1d-7d </li></ul>
    31. 31. Vertical Focus <ul><li>Vertical Search Engines already have a niche audience. </li></ul><ul><li>Limit searches to appropriate sites: InsiderFood </li></ul><ul><li>Truevert creates a model of word relations in context to its niche: environmental. </li></ul>
    32. 32. Go Beyond the Web Site <ul><li>Desktop : Xobni for Outlok </li></ul><ul><li>Tools : Zemanta finds related information for blogs and emails </li></ul><ul><li>Modular : Create an application for Facebook, Yahoo, MySpace and more with the Open Social standard. </li></ul>
    33. 33. Go from Search to Action <ul><li>Keyword Finder uses BOSS keyterms to return the top 10 keywords used by successful sites for a query </li></ul><ul><li>Bossy returns a single answer to questions. Where is Big Ben? London. </li></ul>
    34. 34. Resources <ul><li>Yahoo! BOSS: http://developer.yahoo.com/boss </li></ul><ul><li>BOSS Mashup Framework: http://developer.yahoo. com/search/boss/mashup .html </li></ul><ul><li>YQL: http://developer.yahoo.com/yql </li></ul><ul><li>Fire Eagle: http://developer.yahoo.com/fireeagle/ </li></ul><ul><li>Google App Engine: http: //appengine . google .com </li></ul><ul><li>Amazon Web Services: http://aws.amazon.com </li></ul><ul><li>oAuth: http://oauth.net/ </li></ul><ul><li>Open Social: http://www.opensocial.org/ </li></ul><ul><li>Open Data: http://theinfo.org </li></ul><ul><li>Alt Search Engines: http://www.altsearchengines.com/ </li></ul><ul><li>BOSS Hacks: http://bosshacks.com </li></ul><ul><ul><li>Add your hack to http://www. bosshacks .com/hacks/open-hack-day-london-2009 </li></ul></ul>