Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Making sense out of things on the web

2,089 views

Published on

Published in: Technology, Design
  • Be the first to comment

  • Be the first to like this

Making sense out of things on the web

  1. 1. MAKING SENSE OUT OFTHINGS ON THE WEB@pradeepbv
  2. 2. We have been accumulating a lot of information 3
  3. 3. http://en.wikipedia.org/wiki/File:Jingangjing.jpg 4
  4. 4. http://en.wikipedia.org/wiki/File:Printer_in_1568-ce.png http://en.wikipedia.org/wiki/File:BuxheimStChristopher.jpg 5
  5. 5. 6http://en.wikipedia.org/wiki/Odhecaton
  6. 6. What hath God wroughthttp://upload.wikimedia.org/wikipedia/commons/f/f1/The_First_Telegraph.jpg 7
  7. 7. 1891 Telegraph Lines http://en.wikipedia.org/wiki/File:1891_Telegraph_Lines.jpg 8
  8. 8. Mr Watson—Come hereI want to see you 9 http://www.boerner.net/jboerner/?p=9396
  9. 9. radioRadio 10
  10. 10. http://www.elon.edu/e-web/predictions/150/1930.xhtml 11
  11. 11. 12
  12. 12. 13
  13. 13. 14
  14. 14. www 15
  15. 15. http://en.wikipedia.org/wiki/File:NCSA_Mosaic.PNG 16
  16. 16. the Internet had an estimated 16 million users by 1995 17
  17. 17. http://en.wikipedia.org/wiki/Venture_capital 18
  18. 18. People from all over the worldstarted sharing their interests, hopes and dreams online 19
  19. 19. 20
  20. 20. 21http://electrokami.com/wp-content/uploads/2010/09/the-internet-in-real-life.jpg
  21. 21. The number of devices connected to IP networkswill be nearly three times as high as the globalpopulation in 2016 22
  22. 22. kilo mega tera giga pita The Zettabyte Era exa zetta 9,444,732,965,739,290,427,392 bits (1024 exbibytes) yottahttp://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.html 23
  23. 23. “Reports that say that something hasnthappened are always interesting to me, becauseas we know, there are known knowns; there arethings we know we know. We also know thereare known unknowns; that is to say we knowthere are some things we do not know. Butthere are also unknown unknowns – the oneswe dont know we dont know.” Donald Rumsfeld, US Defense Secretary at a press conference at NATO Headquarters, Brussels, Belgium, June 6, 2002 Image: planetization.org 24
  24. 24. Nicholas Carr worriesthat the flood of digitalinformation is changingnot only our habits, buteven our mentalcapacities: Forced to scanand skim to keep up, weare losing our abilities topay sustainedattention, reflectdeeply, or rememberwhat we’ve learned. 25
  25. 25. Information overload?http://blogs.tusc.k12.al.us/bhslibrary/files/2012/01/Information_overload.jpg 26
  26. 26. DO YOU KNOW WHAT ARE YOU LOOKING FOR? 27 http://www.teachersdiary.com/.a/6a0115703931fc970c0128765537ba970c-800wi
  27. 27. DO YOU KNOW WHERE TO FIND WHAT YOU WANT? http://www.flickr.com/photos/special/1597251/ 28
  28. 28. REGULAR SEARCH #FAIL?http://www.flickr.com/photos/sumrow/1267682594/sizes/l/ 29
  29. 29. IS THERE A SUPERHERO WHO CAN HELP?http://www.flickr.com/photos/sumrow/1267682594/sizes/l/ 30
  30. 30. BUILD YOUR OWN SEARCH SERVICE Yes, you are the superhero
  31. 31. BOSS IS BUILD YOUR OWN SEARCH SERVICEhttp://developer.yahoo.com/search/boss/
  32. 32. BOSS PROVIDES APIS TO OUR SEARCH DATA STORES
  33. 33. TO BUILD YOUR OWN POWERFULSEARCH APPLICATIONS
  34. 34. BOSS allows you to search over Web, images, news & Blogs
  35. 35. You can even monetize yourapplications using Search Ads from BOSS and get support.
  36. 36. What can be done on top of BOSS?• Blend and re-rank search results• Your own look and feel• Mix it with other APIs
  37. 37. BOSS Pricing
  38. 38. Free for building your hacks!!
  39. 39. Where do I start?
  40. 40. What’s in it?Restful XML and JSON API Web Image Spelling News Search Ads http//www.flickr.com/photos/joeshlabotnik/419914250/sizes/o/in/photostream/.jpg
  41. 41. Oauth based Autenticationhttp//www.flickr.com/photos/friarsbalsam/5736126308/sizes/o/in/photostream/.jpg
  42. 42. What else do I get? Web and Limited Web results Image attributes like height, width, etc Time span filtering for News Search Document type filtering Extended abstracts http//www.flickr.com/photos/acidpix/6021203584/sizes/o/in/photostream/.jpg
  43. 43. BOSS + YQL• Table Name: boss.search Example Parameters Consumer Key ck - Consumer Secret secret - Query Term q ‘iitd’• e.g. select * from boss.search where ck=… and secret=… and q=‘openhackindia’
  44. 44. Searching “The Dark Knight”
  45. 45. Finding images of “The Dark Knight Rises”select * from boss.search where q="The Dark Knight Rises" and service="images" and ck="..." and secret="..."
  46. 46. Finding “The Dark Knight Rises” in IMDB, movies.yahoo.comselect * from boss.search where q="The Dark Knight Rises" and sites="imdb.com,movies.yahoo.com" and ck="..." and secret="..."
  47. 47. Spell Check and Correctionselect * from boss.search where q="The Dark Knight Rises" and service="spelling" and ck="..." and secret="..."
  48. 48. Finding news on “The Dark Knight Rises”select * from boss.search where q="The DarkKnight Rises" and service="news" and ck="..." and secret="..."
  49. 49. And through the BOSS API Getting multiple data sets  /ysearch/web,images,news?q=anna  /ysearch/web,images,news?web.q=anna&images.q=anna&news.q=lokpal Searching through sites  A Simple Movie Search  /ysearch/web?q=“Dark Knight”& sites=movies.yahoo.com,netflix.com,imdb.com AND/OR operators  /ysearch/web?q="steve jobs"AND((ipad)OR(iphone))&sites=bestbuy.com,newegg.com  Important: Use Braces or quotes
  50. 50. Unary Operators Search for Batman but not “Dark Knight”  q=(batman -“Dark Knight") Find pages with “Heath Ledger” but not “Dark Knight”  q=+”heath ledger”–”Dark Knight”&sites=movies.yahoo.com Force auto-spelling off  q=+”drk knight”
  51. 51. Searching in body and in title Searching for Dark Knight in the Title on Yahoo movies  q=reviews intitle:"dark knight"&sites=movies.yahoo.com Searching for Dark Knight in the Title in Yahoo movies containing Christian Bale  q=reviews intitle:"dark knight" inbody:"christian bale"&sites=movies.yahoo.com
  52. 52. Market and document specific Filters Search for “Dark Knight” in India specific sites  q=“Dark Knight”&market=en-in Search for “PDF’s containing “Dark Knight”  q=“Dark Knight”&type=pdf Search for MS Office type (except PPT’s) containing “Dark Knight”  q=“Dark Knight”&type=msoffice,-ppt
  53. 53. Output
  54. 54. Image search parameters Search for images that are not offensive  /ysearch/images?q=“san francisco”&filter=yes Search for images that are wallpaper size  /ysearch/images?q=“san francisco”&dimensions=wallpaper Search for a image at a certain refer URL  /ysearch/images?q=yahoo&refererurl=http://www.flickr.com• Interesting Output Fields  format, file size, height, width, title, total result count
  55. 55. News search parameters Search news that is less than 7 days old /ysearch/news?q=lokpal&age=7dSearch news that is between 20hrs and 2 days old /ysearch/news?q=lokpal&age=20h2dRe-rank news results by date /ysearch/news?q=lokpal&ranking=trueInteresting Output Fields  Source, Date, Source URL
  56. 56. EXAMPLE HACKS
  57. 57. Duckduckgo.com
  58. 58. Interceder
  59. 59. Ask-boss (v1)Hack: http://ask-boss.appspot.comCode: https://github.com/saurabhsahni/Hacks/tree/master/askBOSS
  60. 60. webmeme.in
  61. 61. http://hackyourworld.org/~iitb_pacman/search/
  62. 62. I did BOSS and got data, now how to extract information of out it?
  63. 63. make sense out of it?
  64. 64. Content Analysisselect * from contentanalysis.analyze where text="Yahoo! kicks off hackday”
  65. 65. Content Analysis from a URLselect * from contentanalysis.analyze where url="http://www.cnn.com/"
  66. 66. Term Exractionselect * from search.termextract where context in (select description from rss where url=‘’)
  67. 67. More resources Yahoo! BOSS: http://developer.yahoo.com/boss BOSS Technical Documentation: http://developer.yahoo.com/search/boss/boss_api_guide/ YQL: http://developer.yahoo.com/yql Amazon Web Services: http://aws.amazon.com oAuth: http://oauth.net/ Open Data: http://theinfo.org Alt Search Engines: http://www.altsearchengines.com/
  68. 68. Happy hacking!

×