YQLSELECT * FROM InternetDeveloper Talk Token Unicorn
Derek Gathright
Derek Gathright• Engineer @ Yahoo
Derek Gathright• Engineer @ Yahoo• Tweet from @derek
Derek Gathright• Engineer @ Yahoo• Tweet from @derek• Blog @ http://derekville.net
Derek Gathright•   Engineer @ Yahoo•   Tweet from @derek•   Blog @ http://derekville.net•   Everything else @ http://derek...
On December 4th, 1995
On December 4th, 1995“Netscape and Sun announce JavaScript, across-platform object scripting language”   http://web.archiv...
How big is the Web?
GINORMOUS!
The Internet: 1995    http://www.jevans.com/pubnetmap.html
The Internet: 2003      http://www.opte.org/maps/
The Internet: 2007      http://xkcd.com/256/
The Internet: 2010      http://xkcd.com/802/
It’s impossible tomeasure the size of theweb because it isconstantly changing,growing, and morphing.
Every second:Twitter gets 600 new tweets
Every minute:YouTube +35 hours of video
Every month: Facebook gets   2.5 billion new photos
Yeah, lots of it is garbage
But theres still a ton ofinteresting stuff out there.
So how do you access it,   programmatically?
It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but  how do you parse it?
It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but  how do you parse it?XPath + DOM Traversal = Yay!
It is easy enough to  ‘scrape’ the web (using   cURL, wget, etc...), but    how do you parse it? XPath + DOM Traversal = Y...
It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but  how do you parse it?
“If a regular expressionis longer than 2 inches, find another method”       - Douglas Crockford
4.4 lbs &1,368 pages.     No thanks!
The Point?The Web has a ton ofdata, but no easy,hackday-able way toaccess it.
THE WEB NEEDS AN API!
APIs are awesome.You get (mostly) whatever datayou want in (mostly) whateverformat you want and in a (mostly)easy to parse...
Example...http://api.twitter.com/1/users/show.json?screen_name=derek
Companies discovered that if you build APIs, developers will come in droves
And it saves you from having todance around on stage like a monkey           (if you are confused, Google “monkey dance”)
Yahoo, Google,Facebook, Twitter,Microsoft, NY Times, ...Most web companiesoffer APIs now days.
As neat as they are,they are imperfect.Why?
As neat as they are,they are imperfect.Why?You have to readdocumentation to usethem.
As neat as they are,they are imperfect.Why?You have to readdocumentation to usethem.BOOOOOOO!!!!
Were developers,were lazy,and we want to build stuff,       NOW!
Yahoo invented a solution to this   problem...
YQL!
"Yahoo! Query Language is anexpressive SQL-like language thatlets you query, filter, and join dataacross Web services.With ...
YQL is...• RESTful• Scaleable• Customizable• ... and lots of other “ables”
How do you use it?
How do you use it?$format = “json”; // or xml;
How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;
How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_q...
How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_q...
How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_q...
How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_q...
YQL Queries SELECT {fields}  FROM {table}WHERE {conditions}
SELECT * FROMweather.forecast     WHERE location=90210
SELECT * FROM  data.html.cssselect       WHEREurl=“http://yahoo.com” AND css=“body a”;
SELECT height,width,url FROM search.images       WHERE query=“kitteh” ANDmimetype LIKE “%jpeg%”
SELECT * FROM google.search    WHERE   q=“pizza”
SELECT status.text FROM   twitter.user.timeline         WHERE  screen_name=“derek”
SELECT * FROM  foursquare.history       WHEREusername=“foo” AND   password=“bar”
SELECT * FROM rss WHERE url IN (SELECT title FROM   atom WHERE url="http://   spreadsheets.google.com/feeds/list/ pg_T0Mv3...
Where’s the magic?
Data Tables13 categories & counting, including...• Geo                  • Social• Flickr               • Upcoming• Local  ...
Open Data Tables900+ community contributed tables inhundreds of categories, including... Amazon               • Netflix• Cr...
google.search
google.searchhttp://www.datatables.org/google/google.search.xml
google.searchhttp://www.datatables.org/google/google.search.xml               twitter.user.timeline
google.search   http://www.datatables.org/google/google.search.xml                   twitter.user.timelinehttp://www.datat...
google.search   http://www.datatables.org/google/google.search.xml                   twitter.user.timelinehttp://www.datat...
google.search   http://www.datatables.org/google/google.search.xml                   twitter.user.timelinehttp://www.datat...
http://www.datatables.org/craigslist/craigslist.search.xml
http://www.datatables.org/craigslist/craigslist.search.xml
http://www.datatables.org/craigslist/craigslist.search.xml
http://www.datatables.org/craigslist/craigslist.search.xml
&query={query}http://www.datatables.org/craigslist/craigslist.search.xml
YQL != Voodoo MagicIt is just rewriting a YQLquery into one (or many)    HTTP calls for you.
USE "http://www.datatables.org/nyt/nyt.bestsellers.xml"AS nyt.bestsellers;USE "https://github.com/gcb/yql.opentable/raw/ma...
USE "https://github.com/gcb/yql.opentable/raw/master/text.concat.xml"AS text.concat;
https://github.com/gcb/yql.opentable/raw/master/text.concat.xml
https://github.com/gcb/yql.opentable/raw/master/text.concat.xml
<execute>• Execute arbitrary JavaScript in Rhino (a JS engine)• E4X Support (XML literals in JS)• Speak protocols and hand...
Summary•   YQL is very useful for...    •   Scraping    •   Creating an API where one doesn’t exist    •   Converting XML ...
NY Times Data Tables•   nyt.article.search        •   nyt.newswire•   nyt.bestsellers           •   nyt.people.activities•...
Get started @   http://developer.yahoo.com/yql            Thanks!Questions? Find, Tweet, or Email me. @derek or drg@yahoo-...
YQL:: Select * from Internet
YQL:: Select * from Internet
YQL:: Select * from Internet
Upcoming SlideShare
Loading in...5
×

YQL:: Select * from Internet

5,609

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,609
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
29
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • You now have a common way of accessing any data on the web, with or without APIs, and all in one common syntax that you are likely familiar with, SQL.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • YQL:: Select * from Internet

    1. 1. YQLSELECT * FROM InternetDeveloper Talk Token Unicorn
    2. 2. Derek Gathright
    3. 3. Derek Gathright• Engineer @ Yahoo
    4. 4. Derek Gathright• Engineer @ Yahoo• Tweet from @derek
    5. 5. Derek Gathright• Engineer @ Yahoo• Tweet from @derek• Blog @ http://derekville.net
    6. 6. Derek Gathright• Engineer @ Yahoo• Tweet from @derek• Blog @ http://derekville.net• Everything else @ http://derek.io
    7. 7. On December 4th, 1995
    8. 8. On December 4th, 1995“Netscape and Sun announce JavaScript, across-platform object scripting language” http://web.archive.org/web/20070916144913/http://wp.netscape.com/newsref/pr/newsrelease67.html
    9. 9. How big is the Web?
    10. 10. GINORMOUS!
    11. 11. The Internet: 1995 http://www.jevans.com/pubnetmap.html
    12. 12. The Internet: 2003 http://www.opte.org/maps/
    13. 13. The Internet: 2007 http://xkcd.com/256/
    14. 14. The Internet: 2010 http://xkcd.com/802/
    15. 15. It’s impossible tomeasure the size of theweb because it isconstantly changing,growing, and morphing.
    16. 16. Every second:Twitter gets 600 new tweets
    17. 17. Every minute:YouTube +35 hours of video
    18. 18. Every month: Facebook gets 2.5 billion new photos
    19. 19. Yeah, lots of it is garbage
    20. 20. But theres still a ton ofinteresting stuff out there.
    21. 21. So how do you access it, programmatically?
    22. 22. It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but how do you parse it?
    23. 23. It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but how do you parse it?XPath + DOM Traversal = Yay!
    24. 24. It is easy enough to ‘scrape’ the web (using cURL, wget, etc...), but how do you parse it? XPath + DOM Traversal = Yay!Regular Expressions = Double Yay!
    25. 25. It is easy enough to‘scrape’ the web (using cURL, wget, etc...), but how do you parse it?
    26. 26. “If a regular expressionis longer than 2 inches, find another method” - Douglas Crockford
    27. 27. 4.4 lbs &1,368 pages. No thanks!
    28. 28. The Point?The Web has a ton ofdata, but no easy,hackday-able way toaccess it.
    29. 29. THE WEB NEEDS AN API!
    30. 30. APIs are awesome.You get (mostly) whatever datayou want in (mostly) whateverformat you want and in a (mostly)easy to parse structure.
    31. 31. Example...http://api.twitter.com/1/users/show.json?screen_name=derek
    32. 32. Companies discovered that if you build APIs, developers will come in droves
    33. 33. And it saves you from having todance around on stage like a monkey (if you are confused, Google “monkey dance”)
    34. 34. Yahoo, Google,Facebook, Twitter,Microsoft, NY Times, ...Most web companiesoffer APIs now days.
    35. 35. As neat as they are,they are imperfect.Why?
    36. 36. As neat as they are,they are imperfect.Why?You have to readdocumentation to usethem.
    37. 37. As neat as they are,they are imperfect.Why?You have to readdocumentation to usethem.BOOOOOOO!!!!
    38. 38. Were developers,were lazy,and we want to build stuff, NOW!
    39. 39. Yahoo invented a solution to this problem...
    40. 40. YQL!
    41. 41. "Yahoo! Query Language is anexpressive SQL-like language thatlets you query, filter, and join dataacross Web services.With YQL, apps run faster withfewer lines of code and a smallernetwork footprint." http://developer.yahoo.com/yql/
    42. 42. YQL is...• RESTful• Scaleable• Customizable• ... and lots of other “ables”
    43. 43. How do you use it?
    44. 44. How do you use it?$format = “json”; // or xml;
    45. 45. How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;
    46. 46. How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_query}&format={$format}”;
    47. 47. How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_query}&format={$format}”;$json_string = goGetIt($url); // likely a curl() call
    48. 48. How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_query}&format={$format}”;$json_string = goGetIt($url); // likely a curl() call$data = json_decode($json);
    49. 49. How do you use it?$format = “json”; // or xml;$base = “http://query.yahooapis.com/v1/public/yql”;$url = “{$base}?q={$yql_query}&format={$format}”;$json_string = goGetIt($url); // likely a curl() call$data = json_decode($json); Or use any of the libraries written for your favorite language/framework
    50. 50. YQL Queries SELECT {fields} FROM {table}WHERE {conditions}
    51. 51. SELECT * FROMweather.forecast WHERE location=90210
    52. 52. SELECT * FROM data.html.cssselect WHEREurl=“http://yahoo.com” AND css=“body a”;
    53. 53. SELECT height,width,url FROM search.images WHERE query=“kitteh” ANDmimetype LIKE “%jpeg%”
    54. 54. SELECT * FROM google.search WHERE q=“pizza”
    55. 55. SELECT status.text FROM twitter.user.timeline WHERE screen_name=“derek”
    56. 56. SELECT * FROM foursquare.history WHEREusername=“foo” AND password=“bar”
    57. 57. SELECT * FROM rss WHERE url IN (SELECT title FROM atom WHERE url="http:// spreadsheets.google.com/feeds/list/ pg_T0Mv3iBwIJoc82J1G8aQ/od6/public/ basic") LIMIT 10 | unique(field="title")
    58. 58. Where’s the magic?
    59. 59. Data Tables13 categories & counting, including...• Geo • Social• Flickr • Upcoming• Local • Weather• Maps • Yahoo (Search)• Meme • YMail• Music • YQL (Storage)
    60. 60. Open Data Tables900+ community contributed tables inhundreds of categories, including... Amazon • Netflix• Craigslist • NY Times• Facebook • SimpleGeo• Foursquare • SPARQL• Google • Twitter• HackerNews • Wordpress• LastFM • YouTube
    61. 61. google.search
    62. 62. google.searchhttp://www.datatables.org/google/google.search.xml
    63. 63. google.searchhttp://www.datatables.org/google/google.search.xml twitter.user.timeline
    64. 64. google.search http://www.datatables.org/google/google.search.xml twitter.user.timelinehttp://www.datatables.org/twitter/twitter.user.timeline.xml
    65. 65. google.search http://www.datatables.org/google/google.search.xml twitter.user.timelinehttp://www.datatables.org/twitter/twitter.user.timeline.xml foursquare.history
    66. 66. google.search http://www.datatables.org/google/google.search.xml twitter.user.timelinehttp://www.datatables.org/twitter/twitter.user.timeline.xml foursquare.history http://www.datatables.org/foursquare/history.xml
    67. 67. http://www.datatables.org/craigslist/craigslist.search.xml
    68. 68. http://www.datatables.org/craigslist/craigslist.search.xml
    69. 69. http://www.datatables.org/craigslist/craigslist.search.xml
    70. 70. http://www.datatables.org/craigslist/craigslist.search.xml
    71. 71. &query={query}http://www.datatables.org/craigslist/craigslist.search.xml
    72. 72. YQL != Voodoo MagicIt is just rewriting a YQLquery into one (or many) HTTP calls for you.
    73. 73. USE "http://www.datatables.org/nyt/nyt.bestsellers.xml"AS nyt.bestsellers;USE "https://github.com/gcb/yql.opentable/raw/master/text.concat.xml"AS text.concat;SELECT text FROM text.concat WHERE text.key1 = "http://www.amazon.com/dp/" AND (text.key2) IN ( SELECT isbns.isbn.isbn10 FROM nyt.bestsellers WHERE apikey=yourAPIKey );// Generates strings like “http://www.amazon.com/dp/031603617X”
    74. 74. USE "https://github.com/gcb/yql.opentable/raw/master/text.concat.xml"AS text.concat;
    75. 75. https://github.com/gcb/yql.opentable/raw/master/text.concat.xml
    76. 76. https://github.com/gcb/yql.opentable/raw/master/text.concat.xml
    77. 77. <execute>• Execute arbitrary JavaScript in Rhino (a JS engine)• E4X Support (XML literals in JS)• Speak protocols and handle authentication; Basic auth, OAuth, XAuth, XMLRPC, ...• Best feature? View-source!
    78. 78. Summary• YQL is very useful for... • Scraping • Creating an API where one doesn’t exist • Converting XML -> JSON, & vice-versa • JSONP for JS-only apps • Many HTTP requests -> single HTTP request • Server-side JS processing
    79. 79. NY Times Data Tables• nyt.article.search • nyt.newswire• nyt.bestsellers • nyt.people.activities• nyt.bestsellers.history • nyt.people.followers• nyt.bestsellers.search • nyt.people.following• nyt.movies.critics • nyt.people.newsfeed• nyt.movies.picks • nyt.people.profiles• nyt.movies.reviews • nyt.people.users
    80. 80. Get started @ http://developer.yahoo.com/yql Thanks!Questions? Find, Tweet, or Email me. @derek or drg@yahoo-inc.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×