HACKING ON
STEROIDS WITH YQL
Saurabh Sahni
YDN Product Guy & Hacker
Twitter: @saurabhsahni
Hacking together
systems in 24 hours is
lot of fun.
Data manipulation
Hacks =           +
          Data visualization
The web has lot of data around
ProgrammableWeb.Com – 6831 APIs
Yahoo! has opened
   up its data
http://developer.yahoo.com/everything.html
THE TROUBLE WITH DATA
 •  You need to find data API
 •  Get Access – Signup for key
 •  Find data endpoint
 •  Read docs to learn what parameters you
    have
 •  Get data in obscure format
 •  Use data after converting and filtering
 •  More APIs you use, more is your
    annoyance
To make data
access easy on the
web, Yahoo!
created YQL
YQL turns web
services and data on
the web into
databases.
select {what} from {where}
    where {conditions}
You can select, filter,
sort and limit
You can even insert,
update and delete
from it.
FINDING VIDEOS ABOUT BANGALORE




SELECT * FROM youtube.search where
query='bangalore'
SELECTING PHOTOS OF HACKDAY




SELECT * FROM flickr.photos.search where
text="hackday” and api_key=“b5a60b2a…”
INSERTING DATA




INSERT INTO bitly.shorten (login, apiKey, longUrl)


VALUES ('ME', 'API_KEY', 'http://yahoo.com')
UPDATING DATA



 UPDATE social.profile.status


 SET status="Using YQL UPDATE”


 WHERE guid="NJFIDHVPVVISDX7UKED2WHU"
RETRIEVING MY CONTACTS




SELECT * FROM social.contacts WHERE
guid=me
ACCESSING PRIVATE DATA
        http://query.yahooapis.com/v1/yql

Uses OAuth 1.0 for authorization

OAuth is complicated – use one of our SDKs at
https://github.com/yahoo
You can also mix and
match several web
services using the in()
command.
select * from search.termextract
where context in (select
description from rss where
url='http://rss.news.yahoo.com/
rss/topstories')
Almost all the top
APIs on web are
accessible from YQL
Some	
  of	
  them	
  
amazon                   foursquare   peerindex
apple                    geo          salesforce
bbc                      github       slideshare
bible                    google       themovideb
boss                     hackernews   tumblr
campfire                 ign          twitter
contentanalysis          intuit       vimeo
craigslist               kiva         weather
delicious                klout        yahoo
dopplr                   lastfm       youtube
etsy                     netflix      zillow
facebook                 paypal
You want even
more?
Alright, how about this?


   atom           json
   csv            microformats
   feed           rss
   html           xml
The easiest way to
start with YQL is to
use the console
http://developer.yahoo.com/yql/
console
YQL: http://developer.yahoo.com/yql/console
How to get this data
in your app?
YQL is a REST API
in itself and has two
endpoints
The public endpoint does not need
any authentication.

http://query.yahooapis.com/v1/public/
yql?q={query} &format={format}
The private endpoint needs oauth
authentication.

http://query.yahooapis.com/v1/yql?q=
{query}&format={format}
Output formats are XML
or JSON
LET’S SEE IT
QUERY EXAMPLES



                 select	
  *	
  from	
  
                 yahoo.finance.quotes	
  
                 where	
  symbol	
  in	
  
                 ("^IXIC","^DJI","YHOO
                 ","AAPL")	
  
QUERY EXAMPLES


                 select	
  *	
  from	
  
                 weather.bylocaHon	
  where	
  
                 locaHon	
  in	
  ("bangalore,	
  in",	
  
                 ”new	
  york,	
  us")	
  
QUERY EXAMPLES
Find hackday tweets:
SELECT * FROM twitter.search where q='hackday’

Search Yahoo! Answers for resolved questions about cars:
select * from answers.search where query="cars" and type="resolved”


Find distance between Bangalore and Mumbai:
select * from geo.distance where place1="bangalore" and
place2="mumbai”


Extract important terms from top stories on Yahoo! news:
select * from search.termextract where context in (select description
from rss where url='http://rss.news.yahoo.com/rss/topstories')
QUERY EXAMPLES

Get Olympic medal list
select * from html where url='http://sports.yahoo.com/olympics/
medals.html' and xpath='//*[@id='mediasportsoverallmedalcount']/div
[2]/table/tbody/tr/td/a'

Shorten a URL:
insert into yahoo.y.ahoo.it (url, keysize) values ('http://
www.javarants.com', 5)

Search apartments in criagslist:
select * from craigslist.search where location="bangalore" and
type="apa" and query="indiranagar”
QUERY EXAMPLES

Scrape news from Yahoo! Finance:
select * from html where url="http://finance.yahoo.com/q?
s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a’


Select, filter data from google spreadsheets:
select * from csv where url="https://
spreadsheets.google.com/pub?key=0ArYndzim-
lbrdF8wc3A5QWl1ZGRpdkxRZk80SU9zUXc&output=csv"
and col5 like 'Bangalore%’ ;
Let’s find hackday
photos on flickr
How about limiting
to those clicked in
Bangalore
MAKING REQUESTS: FLICKR URLS

 <photo farm="3"
        id="5708163920"
        isfamily="0"
        isfriend="0"
        ispublic="1"
        owner="31832337@N04"
        secret="0075137487"
        server="2496"
        title="San Francisco"/>
MAKING REQUESTS: FLICKR URLS

 Photo URL	
  
 http://farm{$farm}.static.flickr.com/{$server}/
 {$id}_{$secret}.jpg


 Photo Page URL	
  
 http://www.flickr.com/photos/{$owner}/{$id}


 Photo Owner Profile URL	
  
 http://www.flickr.com/photos/{$owner}
https://github.com/saurabhsahni/Hacks/
Finding Recent Photos from flickr
Some YQL Hacks
ChromYQLip is a
chrome extension for
page scraping via YQL
Open	
  Hack	
  Bangalore	
  2010	
  Winner	
  
http://bit.ly/chromeYQL
VIDEO CLIP
http://www.webmeme.in
WEBMEME.IN

Fetch multiple feeds in different formats like atom, RSS and
transform them into consistent RSS format
Select * from rss where url in (‘http://feeds.feedburner.com/pluggd’,
‘http://quatrainman.blogspot.com/atom.xml’, ‘…’)


Filter news containing “india” from multiple feeds:
select * from rss where url in ('http://feeds.feedburner.com/
TechCrunch', 'http://www.readwriteweb.com/rss.xml','http://
gigaom.com/feed/') and description like '%india%’
YQL is open – you
can get your data
tables in our system
All you need to do is
write an XML
schema and put it
on Github.
http://github.com/yql/yql-tables
Here is the craigslist
search table
https://github.com/yql/yql-tables/tree/master/
craigslist/craigslist.search.xml
USE INSTANTLY BY UPLOADING ON YOUR
SITE



  USE 'http://www.mysite.com/my_table.xml'
  AS mytable;
  SELECT * FROM mytable
  WHERE user='saurabh'
You can even write
server side
javascript to build a
webservice or
augment one.
hLp://developer.yahoo.com/yql/guide/yql-­‐execute-­‐chapter.html	
  
There are lot of
things you can do
with YQL.
Play yourself
http://
developer.yahoo.com/yql/
One more thing
RESOURCES
All Yahoo! APIs and Services
http://developer.yahoo.com/everything.html

YQL Documentation
http://developer.yahoo.com/yql

YQL Console
http://developer.yahoo.com/yql/console

YQL Github Account (Contribute Tables)
http://github.com/yql/yql-tables
THANKS!

http://www.slideshare.net/
saurabhsahni




 Saurabh Sahni

 Twitter: @saurabhsahni
 Github: http://github.com/saurabhsahni
 Web: http://www.saurabhsahni.com
YQL: Hacking on steroids - Yahoo! Open Hack Day 2012
YQL: Hacking on steroids - Yahoo! Open Hack Day 2012
YQL: Hacking on steroids - Yahoo! Open Hack Day 2012
YQL: Hacking on steroids - Yahoo! Open Hack Day 2012
YQL: Hacking on steroids - Yahoo! Open Hack Day 2012

YQL: Hacking on steroids - Yahoo! Open Hack Day 2012

  • 2.
    HACKING ON STEROIDS WITHYQL Saurabh Sahni YDN Product Guy & Hacker Twitter: @saurabhsahni
  • 4.
    Hacking together systems in24 hours is lot of fun.
  • 5.
    Data manipulation Hacks = + Data visualization
  • 6.
    The web haslot of data around
  • 7.
  • 8.
    Yahoo! has opened up its data
  • 9.
  • 10.
    THE TROUBLE WITHDATA •  You need to find data API •  Get Access – Signup for key •  Find data endpoint •  Read docs to learn what parameters you have •  Get data in obscure format •  Use data after converting and filtering •  More APIs you use, more is your annoyance
  • 11.
    To make data accesseasy on the web, Yahoo! created YQL
  • 12.
    YQL turns web servicesand data on the web into databases.
  • 13.
    select {what} from{where} where {conditions}
  • 14.
    You can select,filter, sort and limit
  • 15.
    You can eveninsert, update and delete from it.
  • 16.
    FINDING VIDEOS ABOUTBANGALORE SELECT * FROM youtube.search where query='bangalore'
  • 17.
    SELECTING PHOTOS OFHACKDAY SELECT * FROM flickr.photos.search where text="hackday” and api_key=“b5a60b2a…”
  • 18.
    INSERTING DATA INSERT INTObitly.shorten (login, apiKey, longUrl) VALUES ('ME', 'API_KEY', 'http://yahoo.com')
  • 19.
    UPDATING DATA UPDATEsocial.profile.status SET status="Using YQL UPDATE” WHERE guid="NJFIDHVPVVISDX7UKED2WHU"
  • 20.
    RETRIEVING MY CONTACTS SELECT* FROM social.contacts WHERE guid=me
  • 21.
    ACCESSING PRIVATE DATA http://query.yahooapis.com/v1/yql Uses OAuth 1.0 for authorization OAuth is complicated – use one of our SDKs at https://github.com/yahoo
  • 22.
    You can alsomix and match several web services using the in() command.
  • 23.
    select * fromsearch.termextract where context in (select description from rss where url='http://rss.news.yahoo.com/ rss/topstories')
  • 24.
    Almost all thetop APIs on web are accessible from YQL
  • 26.
    Some  of  them   amazon foursquare peerindex apple geo salesforce bbc github slideshare bible google themovideb boss hackernews tumblr campfire ign twitter contentanalysis intuit vimeo craigslist kiva weather delicious klout yahoo dopplr lastfm youtube etsy netflix zillow facebook paypal
  • 27.
  • 28.
    Alright, how aboutthis? atom json csv microformats feed rss html xml
  • 29.
    The easiest wayto start with YQL is to use the console http://developer.yahoo.com/yql/ console
  • 30.
  • 34.
    How to getthis data in your app?
  • 35.
    YQL is aREST API in itself and has two endpoints
  • 36.
    The public endpointdoes not need any authentication. http://query.yahooapis.com/v1/public/ yql?q={query} &format={format}
  • 37.
    The private endpointneeds oauth authentication. http://query.yahooapis.com/v1/yql?q= {query}&format={format}
  • 38.
    Output formats areXML or JSON
  • 39.
  • 40.
    QUERY EXAMPLES select  *  from   yahoo.finance.quotes   where  symbol  in   ("^IXIC","^DJI","YHOO ","AAPL")  
  • 41.
    QUERY EXAMPLES select  *  from   weather.bylocaHon  where   locaHon  in  ("bangalore,  in",   ”new  york,  us")  
  • 42.
    QUERY EXAMPLES Find hackdaytweets: SELECT * FROM twitter.search where q='hackday’ Search Yahoo! Answers for resolved questions about cars: select * from answers.search where query="cars" and type="resolved” Find distance between Bangalore and Mumbai: select * from geo.distance where place1="bangalore" and place2="mumbai” Extract important terms from top stories on Yahoo! news: select * from search.termextract where context in (select description from rss where url='http://rss.news.yahoo.com/rss/topstories')
  • 43.
    QUERY EXAMPLES Get Olympicmedal list select * from html where url='http://sports.yahoo.com/olympics/ medals.html' and xpath='//*[@id='mediasportsoverallmedalcount']/div [2]/table/tbody/tr/td/a' Shorten a URL: insert into yahoo.y.ahoo.it (url, keysize) values ('http:// www.javarants.com', 5) Search apartments in criagslist: select * from craigslist.search where location="bangalore" and type="apa" and query="indiranagar”
  • 44.
    QUERY EXAMPLES Scrape newsfrom Yahoo! Finance: select * from html where url="http://finance.yahoo.com/q? s=yhoo" and xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a’ Select, filter data from google spreadsheets: select * from csv where url="https:// spreadsheets.google.com/pub?key=0ArYndzim- lbrdF8wc3A5QWl1ZGRpdkxRZk80SU9zUXc&output=csv" and col5 like 'Bangalore%’ ;
  • 45.
  • 46.
    How about limiting tothose clicked in Bangalore
  • 48.
    MAKING REQUESTS: FLICKRURLS <photo farm="3" id="5708163920" isfamily="0" isfriend="0" ispublic="1" owner="31832337@N04" secret="0075137487" server="2496" title="San Francisco"/>
  • 49.
    MAKING REQUESTS: FLICKRURLS Photo URL   http://farm{$farm}.static.flickr.com/{$server}/ {$id}_{$secret}.jpg Photo Page URL   http://www.flickr.com/photos/{$owner}/{$id} Photo Owner Profile URL   http://www.flickr.com/photos/{$owner}
  • 51.
  • 52.
  • 53.
  • 56.
    ChromYQLip is a chromeextension for page scraping via YQL Open  Hack  Bangalore  2010  Winner  
  • 57.
  • 58.
  • 59.
  • 60.
    WEBMEME.IN Fetch multiple feedsin different formats like atom, RSS and transform them into consistent RSS format Select * from rss where url in (‘http://feeds.feedburner.com/pluggd’, ‘http://quatrainman.blogspot.com/atom.xml’, ‘…’) Filter news containing “india” from multiple feeds: select * from rss where url in ('http://feeds.feedburner.com/ TechCrunch', 'http://www.readwriteweb.com/rss.xml','http:// gigaom.com/feed/') and description like '%india%’
  • 61.
    YQL is open– you can get your data tables in our system
  • 62.
    All you needto do is write an XML schema and put it on Github.
  • 63.
  • 64.
    Here is thecraigslist search table https://github.com/yql/yql-tables/tree/master/ craigslist/craigslist.search.xml
  • 66.
    USE INSTANTLY BYUPLOADING ON YOUR SITE USE 'http://www.mysite.com/my_table.xml' AS mytable; SELECT * FROM mytable WHERE user='saurabh'
  • 67.
    You can evenwrite server side javascript to build a webservice or augment one. hLp://developer.yahoo.com/yql/guide/yql-­‐execute-­‐chapter.html  
  • 68.
    There are lotof things you can do with YQL.
  • 69.
  • 70.
  • 72.
    RESOURCES All Yahoo! APIsand Services http://developer.yahoo.com/everything.html YQL Documentation http://developer.yahoo.com/yql YQL Console http://developer.yahoo.com/yql/console YQL Github Account (Contribute Tables) http://github.com/yql/yql-tables
  • 73.
    THANKS! http://www.slideshare.net/ saurabhsahni Saurabh Sahni Twitter: @saurabhsahni Github: http://github.com/saurabhsahni Web: http://www.saurabhsahni.com