More Related Content Similar to YQL_Batch_Queries_Notes_Examples Similar to YQL_Batch_Queries_Notes_Examples (20) More from Mark Yashar (18) YQL_Batch_Queries_Notes_Examples1. Creating Batch Code to Extract Ages on
Multiple Yahoo Accounts
We wouldlike tofindthe creationdate of multiple Yahooemail accountsatthe same time ratherthan
manuallyenteringseparate queriesintothe YQLconsole foreach email addresses.
First,we wantto be able tocombine two differentYQLqueriesintoasingle call.
Eventually,we wouldliketoputa batch or listof Yahooemail addressesintoaYQL query/requestand
thenanalyze the outputwe get. Inthe YQL queryoutput,we’dlike the emailaddressof the usertobe
outputnextto or inproximitywiththe creationdate of the email accountandrelateduserinformation.
Example 1: We combine twodifferentYQLrequestsintoasingle call by joiningdataacrosstwo
differentYQLtables (viasub-selects)by enteringthe followingqueryintothe YQLconsole
(https://developer.yahoo.com/yql/console/):
select createdfromsocial.profile whereguidin(select guidfromyahoo.identity
where yid= 'mark.yashar@yahoo.com')
and we obtainthe followingoutput:
{
"query": {
"count": 1,
"created": "2016-04-07T23:08:32Z",
"lang": "en-US",
"diagnostics": {
"publiclyCallable": "true",
"url": [
{
"execution-start-time": "1",
"execution-stop-time": "62",
"execution-time": "61",
"content":
"http://profiles.yahoo.com/v2/identities.handle(mark.yashar%40yahoo.com~yid)"
},
{
"execution-start-time": "63",
"execution-stop-time": "71",
2. "execution-time": "8",
"content":
"http://social.yahooapis.com/progrss/v1/users.guid(QCYF27AAV7MC7RL44KZ4EMX7FY
)/profile?format=json&.imgssl=1"
}
],
"user-time": "71",
"service-time": "69",
"build-version": "0.2.998"
},
"results": {
"profile": {
"created": "2016-03-14T23:21:29Z"
}
}
}
}
The relevantoutputresultsare hi-lightedinyellow:The creationdate of the Yahooemail accountwas
March 14, 2016.
Example 2: For thisexample,we’reinterestedinfindingthe creation datesfortwoemails
simultaneously.We canuse the ‘yql.query.multi’YQLtable tocombine differentYQLqueriesby,for
example,enteringthe followingintothe YQLconsole:
select *from yql.query.multi where queries="selectcreatedfromsocial.profile
where guidin (select guidfromyahoo.identity where yid=
'mark.yashar@yahoo.com', ); select createdfromsocial.profile where guidin
(select guidfromyahoo.identity where yid='mrsglobs@yahoo.com')"
and we getthe followingoutput:
{
"query": {
"count": 2,
"created": "2016-04-07T22:29:25Z",
"lang": "en-US",
"diagnostics": {
"publiclyCallable": "false",
"url": [
{
"execution-start-time": "5",
"execution-stop-time": "70",
"execution-time": "65",
3. "content":
"http://profiles.yahoo.com/v2/identities.handle(mark.yashar%40yahoo.com~yid)"
},
{
"execution-start-time": "5",
"execution-stop-time": "77",
"execution-time": "72",
"content":
"http://profiles.yahoo.com/v2/identities.handle(mrsglobs%40yahoo.com~yid)"
},
{
"execution-start-time": "71",
"execution-stop-time": "80",
"execution-time": "9",
"content":
"http://social.yahooapis.com/progrss/v1/users.guid(QCYF27AAV7MC7RL44KZ4EMX7FY
)/profile?format=json&.imgssl=1"
},
{
"execution-start-time": "78",
"execution-stop-time": "85",
"execution-time": "7",
"content":
"http://social.yahooapis.com/progrss/v1/users.guid(Q7EJRC36XZ7Z2EYEK3OVFITR6E
)/profile?format=json&.imgssl=1"
}
],
"query": [
{
"execution-start-time": "2",
"execution-stop-time": "81",
"execution-time": "79",
"content": "select created from social.profile where guid in (select
guid from yahoo.identity where yid = 'mark.yashar@yahoo.com')"
},
{
"execution-start-time": "2",
"execution-stop-time": "85",
"execution-time": "83",
"content": " select created from social.profile where guid in (select
guid from yahoo.identity where yid = 'mrsglobs@yahoo.com')"
}
],
"user-time": "87",
"service-time": "153",
"build-version": "0.2.998"
},
"meta": {
"meta": [
null,
null
]
},
"results": {
"results": [
{
"profile": {
4. "created": "2016-03-14T23:21:29Z"
}
},
{
"profile": {
"created": "2009-09-10T19:17:37Z"
}
}
]
}
}
}
The relevantresultsthatwe’re interestedin –the creationdatesof the twoemail accounts – are hi-
lightedinyellow:the ‘mark.yashar@yahoo.com’ accountwascreatedonMarch 14, 2016, and the
‘mrsglobs@yahoo.com’accountwascreatedonSeptember10,2009.
Example #3: ExtendingExample #2to more email addressesandremovingthe diagnosticanddebugging
outputinthe YQL console,we have:
select* fromyql.query.multi wherequeries="selectcreatedfromsocial.profile where guidin(select
guidfromyahoo.identitywhereyid='mark.yashar@yahoo.com');selectcreatedfromsocial.profile
where guidin(selectguidfrom yahoo.identitywhere yid='mrsglobs@yahoo.com');selectcreatedfrom
social.profile where guidin(selectguidfromyahoo.identity where yid='cclaudio1357@yahoo.com');
selectcreatedfromsocial.profile where guidin(selectguidfromyahoo.identity where yid=
'minnie_z_1999@yahoo.com');selectcreatedfromsocial.profile where guidin(selectguidfrom
yahoo.identitywhere yid='jperez_10040@yahoo.com');selectcreatedfromsocial.profilewhere guidin
(selectguidfromyahoo.identitywhere yid= 'chiennguyen_1981@yahoo.com.vn');selectcreatedfrom
social.profile where guidin(selectguidfromyahoo.identitywhere yid= 'elabiner@yahoo.com');select
createdfromsocial.profile where guidin(selectguidfromyahoo.identitywhereyid=
'jigpatel@yahoo.com')"
and we getthe followingoutput:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="8" yahoo:created="2016-04-10T09:40:35Z" yahoo:lang="en-US">
<meta>
<meta/>
<meta/>
<meta/>
<meta/>
<meta/>
<meta/>
<meta/>
<meta/>
</meta>
<results>
<results>
<profile>
5. <created>2016-03-14T23:21:29Z</created>
</profile>
</results>
<results>
<profile>
<created>2009-09-10T19:17:37Z</created>
</profile>
</results>
<results>
<profile>
<created>2012-09-23T23:01:41Z</created>
</profile>
</results>
<results>
<profile>
<created>2008-10-09T00:01:48Z</created>
</profile>
</results>
<results/>
<results>
<profile>
<created>2008-12-21T01:32:28Z</created>
</profile>
</results>
<results>
<profile>
<created>2009-12-10T01:38:19Z</created>
</profile>
</results>
<results/>
</results>
</query>
Example #4: Runningessentiallythe same querywiththe same resultsasinExample #3,but writingthe
queryina more compact form,we have:
select *from yql.query.multi where queries="selectemails, created,
memberSince, familyName,gender,givenName,nickname fromsocial.profile
where guidin (select guidfromyahoo.identity where yidin
('mark.yashar@yahoo.com','jigpatel@yahoo.com','mrsglobs@yahoo.com','cclau
dio1357@yahoo.com,'minnie_z_1999@yahoo.com',
'chiennguyen_1981@yahoo.com.vn', 'elabiner@yahoo.com'))"
withthe followingoutput:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2016-04-15T22:18:07Z" yahoo:lang="en-US">
<meta>
7. dio1357@yahoo.com','minnie_z_1999@yahoo.com',
'chiennguyen_1981@yahoo.com.vn', 'elabiner@yahoo.com'))
withthe followingoutput:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="7" yahoo:created="2016-04-15T22:35:26Z" yahoo:lang="en-US">
<results>
<profile>
<created>2010-01-16T17:45:10Z</created>
<memberSince>1997-08-19T22:30:15Z</memberSince>
<nickname>Jigar</nickname>
</profile>
<profile>
<created>2009-12-10T01:38:19Z</created>
<memberSince>2005-08-07T16:04:04Z</memberSince>
<nickname>Ellen</nickname>
</profile>
<profile>
<created>2012-09-23T23:01:41Z</created>
<memberSince>2012-09-23T23:01:20Z</memberSince>
<nickname>Carlos</nickname>
</profile>
<profile>
<created>2016-03-14T23:21:29Z</created>
<memberSince>2016-03-14T23:21:28Z</memberSince>
<nickname>Mark</nickname>
</profile>
<profile>
<created>2008-12-21T01:32:28Z</created>
<familyName>nguyenmangchien</familyName>
<givenName>chien</givenName>
<memberSince>2006-10-09T09:13:10Z</memberSince>
<nickname>chiennguyen</nickname>
</profile>
<profile>
<created>2009-09-10T19:17:37Z</created>
<memberSince>2006-10-23T03:52:22Z</memberSince>
<nickname>Tina</nickname>
</profile>
<profile>
<created>2008-10-09T00:01:48Z</created>
<memberSince>2000-04-22T21:48:38Z</memberSince>
<nickname>Minnie</nickname>
</profile>
</results>
</query>
Example #6: The YQL query output can be expanded by checking the “diagnostic box”in the console.Doing that
will generate additional material (includingemail addresses) atthe beginning of the query output such as in the
followingexample:
<diagnostics>
<publiclyCallable>true</publiclyCallable>
8. <url execution-start-time="2" execution-stop-time="64" execution-
time="62"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(mark.yashar
%40yahoo.com~yid)]]></url>
<url execution-start-time="3" execution-stop-time="65" execution-
time="62"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(minnie_z_19
99%40yahoo.com~yid)]]></url>
<url execution-start-time="3" execution-stop-time="66" execution-
time="63"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(jigpatel%40
yahoo.com~yid)]]></url>
<url execution-start-time="3" execution-stop-time="68" execution-
time="65"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(mrsglobs%40
yahoo.com~yid)]]></url>
<url execution-start-time="3" execution-stop-time="68" execution-
time="65"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(cclaudio135
7%40yahoo.com~yid)]]></url>
<url execution-start-time="64" execution-stop-time="127" execution-
time="63"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(chiennguyen
_1981%40yahoo.com.vn~yid)]]></url>
<url execution-start-time="66" execution-stop-time="129" execution-
time="63"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(elabiner%40
yahoo.com~yid)]]></url>
<url execution-start-time="130" execution-stop-time="145" execution-
time="15"><![CDATA[http://social.yahooapis.com/progrss/v1/users.guid(QCYF27AA
V7MC7RL44KZ4EMX7FY,3VSVJ6XZX4NJWUUFFHJDXBIQBI,Q7EJRC36XZ7Z2EYEK3OVFITR6E,VRPA
RSKJIKEFNIHKXY5QK6L3LU,3VT2ZEHSCWNLWMRNMWC25IF4HY,ND2YVRT23FJQWZW5FE33OJVN6I,
BHHEKTXSUVTJXZSVLBYXU7WISA)/profile?format=json&.imgssl=1]]></url>
<user-time>148</user-time>
<service-time>458</service-time>
<build-version>0.2.998</build-version>
</diagnostics>
However,the email addresses thatare displayedhere are neitherinthe same orderas theywere listed
inthe YQL querynor inthe displayedqueryoutput.
Example #7: For the mostdetailedandcomprehensive informationforagivenyahoouser(inadditionto
informationregarding‘created’,‘memberSince’,‘nickname’,etc.) youcanuse a YQL querythatselects
on the wildcard‘*’ and make sure that the ‘Diagnostics’ box ischeckedjustbelow the console,e.g.,
select * from social.profile where guid in (select guid from yahoo.identity where yid in
('mark.yashar@yahoo.com','jigpatel@yahoo.com','mrsglobs@yahoo.com','cclaudio1357@ya
hoo.com','minnie_z_1999@yahoo.com', 'chiennguyen_1981@yahoo.com.vn',
'elabiner@yahoo.com'))
which creates the following more comprehensive and detailed output:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="7" yahoo:created="2016-05-05T19:05:43Z" yahoo:lang="en-US">
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<url execution-start-time="1" execution-stop-time="64" execution-
time="63"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(mark.yashar
%40yahoo.com~yid)]]></url>
9. <url execution-start-time="1" execution-stop-time="65" execution-
time="64"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(jigpatel%40
yahoo.com~yid)]]></url>
<url execution-start-time="2" execution-stop-time="66" execution-
time="64"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(mrsglobs%40
yahoo.com~yid)]]></url>
<url execution-start-time="2" execution-stop-time="66" execution-
time="64"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(cclaudio135
7%40yahoo.com~yid)]]></url>
<url execution-start-time="2" execution-stop-time="68" execution-
time="66"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(minnie_z_19
99%40yahoo.com~yid)]]></url>
<url execution-start-time="65" execution-stop-time="126" execution-
time="61"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(elabiner%40
yahoo.com~yid)]]></url>
<url execution-start-time="65" execution-stop-time="127" execution-
time="62"><![CDATA[http://profiles.yahoo.com/v2/identities.handle(chiennguyen
_1981%40yahoo.com.vn~yid)]]></url>
<url execution-start-time="127" execution-stop-time="144" execution-
time="17"><![CDATA[http://social.yahooapis.com/progrss/v1/users.guid(QCYF27AA
V7MC7RL44KZ4EMX7FY,3VSVJ6XZX4NJWUUFFHJDXBIQBI,Q7EJRC36XZ7Z2EYEK3OVFITR6E,VRPA
RSKJIKEFNIHKXY5QK6L3LU,3VT2ZEHSCWNLWMRNMWC25IF4HY,ND2YVRT23FJQWZW5FE33OJVN6I,
BHHEKTXSUVTJXZSVLBYXU7WISA)/profile?format=json&.imgssl=1]]></url>
<user-time>151</user-time>
<service-time>461</service-time>
<build-version>0.2.998</build-version>
</diagnostics>
<results>
<profile>
<guid>3VSVJ6XZX4NJWUUFFHJDXBIQBI</guid>
<ageCategory>A</ageCategory>
<created>2010-01-16T17:45:10Z</created>
<image>
<height>192</height>
<imageUrl>https://s.yimg.com/dh/ap/social/profile/profile_b192.png</imageUrl>
<size>192x192</size>
<width>192</width>
</image>
<lang>en-US</lang>
<location>New York, New York</location>
<memberSince>1997-08-19T22:30:15Z</memberSince>
<nickname>Jigar</nickname>
<profileUrl>http://profile.yahoo.com/3VSVJ6XZX4NJWUUFFHJDXBIQBI</profileUrl>
<isConnected>false</isConnected>
<bdRestricted>true</bdRestricted>
</profile>
<profile>
<guid>BHHEKTXSUVTJXZSVLBYXU7WISA</guid>
<ageCategory>A</ageCategory>
<created>2009-12-10T01:38:19Z</created>
<image>
<height>192</height>
<imageUrl>https://s.yimg.com/dh/ap/social/profile/profile_b192.png</imageUrl>
<size>192x192</size>
10. <width>192</width>
</image>
<lang>en-US</lang>
<location>New York, New York</location>
<memberSince>2005-08-07T16:04:04Z</memberSince>
<nickname>Ellen</nickname>
<profileUrl>http://profile.yahoo.com/BHHEKTXSUVTJXZSVLBYXU7WISA</profileUrl>
<isConnected>false</isConnected>
</profile>
<profile>
<guid>VRPARSKJIKEFNIHKXY5QK6L3LU</guid>
<ageCategory>A</ageCategory>
<created>2012-09-23T23:01:41Z</created>
<image>
<height>192</height>
<imageUrl>https://s.yimg.com/dh/ap/social/profile/profile_b192.png</imageUrl>
<size>192x192</size>
<width>192</width>
</image>
<lang>en-US</lang>
<memberSince>2012-09-23T23:01:20Z</memberSince>
<nickname>Carlos</nickname>
<profileUrl>http://profile.yahoo.com/VRPARSKJIKEFNIHKXY5QK6L3LU</profileUrl>
<isConnected>false</isConnected>
</profile>
<profile>
<guid>QCYF27AAV7MC7RL44KZ4EMX7FY</guid>
<ageCategory>A</ageCategory>
<created>2016-03-14T23:21:29Z</created>
<image>
<height>192</height>
<imageUrl>https://s.yimg.com/dh/ap/social/profile/profile_b192.png</imageUrl>
<size>192x192</size>
<width>192</width>
</image>
<lang>en-US</lang>
<memberSince>2016-03-14T23:21:28Z</memberSince>
<nickname>Mark</nickname>
<profileUrl>http://profile.yahoo.com/QCYF27AAV7MC7RL44KZ4EMX7FY</profileUrl>
<isConnected>false</isConnected>
<bdRestricted>true</bdRestricted>
</profile>
<profile>
<guid>ND2YVRT23FJQWZW5FE33OJVN6I</guid>
<ageCategory>A</ageCategory>
<created>2008-12-21T01:32:28Z</created>
<familyName>nguyenmangchien</familyName>
<status/>
<givenName>chien</givenName>
<image>
<height>192</height>
12. UnresolvedIssues
We’dlike tooutputthe user’semail addressalongwiththe creationdate of the email accountandother
info.byissuingaYQL querywithinthe YQLConsole,butIhaven’tfiguredouthow todo this.One might
be able to do thisif there were a YQL printor echo commandlike inSQL,e.g.,
mysql>SELECT'some text'as ''
but nosuch commandseemstoexistwithinYQLanda commandlike thisresultsinasyntax error.
Anotherapproachto explore wouldbe towrite anexternal script(application)thatwouldenableone to
run the YQL query outside of the console usingPython,PHP,orJavascriptscriptsthatutilize yql queries
withinthem.However,thismaypresentauthenticationissues,problems,and challengesforpython
code (forexample),whichapparently presentsYahoodataprivacyand securityproblemsand
Additional work, July 2016
I’ve developedanapproachusingsome pythontools(e.g.pandas) incombinationwithEXCELtotake
the JSON outputgeneratedfromYQL email queriesfromthe YQLconsole andconvertit to normalized,
tabulardata outputto an EXCEL spreadsheet.Thisapproachmayreduce butnot eliminate some of the
manual/brute force workthatmightotherwise be neededtocreate a (EXCELspreadsheet) table
containinguseryahooemail addressesalongwithimportantinfoaboutthemsuchascreationdate,
name,location,‘memberSince’,nickname,etc.
I will demonstratewithanexample: Suppose Ienterthe followingYQLemail query inthe YQL console
to retrieve dataabout20 useryahooemail addressesatthe same time:
selectcreated,lang,content,guid,lang,location,memberSince,nickname,familyName,givenName
from social.profilewhereguidin(selectguidfromyahoo.identity where yid
in('mark.yashar@yahoo.com','jigpatel@yahoo.com','mrsglobs@yahoo.com','cclaudio1357@yahoo.com',
'minnie_z_1999@yahoo.com','chiennguyen_1981@yahoo.com.vn','elabiner@yahoo.com',
'holidaysevents@yahoo.com','indyaclayton@yahoo.com','s.wells8@yahoo.com',
'darrettchithra@yahoo.com','terahkuhnen@yahoo.com','wenjing_mo@yahoo.com',
'billlclem123@yahoo.com','egyedveronika@yahoo.com','angelhong320@yahoo.com',
'dungmy88@yahoo.com','manmeetsaini@yahoo.com','chandarmk@yahoo.com',
'jessiejzhu@yahoo.com'))
13. Thisproducesthe followingJSON output(includingdiagnostics/metadata):
{
"query": {
"count": 19,
"created": "2016-07-28T05:46:04Z",
"lang": "en-US",
"diagnostics": {
"publiclyCallable": "true",
"url": [
{
"execution-start-time": "1",
"execution-stop-time": "65",
"execution-time": "64",
"content":
"http://profiles.yahoo.com/v2/identities.handle(mark.yashar%40yahoo.com~yid)"
},
{
"execution-start-time": "3",
"execution-stop-time": "66",
"execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(minnie_z_1999%40yahoo.com~yid
)"
},
{
"execution-start-time": "2",
"execution-stop-time": "67",
"execution-time": "65",
"content":
"http://profiles.yahoo.com/v2/identities.handle(jigpatel%40yahoo.com~yid)"
},
14. {
"execution-start-time": "3",
"execution-stop-time": "68",
"execution-time": "65",
"content":
"http://profiles.yahoo.com/v2/identities.handle(cclaudio1357%40yahoo.com~yid)
"
},
{
"execution-start-time": "2",
"execution-stop-time": "68",
"execution-time": "66",
"content":
"http://profiles.yahoo.com/v2/identities.handle(mrsglobs%40yahoo.com~yid)"
},
{
"execution-start-time": "73",
"execution-stop-time": "132",
"execution-time": "59",
"content":
"http://profiles.yahoo.com/v2/identities.handle(chiennguyen_1981%40yahoo.com.
vn~yid)"
},
{
"execution-start-time": "74",
"execution-stop-time": "136",
"execution-time": "62",
"content":
"http://profiles.yahoo.com/v2/identities.handle(s.wells8%40yahoo.com~yid)"
},
{
"execution-start-time": "74",
"execution-stop-time": "136",
"execution-time": "62",
15. "content":
"http://profiles.yahoo.com/v2/identities.handle(holidaysevents%40yahoo.com~yi
d)"
},
{
"execution-start-time": "73",
"execution-stop-time": "136",
"execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(elabiner%40yahoo.com~yid)"
},
{
"execution-start-time": "74",
"execution-stop-time": "138",
"execution-time": "64",
"content":
"http://profiles.yahoo.com/v2/identities.handle(indyaclayton%40yahoo.com~yid)
"
},
{
"execution-start-time": "137",
"execution-stop-time": "192",
"execution-time": "55",
"http-status-code": "404",
"http-status-message": "Not Found",
"content":
"http://profiles.yahoo.com/v2/identities.handle(billlclem123%40yahoo.com~yid)
"
},
{
"execution-start-time": "132",
"execution-stop-time": "195",
"execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(darrettchithra%40yahoo.com~yi
d)"
16. },
{
"execution-start-time": "136",
"execution-stop-time": "197",
"execution-time": "61",
"content":
"http://profiles.yahoo.com/v2/identities.handle(terahkuhnen%40yahoo.com~yid)"
},
{
"execution-start-time": "136",
"execution-stop-time": "199",
"execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(wenjing_mo%40yahoo.com~yid)"
},
{
"execution-start-time": "139",
"execution-stop-time": "200",
"execution-time": "61",
"content":
"http://profiles.yahoo.com/v2/identities.handle(egyedveronika%40yahoo.com~yid
)"
},
{
"execution-start-time": "193",
"execution-stop-time": "255",
"execution-time": "62",
"content":
"http://profiles.yahoo.com/v2/identities.handle(angelhong320%40yahoo.com~yid)
"
},
{
"execution-start-time": "195",
"execution-stop-time": "258",
17. "execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(dungmy88%40yahoo.com~yid)"
},
{
"execution-start-time": "197",
"execution-stop-time": "260",
"execution-time": "63",
"content":
"http://profiles.yahoo.com/v2/identities.handle(manmeetsaini%40yahoo.com~yid)
"
},
{
"execution-start-time": "199",
"execution-stop-time": "260",
"execution-time": "61",
"content":
"http://profiles.yahoo.com/v2/identities.handle(chandarmk%40yahoo.com~yid)"
},
{
"execution-start-time": "200",
"execution-stop-time": "262",
"execution-time": "62",
"content":
"http://profiles.yahoo.com/v2/identities.handle(jessiejzhu%40yahoo.com~yid)"
},
{
"execution-start-time": "263",
"execution-stop-time": "276",
"execution-time": "13",
"content":
"http://social.yahooapis.com/progrss/v1/users.guid(OBJHXUVTSWYSEOLJGHPPYTIQOM
,EZ6ZU7NV7YADJAQAL46XEKC4MU,4KQA3LRHJLHQUJRFBBQKOPCGP4,EXQ6RIF7576Z7X2LQNU2ZP
MFOY,6F7OVWYHNLC3NCG4VRVMRJS3X4,A4C5Z3PKCOIWO5DMPBS6A5K3GI,4YN2AAB3CPIQ6BF5U2
PVNJIHQE,FRJNSOTCZGSUUMVFXVJ2OKTLAA,2NJYHULB6E7WNOEW5EHT2YDVSA)/profile?forma
t=json&.imgssl=1"
18. },
{
"execution-start-time": "263",
"execution-stop-time": "279",
"execution-time": "16",
"content":
"http://social.yahooapis.com/progrss/v1/users.guid(QCYF27AAV7MC7RL44KZ4EMX7FY
,3VSVJ6XZX4NJWUUFFHJDXBIQBI,Q7EJRC36XZ7Z2EYEK3OVFITR6E,VRPARSKJIKEFNIHKXY5QK6
L3LU,3VT2ZEHSCWNLWMRNMWC25IF4HY,ND2YVRT23FJQWZW5FE33OJVN6I,BHHEKTXSUVTJXZSVLB
YXU7WISA,33VH3ODH6VAUFWZUQGR6GLXQEA,ADDPYQSNFWSLK5VDAJPESLP2DU,IQACKBQMRKG7TR
GWV66P4EA3GI)/profile?format=json&.imgssl=1"
}
],
"user-time": "284",
"service-time": "1276",
"build-version": "0.2.998"
},
"results": {
"profile": [
{
"created": "2010-01-16T17:45:10Z",
"lang": "en-US",
"location": "New York, New York",
"memberSince": "1997-08-19T22:30:15Z",
"nickname": "Jigar",
"guid": "3VSVJ6XZX4NJWUUFFHJDXBIQBI"
},
{
"lang": "en-US",
"memberSince": "2004-11-16T22:04:53Z",
"nickname": "Holidays",
"guid": "33VH3ODH6VAUFWZUQGR6GLXQEA"
},
{
19. "created": "2009-12-10T01:38:19Z",
"lang": "en-US",
"location": "New York, New York",
"memberSince": "2005-08-07T16:04:04Z",
"nickname": "Ellen",
"guid": "BHHEKTXSUVTJXZSVLBYXU7WISA"
},
{
"created": "2012-09-23T23:01:41Z",
"lang": "en-US",
"memberSince": "2012-09-23T23:01:20Z",
"nickname": "Carlos",
"guid": "VRPARSKJIKEFNIHKXY5QK6L3LU"
},
{
"created": "2016-03-14T23:21:29Z",
"lang": "en-US",
"memberSince": "2016-03-14T23:21:28Z",
"nickname": "Mark",
"guid": "QCYF27AAV7MC7RL44KZ4EMX7FY"
},
{
"created": "2008-12-21T01:32:28Z",
"familyName": "nguyenmangchien",
"givenName": "chien",
"lang": "en-US",
"location": "hanoi",
"memberSince": "2006-10-09T09:13:10Z",
"nickname": "chiennguyen",
"guid": "ND2YVRT23FJQWZW5FE33OJVN6I"
},
{
20. "created": "2012-08-14T15:11:06Z",
"lang": "en-US",
"memberSince": "2012-08-14T15:10:05Z",
"nickname": "Indya",
"guid": "ADDPYQSNFWSLK5VDAJPESLP2DU"
},
{
"created": "2010-01-31T19:48:55Z",
"lang": "en-US",
"memberSince": "2009-07-17T22:17:36Z",
"nickname": "Sara",
"guid": "IQACKBQMRKG7TRGWV66P4EA3GI"
},
{
"created": "2009-09-10T19:17:37Z",
"lang": "en-US",
"location": "",
"memberSince": "2006-10-23T03:52:22Z",
"nickname": "Tina",
"guid": "Q7EJRC36XZ7Z2EYEK3OVFITR6E"
},
{
"created": "2008-10-09T00:01:48Z",
"lang": "en-US",
"location": "",
"memberSince": "2000-04-22T21:48:38Z",
"nickname": "Minnie",
"guid": "3VT2ZEHSCWNLWMRNMWC25IF4HY"
},
{
"created": "2009-08-01T01:01:16Z",
"lang": "en-US",
21. "location": "Beijing",
"memberSince": "2009-01-04T11:24:34Z",
"nickname": "Wenjing",
"guid": "4KQA3LRHJLHQUJRFBBQKOPCGP4"
},
{
"created": "2009-11-12T18:31:52Z",
"lang": "en-US",
"location": "Budapest, Budapest",
"memberSince": "2008-11-16T11:10:55Z",
"nickname": "Veronika",
"guid": "EXQ6RIF7576Z7X2LQNU2ZPMFOY"
},
{
"created": "2010-02-17T21:20:22Z",
"lang": "en-US",
"location": "Daly City, California",
"memberSince": "2008-08-26T20:53:01Z",
"nickname": "Sandy",
"guid": "A4C5Z3PKCOIWO5DMPBS6A5K3GI"
},
{
"created": "2008-10-08T10:40:07Z",
"familyName": "singh",
"givenName": "Manmeet",
"lang": "en-US",
"location": "NA, NA 141004 India",
"memberSince": "2000-03-06T15:34:20Z",
"nickname": "senz",
"guid": "4YN2AAB3CPIQ6BF5U2PVNJIHQE"
},
{
22. "created": "2012-05-31T18:09:15Z",
"lang": "en-US",
"memberSince": "2012-05-31T17:53:34Z",
"nickname": "Chithra",
"guid": "OBJHXUVTSWYSEOLJGHPPYTIQOM"
},
{
"created": "2010-01-11T00:07:20Z",
"familyName": "Hong",
"givenName": "Angel",
"lang": "en-US",
"location": "San Francisco, California",
"memberSince": "2000-09-06T06:43:34Z",
"nickname": "Angel",
"guid": "6F7OVWYHNLC3NCG4VRVMRJS3X4"
},
{
"created": "2010-01-15T23:37:08Z",
"lang": "en-US",
"location": "",
"memberSince": "1999-12-31T07:05:31Z",
"nickname": "Thiru",
"guid": "FRJNSOTCZGSUUMVFXVJ2OKTLAA"
},
{
"created": "2010-01-02T23:03:25Z",
"lang": "en-US",
"location": "Houston, Texas",
"memberSince": "2008-07-17T16:49:35Z",
"nickname": "Terah",
"guid": "EZ6ZU7NV7YADJAQAL46XEKC4MU"
},
23. {
"created": "2010-01-30T23:47:12Z",
"lang": "en-US",
"memberSince": "2005-02-06T07:33:35Z",
"nickname": "Jessie",
"guid": "2NJYHULB6E7WNOEW5EHT2YDVSA"
}
]
}
}
}
------------------------------------------
Now,the PythoncommandsIusedto convertthisJSON data to normalized,tabulateddatawhich
isthenexportedtoan EXCEL spreadsheetisasfollows:
importnumpy
importpandasas pd
frompandas.io.jsonimportjson_normalize
frompandas import*
importpandas
importjson
importijson
pd.set_option('display.max_colwidth', -1)
pd.set_option('display.max_columns',5000)
pd.set_option('display.max_rows',5000)
pd.set_option('display.max_columns',300)
pd.set_option('display.max_rows', 300)
data = [ the JSON outputdata above isinsertedhere]
query=pd.DataFrame(data)
24. frame1=[DataFrame(json_normalize(query['query'][0]['diagnostics']['url'][0:19])),DataFrame(json_norm
alize(query['query'][0]['results']['profile']))]
result= pd.concat(frame1,axis=1)
printresult
# result.to_csv('C:/Users/myashar/YQL_Data_Extract_Example.csv',index=False)
# result.to_excel('C:/Users/myashar/YQL_Data_Extract_Example.xlsx',index=False)
withopen('C:/Users/myashar/YQL_Data_Extract_Example.csv','a') asf:
result.to_csv(f,index=False,header=False)
Thisexample scriptproducesthe followingoutput:
content
0 http://profiles.yahoo.com/v2/identities.handle(mark.yashar%40yahoo.com~yi
d)
1 http://profiles.yahoo.com/v2/identities.handle(minnie_z_1999%40yahoo.com~
yid)
2 http://profiles.yahoo.com/v2/identities.handle(jigpatel%40yahoo.com~yid)
3 http://profiles.yahoo.com/v2/identities.handle(cclaudio1357%40yahoo.com~y
id)
4 http://profiles.yahoo.com/v2/identities.handle(mrsglobs%40yahoo.com~yid)
5 http://profiles.yahoo.com/v2/identities.handle(chiennguyen_1981%40yahoo.c
om.vn~yid)
6 http://profiles.yahoo.com/v2/identities.handle(s.wells8%40yahoo.com~yid)
7 http://profiles.yahoo.com/v2/identities.handle(holidaysevents%40yahoo.com
~yid)
8 http://profiles.yahoo.com/v2/identities.handle(elabiner%40yahoo.com~yid)
9 http://profiles.yahoo.com/v2/identities.handle(indyaclayton%40yahoo.com~y
id)
10 http://profiles.yahoo.com/v2/identities.handle(billlclem123%40yahoo.com~y
id)
11 http://profiles.yahoo.com/v2/identities.handle(darrettchithra%40yahoo.com
~yid)
12 http://profiles.yahoo.com/v2/identities.handle(terahkuhnen%40yahoo.com~yi
d)
13 http://profiles.yahoo.com/v2/identities.handle(wenjing_mo%40yahoo.com~yid
)
14 http://profiles.yahoo.com/v2/identities.handle(egyedveronika%40yahoo.com~
yid)
15 http://profiles.yahoo.com/v2/identities.handle(angelhong320%40yahoo.com~y
id)
25. 16 http://profiles.yahoo.com/v2/identities.handle(dungmy88%40yahoo.com~yid)
17 http://profiles.yahoo.com/v2/identities.handle(manmeetsaini%40yahoo.com~y
id)
18 http://profiles.yahoo.com/v2/identities.handle(chandarmk%40yahoo.com~yid)
execution-start-time execution-stop-time execution-time http-status-
code
0 1 65 64 NaN
1 3 66 63 NaN
2 2 67 65 NaN
3 3 68 65 NaN
4 2 68 66 NaN
5 73 132 59 NaN
6 74 136 62 NaN
7 74 136 62 NaN
8 73 136 63 NaN
9 74 138 64 NaN
10 137 192 55 404
11 132 195 63 NaN
12 136 197 61 NaN
13 136 199 63 NaN
14 139 200 61 NaN
15 193 255 62 NaN
16 195 258 63 NaN
17 197 260 63 NaN
18 199 260 61 NaN
http-status-message created familyName givenName
0 NaN 2010-01-16T17:45:10Z NaN NaN
1 NaN NaN NaN NaN
2 NaN 2009-12-10T01:38:19Z NaN NaN
3 NaN 2012-09-23T23:01:41Z NaN NaN
4 NaN 2016-03-14T23:21:29Z NaN NaN
5 NaN 2008-12-21T01:32:28Z nguyenmangchien chien
6 NaN 2012-08-14T15:11:06Z NaN NaN
7 NaN 2010-01-31T19:48:55Z NaN NaN
8 NaN 2009-09-10T19:17:37Z NaN NaN
9 NaN 2008-10-09T00:01:48Z NaN NaN
10 Not Found 2009-08-01T01:01:16Z NaN NaN
11 NaN 2009-11-12T18:31:52Z NaN NaN
12 NaN 2010-02-17T21:20:22Z NaN NaN
13 NaN 2008-10-08T10:40:07Z singh Manmeet
14 NaN 2012-05-31T18:09:15Z NaN NaN
15 NaN 2010-01-11T00:07:20Z Hong Angel
16 NaN 2010-01-15T23:37:08Z NaN NaN
17 NaN 2010-01-02T23:03:25Z NaN NaN
18 NaN 2010-01-30T23:47:12Z NaN NaN
guid lang location
0 3VSVJ6XZX4NJWUUFFHJDXBIQBI en-US New York, New York
1 33VH3ODH6VAUFWZUQGR6GLXQEA en-US NaN
2 BHHEKTXSUVTJXZSVLBYXU7WISA en-US New York, New York
3 VRPARSKJIKEFNIHKXY5QK6L3LU en-US NaN
4 QCYF27AAV7MC7RL44KZ4EMX7FY en-US NaN
26. 5 ND2YVRT23FJQWZW5FE33OJVN6I en-US hanoi
6 ADDPYQSNFWSLK5VDAJPESLP2DU en-US NaN
7 IQACKBQMRKG7TRGWV66P4EA3GI en-US NaN
8 Q7EJRC36XZ7Z2EYEK3OVFITR6E en-US
9 3VT2ZEHSCWNLWMRNMWC25IF4HY en-US
10 4KQA3LRHJLHQUJRFBBQKOPCGP4 en-US Beijing
11 EXQ6RIF7576Z7X2LQNU2ZPMFOY en-US Budapest, Budapest
12 A4C5Z3PKCOIWO5DMPBS6A5K3GI en-US Daly City, California
13 4YN2AAB3CPIQ6BF5U2PVNJIHQE en-US NA, NA 141004 India
14 OBJHXUVTSWYSEOLJGHPPYTIQOM en-US NaN
15 6F7OVWYHNLC3NCG4VRVMRJS3X4 en-US San Francisco, California
16 FRJNSOTCZGSUUMVFXVJ2OKTLAA en-US
17 EZ6ZU7NV7YADJAQAL46XEKC4MU en-US Houston, Texas
18 2NJYHULB6E7WNOEW5EHT2YDVSA en-US NaN
memberSince nickname
0 1997-08-19T22:30:15Z Jigar
1 2004-11-16T22:04:53Z Holidays
2 2005-08-07T16:04:04Z Ellen
3 2012-09-23T23:01:20Z Carlos
4 2016-03-14T23:21:28Z Mark
5 2006-10-09T09:13:10Z chiennguyen
6 2012-08-14T15:10:05Z Indya
7 2009-07-17T22:17:36Z Sara
8 2006-10-23T03:52:22Z Tina
9 2000-04-22T21:48:38Z Minnie
10 2009-01-04T11:24:34Z Wenjing
11 2008-11-16T11:10:55Z Veronika
12 2008-08-26T20:53:01Z Sandy
13 2000-03-06T15:34:20Z senz
14 2012-05-31T17:53:34Z Chithra
15 2000-09-06T06:43:34Z Angel
16 1999-12-31T07:05:31Z Thiru
17 2008-07-17T16:49:35Z Terah
18 2005-02-06T07:33:35Z Jessie
Thisdata is thenwrittento/ appendedtoanEXCEL spreadsheetinCSV format(see the attached
EXCEL spreadsheet“YQL_Data_Extract_Example.csv”).
Note that the “content”fieldispopulatedbythe email addressesinthe formof,e.g.,
http://profiles.yahoo.com/v2/identities.handle(elabiner%40yahoo.com~yid),
whichisconvertedto elabiner@yahoo.com (whichwe candowithinEXCEL),i.e.,“%40”is replacedby
“@” and“~yid” is eliminated,etc.
27. Now,as I mentionedpreviously,the emailsinthe ‘content’fieldare notinthe correct order (due tothe
waythe data isgeneratedanddisplayedinthe YQL console due toYahoodata privacyissues) andwill
have to be put inthe correctorder manually,andsome extraneousun-necessaryfieldsinthe table
shouldbe eliminated,andwe can eliminate duplicates,entrieswithmissingdata,etc.andgenerally
cleanup the table manuallyasnecessary(e.g.,onlykeeprows/entrieswithemailswithrecentcreation
dates),butI thinkthisapproach/processmightoverall reduce some of the manual /brute force work
that wouldotherwisebe neededwithoutthe Pythontools/script.Forexample,itwouldbe easiernowto
correctlypopulate (inthe correctorder) the email fieldinthe table bymatchingthe emailswiththe
correct nicknames,familynames,givennamesorGUIDs(whichcouldbe lookedupviathe YQL console
if necessary) inthe table.
I can continue towork onthisas necessary,andthose onthe team(or elsewhere atVISA) withgreater
data science and/orEXCEL expertise mightbe able tofindwaystofurtherimprove orbuilduponthis
process(and/orfindalternativeapproaches) toreduce some of the manual/brute force workthatmay
(otherwise) be involvedhere.