Place not Space; Geo without Maps


Published on

"Place not Space; Geo without Maps"; presented on October 1st. 2009 at FOWA (Future of Web Applications) London.

Published in: Technology, Sports
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The placeDetails container defines a place – there’s one per place found – it holdsWOEID – the unique identifier for the placetype – the place type name for the placename – the fully qualified name of the placecentroid – the centroid coordinatesmatchType – the type of match (0=text/text & coordinates, 1=coordinates only)weight – relative weight of the place within the documentconfidence – confidence that the document mentions the place
  • WOEID – list of WOEIDs referencing the placestart & end – index of first and last character in the place reference or -1 if type is XPathisPlaintextMarker – flag indicating if the reference is plain texttext – the actual place referencetype – type of reference – plaintext, Xpath, Xpathwithcountsxpath – xpath of the reference
  • So now we have the places, their references and their WOEIDs we can easily hook into services which understand WOEIDsSuch as FlickrBecause not only does Flickr love you, it also knows about WOEIDs, as this YQL fragment shows
  • But what about those services that don’t speak WOEID fluently?Looking back at the place definitions, we have WOEIDs.Well each WOEID has metadata attributes associated with it, such as the centroid of a place with the longitude and latitudeAnd because geo should be technologically agnostic, so must we, so with these coordinates we can use other services, such as Google Earth
  • Place not Space; Geo without Maps

    1. Place not Space; Geo without MapsFOWA London, October 2009Gary Gale, Yahoo! Geo Technologies<br />
    2. PLACES, PEOPLE and THINGS<br />atibens on Flickr :<br />
    3. Knowing where our users are, and the places that are important to them<br />Knowing the geographic context of everything we index, manage and publish<br />Knowing <br />geographic locations, and the names of places<br />We Connect<br />Places, People and Things<br />
    4. SOME NUMBERS<br />KoenVereeken on Flickr :<br />
    5. 85% of all data stored is unstructured<br />This doubles every 3 months<br />80% of all data contains a geo reference<br />Source: Gartner Group<br />Mr Faber on Flickr :<br />
    6. MINE THAT CONTENT<br />tjblackwell on Flickr :<br />
    7. Content / URL<br />+<br />=<br />Places & References<br />
    8.<br />
    9. UNLOCK PLACEMAKER<br /><br />bohman on Flickr :<br />
    10.<br />
    11. Placemaker Parameters<br />appid<br />100% mandatory <br />inputLanguage<br />en-US, fr-CA, …<br />outputType<br />XML or RSS<br />documentContent<br />text to geoparse<br />documentTitle<br />optional title<br />documentURL<br />URL to geoparse<br />documentType<br />MIME type of doc<br />autoDisambiguate<br />remove duplicates<br />focusWoeid<br />filter around a WOEID<br />
    12. WOEIDs<br />stevefaeembra on Flickr :<br />
    13. Unique<br />Permanent<br />Global<br />Language Neutral<br />London = Londra = Londres = ロンドン<br />United States = États-Unis = StatiUniti = 미국<br />Ensures that geography can be employed consistently and globally<br />straup on Flickr :<br />
    14. GeoPlanet<br />A Global Location Repository<br />Names + Geometry +Topology<br />WOEIDs for<br /><ul><li> cities and towns
    15. postal codes, airports
    16. admin regions, time zones
    17. telephone code areas
    18. marketing areas
    19. points of interest
    20. colloquial areas
    21. neighbourhoods</li></ul>woodleywonderworks on Flickr :<br />
    22. // POST to Placemaker<br />$ch = curl_init(); <br />define(&apos;POSTURL&apos;, &apos;;);<br />define(&apos;POSTVARS&apos;, &apos;appid=&apos;. $key.&apos;&documentContent=&apos;.urlencode($content).<br /> &apos;&documentType=text/plain&outputType=xml&apos;.$lang);<br />$ch = curl_init(POSTURL);<br />curl_setopt($ch, CURLOPT_POST, 1);<br />curl_setopt($ch, CURLOPT_POSTFIELDS, POSTVARS);<br />curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); <br />$placemaker = curl_exec($ch);<br />curl_close($ch);<br />
    23. PLACES<br />that_james on Flickr :<br />
    24. &lt;placeDetails&gt;<br />&lt;place&gt;<br />&lt;woeId&gt;44418&lt;/woeId&gt;<br />&lt;type&gt;Town&lt;/type&gt;<br />&lt;name&gt;<br />&lt;![CDATA[London, England, GB]]&gt;<br />&lt;/name&gt;<br />&lt;centroid&gt;<br />&lt;latitude&gt;51.5063&lt;/latitude&gt;<br />&lt;longitude&gt;-0.12714&lt;/longitude&gt;<br />&lt;/centroid&gt;<br />&lt;/place&gt;<br />&lt;matchType&gt;0&lt;/matchType&gt;<br />&lt;weight&gt;1&lt;/weight&gt;<br />&lt;confidence&gt;10&lt;/confidence&gt;<br />&lt;/placeDetails&gt;<br />One place for WOEID 44418<br />
    25. REFERENCES<br />misterbisson on Flickr :<br />
    26. &lt;reference&gt;<br />&lt;woeIds&gt;44418&lt;/woeIds&gt;<br />&lt;start&gt;1079&lt;/start&gt;<br />&lt;end&gt;1089&lt;/end&gt;<br />&lt;isPlaintextMarker&gt;1&lt;/isPlaintextMarker&gt;<br />&lt;text&gt;&lt;![CDATA[London, UK]]&gt;&lt;/text&gt;<br />&lt;type&gt;plaintext&lt;/type&gt;<br />&lt;xpath&gt;&lt;![CDATA[]]&gt;&lt;/xpath&gt;<br />&lt;/reference&gt;<br />&lt;reference&gt;<br />&lt;woeIds&gt;44418&lt;/woeIds&gt;<br />&lt;start&gt;1116&lt;/start&gt;<br />&lt;end&gt;1126&lt;/end&gt;<br />&lt;isPlaintextMarker&gt;1&lt;/isPlaintextMarker&gt;<br />&lt;text&gt;&lt;![CDATA[London, UK]]&gt;&lt;/text&gt;<br />&lt;type&gt;plaintext&lt;/type&gt;<br />&lt;xpath&gt;&lt;![CDATA[]]&gt;&lt;/xpath&gt;<br />&lt;/reference&gt;<br />Two references for WOEID 44418<br />Two references for WOEID 44418<br />
    27. // turn into an PHP object and loop over the results<br />$places = simplexml_load_string($placemaker, &apos;SimpleXMLElement&apos;,<br /> LIBXML_NOCDATA); <br />if($places-&gt;document-&gt;placeDetails){<br /> $foundplaces = array();<br />// create a hashmap of the places found to mix with<br />// the references found<br />foreach($places-&gt;document-&gt;placeDetails as $p){<br /> $wkey = &apos;woeid&apos;.$p-&gt;place-&gt;woeId;<br /> $foundplaces[$wkey]=array(<br /> &apos;name&apos;=&gt;str_replace(&apos;, ZZ&apos;,&apos;&apos;,$p-&gt;place-&gt;name).&apos;&apos;,<br /> &apos;type&apos;=&gt;$p-&gt;place-&gt;type.&apos;&apos;,<br /> &apos;woeId&apos;=&gt;$p-&gt;place-&gt;woeId.&apos;&apos;,<br /> &apos;lat&apos;=&gt;$p-&gt;place-&gt;centroid-&gt;latitude.&apos;&apos;,<br /> &apos;lon&apos;=&gt;$p-&gt;place-&gt;centroid-&gt;longitude.&apos;’<br /> );<br /> }<br />}<br />
    28. // loop over references and filter out duplicates<br />$refs = $places-&gt;document-&gt;referenceList-&gt;reference;<br />$usedwoeids = array();<br />foreach($refs as $r){<br />foreach($r-&gt;woeIds as $wi){<br />if(in_array($wi,$usedwoeids)){<br /> continue;<br /> } else {<br /> $usedwoeids[] = $wi.&apos;&apos;;<br /> }<br /> $currentloc = $foundplaces[&quot;woeid&quot;.$wi];<br />if($r-&gt;text!=&apos;&apos; && $currentloc[&apos;name&apos;]!=&apos;&apos; && <br /> $currentloc[&apos;lat&apos;]!=&apos;&apos; && $currentloc[&apos;lon&apos;]!=&apos;&apos;){<br /> $text = preg_replace(&apos;/s+/&apos;,&apos; &apos;,$r-&gt;text);<br /> $name = addslashes(str_replace(&apos;, ZZ’,<br /> $currentloc[&apos;name&apos;]));<br /> $desc = addslashes($text);<br /> $lat = $currentloc[&apos;lat&apos;];<br /> $lon = $currentloc[&apos;lon&apos;];<br /> $class = stripslashes($desc).&quot;|$name|$lat|$lon&quot;;<br /> $placelist.= &quot;&lt;li&gt;&quot;.<br /> }<br />}<br />
    29. select * from where photo_id in<br />(select id from where woe_id=44418)<br />and license=4;<br />
    30. &lt;placeDetails&gt;<br />&lt;place&gt;<br />&lt;woeId&gt;44418&lt;/woeId&gt;<br />&lt;type&gt;Town&lt;/type&gt;<br />&lt;name&gt;<br />&lt;![CDATA[London, England, GB]]&gt;<br />&lt;/name&gt;<br />&lt;centroid&gt;<br />&lt;latitude&gt;51.5063&lt;/latitude&gt;<br />&lt;longitude&gt;-0.12714&lt;/longitude&gt;<br />&lt;/centroid&gt;<br />&lt;/place&gt;<br />&lt;matchType&gt;0&lt;/matchType&gt;<br />&lt;weight&gt;1&lt;/weight&gt;<br />&lt;confidence&gt;10&lt;/confidence&gt;<br />&lt;/placeDetails&gt;<br />ragewear on Flickr :<br />
    31.<br />
    32.<br />
    33. THE INTERNET IS BROKEN<br />Nesster on Flickr :<br />
    34. // load the URL, using YQL to filter the HTML<br />// and fix UTF-8 nasties<br />$url = &apos;;;<br />$realurl = &apos;’.<br /> &apos;?q=select%20*%20&apos;.<br /> &apos;from%20html%20where%20url%20%3D%20%22&apos;.<br /> urlencode($url).&apos;%22&format=xml&apos;;<br />$ch = curl_init(); <br />curl_setopt($ch, CURLOPT_URL, $realurl); <br />curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); <br />$c = curl_exec($ch); <br />curl_close($ch);<br />if(strstr($c,&apos;&lt;&apos;)){<br /> $c = preg_replace(&quot;/.*&lt;results&gt;|&lt;/results&gt;.*/&quot;,&apos;&apos;,$c);<br /> $c = preg_replace(&quot;/&lt;?xml version=&quot;1.0&quot;&quot;.<br /> &quot; encoding=&quot;UTF-8&quot;?&gt;/&quot;,&apos;&apos;,$c);<br /> $c = strip_tags($c);<br /> $c = preg_replace(&quot;/[ ? ]+/&quot;,&quot; &quot;,$c);<br />}<br />
    35. MINOR ANNOYANCES<br />swooshthesnail on Flickr :<br />
    36. 50,000 BYTES<br />ASurroca on Flickr :<br />
    37. X<br />NO JSON<br />
    38. POST NOT GET<br />sludgegulper on Flickr :<br />
    39. WANT TO KNOW MORE?<br />selva on Flickr :<br />
    40. Earth<br /><br /><br />
    41.<br />
    42.<br />
    43.<br />
    44. THANK YOU FOR LISTENING<br />quinn.anya on Flickr :<br />
    45.<br /><br /><br />