Successfully reported this slideshow.
Your SlideShare is downloading. ×

(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Location ... It's Moving On
Location ... It's Moving On
Loading in …3
×

Check these out next

1 of 85 Ad

(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)

Download to read offline

"(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)"; presented on March 10th. 2010 at the London Twitter DevNest 7, at the Sun Customer Briefing Centre in London.

"(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)"; presented on March 10th. 2010 at the London Twitter DevNest 7, at the Sun Customer Briefing Centre in London.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Advertisement

Similar to (Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs) (20)

More from Gary Gale (19)

Advertisement

Recently uploaded (20)

(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)

  1. 1. London Twitter #devnest 7, March 2010<br />(Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…<br />Gary Gale, Yahoo! Geo Technologies<br />
  2. 2. the agenda<br />louisvolant on Flickr : http://www.flickr.com/photos/27048731@N03/4003756731/<br />
  3. 3. the agenda<br /><ul><li>the hello
  4. 4. the WOEIDs
  5. 5. the WTF?
  6. 6. the background
  7. 7. the geocoding and the geoparsing
  8. 8. the frustration
  9. 9. the WOEIDsredux
  10. 10. the APIs
  11. 11. the demo
  12. 12. the goodbye</li></ul>3<br />
  13. 13. 4<br />KELLYLEEBARRETT on Flickr : http://www.flickr.com/photos/kellylee/4177529745/<br />
  14. 14. 5<br />Gary Gale on Flickr : http://www.flickr.com/photos/vicchi/4414198544/<br />
  15. 15. WOEIDs<br />stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/<br />
  16. 16. 44418<br />12589342<br />
  17. 17. 8<br />David Armano on Flickr : http://www.flickr.com/photos/7855449@N02/3158864420/<br />
  18. 18. some background<br />blakophoto on Flickr : http://www.flickr.com/photos/cleveralias/3158810304/<br />
  19. 19. let’s talk about geocoding<br />inF! on Flickr : http://www.flickr.com/photos/nathanbarrow/3339245753/<br />
  20. 20. geocoding is the process of finding associated geographic coordinates (often expressed as latitude and longitude) from other geographic data, such as street addresses, or zip codes (postal codes).<br />
  21. 21. reverse geocoding is the process of back (reverse) coding of a point location (latitude, longitude) to a readable address or place name. <br />
  22. 22. noway on Flickr : http://www.flickr.com/photos/noway/78606643/<br />
  23. 23. what?<br />where?<br />
  24. 24.
  25. 25. what? (maybe) where? (maybe)<br />
  26. 26. this is not geocoding, this is geoparsing<br />szim90 on Flickr : http://www.flickr.com/photos/szim90/272670479/<br />
  27. 27. geoparsing is the process of assigning geographic identifiers (e.g., codes or geographic coordinates expressed as latitude-longitude) to textual words and phrases that occur in unstructured content.<br />
  28. 28. cheap flights from london to paris in october<br />
  29. 29. 20<br />“I’m sorry dave; I can’t find that place”<br />
  30. 30. 21<br />web servers<br />Jamison Judd on Flickr : http://www.flickr.com/photos/jamisonjudd/2433102356/<br />
  31. 31. 22<br />51° 30' 50.0868", 0° 7' 42.8514"<br />(125 Shaftesbury Avenue, London, UK)<br />163.1.117.210<br />(Oxford, UK)<br />20442/6015<br />(Brest, France)<br />#C5243B212<br />(Wilmington, Delaware, USA)<br />
  32. 32. 23<br />web surfers<br />National Library NZ on The Commons on Flickr : http://www.flickr.com/photos/nationallibrarynz_commons/3326203787/<br />
  33. 33. 24<br />The West End<br />Downtown<br />The Shops<br />The High Street<br />
  34. 34. 25<br />The Online World<br />Formal, normalised, structured, regular<br />The Real World<br />“We Are Here”<br />The Offline World<br />Informal, eccentric, bizarre, irregular<br />
  35. 35. cheap flights from london to paris in october<br />1) Tokenize<br />London<br />2) Remove common words<br />3) Remove words not in gazetteer<br />Paris<br />
  36. 36. “in”… India?<br />bodhitjal on Flickr : http://www.flickr.com/photos/bodhithaj/361857780/<br />
  37. 37. “in”… Indiana?<br />OZinOH on Flickr : http://www.flickr.com/photos/75905404@N00/505688957/<br />
  38. 38. “to”… Tonga?<br />j_buswell on Flickr : http://www.flickr.com/photos/j_buswell/3683814556/<br />
  39. 39. language<br />Jovike on Flickr : http://www.flickr.com/photos/jvk/19894053/<br />
  40. 40. Thé?<br />a town in Burgundy, France<br />IN?<br />ISO 3166-1 Alpha-2<br />for India<br />To?<br />a town in Ibaraki<br />prefecture, Japan<br />Is?<br />another town in Burgundy, France<br />IT?<br />ISO 3166-1 Alpha-2 for Italy<br />AND?<br />ISO 31660-1 Alpha-3<br />for Andorra<br />You?<br />a town in Yatenga, Burkina Faso<br />Å?<br />a town in NorlandFylke,<br />Norway<br />That?<br />a town in Rajasthan, India<br />
  41. 41. may cause frustration<br />paloaltosoftware on Flickr : http://www.flickr.com/photos/paloalto/3038701605/<br />
  42. 42. disambiguation<br />KoenVereeken on Flickr : http://www.flickr.com/photos/koenvereeken/2088902012/<br />
  43. 43. this is peru …<br />
  44. 44. and so is this (in argentina)<br />
  45. 45. and so is this (in bolivia)<br />
  46. 46. semantics required<br />dullhunk on Flickr : http://www.flickr.com/photos/dullhunk/3525013547/<br />
  47. 47. Hilton, Paris<br />Paris Hilton<br />
  48. 48. London<br />Jack London<br />
  49. 49. Panama<br />Panama Hats<br />
  50. 50. who uses official names anyway?<br />takomabibelot on Flickr : http://www.flickr.com/photos/takomabibelot/234301712/<br />
  51. 51. MOMA NYC<br />Museum of Modern Art, New York<br />paulamoya on Flickr : http://www.flickr.com/photos/40351463@N00/745012335/<br />
  52. 52. Millennium Wheel<br />London Eye<br />hismith83 on Flickr : http://www.flickr.com/photos/hismith83/200701961/<br />
  53. 53. San Francisco<br />City and County of San Francisco<br />SF Brit on Flickr : http://www.flickr.com/photos/cnbattson/192162591/<br />
  54. 54. WOEIDs (redux)<br />stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/<br />
  55. 55. 44418<br />12589342<br />
  56. 56. 51° 30' 50.0868", 0° 7' 42.8514"<br />
  57. 57. Unique<br />Permanent<br />Global<br />Language Neutral<br />London = Londra = Londres = ロンドン<br />United States = États-Unis = StatiUniti = 미국<br />Ensures that geography can be employed consistently and globally<br />straup on Flickr : http://www.flickr.com/photos/straup/3504862388/<br />
  58. 58. GeoPlanet<br />A Global Location Repository<br />Names + Geometry +Topology<br />WOEIDs for<br /><ul><li> cities and towns
  59. 59. postal codes, airports
  60. 60. admin regions, time zones
  61. 61. telephone code areas
  62. 62. marketing areas
  63. 63. points of interest
  64. 64. colloquial areas
  65. 65. neighbourhoods</li></ul>woodleywonderworks on Flickr : http://www.flickr.com/photos/wwworks/2222523978/<br />
  66. 66. Continents<br />Countries<br />Counties<br />Regions<br />Colloquials<br />Targeting Zones<br />Postal Codes<br />Area Codes<br />Boroughs<br />Neighbourhoods<br />POIs<br />
  67. 67. United Kingdom<br />23424975<br />VereinigtesKönigreich<br />Europe<br />24865675<br />Country<br />Continent<br />Royaume Uni<br />England<br />24554868<br />Great Britain<br />28298150<br />Country<br />Colloquial<br />イギリス<br />Warwickshire<br />12602190<br />Worcestershire<br />12602192<br />County<br />County<br />Earth<br />1<br />Supername<br />Stratford-on-Avon<br />12696101<br />District<br />Stratford-upon-Avon<br />36424<br />Warwick<br />39228<br />Town<br />Town<br />CV37<br />26787646<br />ZIP<br />
  68. 68. http://engineering.twitter.com/2010/02/woeids-in-twitters-trends.html<br />
  69. 69. http://isithackday.com/hacks/placemaker/tweet-locations.php<br />
  70. 70. http://wherein.yahooapis.com/v1/document<br />
  71. 71. unlock your api<br />https://developer.apps.yahoo.com/wsregapp/<br />sam.d on Flickr : http://www.flickr.com/photos/samd/65693717/<br />
  72. 72. Placemaker Parameters<br />appid<br />100% mandatory <br />inputLanguage<br />en-US, fr-CA, …<br />outputType<br />XML or RSS<br />documentContent<br />text to geoparse<br />documentTitle<br />optional title<br />documentURL<br />URL to geoparse<br />documentType<br />MIME type of doc<br />autoDisambiguate<br />remove duplicates<br />focusWoeid<br />filter around a WOEID<br />
  73. 73. // POST to Placemaker<br />$ch = curl_init(); <br />define('POSTURL', 'http://wherein.yahooapis.com/v1/document');<br />define('POSTVARS', 'appid='. $key.'&documentContent='.urlencode($content).<br /> '&documentType=text/plain&outputType=xml'.$lang);<br />$ch = curl_init(POSTURL);<br />curl_setopt($ch, CURLOPT_POST, 1);<br />curl_setopt($ch, CURLOPT_POSTFIELDS, POSTVARS);<br />curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); <br />$placemaker = curl_exec($ch);<br />curl_close($ch);<br />
  74. 74. places<br />that_james on Flickr : http://www.flickr.com/photos/that_james/496797309/<br />
  75. 75. <placeDetails><br /><place><br /><woeId>44418</woeId><br /><type>Town</type><br /><name><br /><![CDATA[London, England, GB]]><br /></name><br /><centroid><br /><latitude>51.5063</latitude><br /><longitude>-0.12714</longitude><br /></centroid><br /></place><br /><matchType>0</matchType><br /><weight>1</weight><br /><confidence>10</confidence><br /></placeDetails><br />One place for WOEID 44418<br />
  76. 76. references<br />misterbisson on Flickr : http://www.flickr.com/photos/maisonbisson/117720946/<br />
  77. 77. <reference><br /><woeIds>44418</woeIds><br /><start>1079</start><br /><end>1089</end><br /><isPlaintextMarker>1</isPlaintextMarker><br /><text><![CDATA[London, UK]]></text><br /><type>plaintext</type><br /><xpath><![CDATA[]]></xpath><br /></reference><br /><reference><br /><woeIds>44418</woeIds><br /><start>1116</start><br /><end>1126</end><br /><isPlaintextMarker>1</isPlaintextMarker><br /><text><![CDATA[London, UK]]></text><br /><type>plaintext</type><br /><xpath><![CDATA[]]></xpath><br /></reference><br />Two references for WOEID 44418<br />Two references for WOEID 44418<br />
  78. 78. // turn into an PHP object and loop over the results<br />$places = simplexml_load_string($placemaker, 'SimpleXMLElement',<br /> LIBXML_NOCDATA); <br />if($places->document->placeDetails){<br /> $foundplaces = array();<br />// create a hashmap of the places found to mix with<br />// the references found<br />foreach($places->document->placeDetails as $p){<br /> $wkey = 'woeid'.$p->place->woeId;<br /> $foundplaces[$wkey]=array(<br /> 'name'=>str_replace(', ZZ','',$p->place->name).'',<br /> 'type'=>$p->place->type.'',<br /> 'woeId'=>$p->place->woeId.'',<br /> 'lat'=>$p->place->centroid->latitude.'',<br /> 'lon'=>$p->place->centroid->longitude.'’<br /> );<br /> }<br />}<br />
  79. 79. // loop over references and filter out duplicates<br />$refs = $places->document->referenceList->reference;<br />$usedwoeids = array();<br />foreach($refs as $r){<br />foreach($r->woeIds as $wi){<br />if(in_array($wi,$usedwoeids)){<br /> continue;<br /> } else {<br /> $usedwoeids[] = $wi.'';<br /> }<br /> $currentloc = $foundplaces["woeid".$wi];<br />if($r->text!='' && $currentloc['name']!='' && <br /> $currentloc['lat']!='' && $currentloc['lon']!=''){<br /> $text = preg_replace('/+/',' ',$r->text);<br /> $name = addslashes(str_replace(', ZZ’,<br /> $currentloc['name']));<br /> $desc = addslashes($text);<br /> $lat = $currentloc['lat'];<br /> $lon = $currentloc['lon'];<br /> $class = stripslashes($desc)."|$name|$lat|$lon";<br /> $placelist.= "<li>".<br /> }<br />}<br />
  80. 80. http://www.vicchi.org/speaking<br />
  81. 81.
  82. 82. the internet is broken<br />Nesster on Flickr : http://www.flickr.com/photos/nesster/3168425434/<br />
  83. 83. // load the URL, using YQL to filter the HTML<br />// and fix UTF-8 nasties<br />$url = 'http://www.vicchi.org/speaking';<br />$realurl = 'http://query.yahooapis.com/v1/public/yql’.<br /> '?q=select%20*%20'.<br /> 'from%20html%20where%20url%20%3D%20%22'.<br /> urlencode($url).'%22&format=xml';<br />$ch = curl_init(); <br />curl_setopt($ch, CURLOPT_URL, $realurl); <br />curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); <br />$c = curl_exec($ch); <br />curl_close($ch);<br />if(strstr($c,'<')){<br /> $c = preg_replace("/.*<results>|<results>.*/",'',$c);<br /> $c = preg_replace("/<xml version=quot;10quot;".<br /> " encoding=quot;UTF-8quot;>/",'',$c);<br /> $c = strip_tags($c);<br /> $c = preg_replace("/[?]+/"," ",$c);<br />}<br />
  84. 84. minor annoyances<br />swooshthesnail on Flickr : http://www.flickr.com/photos/swooshthesnail/3281681399/<br />
  85. 85. 50,000 bytes<br />ASurroca on Flickr : http://www.flickr.com/photos/asurroca/147049402/<br />
  86. 86. X<br />no json<br />
  87. 87. post not get<br />sludgegulper on Flickr : http://www.flickr.com/photos/sludgeulper/2645478209/<br />
  88. 88. http://where.yahooapis.com/v1/<br />
  89. 89. collections<br />bradman334 on Flickr : http://www.flickr.com/photos/bradman334/3402569690/<br />
  90. 90. collections<br /><ul><li>lists of related resources, such as places
  91. 91. e.g. find all places called “london”</li></ul>http://where.yahooapis.com/v1/places.q('london');count=0?appid=[your id]<br /><ul><li>e.g. find the most likely place called “london”</li></ul>http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]<br />74<br />
  92. 92. <places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" <br />yahoo:start="0" yahoo:count="1" yahoo:total="22"><br /><place yahoo:uri="http://where.yahooapis.com/v1/place/44418" xml:lang="en-us"><br /><woeid>44418</woeid><br /><placeTypeName code="7">Town</placeTypeName><br /><name>London</name><br /><country type="Country" code="GB">United Kingdom</country><br /><admin1 type="Country" code="GB-ENG">England</admin1><br /><admin2 type="County" code="">Greater London</admin2><br /><admin3></admin3><br /><locality1 type="Town">London</locality1><br /><locality2></locality2><br /><postal></postal><br /><centroid><br /><latitude>51.506321</latitude><longitude>-0.127140</longitude><br /></centroid><br /><boundingBox><br /><southWest><latitude>51.261318</latitude><longitude>-0.563000</longitude></southWest><br /><northEast><latitude>51.686031</latitude><longitude>0.280360</longitude></northEast><br /></boundingBox><br /></place><br /></places><br />
  93. 93. resources<br />joshuarichards on Flickr : http://www.flickr.com/photos/joshywoshywoo/124671979/<br />
  94. 94. resources<br /><ul><li>unique objects that contain multiple attributes, such as a place
  95. 95. e.g. get attributes for WOEID 44418</li></ul>http://where.yahooapis.com/v1/place/44418?appid=[your id]<br /><ul><li>e.g. find the most likely place called “london”</li></ul>http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]<br />77<br />
  96. 96. resources<br /><ul><li>unique objects that contain multiple attributes, such as a place
  97. 97. e.g. get places related to WOEID 44418</li></ul>http://where.yahooapis.com/v1/place/44418/relation?appid=[your id]<br /><ul><li>parent, ancestors, belongsto, neighbours, siblings, children</li></ul>78<br />
  98. 98. <?xml version="1.0" encoding="UTF-8"?><places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:start="0" yahoo:count="10" yahoo:total="34"><br /><place yahoo:uri="http://where.yahooapis.com/v1/place/12695806" xml:lang="en-us"><br /><woeid>12695806</woeid><br /><placeTypeName code="10">Local Administrative Area</placeTypeName><br /><name>City of London</name><br /></place><br /><place yahoo:uri="http://where.yahooapis.com/v1/place/12695807" xml:lang="en-us"><br /><woeid>12695807</woeid><br /><placeTypeName code="10">Local Administrative Area</placeTypeName><br /><name>London Borough of Camden</name><br /></place><br /><place yahoo:uri="http://where.yahooapis.com/v1/place/12695808" xml:lang="en-us"><br /><woeid>12695808</woeid><br /><placeTypeName code="10">Local Administrative Area</placeTypeName><br /><name>London Borough of Hackney</name><br /></place><br />…<br /></places><br />
  99. 99. Far more than you could ever want<br />http://delicious.com/codepo8/geotoys<br />
  100. 100. never work with children, animals or live demos<br />elephipelephi on Flickr : http://www.flickr.com/photos/elephipelephi/1493013250/<br />
  101. 101. not taking notes?<br />selva on Flickr : http://www.flickr.com/photos/selva/24604141/<br />
  102. 102. London Twitter #devnest 7, March 2010<br />(Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…<br />Gary Gale, Yahoo! Geo Technologies<br />http://slideshare.net/vicchi<br />
  103. 103. thanks for listening<br />Paul Keleher on Flickr : http://www.flickr.com/photos/pkeleher/1658311814/<br />
  104. 104. www.ygeoblog.com<br />twitter.com/vicchi<br />twitter.com/yahoogeo<br />

×