(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)

on

  • 7,102 views

"(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)"; presented on March 10th. 2010 at the London Twitter DevNest 7, at the Sun Customer Briefing Centre in London.

"(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs)"; presented on March 10th. 2010 at the London Twitter DevNest 7, at the Sun Customer Briefing Centre in London.

Statistics

Views

Total Views
7,102
Views on SlideShare
6,574
Embed Views
528

Actions

Likes
5
Downloads
27
Comments
0

7 Embeds 528

http://www.vicchi.org 421
http://devnest.org 54
http://www.slideshare.net 25
http://londonfirst.dev01.maverick.local 13
http://twitterdevelopernest.com 12
http://api.rockmelt.com 2
http://www.mysparebrain.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

(Almost) Everything You Ever Wanted To Know About Geo (with WOEIDs) Presentation Transcript

  • 1. London Twitter #devnest 7, March 2010
    (Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…
    Gary Gale, Yahoo! Geo Technologies
  • 2. the agenda
    louisvolant on Flickr : http://www.flickr.com/photos/27048731@N03/4003756731/
  • 3. the agenda
    • the hello
    • 4. the WOEIDs
    • 5. the WTF?
    • 6. the background
    • 7. the geocoding and the geoparsing
    • 8. the frustration
    • 9. the WOEIDsredux
    • 10. the APIs
    • 11. the demo
    • 12. the goodbye
    3
  • 13. 4
    KELLYLEEBARRETT on Flickr : http://www.flickr.com/photos/kellylee/4177529745/
  • 14. 5
    Gary Gale on Flickr : http://www.flickr.com/photos/vicchi/4414198544/
  • 15. WOEIDs
    stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/
  • 16. 44418
    12589342
  • 17. 8
    David Armano on Flickr : http://www.flickr.com/photos/7855449@N02/3158864420/
  • 18. some background
    blakophoto on Flickr : http://www.flickr.com/photos/cleveralias/3158810304/
  • 19. let’s talk about geocoding
    inF! on Flickr : http://www.flickr.com/photos/nathanbarrow/3339245753/
  • 20. geocoding is the process of finding associated geographic coordinates (often expressed as latitude and longitude) from other geographic data, such as street addresses, or zip codes (postal codes).
  • 21. reverse geocoding is the process of back (reverse) coding of a point location (latitude, longitude) to a readable address or place name.
  • 22. noway on Flickr : http://www.flickr.com/photos/noway/78606643/
  • 23. what?
    where?
  • 24.
  • 25. what? (maybe) where? (maybe)
  • 26. this is not geocoding, this is geoparsing
    szim90 on Flickr : http://www.flickr.com/photos/szim90/272670479/
  • 27. geoparsing is the process of assigning geographic identifiers (e.g., codes or geographic coordinates expressed as latitude-longitude) to textual words and phrases that occur in unstructured content.
  • 28. cheap flights from london to paris in october
  • 29. 20
    “I’m sorry dave; I can’t find that place”
  • 30. 21
    web servers
    Jamison Judd on Flickr : http://www.flickr.com/photos/jamisonjudd/2433102356/
  • 31. 22
    51° 30' 50.0868", 0° 7' 42.8514"
    (125 Shaftesbury Avenue, London, UK)
    163.1.117.210
    (Oxford, UK)
    20442/6015
    (Brest, France)
    #C5243B212
    (Wilmington, Delaware, USA)
  • 32. 23
    web surfers
    National Library NZ on The Commons on Flickr : http://www.flickr.com/photos/nationallibrarynz_commons/3326203787/
  • 33. 24
    The West End
    Downtown
    The Shops
    The High Street
  • 34. 25
    The Online World
    Formal, normalised, structured, regular
    The Real World
    “We Are Here”
    The Offline World
    Informal, eccentric, bizarre, irregular
  • 35. cheap flights from london to paris in october
    1) Tokenize
    London
    2) Remove common words
    3) Remove words not in gazetteer
    Paris
  • 36. “in”… India?
    bodhitjal on Flickr : http://www.flickr.com/photos/bodhithaj/361857780/
  • 37. “in”… Indiana?
    OZinOH on Flickr : http://www.flickr.com/photos/75905404@N00/505688957/
  • 38. “to”… Tonga?
    j_buswell on Flickr : http://www.flickr.com/photos/j_buswell/3683814556/
  • 39. language
    Jovike on Flickr : http://www.flickr.com/photos/jvk/19894053/
  • 40. Thé?
    a town in Burgundy, France
    IN?
    ISO 3166-1 Alpha-2
    for India
    To?
    a town in Ibaraki
    prefecture, Japan
    Is?
    another town in Burgundy, France
    IT?
    ISO 3166-1 Alpha-2 for Italy
    AND?
    ISO 31660-1 Alpha-3
    for Andorra
    You?
    a town in Yatenga, Burkina Faso
    Å?
    a town in NorlandFylke,
    Norway
    That?
    a town in Rajasthan, India
  • 41. may cause frustration
    paloaltosoftware on Flickr : http://www.flickr.com/photos/paloalto/3038701605/
  • 42. disambiguation
    KoenVereeken on Flickr : http://www.flickr.com/photos/koenvereeken/2088902012/
  • 43. this is peru …
  • 44. and so is this (in argentina)
  • 45. and so is this (in bolivia)
  • 46. semantics required
    dullhunk on Flickr : http://www.flickr.com/photos/dullhunk/3525013547/
  • 47. Hilton, Paris
    Paris Hilton
  • 48. London
    Jack London
  • 49. Panama
    Panama Hats
  • 50. who uses official names anyway?
    takomabibelot on Flickr : http://www.flickr.com/photos/takomabibelot/234301712/
  • 51. MOMA NYC
    Museum of Modern Art, New York
    paulamoya on Flickr : http://www.flickr.com/photos/40351463@N00/745012335/
  • 52. Millennium Wheel
    London Eye
    hismith83 on Flickr : http://www.flickr.com/photos/hismith83/200701961/
  • 53. San Francisco
    City and County of San Francisco
    SF Brit on Flickr : http://www.flickr.com/photos/cnbattson/192162591/
  • 54. WOEIDs (redux)
    stevefaeembra on Flickr : http://www.flickr.com/photos/stevefaeembra/3567750853/
  • 55. 44418
    12589342
  • 56. 51° 30' 50.0868", 0° 7' 42.8514"
  • 57. Unique
    Permanent
    Global
    Language Neutral
    London = Londra = Londres = ロンドン
    United States = États-Unis = StatiUniti = 미국
    Ensures that geography can be employed consistently and globally
    straup on Flickr : http://www.flickr.com/photos/straup/3504862388/
  • 58. GeoPlanet
    A Global Location Repository
    Names + Geometry +Topology
    WOEIDs for
    • cities and towns
    • 59. postal codes, airports
    • 60. admin regions, time zones
    • 61. telephone code areas
    • 62. marketing areas
    • 63. points of interest
    • 64. colloquial areas
    • 65. neighbourhoods
    woodleywonderworks on Flickr : http://www.flickr.com/photos/wwworks/2222523978/
  • 66. Continents
    Countries
    Counties
    Regions
    Colloquials
    Targeting Zones
    Postal Codes
    Area Codes
    Boroughs
    Neighbourhoods
    POIs
  • 67. United Kingdom
    23424975
    VereinigtesKönigreich
    Europe
    24865675
    Country
    Continent
    Royaume Uni
    England
    24554868
    Great Britain
    28298150
    Country
    Colloquial
    イギリス
    Warwickshire
    12602190
    Worcestershire
    12602192
    County
    County
    Earth
    1
    Supername
    Stratford-on-Avon
    12696101
    District
    Stratford-upon-Avon
    36424
    Warwick
    39228
    Town
    Town
    CV37
    26787646
    ZIP
  • 68. http://engineering.twitter.com/2010/02/woeids-in-twitters-trends.html
  • 69. http://isithackday.com/hacks/placemaker/tweet-locations.php
  • 70. http://wherein.yahooapis.com/v1/document
  • 71. unlock your api
    https://developer.apps.yahoo.com/wsregapp/
    sam.d on Flickr : http://www.flickr.com/photos/samd/65693717/
  • 72. Placemaker Parameters
    appid
    100% mandatory
    inputLanguage
    en-US, fr-CA, …
    outputType
    XML or RSS
    documentContent
    text to geoparse
    documentTitle
    optional title
    documentURL
    URL to geoparse
    documentType
    MIME type of doc
    autoDisambiguate
    remove duplicates
    focusWoeid
    filter around a WOEID
  • 73. // POST to Placemaker
    $ch = curl_init();
    define('POSTURL', 'http://wherein.yahooapis.com/v1/document');
    define('POSTVARS', 'appid='. $key.'&documentContent='.urlencode($content).
    '&documentType=text/plain&outputType=xml'.$lang);
    $ch = curl_init(POSTURL);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, POSTVARS);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $placemaker = curl_exec($ch);
    curl_close($ch);
  • 74. places
    that_james on Flickr : http://www.flickr.com/photos/that_james/496797309/
  • 75. <placeDetails>
    <place>
    <woeId>44418</woeId>
    <type>Town</type>
    <name>
    <![CDATA[London, England, GB]]>
    </name>
    <centroid>
    <latitude>51.5063</latitude>
    <longitude>-0.12714</longitude>
    </centroid>
    </place>
    <matchType>0</matchType>
    <weight>1</weight>
    <confidence>10</confidence>
    </placeDetails>
    One place for WOEID 44418
  • 76. references
    misterbisson on Flickr : http://www.flickr.com/photos/maisonbisson/117720946/
  • 77. <reference>
    <woeIds>44418</woeIds>
    <start>1079</start>
    <end>1089</end>
    <isPlaintextMarker>1</isPlaintextMarker>
    <text><![CDATA[London, UK]]></text>
    <type>plaintext</type>
    <xpath><![CDATA[]]></xpath>
    </reference>
    <reference>
    <woeIds>44418</woeIds>
    <start>1116</start>
    <end>1126</end>
    <isPlaintextMarker>1</isPlaintextMarker>
    <text><![CDATA[London, UK]]></text>
    <type>plaintext</type>
    <xpath><![CDATA[]]></xpath>
    </reference>
    Two references for WOEID 44418
    Two references for WOEID 44418
  • 78. // turn into an PHP object and loop over the results
    $places = simplexml_load_string($placemaker, 'SimpleXMLElement',
    LIBXML_NOCDATA);
    if($places->document->placeDetails){
    $foundplaces = array();
    // create a hashmap of the places found to mix with
    // the references found
    foreach($places->document->placeDetails as $p){
    $wkey = 'woeid'.$p->place->woeId;
    $foundplaces[$wkey]=array(
    'name'=>str_replace(', ZZ','',$p->place->name).'',
    'type'=>$p->place->type.'',
    'woeId'=>$p->place->woeId.'',
    'lat'=>$p->place->centroid->latitude.'',
    'lon'=>$p->place->centroid->longitude.'’
    );
    }
    }
  • 79. // loop over references and filter out duplicates
    $refs = $places->document->referenceList->reference;
    $usedwoeids = array();
    foreach($refs as $r){
    foreach($r->woeIds as $wi){
    if(in_array($wi,$usedwoeids)){
    continue;
    } else {
    $usedwoeids[] = $wi.'';
    }
    $currentloc = $foundplaces["woeid".$wi];
    if($r->text!='' && $currentloc['name']!='' &&
    $currentloc['lat']!='' && $currentloc['lon']!=''){
    $text = preg_replace('/s+/',' ',$r->text);
    $name = addslashes(str_replace(', ZZ’,
    $currentloc['name']));
    $desc = addslashes($text);
    $lat = $currentloc['lat'];
    $lon = $currentloc['lon'];
    $class = stripslashes($desc)."|$name|$lat|$lon";
    $placelist.= "<li>".
    }
    }
  • 80. http://www.vicchi.org/speaking
  • 81.
  • 82. the internet is broken
    Nesster on Flickr : http://www.flickr.com/photos/nesster/3168425434/
  • 83. // load the URL, using YQL to filter the HTML
    // and fix UTF-8 nasties
    $url = 'http://www.vicchi.org/speaking';
    $realurl = 'http://query.yahooapis.com/v1/public/yql’.
    '?q=select%20*%20'.
    'from%20html%20where%20url%20%3D%20%22'.
    urlencode($url).'%22&format=xml';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $realurl);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $c = curl_exec($ch);
    curl_close($ch);
    if(strstr($c,'<')){
    $c = preg_replace("/.*<results>|</results>.*/",'',$c);
    $c = preg_replace("/<?xml version="1.0"".
    " encoding="UTF-8"?>/",'',$c);
    $c = strip_tags($c);
    $c = preg_replace("/[ ? ]+/"," ",$c);
    }
  • 84. minor annoyances
    swooshthesnail on Flickr : http://www.flickr.com/photos/swooshthesnail/3281681399/
  • 85. 50,000 bytes
    ASurroca on Flickr : http://www.flickr.com/photos/asurroca/147049402/
  • 86. X
    no json
  • 87. post not get
    sludgegulper on Flickr : http://www.flickr.com/photos/sludgeulper/2645478209/
  • 88. http://where.yahooapis.com/v1/
  • 89. collections
    bradman334 on Flickr : http://www.flickr.com/photos/bradman334/3402569690/
  • 90. collections
    • lists of related resources, such as places
    • 91. e.g. find all places called “london”
    http://where.yahooapis.com/v1/places.q('london');count=0?appid=[your id]
    • e.g. find the most likely place called “london”
    http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]
    74
  • 92. <places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
    yahoo:start="0" yahoo:count="1" yahoo:total="22">
    <place yahoo:uri="http://where.yahooapis.com/v1/place/44418" xml:lang="en-us">
    <woeid>44418</woeid>
    <placeTypeName code="7">Town</placeTypeName>
    <name>London</name>
    <country type="Country" code="GB">United Kingdom</country>
    <admin1 type="Country" code="GB-ENG">England</admin1>
    <admin2 type="County" code="">Greater London</admin2>
    <admin3></admin3>
    <locality1 type="Town">London</locality1>
    <locality2></locality2>
    <postal></postal>
    <centroid>
    <latitude>51.506321</latitude><longitude>-0.127140</longitude>
    </centroid>
    <boundingBox>
    <southWest><latitude>51.261318</latitude><longitude>-0.563000</longitude></southWest>
    <northEast><latitude>51.686031</latitude><longitude>0.280360</longitude></northEast>
    </boundingBox>
    </place>
    </places>
  • 93. resources
    joshuarichards on Flickr : http://www.flickr.com/photos/joshywoshywoo/124671979/
  • 94. resources
    • unique objects that contain multiple attributes, such as a place
    • 95. e.g. get attributes for WOEID 44418
    http://where.yahooapis.com/v1/place/44418?appid=[your id]
    • e.g. find the most likely place called “london”
    http://where.yahooapis.com/v1/places.q('london’)?appid=[your id]
    77
  • 96. resources
    • unique objects that contain multiple attributes, such as a place
    • 97. e.g. get places related to WOEID 44418
    http://where.yahooapis.com/v1/place/44418/relation?appid=[your id]
    • parent, ancestors, belongsto, neighbours, siblings, children
    78
  • 98. <?xml version="1.0" encoding="UTF-8"?><places xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:start="0" yahoo:count="10" yahoo:total="34">
    <place yahoo:uri="http://where.yahooapis.com/v1/place/12695806" xml:lang="en-us">
    <woeid>12695806</woeid>
    <placeTypeName code="10">Local Administrative Area</placeTypeName>
    <name>City of London</name>
    </place>
    <place yahoo:uri="http://where.yahooapis.com/v1/place/12695807" xml:lang="en-us">
    <woeid>12695807</woeid>
    <placeTypeName code="10">Local Administrative Area</placeTypeName>
    <name>London Borough of Camden</name>
    </place>
    <place yahoo:uri="http://where.yahooapis.com/v1/place/12695808" xml:lang="en-us">
    <woeid>12695808</woeid>
    <placeTypeName code="10">Local Administrative Area</placeTypeName>
    <name>London Borough of Hackney</name>
    </place>

    </places>
  • 99. Far more than you could ever want
    http://delicious.com/codepo8/geotoys
  • 100. never work with children, animals or live demos
    elephipelephi on Flickr : http://www.flickr.com/photos/elephipelephi/1493013250/
  • 101. not taking notes?
    selva on Flickr : http://www.flickr.com/photos/selva/24604141/
  • 102. London Twitter #devnest 7, March 2010
    (Almost) Everything You Ever WantedTo Know About Geo (with WOEIDs)…
    Gary Gale, Yahoo! Geo Technologies
    http://slideshare.net/vicchi
  • 103. thanks for listening
    Paul Keleher on Flickr : http://www.flickr.com/photos/pkeleher/1658311814/
  • 104. www.ygeoblog.com
    twitter.com/vicchi
    twitter.com/yahoogeo