View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
Giving a real-t ime geo talk at@where 20. How do you build stuff?#rtgeo.19 Apr via Twitter for iPhone from Santa Clara Convention Center 50 01 Great America Parkway Santa Clara, CA 95054 View Tweets at this place
Background  raffi@ wherehoo wherehoo ~/: cat / etc/servi 5859/udp ces | gre # WHEREHO p whereho o 5859/tcp OWherehoo (2000) # WHEREHO O⇢ “The Stuff Around You”⇢ “Wherehoo Server: An interactive location service for software agents and intelligent systems” - J.Youll, R.Krikorian⇢ In your /etc/services file!BusRadio (2004)⇢ Designed mobile computers to play media while also transmitting telemetry⇢ Looked and sounded like a radio - but really a Linux computerOneHop (2007)⇢ Bluetooth proximity-based social networking
BackgroundTwitter⇢Originally tech lead of API / Platform team⇢Built the first geo-based infrastructure before acquisition of Mixer Labs in December of 2009⇢Now lead of the Application Services group⇢Runs five teams focused on scalable infrastructure around “core” data objects ⇢Tweets, users, timelines, places, etc. ⇢Delivery, authentication, APIs, etc.
Table of contentsBackground⇢ Why are we interested in this?Twitter’s geo APIs⇢ How do we allow people to talk about place?⇢ Context around “place”Problem statement⇢ What do we want our system to do?Infrastructure⇢ How is Twitter solving this problem?
Original attemptsAdding it to the tweet⇢ Use myloc.me, et. al. to add text to the tweet⇢ Puts location “in band”⇢ Takes from the 140 charactersSetting profile level locations⇢ Set the user/location of a Twitter user⇢ There’s an API for that!⇢ Not a per-tweet basis⇢ Not intended for high frequency alterations
Geotagging APIAdding it to the tweet⇢ Per-tweet basis⇢ Out of band and pure metadata⇢ Does not take from the 140 charactersNative Twitter support⇢ Simple way to update status with location data⇢ Ability to remove geotags from your tweets en masse⇢ Using GeoRSS and GeoJSON as the encoding format⇢ Across all Twitter APIs (REST, Search, and Streaming)
geocode “latitud parameSearch e,longit radius h ude,rad as units ter take s ius” wh of mi or ere km  raffi@~/: curl "http://search.twitter.com/search.atom? geocode=40.757929%2C-73.985506%2C25km&source=foursquare" ... <title>On the way to ace now, so whenever you can make it Ill be there. (@ Port Imperial Ferry in Weehawken) http://4sq.com/ 2rq0vO</title> ... <twitter:geo> <georss:point>40.7759 -74.0129</georss:point> </twitter:geo> ...
location filtering  raffi@~/: curl "http://stream.twitter.com/1/statuses/filter.xml? locations=-74.5129,40.2759,-73.5019,41.2759" locations is a b ounding box s “long1,lat1,lon pecified by g2,lat2” and ca to 10 location n track up s that are mos square (~60 m t 1 degree iles square an to cover most d enough metropolitan areas)
Trends APIGlobal Trends⇢Analysis of “hot conversations”⇢Does not take from the 140 charactersLocation specific trends⇢Tweets being localized through a variety of means internally⇢Locations exposed over the API as WOEIDs and Twitter IDs⇢Can ask for available trends sorted by distnace
available locations  raffi@~/: curl "http://api.twitter.com/1/trends/available.xml" <locations type=”array”> <location> <woeid>2487956</woeid> <name>San Francisco</name> <placeTypeName code=”7”>Town</placeTypeName> <country type=”Country” code=”US”>United States</country> <url>http://where.yahooapis.com/v1/place/2487956</url> ke a lat and long nally ta </location> C an optio trends to have ... parameter ted, as ed, sor </locations> location s return dista nce from you.
Look up a trena Local trend WOEID d at a given  raffi@~/: curl "http://api.twitter.com/1/trends/2487956.xml" <matching_trends type=”array”> <trends as_of=”2009-12-15T20:19:09Z”> ... <trend url=”http://search.twitter.com/search?q=Golden+Globe +nominations” query=”Golden+Globe+nominations”>Golden Globe nominations</ trend> <trend url=”http://search.twitter.com/search?q=%23somethingaintright” query=”%23somethingaintright”>#somethingaintright</trend> ... </trends> </matching_trends>
Sharing coordinatesMore aptly named “geotagging”Good for sharing photosPossibly good for talking about a specific place(e.g. store, restaurant)People don’t understand numbers and withouta map, there is a lack of contextHuge privacy implications
Sharing polygonsPrivacy implications arepotentially betterIf you thought sharing one pairof numbers was bad...Questions around polygondefinitionStill unable to visualize unlesson a map
Sharing namesHas the potential to make a connection with usersDistinguishes a “named place” from simply a “place”Inverse relationship between granularity and connectionRather large internationalization / context implications
Geo-place APISupport for “names”⇢Not just coordinates⇢More contextually relevant⇢Positive privacy benefitsIncreased comlexity⇢Need to be able to look up a list of places⇢Requires a “reverse geocoder”⇢Human driven tagging and not possible to be fully automatic
as background... MySQL + GISAbility to index points and do a spatial query⇢For example, get points within a bounding rectangle⇢SELECT MBRContains(GeomFromText(‘Polygon(0 0, 0 3, 3 3, 3 0, 0 0))’), coord) FROM geometryHard to cache the spatial queryPossibly requires a DB hit on every query
optionsGrid / quad-tree⇢ Create a grid (possibly nested) of the entire EarthGeohash⇢ Arbitrarily precise and hierarhical spatial data referenceSpace filling curves⇢ Mapping 2D space into 1D while preserving localityR-Tree⇢ Spatial access data structure
geohash37o18’N 121o54’W = 9q9k4Hierarchical spatial data structurePrecision encodedDistance captured⇢Nearby places (usually) share the same prefix⇢The longer the string match, the closer the places are
Space filling curveGeneralization of geohash⇢2D to 1D mapping⇢Nearness is capturedRecurisvely can fill up spacedepending on resolution requiredFractal-like pattern can be usedto take up as much room aspossiblE
How do you store precision?“Precision” is a hard thing to encodeAccuracy can be encoded with an error radiusTwitter opts for tracking the number of decimals passed⇢140.0 != 140.00⇢DecimalTrackingFloat
Twitter infrastructureRuby on Rails-ish frontendScala-based services backendMySQL and soon to be Cassandra as the storeRPC to back-end or put items into queues
Simplified architectureR-Tree for spatial lookup⇢Data provider for front-end lookups⇢Store place object with envelope of place in R-TreeMapping from ID to place object
Java Toplogy Suite (JTS)http://www.vividsolutions.com/jts/jtshome.htmOpen sourceGood for representing and manipulating “geometries”Has support for fundamental geometric operations⇢ contains⇢ envelopeHas a R-Tree implementation
pointI nsidepointO in pol utside ygon? in pol true ygon? false
at (0. 0, 0.0 -- reg ) at (1. ion 1 0, 1.0 -- reg ) ion 1 -- reg at (2. ion 2 0, 2.0 -- reg ) ion 1 -- reg at (3. ion 2 0, 3.0 -- reg )at (4. ion 2 0, 4.0 -- emp ) ty
Java Topology Suite (JTS)Serializers and deserializers⇢Well-known text (WKT)⇢Well-known binary (WKB)⇢No GeoRSS or GeoJSON support
interface / RPCRockDove is a backend service⇢Data provider for front-end lookups⇢Uses some form of RPC (Thrift, Avro, etc.) to communicate with⇢Data could be cached on frontend to prevent lookupsSimple RPC interface⇢get(id)⇢containedWithin(lat, long)
Interface / RPCWatch those RPC queues!Fail fast and potentially throw “over capacity” messages⇢get(id) throws OverCapacity⇢containedWithin(lat, long) throws OverCapacityDistinguish between write path and read path
georubyhttp://georuby.rubyforge.org/Open sourceOpenGIS Simple Features Interface StandardOnly good for representing geometric entitiesGeoRuby::SimpleFeatures::Geometry::from_ewkbNo GeoJSON serializers
Triangulation: Cellular200m to 1km accuracyMeasuring signal strength to cell towers with known locationsIf can only see one cellular tower, then fallback to cellular toweridentification - better than nothing, but really inaccurateRequires cellular modem, software, and lookups
Triangulation: WifiSub 20m accuracyWorks indoors and in urban areasDoesn’t need dedicated hardware just a 802.11 radioRelatively quick time to get a position
Triangulation: GPSSub 1m accuracyNeed dedicated GPS hardwareProne to multi-path confusion especially in citiesNeeds line of sight to the skyDoesn’t work well indoorsPotentially takes a few minutes to get a lock
AssociationIP address to geographical mappingAll done on the server sideMaybe “good” for city level⇢ Maxmind has 83% at 40km⇢ Very error prone⇢ Gets wonky when dealing with cellular connections or rather large ISPsDatabase needs to be refreshed fairlyfrequently
ExtractionRead the text and understand intentHard to understand whether talkingfroma place, or about a placeRunning text through a geocoder(Google, Yahoo, Geocoder.us)Parsing structured URLs and thencrawling “place pages”