Twitter has launched a Geotagging API – we really wanted to enable users to not only talk about “What’s happening?” but also “What’s happening right here?” For a while now, we’ve been watching as users have been trying to geo-tag their tweets through a variety of methods, all of which involve a link to a map service embedded in their Tweet. This talk will delve into how Twitter handles their geocontent including tool suggestions.
As a platform, we’ve tried to make it easier for our users by making location be omnipresent through our platform, and an inherent (but optional) part of a tweet. We’re making the platform be not just about time, but also about place.
The InfluxDB 2.0 Storage Engine | Jacob Marble | InfluxDataInfluxData
The InfluxDB storage engine was completely overhauled for 2.0. Jacob will walk through why we made these changes and discuss architectural considerations in using the new TSM engine.
Data warehouse or conventional database: Which is right for you?Data Con LA
Data Con LA 2020
Description
Developers have a plethora of choice for application data stores. In this talk we'll explore the differences between transaction processing systems like MySQL and analytic databases like ClickHouse to help you make the best choice for your application. Confused about when to use a data warehouse vs a traditional relational database? Open source has so many choices! Using MySQL and ClickHouse as examples, we'll work through use cases to see where each shines. Along the way we'll explore key technical differences like:
* row vs. column storage
* indexing and compression
* query parallelization
* concurrency support
* transaction models.
Finally we'll discuss how to handle use cases that require capabilities of both. Listeners will leave with clear criteria and and deeper understanding of database internals that enable them to make the right choice(s) for their own use cases.
Speaker
Robert Hodges, Altinity, Inc, CEO
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Altinity Ltd
Slides for the webinar presented on June 16, 2020
By James Hartig, Co-Founders of Admiral and Robert Hodges, Altinity CEO
Advertising is dying in the wake of privacy and adblockers. Join us for a conversation with James Hartig, a Co-Founder at Admiral (getadmiral.com), who helps publishers diversify their revenue and build more meaningful relationships with users. We'll start with an overview of Admiral's platform and how they use large scale session data to power their engagement engine. We'll then discuss the ClickHouse features that Admiral uses to power these real-time decisions. Finally, we'll walk through how Admiral migrated from MongoDB to ClickHouse and some of their plans for future projects. Join us to learn how ClickHouse drives cutting edge real-time applications today!
Speaker Bios:
James Hartig is one of the Co-Founders of Admiral working on distributed systems in Golang. Before this, he worked at the online music streaming platform, Grooveshark.
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
The InfluxDB 2.0 Storage Engine | Jacob Marble | InfluxDataInfluxData
The InfluxDB storage engine was completely overhauled for 2.0. Jacob will walk through why we made these changes and discuss architectural considerations in using the new TSM engine.
Data warehouse or conventional database: Which is right for you?Data Con LA
Data Con LA 2020
Description
Developers have a plethora of choice for application data stores. In this talk we'll explore the differences between transaction processing systems like MySQL and analytic databases like ClickHouse to help you make the best choice for your application. Confused about when to use a data warehouse vs a traditional relational database? Open source has so many choices! Using MySQL and ClickHouse as examples, we'll work through use cases to see where each shines. Along the way we'll explore key technical differences like:
* row vs. column storage
* indexing and compression
* query parallelization
* concurrency support
* transaction models.
Finally we'll discuss how to handle use cases that require capabilities of both. Listeners will leave with clear criteria and and deeper understanding of database internals that enable them to make the right choice(s) for their own use cases.
Speaker
Robert Hodges, Altinity, Inc, CEO
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Altinity Ltd
Slides for the webinar presented on June 16, 2020
By James Hartig, Co-Founders of Admiral and Robert Hodges, Altinity CEO
Advertising is dying in the wake of privacy and adblockers. Join us for a conversation with James Hartig, a Co-Founder at Admiral (getadmiral.com), who helps publishers diversify their revenue and build more meaningful relationships with users. We'll start with an overview of Admiral's platform and how they use large scale session data to power their engagement engine. We'll then discuss the ClickHouse features that Admiral uses to power these real-time decisions. Finally, we'll walk through how Admiral migrated from MongoDB to ClickHouse and some of their plans for future projects. Join us to learn how ClickHouse drives cutting edge real-time applications today!
Speaker Bios:
James Hartig is one of the Co-Founders of Admiral working on distributed systems in Golang. Before this, he worked at the online music streaming platform, Grooveshark.
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
Slides for the webinar, presented on February 26, 2020
By Robert Hodges, Altinity CEO
Materialized views are the killer feature of ClickHouse, and the Altinity 2019 webinar on how they work was very popular. Join this updated webinar to learn how to use materialized views to speed up queries hundreds of times. We'll cover basic design, last point queries, using TTLs to drop source data, counting unique values, and other useful tricks. Finally, we'll cover recent improvements that make materialized views more useful than ever.
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
Webinar May 27, 2020
ClickHouse is famously fast, but a small amount of extra work makes it much faster. Join us for the latest version of our popular talk on single-node ClickHouse performance. We start by examining the system log to see what ClickHouse queries are doing. Then we introduce standard tricks to increase speed: adding CPUs, reducing I/O with filters, restructuring joins, adding indexes, and using materialized views, plus many more. In each case we show how to measure the results of your work. There will as usual be time for questions as well at the end. Sign up now to polish your ClickHouse performance skills!
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
From webinars September 11 and September 17, 2019
ClickHouse is famous for speed. That said, you can almost always make it faster! This webinar uses examples to teach you how to deduce what queries are actually doing by reading the system log and system tables. We'll then explore standard ways to increase query speed: data types and encodings, filtering, join reordering, skip indexes, materialized views, session parameters, to name just a few. In each case we'll circle back to query plans and system metrics to demonstrate changes in ClickHouse behavior that explain the boost in performance. We hope you'll enjoy the first step to becoming a ClickHouse performance guru!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Slides from my Introduction to PostGIS workshop at the FOSS4G conference in 2009. The material is available at http://revenant.ca/www/postgis/workshop/
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries). Geofencing lays the foundation for realising use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play. This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs).
IT Days - Parse huge JSON files in a streaming way.pptxAndrei Negruti
Everyone uses JSON files. Thankfully, most of the time the JSON files we use are small and we can always just read and process everything in memory because it is convenient and easy to do. Most of the time it is not all the time. Sometimes you must process big JSON files and the moment you try to do this the old-fashioned way you are soon going to see the dreadful “java.lang.OutOfMemoryError.” One search on the internet and you will find solutions to this problem. Concisely you will see a variation of these answers:
Split your file into smaller ones Increase max memory used (yes, this is one of the answers)
Save the JSON in a temporary file and use the streaming capabilities of GSON or Jackson.
GSON or Jackson work well but they require you to write a lot of boilerplate code and get your hands dirty with lots of tokens, if checks, path checks etc. We developed a fourth option, and we were able to abstract away what Jackson can do and create an interface that is easy to understand and interact with. With its help we managed to deliver increased performance, reduce the memory we need to run our service by more than 50% while also being able to translate an infinite number of paragraphs because now we no longer have the entire file in memory.
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Ingesting streaming data into Graph DatabaseGuido Schmutz
This talk presents the experience of a customer project where we built a stream-based ingestion into a graph database. It is one thing to load the graph first and then querying it. But it is another story if the data to be added to the graph is constantly streaming in, while querying it. Data is easy to add, if each single message ends up as a new vertex in the graph. But if a message consists of hierarchical information, it most often means creating multiple new vertices as well adding edges to connect this information. What if a node already exists in the graph? Do we create it again or do we rather add edges which link to the existing node? Creating multiple nodes for the same real-life entity is not the best choice, so we have to check for existence first. We end up requiring multiple operations against the graph, which demonstrated to be a bottle neck. This talk presents the implementation of an ingestion pipeline and the design choice we made to improve performance.
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
Slides for the webinar, presented on February 26, 2020
By Robert Hodges, Altinity CEO
Materialized views are the killer feature of ClickHouse, and the Altinity 2019 webinar on how they work was very popular. Join this updated webinar to learn how to use materialized views to speed up queries hundreds of times. We'll cover basic design, last point queries, using TTLs to drop source data, counting unique values, and other useful tricks. Finally, we'll cover recent improvements that make materialized views more useful than ever.
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
Webinar May 27, 2020
ClickHouse is famously fast, but a small amount of extra work makes it much faster. Join us for the latest version of our popular talk on single-node ClickHouse performance. We start by examining the system log to see what ClickHouse queries are doing. Then we introduce standard tricks to increase speed: adding CPUs, reducing I/O with filters, restructuring joins, adding indexes, and using materialized views, plus many more. In each case we show how to measure the results of your work. There will as usual be time for questions as well at the end. Sign up now to polish your ClickHouse performance skills!
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
From webinars September 11 and September 17, 2019
ClickHouse is famous for speed. That said, you can almost always make it faster! This webinar uses examples to teach you how to deduce what queries are actually doing by reading the system log and system tables. We'll then explore standard ways to increase query speed: data types and encodings, filtering, join reordering, skip indexes, materialized views, session parameters, to name just a few. In each case we'll circle back to query plans and system metrics to demonstrate changes in ClickHouse behavior that explain the boost in performance. We hope you'll enjoy the first step to becoming a ClickHouse performance guru!
Speaker Bio:
Robert Hodges is CEO of Altinity, which offers enterprise support for ClickHouse. He has over three decades of experience in data management spanning 20 different DBMS types. ClickHouse is his current favorite. ;)
Slides from my Introduction to PostGIS workshop at the FOSS4G conference in 2009. The material is available at http://revenant.ca/www/postgis/workshop/
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries). Geofencing lays the foundation for realising use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play. This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs).
IT Days - Parse huge JSON files in a streaming way.pptxAndrei Negruti
Everyone uses JSON files. Thankfully, most of the time the JSON files we use are small and we can always just read and process everything in memory because it is convenient and easy to do. Most of the time it is not all the time. Sometimes you must process big JSON files and the moment you try to do this the old-fashioned way you are soon going to see the dreadful “java.lang.OutOfMemoryError.” One search on the internet and you will find solutions to this problem. Concisely you will see a variation of these answers:
Split your file into smaller ones Increase max memory used (yes, this is one of the answers)
Save the JSON in a temporary file and use the streaming capabilities of GSON or Jackson.
GSON or Jackson work well but they require you to write a lot of boilerplate code and get your hands dirty with lots of tokens, if checks, path checks etc. We developed a fourth option, and we were able to abstract away what Jackson can do and create an interface that is easy to understand and interact with. With its help we managed to deliver increased performance, reduce the memory we need to run our service by more than 50% while also being able to translate an infinite number of paragraphs because now we no longer have the entire file in memory.
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Ingesting streaming data into Graph DatabaseGuido Schmutz
This talk presents the experience of a customer project where we built a stream-based ingestion into a graph database. It is one thing to load the graph first and then querying it. But it is another story if the data to be added to the graph is constantly streaming in, while querying it. Data is easy to add, if each single message ends up as a new vertex in the graph. But if a message consists of hierarchical information, it most often means creating multiple new vertices as well adding edges to connect this information. What if a node already exists in the graph? Do we create it again or do we rather add edges which link to the existing node? Creating multiple nodes for the same real-life entity is not the best choice, so we have to check for existence first. We end up requiring multiple operations against the graph, which demonstrated to be a bottle neck. This talk presents the implementation of an ingestion pipeline and the design choice we made to improve performance.
Terraform, is no doubt very flexible and powerful. The question is, how do we write Terraform code and construct our infrastructure in a reproducible fashion that makes sense? How can we keep code DRY, segment state, and reduce the risk of making changes to our service/stack/infrastructure?
HashiCorp’s infrastructure management tool, Terraform, is no doubt very flexible and powerful. The question is, how do we write Terraform code and construct our infrastructure in a reproducible fashion that makes sense? How can we keep code DRY, segment state, and reduce the risk of making changes to our service/stack/infrastructure?
This talk describes a design pattern to help answer the previous questions. The talk is divided into two sections, with the first section describing and defining the design pattern with a Deployment Example. The second part uses a multi-repository GitHub organization to create a Real World Example of the design pattern.
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...Uri Cohen
This presentation will focuses on the various data and querying models available in today’s distributed data stores landscape. It reviews what models and APIs are available and discusses the capabilities each of them provides, the applicable use cases and what it means for your application’s performance and scalability.
3. Background
Wherehoo (2000)
‣ “The Stuff Around You”
‣ “Wherehoo Server: An interactive location service for software agents and intelligent
systems” - J.Youll, R.Krikorian
‣ In your /etc/services file
BusRadio (2004)
‣ Designed mobile computers to play media while also transmitting telemetry
‣ Looked and sounded like a radio - but really a Linux computer
OneHop (2007)
‣ Bluetooth proximity-based social networking
4. Table of Contents
Background
‣ Why are we interested in this?
Twitter’s Geo APIs
‣ How do we allow people to talk about place?
Problem statement
‣ What are we trying to have our system do?
Infrastructure
‣ How is Twitter solving this problem?
11. Original attempts
Adding it to the tweet
‣ Use myloc.me, et. al. to add text to the tweet
‣ Localizes mobile phone and puts location “in band”
‣ Takes from 140 characters
Setting profile level locations
‣ Set the user/location of a Twitter user
‣ There is an API for that!
‣ Not on a per-tweet basis and not designed for high frequency updates
16. Geotagging API
Adding it to the tweet
‣ Per-tweet basis
‣ Out of band / pure meta-data
‣ Does not take from the 140 characters
Native Twitter support
‣ Simple way to update status with location data
‣ Ability to remove geotags from your tweets en masse
‣ Using GeoRSS and GeoJSON as the encoding format
‣ Across all Twitter APIs (REST, Search, and Streaming)
19. Search
search (with geocode)
curl "http://search.twitter.com/search.atom?
geocode=40.757929%2C-73.985506%2C25km&source=foursquare"
geocode parameter takes “latitude,longitude,radius” where radius has
units of mi or km
...
<title>On the way to ace now, so whenever you can make it I'll be there. (@
Port Imperial Ferry in Weehawken) http://4sq.com/2rq0vO</title>
...
<twitter:geo>
<georss:point>40.7759 -74.0129</georss:point>
</twitter:geo>
...
28. Trends API
Global trends
‣ Currently on front page of Twitter.com and on search.twitter.com
‣ Analysis of “hot conversations”
‣ Does not take from the 140 characters
Location specific trends
‣ Tweets being localized through a variety of means into trends
‣ Locations exposed over the API as WOEIDs
‣ Can ask for available trends sorted by distance from your location
‣ Querying for a parent of a location will return all locations under it
29. Available locations
trends/available
curl "http://api.twitter.com/1/trends/available.xml"
Can optionally take a lat and long parameter to have trends locations
returned, sorted, as distance from you.
<locations type=”array”>
<location>
<woeid>2487956</woeid>
<name>San Francisco</name>
<placeTypeName code=”7”>Town</placeTypeName>
<country type=”Country” code=”US”>United States</country>
<url>http://where.yahooapis.com/v1/place/2487956</url>
</location>
...
</locations>
30. Available locations
trends/woeid.xml (trends/twid.xml coming soon)
curl "http://api.twitter.com/1/trends/2487956.xml"
Look up the trends at the given WOEID
<matching_trends type=”array”>
<trends as_of=”2009-12-15T20:19:09Z”>
...
<trend url=”http://search.twitter.com/search?q=Golden+Globe+nominations” query=”Golden
+Globe+nominations”>Golden Globe nominations</trend>
<trend url=”http://search.twitter.com/search?q=%23somethingaintright”
query=”%23somethingaintright”>#somethingaintright</trend>
...
</trends>
</matching_trends>
32. Geo-place API
Support for “names"
‣ Not just coordinates
‣ More contextually relevant
‣ Positive privacy benefits
Increased complexity
‣ Need to be able to look up a list of places
‣ Requires a “reverse geocoder”
‣ Human driven tagging and not possible to be fully automatic
38. What do we need to build?
‣ Database of places
‣ Given a real-world location, find programatic places that that
place maps to
‣ Spatial search
‣ Method to store places with content
‣ Per user basis
‣ Per tweet basis
40. As background... MySQL + GIS
‣ Ability to index points and do a spatial query
‣ For example, get points within a bounding rectangle
‣ SELECT
MBRContains(GeomFromText(
'POLYGON((0 0,0 3,3 3,3 0,0 0))' ), coord)
FROM geometry
‣ Hard to cache the spatial query
‣ Possibly requires a DB hit on every query
41. Options
Grid / Quad-tree
‣ Create a grid (possibly nested) of the entire Earth
Geohash
‣ Arbitrarily precise and hierarchical spatial data reference
Space filling curves
‣ Mapping 2D space into 1D while preserving locality
R-Tree
‣ Spatial access data structure
46. Geohash
‣ 37o18’N 121o54’W = 9q9k4
‣ Hierarchical spatial data structure
‣ Precision encoded
‣ Distance captured
‣ Nearby places (usually) share the same prefix
‣ The longer the string match, the closer the places are
48. Geohash
‣ Possible to do range query in database
‣ Matching based on prefix will return all the points that fit in that
“grid”
‣ Able to store 2D data in a 1D space
51. Space filling curve
‣ Generalization of geohash
‣ 2D to 1D mapping
‣ Nearness is captured
‣ Recursively can fill up space
depending on resolution desired
‣ Fractal-like pattern can be used to
take up as much room as possible
54. R-Tree
‣ Height-balanced tree data
structure for spatial data
‣ Uses hierarchically nested
bounding boxes
‣ Nearby elements are placed in
the same node
57. How do you store precision?
‣ “Precision” is a hard thing to encode
‣ Accuracy can be encoded with an error radius
‣ Twitter opts for tracking the number of decimals passed
‣ 140.0 != 140.00
‣ DecimalTrackingFloat
60. Twitter Infrastructure
‣ Ruby on Rails-ish frontend
‣ Scala-based services backend
‣ MySQL and soon to be Cassandra as the store
‣ RPC to back-end or put items into queues
63. Simplified architecture
‣ R-Tree for spatial lookup
‣ Data provider for front-end lookups
‣ Store place object with envelope of place in R-Tree
‣ Mapping from ID to place object
64. Java Topology Suite (JTS)
‣ http://www.vividsolutions.com/jts/jtshome.htm
‣ Open source
‣ Good for representing and manipulating “geometries”
‣ Has support for fundamental geometric operations
‣ contains
‣ envelope
‣ Has a R-Tree implementation
65. point
Insid
point e in
Outsi polyg
de in on? t
polyg rue
on? f
alse
66. at (0
.0, 0
-- re .0)
at (1 gion
.0, 1 1
-- re .0)
gion
-- re 1
at (2 gion
.0, 2 2
-- re .0)
gion
-- re 1
at (3 gion
.0, 3 2
-- re .0)
at (4 gion
.0, 4 2
-- em .0)
pty
67. Java Topology Suite (JTS)
‣ Serializers and deserializers
‣ Well-known text (WKT)
‣ Well-known binary (WKB)
‣ No GeoRSS or GeoJSON support
68. Interface / RPC
‣ RockDove is a backend service
‣ Data provider for front-end lookups
‣ Uses some form of RPC (Thrift, Avro, etc.) to communicate with
‣ Data could be cached on frontend to prevent lookups
‣ Simple RPC interface
‣ get(id)
‣ containedWithin(lat, long)
69.
70. Interface / RPC
‣ Watch those RPC queues!
‣ Fail fast and potentially throw “over capacity” messages
‣ get(id) throws OverCapacity
‣ containedWithin(lat, long) throws
OverCapacity
‣ Distinguish between write path and read path
71. GeoRuby
‣ http://georuby.rubyforge.org/
‣ Open source
‣ OpenGIS Simple Features Interface Standard
‣ Only good for representing geometric entities
‣ GeoRuby::SimpleFeatures::Geometry::from_ewkb
‣ No GeoJSON serializers
74. Location in Browser
‣ Geolocation API Specification for JavaScript
navigator.geolocation.getCurrentPosition
‣ Does a callback with a position object
‣ position.coords has
‣ latitude and longitude
‣ accuracy
‣ other stuff
‣ Support in Firefox 3.5, Chromium, Opera, and others with Google Gears