Mack Hardy @mackaffinity from Affinity Bridge @affinitybridge discusses server side mapping tools for drupal, using PostGIS as a spatial backend, generating tiles and managing large sets of geodata and displaying it in Drupal CMS
This presentation will dive into a development team’s use case for choosing MongoDB as their spatially enabled NoSQL solution. The talk will also cover how the integration of GeoServer can expand the accessibility of your data. GeoServer is the open source implementation of Open Geospatial Consortium (OGC) standards and a core component of the Geospatial Web.
With the open source Geo2tag platform, developers can use JSON or XML to manage location references in apps for Nokia X and Nokia Asha phones. In this webinar, we’ll show how to use the Geo2tag API and how to manage a local database of georeferences. We’ll begin the training by introducing the fundamentals of Location Based Services and the REST API of Geo2Tag LBS Platform (www.geo2tag.org). We’ll focus on networking, JSON and web services. Then we will demonstrate several applications developed on top of Geo2Tagand share the newest enhancements to the platform. We’ll end the training with a discussion of integrating Geo2Tag and third-party map widgets.
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Concrete findings and "best practices" from building a cluster sized for 150 analytic queries per second on 100TB of http logs. Topics covered: hardware, clients (http vs native), partitioning, indexing, SELECT vs INSERT performance, replication, sharding, quotas, and benchmarking.
This presentation will dive into a development team’s use case for choosing MongoDB as their spatially enabled NoSQL solution. The talk will also cover how the integration of GeoServer can expand the accessibility of your data. GeoServer is the open source implementation of Open Geospatial Consortium (OGC) standards and a core component of the Geospatial Web.
With the open source Geo2tag platform, developers can use JSON or XML to manage location references in apps for Nokia X and Nokia Asha phones. In this webinar, we’ll show how to use the Geo2tag API and how to manage a local database of georeferences. We’ll begin the training by introducing the fundamentals of Location Based Services and the REST API of Geo2Tag LBS Platform (www.geo2tag.org). We’ll focus on networking, JSON and web services. Then we will demonstrate several applications developed on top of Geo2Tagand share the newest enhancements to the platform. We’ll end the training with a discussion of integrating Geo2Tag and third-party map widgets.
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Concrete findings and "best practices" from building a cluster sized for 150 analytic queries per second on 100TB of http logs. Topics covered: hardware, clients (http vs native), partitioning, indexing, SELECT vs INSERT performance, replication, sharding, quotas, and benchmarking.
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
The Geographic Information System (GIS), industry is booming, especially with the continued reliance on online maps and the rise of location-aware mobile devices. GIS tech can be one of the key players in the mobile internet, big data, and the internet of things, and is an essential tool for the next generation of the global IT industry.
Yet, the GIS community is not prepared. With all the data available, GIS experts lack an off-the-shelf solutions to manage the growing volume of spatial data. Relational spatial databases (RSDB) were the leader in this field for decades, but RSDBs have failed to innovate to handle massive volumes of data coming in at high velocity.
Fortunately, MongoDB a useful tool for this challenge, but needs some tooling to create a connector to the GIS tech ecosystem. In order to bridge the gap, we built a pipeline to comply with the architecture of the Geospatial Data Abstraction Library (GDAL), so that MongoDB can work with most of popular GIS tools such as OpenLayers, Mapserver, GeoServer, QGIS, ArcGIS and others with ease. In this talk, I'll go through this pipeline tool and showcase some examples of how you can use this in your next application.
GeoMesa presentation from LocationTech Tour - DC - November, 14th 2013. Presented by Anthony Fox (@algoriffic) of CCRi.
GeoMesa is an open source project providing spatio-temporal indexing, querying, and visualizing capabilities to Accumulo. Learn more at http://geomesa.github.io/
GeoPackage, OWS Context and the OGC Interoperability ProgramRaj Singh
Overview of GeoPackage, OWS Context and the OGC Interoperability Program Testbed process with details on how OGC testbeds work and the time commitment.
MapDB - taking Java collections to the next levelJavaDayUA
Java collections have several limitations. But imagine library without limits, which could even replace your database. This session talks about drop-in replacement with many new possibilities. MapDB provides Java collections backed by in-memory or on-disk store. It adds extra features to traditional collections (entry expiration, binding, secondary collections…). It is also proper database engine and has transactions, snapshots, incremental backups… And finally it is not affected by GC, so it can take a billion entries without a hiccup.
State of GeoServer provides an update on our community and reviews the new and noteworthy features for the Project. The community has a lot to cover in 2.12 and the recently released 2.13.
Each release provides exciting new features. This talk covers our work on supporting Java 9 and diverse improvements across GeoServer.
Attend this talk for a cheerful update on what is happening with this popular OSGeo project. Whether you are an expert user, a developer, or simply curious what GeoServer can do for you.
Introduction to GeoPackage and OWS ContextRaj Singh
GeoPackage is the modern alternative to formats like SDTS and Shapefile. At it’s core, GeoPackage is simply a SQLite database schema. If you know SQLite, you are close to knowing GeoPackage. Install Spatialite – the premiere spatial extention to SQLite – and you get all the performance of a spatial database along with the convenience of a file-based data set that can be emailed, shared on a USB drive or burned to a DVD.
A ‘context document’ specifies a fully configured service set which can be exchanged (with a consistent interpretation) among clients supporting the standard. The OGC Web Services Context Document (OWS Context) was created to allow a set of configured information resources (service set) to be passed between applications primarily as a collection of services. OWS Context is developed to support in-line content as well. The goal is to support use cases such as the distribution of search results, the exchange of a set of resources such as OGC Web Feature Service (WFS), Web Map Service (WMS), Web Map Tile Service (WMTS), Web Coverage Service (WCS) and others in a ‘common operating picture’. Additionally OWS Context can deliver a set of configured processing services (Web Processing Service (WPS)) parameters to allow the processing to be reproduced on different nodes.
In Apache Cassandra Lunch #59: Functions in Cassandra, we discussed the functions that are usable inside of the Cassandra database. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live.
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...Chester Chen
Uber developed an new Spark ingestion system, Marmaray, for data ingestion from various sources. It’s designed to ingest billions of Kafka messages every 30 minutes. The amount of data handled by the pipeline is of the order hundreds of TBs. Omar details how to tackle such scale and insights into the optimizations techniques. Some key highlights are how to understand bottlenecks in Spark applications, to cache or not to cache your Spark DAG to avoid rereading your input data, how to effectively use accumulators to avoid unnecessary Spark actions, how to inspect your heap and nonheap memory usage across hundreds of executors, how you can change the layout of data to save long-term storage cost, how to effectively use serializers and compression to save network and disk traffic, and how to reduce amortize the cost of your application by multiplexing your jobs, different techniques for reducing memory footprint, runtime, and on-disk usage. CGI was able to significantly (~10%–40%) reduce memory footprint, runtime, and disk usage.
Speaker: Omkar Joshi (Uber)
Omkar Joshi is a senior software engineer on Uber’s Hadoop platform team, where he’s architecting Marmaray. Previously, he led object store and NFS solutions at Hedvig and was an initial contributor to Hadoop’s YARN scheduler.
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
The Geographic Information System (GIS), industry is booming, especially with the continued reliance on online maps and the rise of location-aware mobile devices. GIS tech can be one of the key players in the mobile internet, big data, and the internet of things, and is an essential tool for the next generation of the global IT industry.
Yet, the GIS community is not prepared. With all the data available, GIS experts lack an off-the-shelf solutions to manage the growing volume of spatial data. Relational spatial databases (RSDB) were the leader in this field for decades, but RSDBs have failed to innovate to handle massive volumes of data coming in at high velocity.
Fortunately, MongoDB a useful tool for this challenge, but needs some tooling to create a connector to the GIS tech ecosystem. In order to bridge the gap, we built a pipeline to comply with the architecture of the Geospatial Data Abstraction Library (GDAL), so that MongoDB can work with most of popular GIS tools such as OpenLayers, Mapserver, GeoServer, QGIS, ArcGIS and others with ease. In this talk, I'll go through this pipeline tool and showcase some examples of how you can use this in your next application.
GeoMesa presentation from LocationTech Tour - DC - November, 14th 2013. Presented by Anthony Fox (@algoriffic) of CCRi.
GeoMesa is an open source project providing spatio-temporal indexing, querying, and visualizing capabilities to Accumulo. Learn more at http://geomesa.github.io/
GeoPackage, OWS Context and the OGC Interoperability ProgramRaj Singh
Overview of GeoPackage, OWS Context and the OGC Interoperability Program Testbed process with details on how OGC testbeds work and the time commitment.
MapDB - taking Java collections to the next levelJavaDayUA
Java collections have several limitations. But imagine library without limits, which could even replace your database. This session talks about drop-in replacement with many new possibilities. MapDB provides Java collections backed by in-memory or on-disk store. It adds extra features to traditional collections (entry expiration, binding, secondary collections…). It is also proper database engine and has transactions, snapshots, incremental backups… And finally it is not affected by GC, so it can take a billion entries without a hiccup.
State of GeoServer provides an update on our community and reviews the new and noteworthy features for the Project. The community has a lot to cover in 2.12 and the recently released 2.13.
Each release provides exciting new features. This talk covers our work on supporting Java 9 and diverse improvements across GeoServer.
Attend this talk for a cheerful update on what is happening with this popular OSGeo project. Whether you are an expert user, a developer, or simply curious what GeoServer can do for you.
Introduction to GeoPackage and OWS ContextRaj Singh
GeoPackage is the modern alternative to formats like SDTS and Shapefile. At it’s core, GeoPackage is simply a SQLite database schema. If you know SQLite, you are close to knowing GeoPackage. Install Spatialite – the premiere spatial extention to SQLite – and you get all the performance of a spatial database along with the convenience of a file-based data set that can be emailed, shared on a USB drive or burned to a DVD.
A ‘context document’ specifies a fully configured service set which can be exchanged (with a consistent interpretation) among clients supporting the standard. The OGC Web Services Context Document (OWS Context) was created to allow a set of configured information resources (service set) to be passed between applications primarily as a collection of services. OWS Context is developed to support in-line content as well. The goal is to support use cases such as the distribution of search results, the exchange of a set of resources such as OGC Web Feature Service (WFS), Web Map Service (WMS), Web Map Tile Service (WMTS), Web Coverage Service (WCS) and others in a ‘common operating picture’. Additionally OWS Context can deliver a set of configured processing services (Web Processing Service (WPS)) parameters to allow the processing to be reproduced on different nodes.
In Apache Cassandra Lunch #59: Functions in Cassandra, we discussed the functions that are usable inside of the Cassandra database. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live.
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...Chester Chen
Uber developed an new Spark ingestion system, Marmaray, for data ingestion from various sources. It’s designed to ingest billions of Kafka messages every 30 minutes. The amount of data handled by the pipeline is of the order hundreds of TBs. Omar details how to tackle such scale and insights into the optimizations techniques. Some key highlights are how to understand bottlenecks in Spark applications, to cache or not to cache your Spark DAG to avoid rereading your input data, how to effectively use accumulators to avoid unnecessary Spark actions, how to inspect your heap and nonheap memory usage across hundreds of executors, how you can change the layout of data to save long-term storage cost, how to effectively use serializers and compression to save network and disk traffic, and how to reduce amortize the cost of your application by multiplexing your jobs, different techniques for reducing memory footprint, runtime, and on-disk usage. CGI was able to significantly (~10%–40%) reduce memory footprint, runtime, and disk usage.
Speaker: Omkar Joshi (Uber)
Omkar Joshi is a senior software engineer on Uber’s Hadoop platform team, where he’s architecting Marmaray. Previously, he led object store and NFS solutions at Hedvig and was an initial contributor to Hadoop’s YARN scheduler.
Getting Started with Geospatial Data in MongoDBMongoDB
MongoDB supports geospatial data and specialized indexes that make building location-aware applications easy and scalable.
In this session, you will learn the fundamentals of working with geospatial data in MongoDB. We will explore how to store and index geospatial data and best practices for using geospatial query operators and methods. By the end of this session, you should be able to implement basic geolocation functionality in an application.
In this webinar, you will learn:
- Getting geospatial data into MongoDB and how to build geospatial indexes.
- The fundamentals of MongoDB's geospatial query operators and how to design queries that meet the needs of your application.
- Advanced geospatial capabilities with Java geospatial libraries and MongoDB.
This is a presentation given on October 24 by Michael Uzquiano of Cloud CMS (http://www.cloudcms.com) at the MongoDB Boston conference.
In this presentation, we cover Hazelcast - an in-memory data grid that provides distributed object persistence across multiple nodes in a cluster. When backed by MongoDB, objects are naturally written to Mongo by Hazelcast. The integration points are clean and easy to implement.
We cover a few simple cases along with code samples to provide the MongoDB community with some ideas of how to integrate Hazelcast into their own MongoDB Java applications.
Map visualization using D3 js and Topojson File Format. Meclenburg county Zip Codes are shown with a overlay of per-capita income and (arbitrary) number of Starbucks.
Abstract –
Spark 2 is here, while Spark has been the leading cluster computation framework for severl years, its second version takes Spark to new heights. In this seminar, we will go over Spark internals and learn the new concepts of Spark 2 to create better scalable big data applications.
Target Audience
Architects, Java/Scala developers, Big Data engineers, team leaders
Prerequisites
Java/Scala knowledge and SQL knowledge
Contents:
- Spark internals
- Architecture
- RDD
- Shuffle explained
- Dataset API
- Spark SQL
- Spark Streaming
PyDX Presentation about Python, GeoData and MapsHannes Hapke
This presentation introduces you to the basics of geospatial data and guides you through two examples for Django. First, you will learn how to program a small GeoDjango project. And secondly, you will learn how to extend the project with a few lines of code to turn the Django project into an API endpoint, which can be consumed by your mobile clients or java script single page applications.
Recent developments in Hadoop version 2 are pushing the system from the traditional, batch oriented, computational model based on MapRecuce towards becoming a multi paradigm, general purpose, platform. In the first part of this talk we will review and contrast three popular processing frameworks. In the second part we will look at how the ecosystem (eg. Hive, Mahout, Spark) is making use of these new advancements. Finally, we will illustrate "use cases" of batch, interactive and streaming architectures to power traditional and "advanced" analytics applications.
Best Hadoop Institutes : kelly tecnologies is the best Hadoop training Institute in Bangalore.Providing hadoop courses by realtime faculty in Bangalore.
Hadoop isn't limited to running Java code, you can write your jobs in a variety of dynamic languages.
This talk is about Hadoop's Streaming API, and the best way we found to run Perl jobs on Amazon's Elastic MapReduce platform.
Similar to Server side geo_tools_in_drupal_pnw_2012 (20)
7. Storage - GeoField
Where do we store data in drupal?
A common field format
for geodata
• WKT
• Lat Lon
• Bounding
8.
9. Spatial Import
How do we import data from external sources?
Shapefiles
KML files
Uses GeoPHP and ogr2ogr
Spatial module -> Saves as WKT -> Geofield
http://drupal.org/project/spatial
10. ogr2ogr
Wrapper module for the GDAL ogr2ogr library
- Spatial module calls ogr2ogr
- Converts data from source formats to WKT
http://drupal.org/project/ogr2ogr
12. Sync_PostGIS Module
• Allows drupal to query PostGIS as a spatial query
service, much in the way SOLR is used for search
• Syncs data from Drupal entities with geofields to
PostGIS
• Provides query methods for testing intersection,
within, buffer conditions
http://drupal.org/project/sync_postgis
14. What does sync_postgis tell us?
Intersections with other data points
Within a buffer of 5km
15. Testing for Intersection
function geoquery_intersects($item1, $item2) {
$params = array($item1, $item2);
foreach ($params as &$param) {
if (is_scalar($param)) {
$param = array('entity_type' => 'node', 'eid' => $param);
}
}
if ($connection = sync_postgis_get_postgis_connection()) {
$geo_query = new syncPgQuery($connection);
return $geo_query->booleanRelQuery('intersects',
$params[0], $params[1])->execute();
}
16. Testing for Buffer Distance
function geoquery_dwithin($item1, $item2, $distance = 0, $srid = 4326)
{
$params = array($item1, $item2);
foreach ($params as &$param) {
if (is_scalar($param)) {
$param = array('entity_type' => 'node', 'eid' => $param);
}
}
if ($connection = sync_postgis_get_postgis_connection()) {
$geo_query = new syncPgQuery($connection, $srid);
return $geo_query->booleanRelQuery('dwithin', $params[0],
$params[1], $distance)->execute();
}
}
17. Displaying the results
With the results from the PostGIS
backend, we can show the user
useful information
In this case they know that
- target is in the protected area
- target intersects 2 traplines
- target is within a 5 km buffer of 4
other nodes of interest
19. Beyond Vector Based Maps
We want to show huge datasets, the vector model requires
"painting" the data onto the map, which is computationally
expensive
Pre-rendering the dataset onto a tile, means the client can
load the data quickly, and tiles are easy to cache
Obvious downside of caching, is that it doesn't work well
with frequently changing data
Tilemill has been great for creating base tiles, but
regenerating the entire tileset when the data changes is
hard and time consuming
20. Comparing Vector vs MB Tiles
Vector - 2.30 MB of transfer - Client side render
Tiles - 529 kB of transfer - Server side render
21. Tilestache
Python application for serving tiles
tilestache takes inputs of :
--- mbtiles which are pre-generated
--- mapnik configuration (to generate tiles on the fly)
--- vector (geojson, arcjson)
--- combinations of these inputs as composite
We are generating using mapnik with PostGIS as a datasource
- provides a caching layer for serving tiles
22. Composite maps from PostGIS Data
• base map satellite images
• tilestache provides the data
layers from PostGIS via mapnik
• leaflet map definition points to
layers in the layer switcher
Image credit http://mike.teczno.com/notes/tilestache.html
23. WAX interactivity
• parcel data with tiles in JSON
• on mouseover and on click behaviours
• need to pre-cache WAX styling
30. Search API
Next we want to be able to return data items from SOLR
to a map
Search on non-geographic facets - just like a view
Search on geographic facets - facet controls pull data
from PostGIS, or use SOLRs spatial extensions
http://drupal.org/project/search_api_location
Location module - geocoding and showing on a map Location module - proximity searches for points, small sets * some example of some old timey proximity tools * Performance issues as > 500 markers
Make a simpler verison? Remove this slide?
Make a simpler verison? Remove this slide?
Now that the geodata is in postgis, we can pull the data from postgis into other applications some examples of this would be tilestache, and tilemill
Now that the geodata is in postgis, we can pull the data from postgis into other applications some examples of this would be tilestache, and tilemill