This document discusses implementing real-time geofencing using Apache Kafka. It presents 4 approaches to join vehicle position data with geofence polygons: 1) using a cross join, 2) aggregating geofences by group, 3) using a custom UDF to aggregate geofences by geohash, and 4) aggregating geofences into a table grouped by geohash to improve performance. The document focuses on implementing these approaches using KSQL to perform stream processing on the vehicle position and geofence data streams with SQL-like queries.
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries). Geofencing lays the foundation for realising use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play. This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs).
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
ClickHouse clusters depend on ZooKeeper to handle replication and distributed DDL commands. In this Altinity webinar, we’ll explain why ZooKeeper is necessary, how it works, and introduce the new built-in replacement named ClickHouse Keeper. You’ll learn practical tips to care for ZooKeeper in sickness and health. You’ll also learn how/when to use ClickHouse Keeper. We will share our recommendations for keeping that happy as well.
Getting Started with Geospatial Data in MongoDBMongoDB
MongoDB supports geospatial data and specialized indexes that make building location-aware applications easy and scalable.
In this session, you will learn the fundamentals of working with geospatial data in MongoDB. We will explore how to store and index geospatial data and best practices for using geospatial query operators and methods. By the end of this session, you should be able to implement basic geolocation functionality in an application.
In this webinar, you will learn:
- Getting geospatial data into MongoDB and how to build geospatial indexes.
- The fundamentals of MongoDB's geospatial query operators and how to design queries that meet the needs of your application.
- Advanced geospatial capabilities with Java geospatial libraries and MongoDB.
Leo Hsu and Regina Obe
We'll demonstrate integrating PostGIS in both PHP and ASP.NET applications.
We'll demonstrate using the new PostGIS 1.5 geography offering to extend existing web applications with proximity analysis.
More advanced use to display maps and stats using OpenLayers, WMS/WFS services and roll your own WFS like service using the PostGIS KML/GML/and or GeoJSON output functions.
QGIS Training Manual이 번역, 인쇄되어 나왔습니다. QGIS Training Manual의 번역과 감수 등에 고생하신 권용찬, 장병진, 김서인 님 등의 노고에 깊은 감사 드립니다.
책과 편집이 동일한 인쇄본 PDF는 다음 링크에서도 다운 받으실 수 있습니다.
http://open.gaia3d.com/doc/QGIS/manual/QgisTrainingManual_KR_4print.pdf
위의 출력본은 최상의 인쇄를 위해 링크 등이 제거되어 있습니다. 때문에 전자책으로 보실 분은 다음에서 받으시면 됩니다.
http://docs.qgis.org/2.2/pdf/ko/QGIS-2.2-QGISTrainingManual-ko.pdf
웹에서 보시기를 원하시면 다음의 링크를 이용하시면 됩니다.
http://docs.qgis.org/2.2/ko/docs/training_manual/
본 QGIS Training Manual의 번역, 출판은 한국오픈소스GIS포럼의 지원으로 이뤄졌으며, 이에 대해 깊은 감사 드립니다.
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries). Geofencing lays the foundation for realising use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play. This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs).
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
ClickHouse clusters depend on ZooKeeper to handle replication and distributed DDL commands. In this Altinity webinar, we’ll explain why ZooKeeper is necessary, how it works, and introduce the new built-in replacement named ClickHouse Keeper. You’ll learn practical tips to care for ZooKeeper in sickness and health. You’ll also learn how/when to use ClickHouse Keeper. We will share our recommendations for keeping that happy as well.
Getting Started with Geospatial Data in MongoDBMongoDB
MongoDB supports geospatial data and specialized indexes that make building location-aware applications easy and scalable.
In this session, you will learn the fundamentals of working with geospatial data in MongoDB. We will explore how to store and index geospatial data and best practices for using geospatial query operators and methods. By the end of this session, you should be able to implement basic geolocation functionality in an application.
In this webinar, you will learn:
- Getting geospatial data into MongoDB and how to build geospatial indexes.
- The fundamentals of MongoDB's geospatial query operators and how to design queries that meet the needs of your application.
- Advanced geospatial capabilities with Java geospatial libraries and MongoDB.
Leo Hsu and Regina Obe
We'll demonstrate integrating PostGIS in both PHP and ASP.NET applications.
We'll demonstrate using the new PostGIS 1.5 geography offering to extend existing web applications with proximity analysis.
More advanced use to display maps and stats using OpenLayers, WMS/WFS services and roll your own WFS like service using the PostGIS KML/GML/and or GeoJSON output functions.
QGIS Training Manual이 번역, 인쇄되어 나왔습니다. QGIS Training Manual의 번역과 감수 등에 고생하신 권용찬, 장병진, 김서인 님 등의 노고에 깊은 감사 드립니다.
책과 편집이 동일한 인쇄본 PDF는 다음 링크에서도 다운 받으실 수 있습니다.
http://open.gaia3d.com/doc/QGIS/manual/QgisTrainingManual_KR_4print.pdf
위의 출력본은 최상의 인쇄를 위해 링크 등이 제거되어 있습니다. 때문에 전자책으로 보실 분은 다음에서 받으시면 됩니다.
http://docs.qgis.org/2.2/pdf/ko/QGIS-2.2-QGISTrainingManual-ko.pdf
웹에서 보시기를 원하시면 다음의 링크를 이용하시면 됩니다.
http://docs.qgis.org/2.2/ko/docs/training_manual/
본 QGIS Training Manual의 번역, 출판은 한국오픈소스GIS포럼의 지원으로 이뤄졌으며, 이에 대해 깊은 감사 드립니다.
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...confluent
(Guido Schmutz, Trivadis) Kafka Summit SF 2018
Internet of Things use cases are a perfect match for processing with a streaming platform such as Kafka and the Confluent Platform. Some of the questions to be answered are: How do we feed the data from our devices into Kafka? Do we directly send data to Kafka? Is Kafka accessible from outside the organization over the internet? What if we want to use a more specific IoT protocol such as MQTT or CoAP in between? How would we integrate it with Kafka? How can we enrich IoT streaming data with static data sitting in a traditional system?
This session will provide answers to these and other questions using a fictitious use case of a trucking company. Trucks are constantly sending data about position and driving habits, which can be used to derive real-time information and actions. A large part of the presentation will be a live demo. The demo will show the implementation of the pipeline incrementally: starting with sending the truck movement events directly to Kafka, then adding MQTT to the sensor data ingestion, followed by using Kafka Streams and KSQL to apply stream processing on the information received. The final pipeline will demonstrate the application of Kafka Connect with MQTT and JDBC source connectors for data ingestion and event stream enrichment, and Kafka Streams and KSQL for stream processing. The key takeaway is the live demonstration of a working end-to-end IoT streaming data ingestion pipeline using Kafka technologies.
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...HostedbyConfluent
With the increasing complexity that comes with adopting distributed architectures like Kafka, it is difficult to identify and triage performance anomalies. Implementing Distributed Tracing technologies like OpenTelemetry can add context to the messages flowing through your system, allowing us to visualize the data flow to pinpoint bottlenecks, bugs, and other performance issues. However, implementing end-to-end distributed tracing can be challenging for multi-request and asynchronous processes built on Kafka.
This talk will walk through how to use OpenTelemetry to tell the full story of a request as it travels through your Kafka producer, queue, and consumer. First, we will learn how context propagation works in OpenTelemetry with W3C and B3 protocols. Then, we will go through the problems faced when instrumenting asynchronous requests. Finally, we will go through the context propagation implementation of the OpenTelemetry Kafka exporter, tracing multi-request and asynchronous processes.
Official "push" and real-time capabilities for Symfony and API Platform (Merc...Les-Tilleuls.coop
Mercure.rocks is a brand new protocol allowing to push data updates to web browsers and other HTTP clients in a convenient, fast, reliable and battery-efficient way. It is especially useful to publish real-time updates of resources served through web APIs, to reactive web and mobile apps.
Both Symfony and API Platform now have an official support for this protocol!
From the ground, Mercure has been designed to work with technologies not able to maintain persistent connections. It's especially relevant in serverless environments, but is also convenient when using PHP or FastCGI scripts.
Mercure is basically a higher-level replacement for WebSocket. Unlike WebSocket, it is compatible with HTTP/2 and HTTP/3.
It has been designed with hypermedia APIs in mind, is auto-discoverable through the Web Linking RFC and is also compatible with GraphQL.
It natively supports authorization, reconnection in case of network issue (with refetching of missed events), subscribing to several topics, topics patterns (using templated URIs)...
Because it is built on top of Server-sent Events and plain old HTTP requests, it is already compatible with all modern browsers, and requires 0 client-side dependencies.
The protocol is open (available as an Internet Draft), and a reference open source implementation of the server written in Go is available.—
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
Webinar. Presented by Robert Hodges and Ned McClain, April 1, 2020
You are about to deploy ClickHouse into production. Congratulations! But what about monitoring? In this webinar we will introduce how to track the health of individual ClickHouse nodes as well as clusters. We'll describe available monitoring data, how to collect and store measurements, and graphical display using Grafana. We'll demo techniques and share sample Grafana dashboards that you can use for your own clusters.
As businesses grow, so does the complexity of their software. New features, new models, and new background processes all continue to be added. . .and developers struggle to make sense of it all. Yet the end user demands a swift and functional experience when interacting with your application. It is paramount to be open to alternative patterns that help tame complex, high-demand services. Two such patterns are command-query responsibility segregation (CQRS) and event sourcing (ES).
Command-query responsibility segregation is an architectural pattern for user-facing applications that extends from the now standard Model-View-Controller (MVC) pattern and is an alternative to the CRUD pattern. At its core, CQRS is about changing how we think of and work with our data by introducing two types of models: all user actions become commands, and a read-only query model powers our views. Commands and queries are logistically separated, providing additional decoupling of our application. CQRS also calls for changes in how we store and structure our data.
Enter event sourcing. Instead of persisting the current state of our domain objects or entities, we record historical events about our data. The key advantage is that we can examine our application data at any point in time, rather than just the current state. This pattern changes how we persist and process our data but is surprisingly efficient.
While each of the two patterns can be used exclusively, they complement each other beautifully and facilitate the construction of decoupled, scalable applications or individual services. Stephen Pember explores the fundamentals of each pattern and offers several examples and demonstration code to show how one might actually go about implementing CQRS and ES. Steve discusses task-based UIs and domain-driven design as he outlines some of the advantages—and challenges—that ThirdChannel has seen when developing systems using CQRS and ES over the past year.
The Zen of High Performance Messaging with NATS NATS
The Zen of High Performance Messaging with NATS
Waldemar Quevedo Salinas, Senior Software Engineer
NATS is an open source, high performant messaging system with a design oriented towards both being as simple and reliable as possible without at the same time trading off scalability. Originally written in Ruby, and then rewritten in Go, a NATS server can nowadays push over 11M messages per second.
In this talk, we will cover how following simplicity as the main design constraint as well as focusing on a limited built-in feature set, resulted in a system which is easy to operate and reason about, making up for an attractive choice for when building many types of distributed systems where low latency and high availability are very important.
You can learn more about NATS at http://www.nats.io
There is a renaissance underway in the messaging space. Due to the demands of IoT networks, cloud native apps, and microservices developers are looking for simple, fast, messaging systems. This is a sharp contrast to how traditional messaging was done.
This webinar will cover:
- The basics of messaging patterns
- What makes NATS unique
- Using a demo inspired by Pokemon Go as an example
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...confluent
(Guido Schmutz, Trivadis) Kafka Summit SF 2018
Internet of Things use cases are a perfect match for processing with a streaming platform such as Kafka and the Confluent Platform. Some of the questions to be answered are: How do we feed the data from our devices into Kafka? Do we directly send data to Kafka? Is Kafka accessible from outside the organization over the internet? What if we want to use a more specific IoT protocol such as MQTT or CoAP in between? How would we integrate it with Kafka? How can we enrich IoT streaming data with static data sitting in a traditional system?
This session will provide answers to these and other questions using a fictitious use case of a trucking company. Trucks are constantly sending data about position and driving habits, which can be used to derive real-time information and actions. A large part of the presentation will be a live demo. The demo will show the implementation of the pipeline incrementally: starting with sending the truck movement events directly to Kafka, then adding MQTT to the sensor data ingestion, followed by using Kafka Streams and KSQL to apply stream processing on the information received. The final pipeline will demonstrate the application of Kafka Connect with MQTT and JDBC source connectors for data ingestion and event stream enrichment, and Kafka Streams and KSQL for stream processing. The key takeaway is the live demonstration of a working end-to-end IoT streaming data ingestion pipeline using Kafka technologies.
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...HostedbyConfluent
With the increasing complexity that comes with adopting distributed architectures like Kafka, it is difficult to identify and triage performance anomalies. Implementing Distributed Tracing technologies like OpenTelemetry can add context to the messages flowing through your system, allowing us to visualize the data flow to pinpoint bottlenecks, bugs, and other performance issues. However, implementing end-to-end distributed tracing can be challenging for multi-request and asynchronous processes built on Kafka.
This talk will walk through how to use OpenTelemetry to tell the full story of a request as it travels through your Kafka producer, queue, and consumer. First, we will learn how context propagation works in OpenTelemetry with W3C and B3 protocols. Then, we will go through the problems faced when instrumenting asynchronous requests. Finally, we will go through the context propagation implementation of the OpenTelemetry Kafka exporter, tracing multi-request and asynchronous processes.
Official "push" and real-time capabilities for Symfony and API Platform (Merc...Les-Tilleuls.coop
Mercure.rocks is a brand new protocol allowing to push data updates to web browsers and other HTTP clients in a convenient, fast, reliable and battery-efficient way. It is especially useful to publish real-time updates of resources served through web APIs, to reactive web and mobile apps.
Both Symfony and API Platform now have an official support for this protocol!
From the ground, Mercure has been designed to work with technologies not able to maintain persistent connections. It's especially relevant in serverless environments, but is also convenient when using PHP or FastCGI scripts.
Mercure is basically a higher-level replacement for WebSocket. Unlike WebSocket, it is compatible with HTTP/2 and HTTP/3.
It has been designed with hypermedia APIs in mind, is auto-discoverable through the Web Linking RFC and is also compatible with GraphQL.
It natively supports authorization, reconnection in case of network issue (with refetching of missed events), subscribing to several topics, topics patterns (using templated URIs)...
Because it is built on top of Server-sent Events and plain old HTTP requests, it is already compatible with all modern browsers, and requires 0 client-side dependencies.
The protocol is open (available as an Internet Draft), and a reference open source implementation of the server written in Go is available.—
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
Webinar. Presented by Robert Hodges and Ned McClain, April 1, 2020
You are about to deploy ClickHouse into production. Congratulations! But what about monitoring? In this webinar we will introduce how to track the health of individual ClickHouse nodes as well as clusters. We'll describe available monitoring data, how to collect and store measurements, and graphical display using Grafana. We'll demo techniques and share sample Grafana dashboards that you can use for your own clusters.
As businesses grow, so does the complexity of their software. New features, new models, and new background processes all continue to be added. . .and developers struggle to make sense of it all. Yet the end user demands a swift and functional experience when interacting with your application. It is paramount to be open to alternative patterns that help tame complex, high-demand services. Two such patterns are command-query responsibility segregation (CQRS) and event sourcing (ES).
Command-query responsibility segregation is an architectural pattern for user-facing applications that extends from the now standard Model-View-Controller (MVC) pattern and is an alternative to the CRUD pattern. At its core, CQRS is about changing how we think of and work with our data by introducing two types of models: all user actions become commands, and a read-only query model powers our views. Commands and queries are logistically separated, providing additional decoupling of our application. CQRS also calls for changes in how we store and structure our data.
Enter event sourcing. Instead of persisting the current state of our domain objects or entities, we record historical events about our data. The key advantage is that we can examine our application data at any point in time, rather than just the current state. This pattern changes how we persist and process our data but is surprisingly efficient.
While each of the two patterns can be used exclusively, they complement each other beautifully and facilitate the construction of decoupled, scalable applications or individual services. Stephen Pember explores the fundamentals of each pattern and offers several examples and demonstration code to show how one might actually go about implementing CQRS and ES. Steve discusses task-based UIs and domain-driven design as he outlines some of the advantages—and challenges—that ThirdChannel has seen when developing systems using CQRS and ES over the past year.
The Zen of High Performance Messaging with NATS NATS
The Zen of High Performance Messaging with NATS
Waldemar Quevedo Salinas, Senior Software Engineer
NATS is an open source, high performant messaging system with a design oriented towards both being as simple and reliable as possible without at the same time trading off scalability. Originally written in Ruby, and then rewritten in Go, a NATS server can nowadays push over 11M messages per second.
In this talk, we will cover how following simplicity as the main design constraint as well as focusing on a limited built-in feature set, resulted in a system which is easy to operate and reason about, making up for an attractive choice for when building many types of distributed systems where low latency and high availability are very important.
You can learn more about NATS at http://www.nats.io
There is a renaissance underway in the messaging space. Due to the demands of IoT networks, cloud native apps, and microservices developers are looking for simple, fast, messaging systems. This is a sharp contrast to how traditional messaging was done.
This webinar will cover:
- The basics of messaging patterns
- What makes NATS unique
- Using a demo inspired by Pokemon Go as an example
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries).
Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others.
GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play.
This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.
Developing Spatial Applications with Google Maps and CARTOCARTO
Learn how CARTO integrates with Google Maps to unlock the advanced visualization capabilities of deck.gl and enables developers to build geospatial apps. You can watch the recorded webinar here: https://go.carto.com/webinars/google-maps-and-carto
Ingesting and Processing IoT Data - using MQTT, Kafka Connect and KSQLGuido Schmutz
Internet of Things use cases are a perfect match for processing with a streaming platform such as Kafka and the Confluent Platform. Some of the questions to be answered are: How do we feed the data from our devices into Kafka? Do we directly send data to Kafka? Is Kafka accessible from outside the organization over the internet? What if we want to use a more specific IoT protocol such as MQTT or CoAP in between? How would we integrate it with Kafka? How can we enrich IoT streaming data with static data sitting in a traditional system?
This session will provide answers to these and other questions using a fictitious use case of a trucking company. Trucks are constantly sending data about position and driving habits, which can be used to derive real-time information and actions. A large part of the presentation will be a live demo. The demo will show the implementation of the pipeline incrementally: starting with sending the truck movement events directly to Kafka, then adding MQTT to the sensor data ingestion, followed by using Kafka Streams and KSQL to apply stream processing on the information received. The final pipeline will demonstrate the application of Kafka Connect with MQTT and JDBC source connectors for data ingestion and event stream enrichment, and Kafka Streams and KSQL for stream processing. The key takeaway is the live demonstration of a working end-to-end IoT streaming data ingestion pipeline using Kafka technologies.
LocationTech is an Eclipse Foundation industry working group for location aware technologies. This presentation introduces LocationTech, looks at what it means for our industry and the participating projects.
Libraries: JTS Topology Suite is the rocket science of GIS providing an implementation of Geometry. Mobile Map Tools provides a C++ foundation that is translated into Java and Javascript for maps on iOS, Andriod and WebGL. GeoMesa is a distributed key/value store based on Accumulo. Spatial4j integrates with JTS to provide Geometry on curved surface.
Process: GeoTrellis real-time distributed processing used scala, akka and spark. GeoJinni mixes spatial data/indexing with Hadoop.
Applications: GEOFF offers OpenLayers 3 as a SWT component. GeoGit distributed revision control for feature data. GeoScipt brings spatial data to Groovy, JavaScript, Python and Scala. uDig offers an eclipse based desktop GIS solution.
Attend this presentation if want to know what LocationTech is about, are interested in these projects or curious about what projects will be next.
I am lokesh kanna from Anna University Regional Campus Tirunelveli make use of the resource you got
Machine learning is the ability of machine to understand languages to machine it is a low level language that is used by humans to give command
A variety of algorithms may be applied depending on the nature of the earth science exploration. Some algorithms may perform significantly better than others for particular objectives. For example, convolutional neural networks (CNN) are good at interpreting images, artificial neural networks (ANN) perform well in soil classification[4] but more computationally expensive to train than support-vector machine (SVM) learning. The application of machine learning has been popular in recent decades, as the development of other technologies such as unmanned aerial vehicles (UAVs),[5] ultra-high resolution remote sensing technology and high-performance computing units[6] lead to the availability of large high-quality datasets and more advanced algorithms.
Twitter has launched a Geotagging API – we really wanted to enable users to not only talk about “What’s happening?” but also “What’s happening right here?” For a while now, we’ve been watching as users have been trying to geo-tag their tweets through a variety of methods, all of which involve a link to a map service embedded in their Tweet. This talk will delve into how Twitter handles their geocontent including tool suggestions.
As a platform, we’ve tried to make it easier for our users by making location be omnipresent through our platform, and an inherent (but optional) part of a tweet. We’re making the platform be not just about time, but also about place.
Twitter has launched a Geotagging API – we really wanted to enable users to not only talk about “What’s happening?” but also “What’s happening right here?” For a while now, we’ve been watching as users have been trying to geo-tag their tweets through a variety of methods, all of which involve a link to a map service embedded in their Tweet. This talk will delve into how Twitter handles their geocontent including tool suggestions.
As a platform, we’ve tried to make it easier for our users by making location be omnipresent through our platform, and an inherent (but optional) part of a tweet. We’re making the platform be not just about time, but also about place.
By Kristoffer Benjaminsson, CTO, Easy.
This talk presents the telemetry system used in Battlefield Heroes and how it helps the team make technical decisions in order to provide the best service possible. We will show real life examples of how telemetry helped improve matchmaking, reduce latency for players and help find false alarms from the cheat detection system. We will also discuss how telemetry can be used in development for catching bugs and support game designers in their work.
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
GeoMesa is an open-source toolkit for processing and analyzing spatio-temporal data, such as IoT and sensor-produced observations, at scale. It provides a consistent API for querying and analyzing data on top of distributed databases (e.g. HBase, Accumulo, Bigtable, Cassandra) and messaging networks (e.g. Kafka) to handle batch analysis of historical archives of data and low-latency processing of data in-stream.
GeoMesa has deep integration with Spark SQL. It has added spatial types (e.g. Point, LineString, Polygons), spatial predicates (st_contains, st_intersects, etc.), and geometry processing functions (e.g. st_buffer, st_convexHull, etc.) to Spark SQL. It also optimizes the processing of these extensions by integrating with the Catalyst SQL optimizer to intercept SQL statements with spatial predicates and provision RDDs based on the underlying spatial index.
This session will describe the implementation of the GeoMesa Spark SQL integration, illustrate its application in production systems and demonstrate spatial aggregations and analytics using map-based visualizations.
This is a presentation given on October 24 by Michael Uzquiano of Cloud CMS (http://www.cloudcms.com) at the MongoDB Boston conference.
In this presentation, we cover Hazelcast - an in-memory data grid that provides distributed object persistence across multiple nodes in a cluster. When backed by MongoDB, objects are naturally written to Mongo by Hazelcast. The integration points are clean and easy to implement.
We cover a few simple cases along with code samples to provide the MongoDB community with some ideas of how to integrate Hazelcast into their own MongoDB Java applications.
Large scale data capture and experimentation platform at GrabRoman
In this video I'm presenting how we built a system to experiment and rollout features across hundreds of microservices at Grab.
The talk also describes a high-performance event tracking system which captures billions of events per day from mobile apps and backend services and makes them easily queryable through SQL with 1 minute end-to-end latency.
We'll go through feature toggles, experimentation platform and a custom, special-purpose database we built on top of Presto to be able to SQL-query everything.
Related blog posts:
- https://engineering.grab.com/building-grab-s-experimentation-platform
- https://engineering.grab.com/feature-toggles-ab-testing
- https://engineering.grab.com/big-data-real-time-presto-talariadb
- https://engineering.grab.com/experimentation-platform-data-pipeline
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes with ...KAI CHU CHUNG
Gokit is microservice tookit and use Service/Endpoint/Transport to strict separation of concerns design. This talk to use go-kit develop microservice application integrate with istio, jaeger prometheus, etc service and deploy on Kubernetes.
We show GEO map search request with The Nominatim Search Service and one GPS Example. It has a Map Quest Search with a simple and efficient interface and powerful capabilities, and relies solely on data contributed to OpenStreetMap.
On the menu View/GEO Map View you can use the function without scripting.
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
Analytical platforms for PoCs and evaluation can be built in the cloud in an hour - with ready-made setup scripts. But if you put the services together freely, it gets more difficult. The open-source platform-in-a-box "Platys" (https://github.com/TrivadisPF/platys) shows that it is easier for test and PoC environments. In addition to possible uses and examples, we explain services and "just briefly" set up a data lake with a database, event broker, stream processing, blob store, SQL access and data science notebook.
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
Today's modern data architectures and the their implementations contain an Event Broker. What are the benefits of placing an Event Broker in a Modern Data (Analytics) Architecture? What exactly is an Event Broker and what capabilities should it provide? Why is Apache Kafka the most popular realisation of an Event Broker?
These and many other questions will be answered in this session. The talk will start with a vendor-neutral definition of the capabilities of an Event Broker.
Then the session will highlight the different architecture styles which can be supported using an Event Broker (Kafka), such as Streaming Data Integration, Stream Analytics and Decoupled Event-Driven Applications and how can these be combined into a unified architecture, making the Event Broker the central nervous system of an enterprise architecture. We will end with an overview of the Kafka ecosystem and a placement of the various components onto the Modern Data (Analytics) Architecture.
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
The concept of "Data Lake" is in everyone's mind today. The idea of storing all the data that accumulates in a company in a central location and making it available sounds very interesting at first. But Data Lake can quickly turn from a clear, beautiful mountain lake into a huge pond, especially if it is inexpertly entrusted with all the source data formats that are common in today's enterprises, such as XML, JSON, CSV or unstructured text data. Who, after some time, still has an overview of which data, which format and how they have developed over different versions? Anyone who wants to help themselves from the Data Lake must ask themselves the same questions over and over again: what information is provided, what data types do they have and how has the content changed over time?
Data serialization frameworks such as Apache Avro and Google Protocol Buffer (Protobuf), which enable platform-independent data modeling and data storage, can help. This talk will discuss the possibilities of Avro and Protobuf and show how they can be used in the context of a data lake and what advantages can be achieved. The support on Avro and Protobuf by Big Data and Fast Data platforms is also a topic.
ksqlDB is a stream processing SQL engine, which allows stream processing on top of Apache Kafka. ksqlDB is based on Kafka Stream and provides capabilities for consuming messages from Kafka, analysing these messages in near-realtime with a SQL like language and produce results again to a Kafka topic. By that, no single line of Java code has to be written and you can reuse your SQL knowhow. This lowers the bar for starting with stream processing significantly.
ksqlDB offers powerful capabilities of stream processing, such as joins, aggregations, time windows and support for event time. In this talk I will present how KSQL integrates with the Kafka ecosystem and demonstrate how easy it is to implement a solution using ksqlDB for most part. This will be done in a live demo on a fictitious IoT sample.
Kafka as your Data Lake - is it Feasible?Guido Schmutz
For a long time we discuss how much data we can keep in Kafka. Can we store data forever or do we remove data after a while and maybe having the history in a data lake on Object Storage or HDFS? With the advent of Tiered Storage in Confluent Enterprise Platform, storing data much longer in Kafka is much very feasible. So can we replace a traditional data lake with just Kafka? Maybe at least for the raw data? But what about accessing the data, for example using SQL?
KSQL allows for processing data in a streaming fashion using an SQL like dialect. But what about reading all data of a topic? You can reset the offset and still use KSQL. But there is another family of products, so-called query engines for Big Data. They originate from the idea of reading Big Data sources such as HDFS, object storage or HBase, using the SQL language. Presto, Apache Drill and Dremio are the most popular solutions in that space. Lately these query engines also added support for Kafka topics as a source of data. With that you can read a topic as a table and join it with information available in other data sources. The idea of course is not real-time streaming analytics but batch analytics directly on the Kafka topic, without having to store it in a big data storage.
This talk answers, how well these tools support Kafka as a data source. What serialization formats do they support? Is there some form of predicate push-down supported or do we have to always read the complete topic? How performant is a query against a topic, compared to a query against the same data sitting in HDFS or an object store? And finally, will this allow us to replace our data lake or at least part of it by Apache Kafka?
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
Today's modern data architectures and the their implementations contain an Event Hub. What are the benefits of placing an Event Hub in a Modern Data (Analytics) Architecture? What exactly is an Event Hub and what capabilities should it provide? Why is Apache Kafka the most popular realization of an Event Hub?
These and many other questions will be answered in this session. The talk will start with a vendor-neutral definition of the capabilities of an Event Hub.
Then the session will highlight the different architecture styles which can be supported using an Event Hub (Kafka), such as Streaming Data Integration, Stream Analytics and Decoupled Event-Driven Applications and how can these be combined into a unified architecture, making the Event Hub the central nervous system of an enterprise architecture. We will end with an overview of the Kafka ecosystem and a placement of the various components onto the Modern Data (Analytics) Architecture.
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform and more and more is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate, ORDS APIs and bridging Kafka with Oracle AQ.
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
Today's modern data architectures and the their implementations contain an Event Hub. What are the benefits of placing an Event Hub in a Modern Data (Analytics) Architecture? What exactly is an Event Hub and what capabilities should it provide? Why is Apache Kafka the most popular realization of an Event Hub? These and many other questions will be answered in this session. The talk will start with a vendor-neutral definition of the capabilities of an Event Hub. Then the session will highlight the different architecture styles which can be supported using an Event Hub (Kafka), such as Streaming Data Integration, Stream Analytics and Decoupled Event-Driven Applications and how can these be combined into a unified architecture, making the Event Hub the central nervous system of an enterprise architecture. We will end with an overview of the Kafka ecosystem and a placement of the various components onto the Modern Data (Analytics) Architecture.
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
What is a Microservices architecture and how does it differ from a Service-Oriented Architecture? Should you use traditional REST APIs to bind services together? Or is it better to use a richer, more loosely-coupled protocol? This talk will start with quick recap of how we created systems over the past 20 years and how different architectures evolved from it. The talk will show how we piece services together in event driven systems, how we use a distributed log (event hub) to create a central, persistent history of events and what benefits we achieve from doing so.
Apache Kafka is a perfect match for building such an asynchronous, loosely-coupled event-driven backbone. Events trigger processing logic, which can be implemented in a more traditional as well as in a stream processing fashion. The talk will show the difference between a request-driven and event-driven communication and show when to use which. It highlights how the modern stream processing systems can be used to hold state both internally as well as in a database and how this state can be used to further increase independence of other services, the primary goal of a Microservices architecture.
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform. A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Data sources flowing into Kafka are often native data streams such as social media streams, telemetry data, financial transactions and many others. But these data stream only contain part of the information. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. To implement new and modern, real-time solutions, an up-to-date view of that information is needed. So how do we make sure that information can flow between the RDBMS and Kafka, so that changes are available in Kafka as soon as possible in near-real-time? This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate and bridging Kafka with Oracle Advanced Queuing (AQ).
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform. A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Data sources flowing into Kafka are often native data streams such as social media streams, telemetry data, financial transactions and many others. But these data stream only contain part of the information. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. To implement new and modern, real-time solutions, an up-to-date view of that information is needed. So how do we make sure that information can flow between the RDBMS and Kafka, so that changes are available in Kafka as soon as possible in near-real-time? This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate and bridging Kafka with Oracle Advanced Queuing (AQ).
Most data visualisation solutions today still work on data sources which are stored persistently in a data store, using the so called “data at rest” paradigms. More and more data sources today provide a constant stream of data, from IoT devices to Social Media streams. These data stream publish with high velocity and messages often have to be processed as quick as possible. For the processing and analytics on the data, so called stream processing solutions are available. But these only provide minimal or no visualisation capabilities. One option is to first persist the data into a data store and then use a traditional data visualisation solution to present the data. If latency is not an issue, such a solution might be good enough. An other question is which data store solution is necessary to keep up with the high load on write and read. If it is not an RDBMS but an NoSQL database, then not all traditional visualisation tools might already integrate with the specific data store. An other option is to use a Streaming Visualisation solution. They are specially built for streaming data and often do not support batch data. A much better solution would be to have one tool capable of handling both, batch and streaming data. This talk presents different architecture blueprints for integrating data visualisation into a fast data solutions and then we show how the different blueprints can be implemented by mapping products onto the blueprints.
Kafka as an event store - is it good enough?Guido Schmutz
Event Sourcing and CQRS are two popular patterns for implementing a Microservices architectures. With Event Sourcing we do not store the state of an object, but instead store all the events impacting its state. Then to retrieve an object state, we have to read the different events related to a certain object and apply them one by one. CQRS (Command Query Responsibility Segregation) on the other hand is a way to dissociate writes (Command) and reads (Query). Event Sourcing and CQRS are frequently grouped and used together to form something bigger. While it is possible to implement CQRS without Event Sourcing, the opposite is not necessarily correct. In order to implement Event Sourcing, an efficient Event Store is needed. But is that also true when combining Event Sourcing and CQRS? And what is an event store in the first place and what features should it implement?
This presentation will first discuss what functionalities an event store should offer and then present how Apache Kafka can be used to implement an event store. But is Kafka good enough or do specific event store solutions such as AxonDB or Event Store provide a better solution?
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Today’s enterprises have their core systems often implemented on top of relational databases, such as the Oracle RDBMS. Implementing a new solution supporting the digital strategy using Kafka and the ecosystem can not always be done completely separate from the traditional legacy solutions. Often streaming data has to be enriched with state data which is held in an RDBMS of a legacy application. It’s important to cache this data in the stream processing solution, so that It can be efficiently joined to the data stream. But how do we make sure that the cache is kept up-to-date, if the source data changes? We can either poll for changes from Kafka using Kafka Connect or let the RDBMS push the data changes to Kafka. But what about writing data back to the legacy application, i.e. an anomaly is detected inside the stream processing solution which should trigger an action inside the legacy application. Using Kafka Connect we can write to a database table or view, which could trigger the action. But this not always the best option. If you have an Oracle RDBMS, there are many other ways to integrate the database with Kafka, such as Advanced Queueing (message broker in the database), CDC through Golden Gate or Debezium, Oracle REST Database Service (ORDS) and more. In this session, we present various blueprints for integrating an Oracle RDBMS with Apache Kafka in both directions and discuss how these blueprints can be implemented using the products mentioned before.
Fundamentals Big Data and AI ArchitectureGuido Schmutz
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
The right architecture is key for any IT project. This is valid in the case for big data projects as well, but on the other hand there are not yet many standard architectures which have proven their suitability over years.
This session discusses different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Event Driven architecture as well as Lambda and Kappa architecture.
Each architecture is presented in a vendor- and technology-independent way using a standard architecture blueprint. In a second step, these architecture blueprints are used to show how a given architecture can support certain use cases and which popular open source technologies can help to implement a solution based on a given architecture.
Most data visualization solutions today still work on data sources which are stored persistently in a data store, using the so called “data at rest” paradigms. More and more data sources today provide a constant stream of data, from IoT devices to Social Media streams. These data stream publish with high velocity and messages often have to be processed as quick as possible. For the processing and analytics on the data, so called stream processing solutions are available. But these only provide minimal or no visualization capabilities. One option is to first persist the data into a data store and then use a traditional data visualization solution to present the data. If latency is not an issue, such a solution might be good enough. An other question is which data store solution is necessary to keep up with the high load on write and read. If it is not an RDBMS but an NoSQL database, then not all traditional visualization tools might already integrate with the specific data store. An other option is to use a Streaming Visualization solution. This talk presents different architecture blueprints for integrating data visualization into a fast data solutions.
Most data visualisation solutions today still work on data sources which are stored persistently in a data store, using the so called “data at rest” paradigms. More and more data sources today provide a constant stream of data, from IoT devices to Social Media streams. These data stream publish with high velocity and messages often have to be processed as quick as possible. For the processing and analytics on the data, so called stream processing solutions are available. But these only provide minimal or no visualisation capabilities. One option is to first persist the data into a data store and then use a traditional data visualisation solution to present the data. If latency is not an issue, such a solution might be good enough. An other question is which data store solution is necessary to keep up with the high load on write and read. If it is not an RDBMS but an NoSQL database, then not all traditional visualisation tools might already integrate with the specific data store. An other option is to use a Streaming Visualisation solution. They are specially built for streaming data and often do not support batch data. A much better solution would be to have one tool capable of handling both, batch and streaming data. This talk presents different architecture blueprints for integrating data visualisation into a fast data solution and then we show how the different blueprints can be implemented by mapping products onto the blueprints.
Building Event-Driven (Micro) Services with Apache KafkaGuido Schmutz
This talk begins with a short recap of how we created systems over the past 20 years, up to the current idea of building systems, using a Microservices architecture. What is a Microservices architecture and how does it differ from a Service-Oriented Architecture? Should you use traditional REST APIs to integrate services with each eachother in a Microservices Architecture? Or is it better to use a more loosely-coupled protocol? Answers to these and many other questions are provided. The talk will show how a distributed log (event hub) can help to create a central, persistent history of events and what benefits we achieve from doing so. Apache Kafka is a perfect match for building such an asynchronous, loosely-coupled event-driven backbone. Events trigger processing logic, which can be implemented in a more traditional as well as in a stream processing fashion. The talk shows the difference between a request-driven and event-driven communication and answers when to use which. It highlights how a modern stream processing systems can be used to hold state both internally as well as in a database and how this state can be used to further increase independence of other services, the primary goal of a Microservices architecture.
More and more data sources today provide a constant data stream, from Internet of Things devices to Social Media streams. It is one thing to collect these events in the velocity they arrive, without losing any single message. An Event Hub and a data flow engine can help here. It’s another thing to do some (complex) analytics on the data. There is always the option to first store them in a data sink of choice, such as a data lake implemented with HDFS/object store, or in a database such as a NoSQL or even an RDBMS, if the volume of events is not too high. Storing a high-volume event stream is feasible and not such a challenge anymore. But doing it adds to the end-to-end latency and it’s a matter of minutes or hours until you can present some results of your analytics. If you need to react fast, you simply can't afford to first store the data and doing the analysis/analytics later. You have to be able to include part of your analytics directly on the data stream. This is called Stream Processing or Stream Analytics. In this talk I will present the important concepts, a Stream Processing solution should support and then dive into some of the most popular frameworks available on the market and how they compare.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Location Analytics - Real Time Geofencing using Apache Kafka
1. gschmutz
Location Analytics – Real-Time
Geofencing using Kafka
Berlin Buzzwords 2019
Guido Schmutz
(guido.schmutz@trivadis.com)
gschmutz http://guidoschmutz.wordpress.com
2. gschmutz
Agenda
Location Analytics – Real-Time Geofencing using Kafka
1. Introduction & Motivation
2. Implementing using KSQL
3. Implementing using Tile38
4. Visualization using ArcadiaData
5. Summary
3. gschmutz
Guido Schmutz
Location Analytics – Real-Time Geofencing using Kafka
Working at Trivadis for more than 22 years
Oracle Groundbreaker Ambassador & Oracle ACE Director
Consultant, Trainer, Software Architect for Java, AWS, Azure,
Oracle Cloud, SOA and Big Data / Fast Data
Platform Architect & Head of Trivadis Architecture Board
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
155th edition
5. gschmutz
Geofencing – What is it?
Location Analytics – Real-Time Geofencing using Kafka
the use of GPS or RFID technology to
create a virtual geographic boundary,
enabling software to trigger a
response when a object/device enters
or leaves a particular area
Possible Events
• OUTSIDE
• lNSIDE
• ENTER
• EXIT
Source: https://tile38.com
6. gschmutz
Geofencing – What can we do with it?
Location Analytics – Real-Time Geofencing using Kafka
• On-Demand and Delivery Services -
assign orders to an area's designated
service provider
• On-Demand Transportation - track
Electronic Transportation Devices and
their distance from charging stations
• Transportation Management - track
flow of people using public transport
systems
• Commercial Real Estate - Identify
how many people drive or walk by a
specific location
• Retail Shopper Guidance - Guide
customer to a specific product once
they are in your store
• Property Security - Open or lock
doors as individuals with designated
devices approach or leave a building
or vehicle.
• Property Control - restrict vehicles to
be operational only inside a geofenced
area – like drones or construction
equipment
7. gschmutz
Geo-Processing
Location Analytics – Real-Time Geofencing using Kafka
Well-known text (WKT) is a text markup language for
representing vector geometry objects on a map
GeoTools is a free software GIS toolkit for developing
standards compliant solutions
11. gschmutz
KSQL - Overview
• Stream
• unbounded sequence of structured data
("facts")
• Facts in a stream are immutable
• Table
• collected state of a stream
• Latest value for each key in a stream
• Facts in a table are mutable
• Stream Processing with zero coding
using SQL-like language
• Stream and Table as first-class
citizens
trucking_
driver
Kafka Broker
KSQL Engine
Kafka Streams
KSQL CLI Commands
Location Analytics – Real-Time Geofencing using Kafka
12. gschmutz
KSQL – Streams and Tabless
Location Analytics – Real-Time Geofencing using Kafka
geofence
Table
vehicle
position
Stream
CREATE STREAM vehicle_position_s
(id VARCHAR,
latitude DOUBLE,
longitude DOUBLE)
WITH (KAFKA_TOPIC='vehicle_position',
VALUE_FORMAT='DELIMITED');
CREATE TABLE geo_fence_t
(id BIGINT,
name VARCHAR,
geometry_wkt VARCHAR)
WITH (KAFKA_TOPIC='geo_fence',
VALUE_FORMAT='JSON',
KEY = 'id');KSQL
Geofencing
13. gschmutz
How to determine "inside" or "outside" geofence?
Location Analytics – Real-Time Geofencing using Kafka
Only one standard UDF for geo processing in KSQL: GEO_DISTANCE
Implement custom UDF using functionality from GeoTools Java library
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT){ .. }
public List<String> geo_fence_bulk(final double latitude
, final double longitude, List<String> idGeometryListWKT) { .. }
ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781
52.56195151687443, 13.2440185546875 52.530216577830124, ...))')
FROM test_geo_udf_s;
52.4497 | 13.3096 | OUTSIDE
52.4556 | 13.3178 | INSIDE
14. gschmutz
Custom UDF to determine if Point is inside a geometry
Location Analytics – Real-Time Geofencing using Kafka
@Udf(description = "determines if a lat/long is inside or outside the
geometry passed as the 3rd parameter as WKT encoded ...")
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT) {
String status = "";
GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory();
WKTReader reader = new WKTReader(geometryFactory);
Polygon polygon = (Polygon) reader.read(geometryWKT);
Coordinate coord = new Coordinate(longitude, latitude);
Point point = geometryFactory.createPoint(coord);
if (point.within(polygon)) {
status = "INSIDE";
} else {
status = "OUTSIDE";
}
return status;
}
15. gschmutz
1) Using Cross Join
Location Analytics – Real-Time Geofencing using Kafka
geofence Table
Join Position
& Geofences
vehicle
position
Stream
Stream
pos &
geofences
CREATE STREAM vp_join_gf_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
gf.geometry_wkt
FROM vehicle_position_s AS vp
CROSS JOIN geo_fence_t AS gf
There is no Cross Join
in KSQL!
16. gschmutz
2) INNER Join
Location Analytics – Real-Time Geofencing using Kafka
geofence Stream
Join Position
& Geofences
{ "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
vehicle
position
Stream
Stream
pos &
geofences
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich Group
Table
geofences
by group 1
Enrich Group
Stream
postion by
group 1
Cannot insert into Table
from Stream
>INSERT INTO geo_fence_t
>SELECT '1' AS group_id, geof.id, …
>FROM geo_fence_s geof;
INSERT INTO can only be used to insert into
a stream. A02_GEO_FENCE_T is a table.
17. gschmutz
3) Geofences aggregated in one group
Location Analytics – Real-Time Geofencing using Kafka
Join Position
& Geofences
Stream
geofence
status
Geofences
aggby group
Table
{ "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk
geofence Stream
vehicle
position
Stream
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich With
Group-1
Stream
geofences
by group 1
Enrich With
Group-1
Stream
postion by
group 1
geofences
by id
{ "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
18. gschmutz
3) Geofences aggregated in one group
Location Analytics – Real-Time Geofencing using Kafka
CREATE TABLE a03_geo_fence_aggby_group_t
AS
SELECT group_id
, collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list
FROM a03_geo_fence_by_group_s geof
GROUP BY group_id;
CREATE STREAM a03_vehicle_position_by_group_s
AS
SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude
FROM vehicle_position_s vehp
PARTITION BY group_id;
19. gschmutz
3) Geofences aggregated in one group
Location Analytics – Real-Time Geofencing using Kafka
ksql> SELECT * FROM a03_geo_fence_status_s;
46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE]
46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE]
...
CREATE STREAM a03_geo_fence_status_s
AS
SELECT vehp.id, vehp.latitude, vehp.longitude,
geo_fence_bulk(vehp.latitude, vehp.longitude,
geofaggid_geometry_wkt_list) AS geofence_status
FROM a03_vehicle_position_by_group_s vehp
LEFT JOIN a03_geo_fence_aggby_group_t geofagg
ON vehp.group_id = geofagg.group_id;
As many as there are geo-fences
20. gschmutz
Geo Hash for a better distribution
Geohash is a geocoding which encodes
a geographic location into a short string
of letters and digits
Length Area width x height
1 5,009.4km x 4,992.6km
2 1,252.3km x 624.1km
3 156.5km x 156km
4 39.1km x 19.5km
12 3.7cm x 1.9cm
http://geohash.gofreerange.com/
Location Analytics – Real-Time Geofencing using Kafka
21. gschmutz
Geo Hash Custom UDF
Location Analytics – Real-Time Geofencing using Kafka
ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3)
>FROM test_geo_udf_s;
38.484769753492536 | -90.23345947265625 | 9yz
public String geohash(final double latitude,
final double longitude, int length)
public List<String> neighbours(String geohash)
public String adjacentHash(String geohash, String directionString)
public List<String> coverBoundingBox(String geometryWKT, int length)
ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5)
>FROM test_geo_udf_s;
POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531
38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk,
9yzfs, 9yzfu]
22. gschmutz
4) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
Join Position
& Geofences
Stream
geofence
status
Geofences
gpby geohash
Table
{ "geohash":9yz", "name":"St. Louis",
"geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"geohash":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
geo_fence_bulk()
geofence Table
vehicle
position
Stream
{ "geohash":9yz", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by id
geo_hash()
geo_hash()
{ "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : -
90.15311}
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
23. gschmutz
4) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
CREATE STREAM a04_geo_fence_by_geohash_s
AS
SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
PARTITION by geo_hash;
INSERT INTO a04_geo_fence_by_geohash_s
SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL
PARTITION BY geo_hash;s
INSERT INTO a04_geo_fence_by_geohash_s
SELECT ...
There is no explode()
functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
24. gschmutz
4) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
CREATE TABLE a04_geo_fence_by_geohash_t
AS
SELECT geo_hash,
COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list,
COLLECT_SET(id) id_list
FROM a04_geo_fence_by_geohash_s
GROUP BY geo_hash;
CREATE STREAM a04_vehicle_position_by_geohash_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
geo_hash(vp.latitude, vp.longitude, 3) geo_hash
FROM vehicle_position_s vp
PARTITION BY geo_hash;
25. gschmutz
4) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
CREATE STREAM a04_geo_fence_status_s
AS
SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude,
geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list)
AS fence_status
FROM a04_vehicle_position_by_geohash_s vp
LEFT JOIN a04_geo_fence_by_geohash_t gf
ON (vp.geo_hash = gf.geo_hash);
ksql> SELECT * FROM a04_geo_fence_status_s;
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE]
...
As many as there are geo-fences in
geohash
26. gschmutz
4a) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk()
geofence Table
vehicle
position
Stream
{ "geohash":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ ”geohash":1", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by id
geo_hash()
geo_hash()
Stream
udf
status
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : -
90.15311}
27. gschmutz
4b) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence()
geofence Table
vehicle
position
Stream
{ "geohash":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
gpby geohash
geo_hash()
geo_hash()
Stream
position &
geofence
Explode
Geofendes
Stream
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : -
90.15311}
28. gschmutz
4b) Geofences aggregated by GeoHash
Location Analytics – Real-Time Geofencing using Kafka
CREATE STREAM a04b_geofence_udf_status_s
AS
SELECT id, latitude, longitude, id_list[0] AS geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp
LEFT JOIN a04_geo_fence_by_geohash_t gf
ON (vp.geo_hash = gf.geo_hash);
INSERT INTO a04b_geofence_udf_status_s
SELECT id, latitude, longitude, id_list[1] geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp
LEFT JOIN a04_geo_fence_by_geohash_t gf
ON (vp.geo_hash = gf.geo_hash)
WHERE id_list[1] IS NOT NULL;
30. gschmutz
Tile38
Location Analytics – Real-Time Geofencing using Kafka
https://tile38.com
Open Source Geospatial Database & Geofencing Server
Real Time Geofencing
Roaming Geofencing
Fast Spatial Indices
Plugable Event Notifications
31. gschmutz
Tile38 – How does it work?
Location Analytics – Real-Time Geofencing using Kafka
> SETCHAN berlin WITHIN vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
> SUBSCRIBE berlin
{"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85
µs"}
.
.
.
{"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo
k":"berlin","key":"vehicle","time":"2019-06-
17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
32. gschmutz
Tile38 – How does it work?
Location Analytics – Real-Time Geofencing using Kafka
> SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN
vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status
% Auto-selecting Consumer mode (use -P or -C to override)
{"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo
k":"berlin_hook","key":"vehicle","time":"2019-06-
17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
33. gschmutz
1) Enrich with GeoFences – aggregated by geohash
Location Analytics – Real-Time Geofencing using Kafka
geofence
Stream
vehicle
position
Stream
Invoke UDF
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Invoke UDF
Geofence
Service
geofence
status
set_pos()
set_fence()
Stream
udf
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
34. gschmutz
2) Using Custom Kafka Connector for Tile38
Location Analytics – Real-Time Geofencing using Kafka
geofence
vehicle
position
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Geofence
Service
kafka-to-
tile38
kafka-to-
tile38
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
35. gschmutz
2) Using Custom Kafka Connector for Tile38
Location Analytics – Real-Time Geofencing using Kafka
curl -X PUT
/api/kafka-connect-1/connectors/Tile38SinkConnector/config
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
"connector.class":
"com.trivadis.geofence.kafka.connect.Tile38SinkConnector",
"topics": "vehicle_position",
"tasks.max": "1",
"tile38.key": "vehicle",
"tile38.operation": "SET",
"tile38.hosts": "tile38:9851"
}'
Currently only supports SET command
39. gschmutz
Outlook
Location Analytics – Real-Time Geofencing using Kafka
• Geo Fencing is doable using Kafka and KSQL
• KSQL is similar to SQL, but don't think relational
• UDF and UDAF's is a powerful way to extend KSQL
• Use Geo Hahes to partition work
• Outlook
• Performance Tests
• Cleanup code of UDFs and UDAFs
• Implement Kafka Source Connector for Tile 38
40. gschmutzLocation Analytics – Real-Time Geofencing using Kafka
Technology on its own won't help you.
You need to know how to use it properly.