Confluent and Elastic

•

1 like•363 views

Un'introduzione ad Apache Kafka e Kafka Connect APIs (part of Apache Kafka), in particolare come Kafka possa essere usato assieme ad Elasticsearch. Grazie a Seacom per averci invitato all'evento a Roma.

Software

1
Confluent & Elastic
La coppia perfetta per
il big data streaming

2
Vision of a streaming enterprise
Elasticsearch
NoSQL
RDBMS Monitoring
MainframesReal-time Analytics Data Warehouse
Apps
Microservices
Hadoop
Streaming Platform
(Powered by Apache Kafka)

3
Confluent and Elastic share similar values
• Distributed and fault tolerant
• Horizontally scalable
• Low latency
• Open source
• Enterprise grade solutions

4
Common Kafka use cases
Data transport and integration
• Log data
• Sensors and device data
• Monitoring streams
• Call data records
• Stock ticker data
• Customer 360
Real-time stream processing
• Monitoring
• Asynchronous applications
• Fraud and security

5
From big data to fast data
Stream data is
The faster the better
Stream data can be
big or fast (Lambda)
Stream data will be
big AND fast (Kappa)
Apache Kafka is the enabling technology of this transition
Big data was
The more the better
ValueofData
Volume of Data
ValueofData
Age of Data Speed Table Batch Table
DB
Streams Hadoop
Job 1 Job 2
Streams
Table 1 Table 2
DB

6
Confluent Platform
Open Source ExternalEnterprise
Confluent Platform
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Control Center
Auto-data
Balancing
Multi-Data
Center Replication
24/7 Support
Supported
Connectors
Clients
Schema
Registry
REST
Proxy
Apache Kafka
Kafka
Connect
Kafka
Streams
Kafka
Core
Database Changes Log Events loT Data Web Events …

7
Apache Kafka API – ETL Analogy
Source SinkConnectAPI
ConnectAPI
Streams API
Extract Transform Load

8
Apache Kafka Connect APIs – Streaming Data Capture
JDBC
Oracle
MySQL
Elastic
Couchbase
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of
data sources and sinks
Preserves data schema
Part of Apache Kafka
project
Integrated within
Confluent Platform’s
Control Center

9
Confluent Elasticsearch Connector
• Easily move data from
Kafka to Elasticsearch
• Open Source, ASL
licensed
• Key Features:
• Exactly Once Delivery
• Mapping Inference
• Schema Evolution
JDBC
Oracle
MySQL
Elastic
Kafka Connect API
Elasticsearch
Connector
Documentation:
http://docs.confluent.io/current/connect/connect-elasticsearch/docs/elasticsearch_connector.html
Source code:
https://github.com/confluentinc/kafka-connect-elasticsearch

10
Benefits of Kafka Connect APIs: Simple, but Powerful
JDBC
Oracle
Mainframes
Elastic
Kafka Connect API
Elasticsearch Connector
HP Vertica
HDFS
HP Vertica Connector
HDFS Connector

11
Benefits of Kafka Connect APIs: Cross Data Center Replication
Kafka
Kafka Connect API
Confluent Replicator
Kafka Cluster
Data Center A
Data Center B
Low latency, real-time
data replication

12
Kafka Connect connectors
Databases Datastore/File Store
Analytics Applications / Other

13
Confluent Control Center: end to end monitoring

14
“A simple, scalable and flexible solution that delivers data in real-time.
Enabling real-time data integration with actionable insights.”
True Partnership Focused on Customers Success
• Enterprise grade distribution of Kafka
• Stream processing at scale
• Simple, reliable, secure and auditable
• Fast and scalable
• Easy to operate
• Enterprise grade security

SIEM platforms are essential to the new cybersecurity paradigm and data collection layer is a very important piece of it. When you deliver a new platform, you can easily get lost in a variety of different vendors and solutions, too many challenges are facing. What if I change vendors, will I keep my data? How to feed multiple tools with the same data? How to collect data from custom apps and services? How to pay less for an expensive platform? How to keep data without a huge cost? Join us if you are looking for the answers. In this session, you will learn how we replaced the vendor-provided data collection layer with kafka connect and the lessons we learnt. After the talk you will know: - architecture and real-life examples of the flexible and highly available data collection platform - custom connectors that do most of the work for us and how to extend the connectors to consume new data, we made them open sourced - easy way to receive data from thousands of servers and many cloud services - how to archive data at low cost You will leave armed with a set of free tools and recipes to build a truly vendor-agnostic data collection platform. It will allow you to take you SIEM costs under control. You will feed your analytics tools with what they need and archive the rest at low cost. You will feed your SIEM smart!

Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...

confluent

Responding to a global pandemic presents a unique set of technical and public health challenges. The real challenge is the ability to gather data coming in via many data streams in variety of formats influences the real-world outcome and impacts everyone. The Centers for Disease Control and Prevention CELR (COVID Electronic Lab Reporting) program was established to rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health departments and other partners. Confluent Kafka with KStreams and Connect play a critical role in program objectives to: o Track the threat of COVID-19 virus o Provide comprehensive data for local, state, and federal response o Better understand locations with an increase in incidence

Streaming Data and Stream Processing with Apache Kafka

confluent

GCP for Apache Kafka® Users: Stream Ingestion and Processing

confluent

Watch this talk here: https://www.confluent.io/online-talks/gcp-for-apache-kafka-users-stream-ingestion-processing In private and public clouds, stream analytics commonly means stateless processing systems organized around Apache Kafka® or a similar distributed log service. GCP took a somewhat different tack, with Cloud Pub/Sub, Dataflow, and BigQuery, distributing the responsibility for processing among ingestion, processing and database technologies. We compare the two approaches to data integration and show how Dataflow allows you to join and transform and deliver data streams among on-prem and cloud Apache Kafka clusters, Cloud Pub/Sub topics and a variety of databases. The session will have a mix of architectural discussions and practical code reviews of Dataflow-based pipelines.

Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy

Kairo Tavares

Etl is Dead; Long Live Streams

confluent

Avvo fkafka

Nitin Kumar

Are you looking for a cloud-based architecture that includes the best of breed streaming and database technologies? In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas. We'll start the journey learning about the basic connectivity between the two cloud services and end with a brief discovery of what you can do with data once it is in MongoDB Atlas. By the end of this session you will know how to securely setup and configure the MongoDB Atlas connectors in the Confluent Cloud in both a source and sink configuration.

Operational Analytics on Event Streams in Kafka

confluent

Speaker: Anirudh Ramanthan, Product Manager, Rockset Tracking key events and analyzing these event streams are critical to many enterprises. We highlight how organizations are using Apache Kafka® as a fast, reliable event streaming platform alongside Rockset, a serverless search and analytics engine, to create stateful microservices to analyze their event streams. In this talk, we will discuss a stateful microservices architecture, where events from multiple channels are collected and streamed into Kafka and continuously ingested into Rockset with no explicit schema or metadata specification required. Developers then use serverless compute frameworks, like AWS Lambda, in conjunction with serverless data management from Rockset to build microservices to derive insights on the data from Kafka. Organizations can leverage this pattern to support low-latency queries on event streams, providing immediate insight on their business.

Enterprise Metadata Integration

Dr. Mirko Kämpf

Streaming Transformations - Putting the T in Streaming ETL

confluent

Speaker: Nick Dearden, Director of Engineering, Confluent We’ll discuss how to leverage some of the more advanced transformation capabilities available in both KSQL and Kafka Connect, including how to chain them together into powerful combinations for handling tasks such as data-masking, restructuring and aggregations. Using KSQL, you can deliver the streaming transformation capability easily and quickly. This is part 3 of 3 in Streaming ETL - The New Data Integration series. Watch the recording: https://videos.confluent.io/watch/en56Qt3KAdrpQ4JE5EZNHj?.

How a Data Mesh is Driving our Platform | Trey Hicks, Gloo

HostedbyConfluent

At Gloo.us, we face a challenge in providing platform data to heterogeneous applications in a way that eliminates access contention, avoids high latency ETLs, and ensures consistency for many teams. We're solving this problem by adopting Data Mesh principles and leveraging Kafka, Kafka Connect, and Kafka streams to build an event driven architecture to connect applications to the data they need. A domain driven design keeps the boundaries between specialized process domains and singularly focused data domains clear, distinct, and disciplined. Applying the principles of a Data Mesh, process domains assume the responsibility of transforming, enriching, or aggregating data rather than relying on these changes at the source of truth -- the data domains. Architecturally, we've broken centralized big data lakes into smaller data stores that can be consumed into storage managed by process domains. This session covers how we’re applying Kafka tools to enable our data mesh architecture. This includes how we interpret and apply the data mesh paradigm, the role of Kafka as the backbone for a mesh of connectivity, the role of Kafka Connect to generate and consume data events, and the use of KSQL to perform minor transformations for consumers.

Data Driven Enterprise with Apache Kafka

confluent

Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)

Kai Wähner

Kafka Streams: What it is, and how to use it?

confluent

Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...

confluent

In this talk we'll look at the relationship between three of the most disruptive software engineering paradigms: event sourcing, stream processing and serverless. We'll debunk some of the myths around event sourcing. We'll look at the inevitability of event-driven programming in the serverless space and we'll see how stream processing links these two concepts together with a single 'database for events'. As the story unfolds we'll dive into some use cases, examine the practicalities of each approach-particularly the stateful elements-and finally extrapolate how their future relationship is likely to unfold. Key takeaways include: The different flavors of event sourcing and where their value lies. The difference between stream processing at application- and infrastructure-levels. The relationship between stream processors and serverless functions. The practical limits of storing data in Kafka and stream processors like KSQL.

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...

HostedbyConfluent

Apache Kafka users who want to leverage Google Cloud Platform's (GCPs) data analytics platform and open source hosting capabilities can bridge their existing Kafka infrastructure on-premise or in other clouds to GCP using Confluent's replicator tool and managed Kafka service on GCP. Using actual customer examples and a reference architecture, we'll showcase how existing Kafka users can stream data to GCP and use it in popular tools like Apache Beam on Dataflow, BigQuery, Google Cloud Storage (GCS), Spark on Dataproc, and Tensorflow for data warehousing, data processing, data storage, and advanced analytics using AI and ML.

Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...

confluent

Microservices, events, containers, and orchestrators are dominating our vernacular today. As operations teams adapt to support these technologies in production, cloud-native platforms like Pivotal Cloud Foundry and Kubernetes have quickly risen to serve as force multipliers of automation, productivity and value. Apache Kafka® is providing developers a critically important component as they build and modernize applications to cloud-native architecture. This talk will explore: • Why cloud-native platforms and why run Apache Kafka on Kubernetes? • What kind of workloads are best suited for this combination? • Tips to determine the path forward for legacy monoliths in your application portfolio • Demo: Running Apache Kafka as a Streaming Platform on Kubernetes

Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset

HostedbyConfluent

Streaming data systems have been growing rapidly in importance to the modern data stack. Kafka’s kSQL provides an interface for analytic tools that speak SQL. Apache Superset, the most popular modern open-source visualization and analytics solution, plugs into nearly any data source that speaks SQL, including Kafka. Here, we review and compare methods for connecting Kafka to Superset to enable streaming analytics use cases including anomaly detection, operational monitoring, and online data integration.

Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...

HostedbyConfluent

Managing Apache Kafka sometimes could be cumbersome, and that's something that we would like to avoid, especially for developers and data engineers that need to build and develop data pipelines. Luckily, Kubernetes and Kafka's combination helps us reduce everyday tasks tremendously by adding myriad capabilities to lessen the complexity of managing clusters. Kafka Connect and KSQLDB are a fantastic combo to add to your streaming stack. These two soldiers can facilitate data acquisition and processing and also provide outstanding real-time ETL capabilities. But what if you need an OLAP datastore to answer complex queries with a low-latency response, that's where Apache Pinot comes to play. At this session, you're going to learn: - Effective Kafka deployment on Kubernetes - How to properly configure Kafka Connect and KSQLDB - Integrate Apache Pinot to answer OLAP queries

Data Integration with Apache Kafka: What, Why, How

Pat Patterson

Presented at Orange County Advanced Analytics and Big Data Meetup, June 21 2019. Apache Kafka has fast become the dominant messaging technology for the enterprise; if you're a data scientist or data engineer and you have not yet worked with Kafka, that situation will likely change soon! In this session, Pat Patterson, director of evangelism at StreamSets, explains what Kafka is, why it has disrupted the previous generation of messaging products, and how you can use open source products to build dataflow pipelines with Kafka, without writing code.

Stream Processing Live Traffic Data with Kafka Streams

Tom Van den Bulck

In this workshop we will set up a streaming framework which will process realtime data of traffic sensors installed within the Belgian road system. Starting with the intake of the data, you will learn best practices and the recommended approach to split the information into events in a way that won't come back to haunt you. With some basic stream operations (count, filter, ... ) you will get to know the data and experience how easy it is to get things done with Spring Boot & Spring Cloud Stream. But since simple data processing is not enough to fulfill all your streaming needs, we will also let you experience the power of windows. After this workshop, tumbling, sliding and session windows hold no more mysteries and you will be a true streaming wizard.

KSQL: Open Source Streaming for Apache Kafka

confluent

Leveraging Mainframe Data for Modern Analytics

confluent

“The mainframe is going away” is as true now as it was 10, 20 and 30 years ago. Mainframes are still crucial in handling critical business transactions, they were however built for an era where batch data movement was the norm and can be difficult to integrate into today’s data-driven, real-time, analytics-focused business processes as well as the environments that support them. Until now. Join experts from Confluent, Attunity, and Capgemini for a one-hour online talk session where you’ll learn how to: Unlock your mainframe data with unique change data capture (CDC) functionality without incurring the complexity and expense that come with sending ongoing queries into the mainframe database How using CDC benefits advanced analytics approaches such as deep machine learning and predictive analytics Deliver ongoing streams of data in real-time to the most demanding analytics environments Ensure that your analytics environment includes the broadest possible range of data sources and destinations while ensuring true enterprise-grade functionality Identify use cases that can help you get started delivering value to the business moving from POC to Pilot to Production

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

HostedbyConfluent

One of the key metrics to monitor when working with Apache Kafka, as a data pipeline or a streaming platform, is Consumer Groups Lag. Lag is the delta between the last produced message and the last committed message of a partition. In other words, lag indicates how far behind your application is in processing up-to-date information. For a long time, we used our own service to keep track of these metrics, collect them and visualize them. But this didn’t scale well. You had to perform many manual operations, redeploy it and to do other tedious manual tasks, but most importantly, the biggest gap for us, was that its output was represented in absolute numbers (e.g - your lag is 30K), which basically tells you nothing as a human being. We understood that we had to find a more suitable solution that will give us better visibility and will allow us to measure the lag in a time-based format that we all understand. In this talk, I’m going to go over the core concepts of Kafka offsets and lags, and explain why lag even matters and is an important KPI to measure. I’ll also talk about the kind of research we did to find the right tool, what the options in the market were at the time, and eventually why we chose Linkedin’s Burrow as the right tool for us. And finally, I’ll take a closer look at Burrow, its building blocks, how we build and deploy it, how we monitor better with it, and eventually the most important improvement - how we transformed its output from numbers to time-based metrics.

Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL

Kai Wähner

Stream Processing is a concept used to act on real-time streaming data. This session shows and demos how teams in different industries leverage the innovative Streams API from Apache Kafka to build and deploy mission-critical streaming real time application and microservices. The session discusses important Streaming concepts like local and distributed state management, exactly once semantics, embedding streaming into any application, deployment to any infrastructure. Afterwards, the session explains key advantages of Kafka's Streams API like distributed processing and fault-tolerance with fast failover, no-downtime rolling deployments and the ability to reprocess events so you can recalculate output when your code changes. The session also introduces KSQL - the Streaming SQL Engine for Apache Kafka. Write SQL streaming queries with the scalability, throughput and fail-over of Kafka Streams under the hood. The end of the session demos how to combine any custom code with your streams application (either Kafka Streams or KSQL) by an example using an analytic model built with any machine learning framework like Apache Spark ML or TensorFlow.

Can Apache Kafka Replace a Database?

Kai Wähner

Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit and when it is not. The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage and ksqlDB as event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration. Key takeaways: - Kafka can store data forever in a durable and high available manner - Kafka has different options to query historical data - Kafka-native add-ons like ksqlDB or Tiered Storage make Kafka more powerful than ever before to store and process data - Kafka does not provide transactions, but exactly-once semantics - Kafka is not a replacement for existing databases like MySQL, MongoDB or Elasticsearch - Kafka and other databases complement each other; the right solution has to be selected for a problem - Different options are available for bi-directional pull and push-based integration between Kafka and databases to complement each other Video Recording: https://youtu.be/7KEkWbwefqQ Blog post: https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/

Streaming Data Ingest and Processing with Apache Kafka

Attunity

Apache™ Kafka is a fast, scalable, durable, and fault-tolerant publish-subscribe messaging system. It offers higher throughput, reliability and replication. To manage growing data volumes, many companies are leveraging Kafka for streaming data ingest and processing. Join experts from Confluent, the creators of Apache™ Kafka, and the experts at Attunity, a leader in data integration software, for a live webinar where you will learn how to: -Realize the value of streaming data ingest with Kafka -Turn databases into live feeds for streaming ingest and processing -Accelerate data delivery to enable real-time analytics -Reduce skill and training requirements for data ingest The recorded webinar on slide 32 includes a demo using automation software (Attunity Replicate) to stream live changes from a database into Kafka and also includes a Q&A with our experts. For more information, please go to www.attunity.com/kafka.

Confluent and Syncsort Webinar August 2016

Precisely

What's hot

Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...

HostedbyConfluent

Operational Analytics on Event Streams in Kafka

confluent

Enterprise Metadata Integration

Dr. Mirko Kämpf

Streaming Transformations - Putting the T in Streaming ETL

confluent

How a Data Mesh is Driving our Platform | Trey Hicks, Gloo

HostedbyConfluent

Data Driven Enterprise with Apache Kafka

confluent

Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)

Kai Wähner

Kafka Streams: What it is, and how to use it?

confluent

Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...

confluent

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...

HostedbyConfluent

Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...

confluent

Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset

HostedbyConfluent

Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...

HostedbyConfluent

Data Integration with Apache Kafka: What, Why, How

Pat Patterson

Stream Processing Live Traffic Data with Kafka Streams

Tom Van den Bulck

KSQL: Open Source Streaming for Apache Kafka

confluent

Leveraging Mainframe Data for Modern Analytics

confluent

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

HostedbyConfluent

Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL

Kai Wähner

Can Apache Kafka Replace a Database?

Kai Wähner

What's hot (20)

Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...

Operational Analytics on Event Streams in Kafka

Enterprise Metadata Integration

Streaming Transformations - Putting the T in Streaming ETL

How a Data Mesh is Driving our Platform | Trey Hicks, Gloo

Data Driven Enterprise with Apache Kafka

Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)

Kafka Streams: What it is, and how to use it?

Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...

Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...

Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset

Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...

Data Integration with Apache Kafka: What, Why, How

Stream Processing Live Traffic Data with Kafka Streams

KSQL: Open Source Streaming for Apache Kafka

Leveraging Mainframe Data for Modern Analytics

Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020

Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL

Can Apache Kafka Replace a Database?

Similar to Confluent and Elastic

Streaming Data Ingest and Processing with Apache Kafka

Attunity

Confluent and Syncsort Webinar August 2016

Precisely

Data Streaming with Apache Kafka & MongoDB - EMEA

Andrew Morgan

Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB

Introducing Confluent Cloud: Apache Kafka as a Service

confluent

Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik

HostedbyConfluent

Qlik is an industry leader across its solution stack, both on the Data Integration side of things with Qlik Replicate (real-time CDC) and Qlik Compose (data warehouse and data lake automation), and on the Analytics side with Qlik Sense. These two “sides” of Qlik are coming together more frequently these days as the need for “always fresh” data increases across organizations. When real-time streaming applications are the topic du jour, those companies are looking to Apache Kafka to provide the architectural backbone those applications require. Those same companies turn to Qlik Replicate to put the data from their enterprise database systems into motion at scale, whether that data resides in “legacy” mainframe databases; traditional relational databases such as Oracle, MySQL, or SQL Server; or applications such as SAP and SalesForce. In this session we will look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we will explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.

Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...

confluent

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...

confluent

MQ, ETL and ESB middleware are often used as integration backbone between legacy applications, modern microservices and cloud services. This introduces several challenges and complexities like point-to-point integration or non-scalable architectures. This session discusses how to build a completely event-driven streaming platform leveraging Apache Kafka’s open source messaging, integration and streaming components to leverage distributed processing, fault-tolerance, rolling upgrades and the ability to reprocess events. Learn the differences between a event-driven streaming platform leveraging Apache Kafka and middleware like MQ, ETL and ESBs – including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture.

Apache Kafka Use Cases_ When To Use It_ When Not To Use_.pdf

Noman Shaikh

Introduction to Apache Kafka and Confluent... and why they matter!

Paolo Castagna

Introduction to Apache Kafka and Confluent... and why they matter

confluent

Milano Apache Kafka Meetup by Confluent (First Italian Kafka Meetup) on Wednesday, November 29th 2017. Il talk introduce Apache Kafka (incluse le APIs Kafka Connect e Kafka Streams), Confluent (la società creata dai creatori di Kafka) e spiega perché Kafka è un'ottima e semplice soluzione per la gestione di stream di dati nel contesto di due delle principali forze trainanti e trend industriali: Internet of Things (IoT) e Microservices.

Big Data, Ingeniería de datos, y Data Lakes en AWS

javier ramirez

AWS Big Data Landscape

Crishantha Nanayakkara

Stream processing on mobile networks

pbelko82

MongoDB World 2019: Streaming ETL on the Shoulders of Giants

MongoDB

JHipster conf 2019 - Kafka Ecosystem

Florent Ramiere

Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure

Luan Moreno Medeiros Maciel

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)

Kai Wähner

Learn the differences between an event-driven streaming platform and middleware like MQ, ETL and ESBs – including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture. Extract-Transform-Load (ETL) is still a widely-used pattern to move data between different systems via batch processing. Due to its challenges in today’s world where real time is the new standard, an Enterprise Service Bus (ESB) is used in many enterprises as integration backbone between any kind of microservice, legacy application or cloud service to move data via SOAP / REST Web Services or other technologies. Stream Processing is often added as its own component in the enterprise architecture for correlation of different events to implement contextual rules and stateful analytics. Using all these components introduces challenges and complexities in development and operations. This session discusses how teams in different industries solve these challenges by building a native streaming platform from the ground up instead of using ETL and ESB tools in their architecture. This allows to build and deploy independent, mission-critical streaming real time application and microservices. The architecture leverages distributed processing and fault-tolerance with fast failover, no-downtime rolling deployments and the ability to reprocess events, so you can recalculate output when your code changes. Integration and Stream Processing are still key functionality but can be realized in real time natively instead of using additional ETL, ESB or Stream Processing tools.

Confluent Enterprise Datasheet

confluent

Introduction to Apache Kafka and why it matters - Madrid

Paolo Castagna

Similar to Confluent and Elastic (20)

Streaming Data Ingest and Processing with Apache Kafka

Confluent and Syncsort Webinar August 2016

Data Streaming with Apache Kafka & MongoDB - EMEA

Webinar: Data Streaming with Apache Kafka & MongoDB

Introducing Confluent Cloud: Apache Kafka as a Service

Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik

Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...

Apache Kafka Use Cases_ When To Use It_ When Not To Use_.pdf

Introduction to Apache Kafka and Confluent... and why they matter!

Introduction to Apache Kafka and Confluent... and why they matter

Big Data, Ingeniería de datos, y Data Lakes en AWS

AWS Big Data Landscape

Stream processing on mobile networks

MongoDB World 2019: Streaming ETL on the Shoulders of Giants

JHipster conf 2019 - Kafka Ecosystem

Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)

Confluent Enterprise Datasheet

Introduction to Apache Kafka and why it matters - Madrid

Recently uploaded

Graphic Design Crash Course for beginners

e20449

Orion Context Broker introduction 20240604

Fermin Galan

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...

Shahin Sheidaei

Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.

Globus Connect Server Deep Dive - GlobusWorld 2024

Globus

How Recreation Management Software Can Streamline Your Operations.pptx

wottaspaceseo

Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.

BoxLang: Review our Visionary Licenses of 2024

Ortus Solutions, Corp

Enterprise Resource Planning System in Telangana

NYGGS Automation Suite

Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics. To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/

Corporate Management | Session 3 of 3 | Tendenci AMS

Tendenci - The Open Source AMS (Association Management Software)

Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have. For more Tendenci AMS events, check out www.tendenci.com/events

Cyaniclab : Software Development Agency Portfolio.pdf

Cyanic lab

CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

Globus

The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.

Using IESVE for Room Loads Analysis - Australia & New Zealand

IES VE

Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...

Globus

The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.

2024 RoOUG Security model for the cloud.pptx

Georgi Kodinov

Developing Distributed High-performance Computing Capabilities of an Open Sci...

Globus

COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...

Mind IT Systems

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...

Juraj Vysvader

Vitthal Shirke Microservices Resume Montevideo

Vitthal Shirke

Quarkus Hidden and Forbidden Extensions

Max Andersen

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx

rickgrimesss22

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better

XfilesPro

Recently uploaded (20)

Graphic Design Crash Course for beginners

Orion Context Broker introduction 20240604

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...

Globus Connect Server Deep Dive - GlobusWorld 2024

How Recreation Management Software Can Streamline Your Operations.pptx

BoxLang: Review our Visionary Licenses of 2024

Enterprise Resource Planning System in Telangana

Corporate Management | Session 3 of 3 | Tendenci AMS

Cyaniclab : Software Development Agency Portfolio.pdf

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

Using IESVE for Room Loads Analysis - Australia & New Zealand

Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...

2024 RoOUG Security model for the cloud.pptx

Developing Distributed High-performance Computing Capabilities of an Open Sci...

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...

Vitthal Shirke Microservices Resume Montevideo

Quarkus Hidden and Forbidden Extensions

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better

Confluent and Elastic

1. 1 Confluent & Elastic La coppia perfetta per il big data streaming

2. 2 Vision of a streaming enterprise Elasticsearch NoSQL RDBMS Monitoring MainframesReal-time Analytics Data Warehouse Apps Microservices Hadoop Streaming Platform (Powered by Apache Kafka)

3. 3 Confluent and Elastic share similar values • Distributed and fault tolerant • Horizontally scalable • Low latency • Open source • Enterprise grade solutions

4. 4 Common Kafka use cases Data transport and integration • Log data • Sensors and device data • Monitoring streams • Call data records • Stock ticker data • Customer 360 Real-time stream processing • Monitoring • Asynchronous applications • Fraud and security

5. 5 From big data to fast data Stream data is The faster the better Stream data can be big or fast (Lambda) Stream data will be big AND fast (Kappa) Apache Kafka is the enabling technology of this transition Big data was The more the better ValueofData Volume of Data ValueofData Age of Data Speed Table Batch Table DB Streams Hadoop Job 1 Job 2 Streams Table 1 Table 2 DB

6. 6 Confluent Platform Open Source ExternalEnterprise Confluent Platform Monitoring Analytics Custom Apps Transformations Real-time Applications … CRM Data Warehouse Database Hadoop Data Integration … Control Center Auto-data Balancing Multi-Data Center Replication 24/7 Support Supported Connectors Clients Schema Registry REST Proxy Apache Kafka Kafka Connect Kafka Streams Kafka Core Database Changes Log Events loT Data Web Events …

7. 7 Apache Kafka API – ETL Analogy Source SinkConnectAPI ConnectAPI Streams API Extract Transform Load

8. 8 Apache Kafka Connect APIs – Streaming Data Capture JDBC Oracle MySQL Elastic Couchbase HDFS Kafka Connect API Kafka Pipeline Connector Connector Connector Connector Connector Connector Sources Sinks Fault tolerant Manage hundreds of data sources and sinks Preserves data schema Part of Apache Kafka project Integrated within Confluent Platform’s Control Center

9. 9 Confluent Elasticsearch Connector • Easily move data from Kafka to Elasticsearch • Open Source, ASL licensed • Key Features: • Exactly Once Delivery • Mapping Inference • Schema Evolution JDBC Oracle MySQL Elastic Kafka Connect API Elasticsearch Connector Documentation: http://docs.confluent.io/current/connect/connect-elasticsearch/docs/elasticsearch_connector.html Source code: https://github.com/confluentinc/kafka-connect-elasticsearch

10. 10 Benefits of Kafka Connect APIs: Simple, but Powerful JDBC Oracle Mainframes Elastic Kafka Connect API Elasticsearch Connector HP Vertica HDFS HP Vertica Connector HDFS Connector

11. 11 Benefits of Kafka Connect APIs: Cross Data Center Replication Kafka Kafka Connect API Confluent Replicator Kafka Cluster Data Center A Data Center B Low latency, real-time data replication

12. 12 Kafka Connect connectors Databases Datastore/File Store Analytics Applications / Other

13. 13 Confluent Control Center: end to end monitoring

14. 14 “A simple, scalable and flexible solution that delivers data in real-time. Enabling real-time data integration with actionable insights.” True Partnership Focused on Customers Success • Enterprise grade distribution of Kafka • Stream processing at scale • Simple, reliable, secure and auditable • Fast and scalable • Easy to operate • Enterprise grade security

Confluent and Elastic

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Confluent and Elastic

Similar to Confluent and Elastic (20)

More from Paolo Castagna

More from Paolo Castagna (7)

Recently uploaded

Recently uploaded (20)

Confluent and Elastic