Dbvisit is a New Zealand-based company with offices worldwide that provides software to replicate data from Oracle databases in real-time to Apache Kafka. Their Dbvisit Replicate Connector is a plugin for Kafka Connect that allows minimal impact replication of database table changes to Kafka topics. The connector also generates metadata topics. Dbvisit focuses only on Oracle databases and replication, has proprietary log mining technology, and supports Oracle back to version 9.2. They have over 1,300 customers globally and offer perpetual or term licensing models for their replication software along with support plans. Dbvisit is a good fit for organizations using Oracle that want to offload reporting, enable real-time analytics, and integrate data into Kafka in a cost-effective manner
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
Flink Forward San Francisco 2022.
With a real-time processing engine like Flink and a transactional storage layer like Hudi, it has never been easier to build end-to-end low-latency data platforms connecting sources like Kafka to data lake storage. Come learn how to blend Lakehouse architectural patterns with real-time processing pipelines with Flink and Hudi. We will dive deep on how Flink can leverage the newest features of Hudi like multi-modal indexing that dramatically improves query and write performance, data skipping that reduces the query latency by 10x for large datasets, and many more innovations unique to Flink and Hudi.
by
Ethan Guo & Kyle Weller
Getting Started with Confluent Schema Registryconfluent
Getting started with Confluent Schema Registry, Patrick Druley, Senior Solutions Engineer, Confluent
Meetup link: https://www.meetup.com/Cleveland-Kafka/events/272787313/
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
Talk from Kafka Summit San Francisco 2019 (https://kafka-summit.org/sessions/event-driven-model-serving-stream-processing-vs-rpc-kafka-tensorflow/). Video recording will be available for free on the Summit website.
Event-based stream processing is a modern paradigm to continuously process incoming data feeds, e.g. for IoT sensor analytics, payment and fraud detection, or logistics. Machine Learning / Deep Learning models can be leveraged in different ways to do predictions and improve the business processes. Either analytic models are deployed natively in the application or they are hosted in a remote model server. In the latter you combine stream processing with RPC / Request-Response paradigm instead of direct doing direct inference within the application. This talk discusses the pros and cons of both approaches and shows examples of stream processing vs. RPC model serving using Kubernetes, Apache Kafka, Kafka Streams, gRPC and TensorFlow Serving. The trade-offs of using a public cloud service like AWS or GCP for model deployment are also discussed and compared to local hosting for offline predictions directly “at the edge”.
Key takeaways
• Machine Learning / Deep Learning models can be used in different ways to do predictions. Scalability and loose coupling are important success factors
• Stream processing vs. RPC / Request-Response for model serving has many trade-offs – learn about alternatives and best practices for your different scenarios
• Understand the alternatives and trade-offs of model deployment in modern infrastructures like Kubernetes or Cloud Services like AWS or GCP
• See live demos with Java, gRPC, Apache Kafka, KSQL and TensorFlow Serving to understand the trade-offs
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
Flink Forward San Francisco 2022.
With a real-time processing engine like Flink and a transactional storage layer like Hudi, it has never been easier to build end-to-end low-latency data platforms connecting sources like Kafka to data lake storage. Come learn how to blend Lakehouse architectural patterns with real-time processing pipelines with Flink and Hudi. We will dive deep on how Flink can leverage the newest features of Hudi like multi-modal indexing that dramatically improves query and write performance, data skipping that reduces the query latency by 10x for large datasets, and many more innovations unique to Flink and Hudi.
by
Ethan Guo & Kyle Weller
Getting Started with Confluent Schema Registryconfluent
Getting started with Confluent Schema Registry, Patrick Druley, Senior Solutions Engineer, Confluent
Meetup link: https://www.meetup.com/Cleveland-Kafka/events/272787313/
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
Talk from Kafka Summit San Francisco 2019 (https://kafka-summit.org/sessions/event-driven-model-serving-stream-processing-vs-rpc-kafka-tensorflow/). Video recording will be available for free on the Summit website.
Event-based stream processing is a modern paradigm to continuously process incoming data feeds, e.g. for IoT sensor analytics, payment and fraud detection, or logistics. Machine Learning / Deep Learning models can be leveraged in different ways to do predictions and improve the business processes. Either analytic models are deployed natively in the application or they are hosted in a remote model server. In the latter you combine stream processing with RPC / Request-Response paradigm instead of direct doing direct inference within the application. This talk discusses the pros and cons of both approaches and shows examples of stream processing vs. RPC model serving using Kubernetes, Apache Kafka, Kafka Streams, gRPC and TensorFlow Serving. The trade-offs of using a public cloud service like AWS or GCP for model deployment are also discussed and compared to local hosting for offline predictions directly “at the edge”.
Key takeaways
• Machine Learning / Deep Learning models can be used in different ways to do predictions. Scalability and loose coupling are important success factors
• Stream processing vs. RPC / Request-Response for model serving has many trade-offs – learn about alternatives and best practices for your different scenarios
• Understand the alternatives and trade-offs of model deployment in modern infrastructures like Kubernetes or Cloud Services like AWS or GCP
• See live demos with Java, gRPC, Apache Kafka, KSQL and TensorFlow Serving to understand the trade-offs
Kafka Streams is a new stream processing library natively integrated with Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.
Hello, kafka! (an introduction to apache kafka)Timothy Spann
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
Running Apache Kafka in production is only the first step in the Kafka operations journey. Professional Kafka users are ready to handle all possible disasters - because for most businesses having a disaster recovery plan is not optional.
In this session, we’ll discuss disaster scenarios that can take down entire Kafka clusters and share advice on how to plan, prepare and handle these events. This is a technical session full of best practices - we want to make sure you are ready to handle the worst mayhem that nature and auditors can cause.
Visit www.confluent.io for more information.
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
Why is Kafka so fast? Why is Kafka so popular? Why Kafka? This slide deck is a tutorial for the Kafka streaming platform. This slide deck covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example to demonstrate failover of brokers as well as consumers. Then it goes through some simple Java client examples for a Kafka Producer and a Kafka Consumer. We have also expanded on the Kafka design section and added references. The tutorial covers Avro and the Schema Registry as well as advance Kafka Producers.
Watch this talk here: https://www.confluent.io/online-talks/how-apache-kafka-works-on-demand
Pick up best practices for developing applications that use Apache Kafka, beginning with a high level code overview for a basic producer and consumer. From there we’ll cover strategies for building powerful stream processing applications, including high availability through replication, data retention policies, producer design and producer guarantees.
We’ll delve into the details of delivery guarantees, including exactly-once semantics, partition strategies and consumer group rebalances. The talk will finish with a discussion of compacted topics, troubleshooting strategies and a security overview.
This session is part 3 of 4 in our Fundamentals for Apache Kafka series.
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
If you were to ask any developer, ""what's a schema and where is it used?"" Most likely, you'd get an answer involving a relational database. The truth is the domain objects used in applications represent a contract, an implied schema, whether developers choose to acknowledge them or not. But even if you recognize the need for a formal schema, what's the best way to manage them?
This presentation will contain some theory and primarily practical application for schemas with Schema Registry. I'll briefly explain what a schema is and how it's very relevant to any application working with Kafka today. It will go into the practical, introducing Schema Registry, describing how it works and how developers can leverage it to provide schemas across an organization. The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.
Attendees will walk away with knowledge of Schema Registry and a solid understanding of how it works, how to integrate them into Kafka clients. They'll also learn enough about the supported serialization frameworks to start implementing schemas right away in their Kafka development efforts.
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
Flink Forward San Francisco 2022.
Apache Flink and Delta Lake together allow you to build the foundation for your data lakehouses by ensuring the reliability of your concurrent streams from processing to the underlying cloud object-store. Together, the Flink/Delta Connector enables you to store data in Delta tables such that you harness Delta’s reliability by providing ACID transactions and scalability while maintaining Flink’s end-to-end exactly-once processing. This ensures that the data from Flink is written to Delta Tables in an idempotent manner such that even if the Flink pipeline is restarted from its checkpoint information, the pipeline will guarantee no data is lost or duplicated thus preserving the exactly-once semantics of Flink.
by
Scott Sandre & Denny Lee
Automate Your Kafka Cluster with Kubernetes Custom Resources confluent
(Sam Obeid, Shopify) Kafka Summit SF 2018
At Shopify we manage multiple Apache Kafka clusters in multiple locations in Google’s cloud platform. We deploy our Kafka clusters as Kubernetes StatefulSets, and we use other K8s workloads to implement different tasks. Automating critical and repetitive operational tasks is one of our top priorities.
In this talk we’ll discuss how we leveraged Kubernetes Custom Resources and Controllers to automate some of the key cluster operational tasks, to detect clusters configuration changes and react to these changes with required actions. We will go through actual examples we implemented at Shopify, how we solved the problem of cluster discovery and how we automated topics creation across different clusters with zero human intervention and safety controls.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform and more and more is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate, ORDS APIs and bridging Kafka with Oracle AQ.
Apache Kafka 0.8 basic training - VerisignMichael Noll
Apache Kafka 0.8 basic training (120 slides) covering:
1. Introducing Kafka: history, Kafka at LinkedIn, Kafka adoption in the industry, why Kafka
2. Kafka core concepts: topics, partitions, replicas, producers, consumers, brokers
3. Operating Kafka: architecture, hardware specs, deploying, monitoring, P&S tuning
4. Developing Kafka apps: writing to Kafka, reading from Kafka, testing, serialization, compression, example apps
5. Playing with Kafka using Wirbelsturm
Audience: developers, operations, architects
Created by Michael G. Noll, Data Architect, Verisign, https://www.verisigninc.com/
Verisign is a global leader in domain names and internet security.
Tools mentioned:
- Wirbelsturm (https://github.com/miguno/wirbelsturm)
- kafka-storm-starter (https://github.com/miguno/kafka-storm-starter)
Blog post at:
http://www.michael-noll.com/blog/2014/08/18/apache-kafka-training-deck-and-tutorial/
Many thanks to the LinkedIn Engineering team (the creators of Kafka) and the Apache Kafka open source community!
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
Active-Active, Active-Passive, and stretch clusters are hallmark patterns that have been the gold standard in Apache Kafka® disaster recovery architectures for years. Moving to Kubernetes requires unpacking these patterns and choosing a configuration that allows you to meet the same RTO and RPO requirements.
In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution that's built-in!
Armed with this information, you will be able to architect your new Apache Kafka Kubernetes deployment (or retool your existing one) to achieve the resilience you require.
With Apache Kafka 0.9, the community has introduced a number of features to make data streams secure. In this talk, we’ll explain the motivation for making these changes, discuss the design of Kafka security, and explain how to secure a Kafka cluster. We will cover common pitfalls in securing Kafka, and talk about ongoing security work.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
(Bruno Simic, Solutions Engineer, Couchbase)
Breakout during Confluent’s streaming event in Munich. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...DataStax
Managing 3.8 million e-prescriptions daily for more than 1 million healthcare professionals is no small feat. And, with rapid growth in the number of digital transactions and expansion of its network, Surescripts needed to replace its legacy relational database system to address a new set of data management challenges while meeting their customers’ demanding SLAs. Join us for this on-demand webinar to hear from Keith Willard, Chief Architect at Surescripts, to learn how and why Surescripts leverages DataStax Enterprise to deliver enhanced message processing at scale.
View recording: https://youtu.be/1T6V1XAoaJQ
Explore all DataStax webinars: https://www.datastax.com/resources/webinars
Kafka Streams is a new stream processing library natively integrated with Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.
Hello, kafka! (an introduction to apache kafka)Timothy Spann
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
Running Apache Kafka in production is only the first step in the Kafka operations journey. Professional Kafka users are ready to handle all possible disasters - because for most businesses having a disaster recovery plan is not optional.
In this session, we’ll discuss disaster scenarios that can take down entire Kafka clusters and share advice on how to plan, prepare and handle these events. This is a technical session full of best practices - we want to make sure you are ready to handle the worst mayhem that nature and auditors can cause.
Visit www.confluent.io for more information.
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
Why is Kafka so fast? Why is Kafka so popular? Why Kafka? This slide deck is a tutorial for the Kafka streaming platform. This slide deck covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example to demonstrate failover of brokers as well as consumers. Then it goes through some simple Java client examples for a Kafka Producer and a Kafka Consumer. We have also expanded on the Kafka design section and added references. The tutorial covers Avro and the Schema Registry as well as advance Kafka Producers.
Watch this talk here: https://www.confluent.io/online-talks/how-apache-kafka-works-on-demand
Pick up best practices for developing applications that use Apache Kafka, beginning with a high level code overview for a basic producer and consumer. From there we’ll cover strategies for building powerful stream processing applications, including high availability through replication, data retention policies, producer design and producer guarantees.
We’ll delve into the details of delivery guarantees, including exactly-once semantics, partition strategies and consumer group rebalances. The talk will finish with a discussion of compacted topics, troubleshooting strategies and a security overview.
This session is part 3 of 4 in our Fundamentals for Apache Kafka series.
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022HostedbyConfluent
If you were to ask any developer, ""what's a schema and where is it used?"" Most likely, you'd get an answer involving a relational database. The truth is the domain objects used in applications represent a contract, an implied schema, whether developers choose to acknowledge them or not. But even if you recognize the need for a formal schema, what's the best way to manage them?
This presentation will contain some theory and primarily practical application for schemas with Schema Registry. I'll briefly explain what a schema is and how it's very relevant to any application working with Kafka today. It will go into the practical, introducing Schema Registry, describing how it works and how developers can leverage it to provide schemas across an organization. The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.
Attendees will walk away with knowledge of Schema Registry and a solid understanding of how it works, how to integrate them into Kafka clients. They'll also learn enough about the supported serialization frameworks to start implementing schemas right away in their Kafka development efforts.
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
Flink Forward San Francisco 2022.
Apache Flink and Delta Lake together allow you to build the foundation for your data lakehouses by ensuring the reliability of your concurrent streams from processing to the underlying cloud object-store. Together, the Flink/Delta Connector enables you to store data in Delta tables such that you harness Delta’s reliability by providing ACID transactions and scalability while maintaining Flink’s end-to-end exactly-once processing. This ensures that the data from Flink is written to Delta Tables in an idempotent manner such that even if the Flink pipeline is restarted from its checkpoint information, the pipeline will guarantee no data is lost or duplicated thus preserving the exactly-once semantics of Flink.
by
Scott Sandre & Denny Lee
Automate Your Kafka Cluster with Kubernetes Custom Resources confluent
(Sam Obeid, Shopify) Kafka Summit SF 2018
At Shopify we manage multiple Apache Kafka clusters in multiple locations in Google’s cloud platform. We deploy our Kafka clusters as Kubernetes StatefulSets, and we use other K8s workloads to implement different tasks. Automating critical and repetitive operational tasks is one of our top priorities.
In this talk we’ll discuss how we leveraged Kubernetes Custom Resources and Controllers to automate some of the key cluster operational tasks, to detect clusters configuration changes and react to these changes with required actions. We will go through actual examples we implemented at Shopify, how we solved the problem of cluster discovery and how we automated topics creation across different clusters with zero human intervention and safety controls.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform and more and more is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate, ORDS APIs and bridging Kafka with Oracle AQ.
Apache Kafka 0.8 basic training - VerisignMichael Noll
Apache Kafka 0.8 basic training (120 slides) covering:
1. Introducing Kafka: history, Kafka at LinkedIn, Kafka adoption in the industry, why Kafka
2. Kafka core concepts: topics, partitions, replicas, producers, consumers, brokers
3. Operating Kafka: architecture, hardware specs, deploying, monitoring, P&S tuning
4. Developing Kafka apps: writing to Kafka, reading from Kafka, testing, serialization, compression, example apps
5. Playing with Kafka using Wirbelsturm
Audience: developers, operations, architects
Created by Michael G. Noll, Data Architect, Verisign, https://www.verisigninc.com/
Verisign is a global leader in domain names and internet security.
Tools mentioned:
- Wirbelsturm (https://github.com/miguno/wirbelsturm)
- kafka-storm-starter (https://github.com/miguno/kafka-storm-starter)
Blog post at:
http://www.michael-noll.com/blog/2014/08/18/apache-kafka-training-deck-and-tutorial/
Many thanks to the LinkedIn Engineering team (the creators of Kafka) and the Apache Kafka open source community!
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
Active-Active, Active-Passive, and stretch clusters are hallmark patterns that have been the gold standard in Apache Kafka® disaster recovery architectures for years. Moving to Kubernetes requires unpacking these patterns and choosing a configuration that allows you to meet the same RTO and RPO requirements.
In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution that's built-in!
Armed with this information, you will be able to architect your new Apache Kafka Kubernetes deployment (or retool your existing one) to achieve the resilience you require.
With Apache Kafka 0.9, the community has introduced a number of features to make data streams secure. In this talk, we’ll explain the motivation for making these changes, discuss the design of Kafka security, and explain how to secure a Kafka cluster. We will cover common pitfalls in securing Kafka, and talk about ongoing security work.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
(Bruno Simic, Solutions Engineer, Couchbase)
Breakout during Confluent’s streaming event in Munich. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...DataStax
Managing 3.8 million e-prescriptions daily for more than 1 million healthcare professionals is no small feat. And, with rapid growth in the number of digital transactions and expansion of its network, Surescripts needed to replace its legacy relational database system to address a new set of data management challenges while meeting their customers’ demanding SLAs. Join us for this on-demand webinar to hear from Keith Willard, Chief Architect at Surescripts, to learn how and why Surescripts leverages DataStax Enterprise to deliver enhanced message processing at scale.
View recording: https://youtu.be/1T6V1XAoaJQ
Explore all DataStax webinars: https://www.datastax.com/resources/webinars
Streaming Data Ingest and Processing with Apache KafkaAttunity
Apache™ Kafka is a fast, scalable, durable, and fault-tolerant
publish-subscribe messaging system. It offers higher throughput, reliability and replication. To manage growing data volumes, many companies are leveraging Kafka for streaming data ingest and processing.
Join experts from Confluent, the creators of Apache™ Kafka, and the experts at Attunity, a leader in data integration software, for a live webinar where you will learn how to:
-Realize the value of streaming data ingest with Kafka
-Turn databases into live feeds for streaming ingest and processing
-Accelerate data delivery to enable real-time analytics
-Reduce skill and training requirements for data ingest
The recorded webinar on slide 32 includes a demo using automation software (Attunity Replicate) to stream live changes from a database into Kafka and also includes a Q&A with our experts.
For more information, please go to www.attunity.com/kafka.
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
Progressive Insurance is well known for its innovative use of data to better serve its customers, and the important role that Hortonworks Data Platform has played in that transformation. However, as with most things worth doing, the path to the Data Lake was not without its challenges. In this session, I’ll share our top use cases for Hadoop – including telematics and display ads, how a skills shortage turned supporting these applications into a nightmare, and how – and why – we now use Syncsort DMX-h to accelerate enterprise adoption by making it quick and easy (or faster and easier) to populate the data lake – and keep it up to date – with data from across the enterprise. I’ll discuss the different approaches we tried, the benefits of using a tool vs. open source, and how we created our Hadoop Ingestor app using Syncsort DMX-h.
Best Practices for Building Hybrid-Cloud Architectures | Hans Jespersenconfluent
Best Practices for building Hybrid-Cloud Architectures - Hans Jespersen
Afternoon opening presentation during Confluent’s streaming event in Paris, presented by Hans Jespersen, VP WW Systems Engineering at Confluent.
The Most Trusted In-Memory database in the world- AltibaseAltibase
Life is a database. How you manage data defines business. ALTIBASE HDB with its Hybrid architecture combines the extreme speed of an In-Memory Database with the storage capacity of an On-Disk Database’ in a single unified engine.
ALTIBASE® HDB™ is the only Hybrid DBMS in the industry that combines an in-memory DBMS with an on-disk DBMS, with a single uniform interface, enabling real-time access to large volumes of data, while simplifying and revolutionizing data processing. ALTIBASE XDB is the world’s fastest in-memory DBMS, featuring unprecedented high performance, and supports SQL-99 standard for wide applicability.
Altibase is provider of In-Memory data solutions for real-time access, analysis and distribution of high volumes of data in mission-critical environments.
Please visit our website (www.altibase.com) to learn more about our products and read more about our case studies. Or contact us at info@altibase.com. We look forward to helping you!
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
The world’s largest enterprises run their infrastructure on Oracle, DB2 and SQL and their critical business operations on SAP applications. Organisations need this data to be available in real-time to conduct necessary analytics. However, delivering this heterogeneous data at the speed it’s required can be a huge challenge because of the complex underlying data models and structures and legacy manual processes which are prone to errors and delays.
Unlock these silos of data and enable the new advanced analytics platforms by attending this session.
Find out how to:
• To overcome common challenges faced by enterprises trying to access their SAP data
• You can integrate SAP data in real-time with change data capture (CDC) technology
• Organisations are using Attunity Replicate for SAP to stream SAP data in to Kafka
Speakers:
John Hol, Regional Director, Attunity
Mike Hollobon, Director Business Development, IBT
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Sandipan Chakraborty, Director of Engineering (Rakuten)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Simplifying Big Data Integration with Syncsort DMX and DMX-hPrecisely
Today’s modern data strategies have to manage more than growing data volumes. They must also address the added complexity of integrating diverse data sources and types, adhere to security and governance mandates, and ensure the right tools and skills are in place to deliver business value from the data.
Learn how the latest enhancements to Syncsort DMX and DMX-h can help you achieve your modern data strategy goals with a single interface for accessing and integrating all your enterprise data sources – batch and streaming – across Hadoop, Spark, Linux, Windows or Unix – on premise or in the cloud.
Watch this on-demand customer education webcast to learn the latest product features introduced this year, including:
• Best in class data ingestion capabilities with enhanced support for mainframes, RDBMSs, MPP, Avro/Parquet, Kafka, NoSQL and more.
• Single interface for streaming and batch processes – now with support for Kafka and MapR Streams
• Secure data access, data governance and lineage with seamless integration with Kerberos, Apache Ranger, Apache Ambari, Cloudera Manager, Cloudera Navigator and Sentry.
• Evolution of our design once, deploy anywhere architecture – now with support for Spark!
Seamless, Real-Time Data Integration with ConnectPrecisely
As many of our customers have come to learn - integrating legacy data into modern data architecture is easier said than done! View this on-demand webinar to learn all about Precisely's seamless data integration solutions and how they have helped thousands of customers like you trust their data.
Learn about the two flavors of Precisely's Connect:
• Collect, prepare, transform and load your data to various targets using Connect ETL with the flexibility of using clusters and running on many different environments. With our 'design once, deploy anywhere' feature; what is built on prem today, can run on a cloud platform tomorrow with no development or mainframe expertise required.
• Capture data changes in real-time with no coding, tuning or performance impact using Connect CDC. Replicating exactly WHAT you need and HOW you need it with over 80 built-in data transformation methods.
An Introduction to Confluent Cloud: Apache Kafka as a Serviceconfluent
Business breakout during Confluent’s streaming event in Munich, presented by Hans Jespersen, VP WW Systems Engineering at Confluent. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Implyconfluent
Presenters: Rachel Pedreschi, Senior Director, Solutions Engineering, Imply.io + Josh Treichel, Partner Solutions Architect, Confluent
Analytic pipelines running purely on batch processing systems can suffer from hours of data lag, resulting in accuracy issues with analysis and overall decision-making. Join us for a demo to learn how easy it is to integrate your Apache Kafka® streams in Apache Druid (incubating) to provide real-time insights into the data.
In this online talk, you’ll hear about ingesting your Kafka streams into Imply’s scalable analytic engine and gaining real-time insights via a modern user interface.
Register now to learn about:
-The benefits of combining a real-time streaming platform with a comprehensive analytics stack
-Building an analytics pipeline by integrating Confluent Platform and Imply
-How KSQL, streaming SQL for Kafka, can easily transform and filter streams of data in real time
-Querying and visualizing streaming data in Imply
-Practical ways to implement Confluent Platform and Imply to address common use cases such as analyzing network flows, collecting and monitoring IoT data and visualizing clickstream data
Confluent Platform, developed by the creators of Kafka, enables the ingest and processing of massive amounts of real-time event data. Imply, the complete analytics stack built on Druid, can ingest, store, query and visualize streaming data from Confluent Platform, enabling end-to-end real-time analytics. Together, Confluent and Imply can provide low latency data delivery, data transform, and data querying capabilities to power a range of use cases.
Similar to Real-time Data Streaming from Oracle to Apache Kafka (20)
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
In our exclusive webinar, you'll learn why event-driven architecture is the key to unlocking cost efficiency, operational effectiveness, and profitability. Gain insights on how this approach differs from API-driven methods and why it's essential for your organization's success.
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
In today's data-driven world, the Internet of Things (IoT) is revolutionizing industries and unlocking new possibilities. Join Data Reply, Confluent, and Imply as we unveil a comprehensive solution for IoT that harnesses the power of real-time insights.
Workshop híbrido: Stream Processing con Flinkconfluent
El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real.
Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real.
En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace.
In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms.
You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes.
Don't miss out on this opportunity to learn from industry experts and take your business to the next level.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Eventos y Microservicios - Santander TechTalkconfluent
Durante esta sesión examinaremos cómo el mundo de los eventos y los microservicios se complementan y mejoran explorando cómo los patrones basados en eventos nos permiten descomponer monolitos de manera escalable, resiliente y desacoplada.
Purpose of the session is to have a dive into Apache, Kafka, Data Streaming and Kafka in the cloud
- Dive into Apache Kafka
- Data Streaming
- Kafka in the cloud
Build real-time streaming data pipelines to AWS with Confluentconfluent
Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
No matter whether you are migrating your Kafka cluster to Confluent Cloud, running a cloud-hybrid environment or are in a different situation where data protection and encryption of sensitive information is required, Confluent Service Mesh allows you to transparently encrypt your data without the need to make code changes to you existing applications.
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
Microservices have become a dominant architectural paradigm for building systems in the enterprise, but they are not without their tradeoffs. Learn how to build event-driven microservices with Apache Kafka
Confluent & GSI Webinars series - Session 3confluent
An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities.
It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks.
This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.
Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud
How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?
This session will show why the old paradigm does not work and that a new approach to the data strategy needs to be taken. It aims to show how a Data Streaming Platform is integral to the evolution of a company’s data strategy and how Confluent is not just an integration layer but the central nervous system for an organisation
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
Confluent Partner Tech Talk with Synthesisconfluent
A discussion on the arduous planning process, and deep dive into the design/architectural decisions.
Learn more about the networking, RBAC strategies, the automation, and the deployment plan.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Real-time Data Streaming from Oracle to Apache Kafka
1. Real-time Data Streaming From
Oracle to Apache Kafka
Mike Donovan – CTO
Kathy Howes – Marketing Manager
Mark Ripma – GM USA
2. Who is Dbvisit?
• Making your Oracle data competitive
• New Zealand-based, US office, Asia sales office, EU office (Prague)
• Low cost solutions; flexible licensing
• Software that’s easy to use
• People who are easy to work with
• Great support
4. IN THE CLOUD, ON-PREMISE OR HYBRID
Reduce business
disruption
Enable real-time operational
intelligence
• Disaster Recovery for Oracle SE
• High Availability
• Database upgrade and migrations
• Migrate to the Cloud
Dbvisit Solution Space
• Offload reporting
• Improve customer experience
• Better business decisions
• Continuous data integration
• Load data warehouse real-time
• Feed data streaming platforms
• Predictive analytics
• Event stream transactional data
• Feed Big Data/Hadoop
Deliver continuous data
streaming
5. The World We Live In
The Situation:
ü The enterprise is increasingly powered by data
ü The use of real-time data for competitive advantage is disrupting most
industries
ü OLTP transactional data essential
ü Traditional databases are not going away, new database technologies are
being added
ü Continuous replication data streams becoming a “first class citizen”
6. Reality of RDBMS
RDBMS
ü Millions of Oracle databases out there
ü OLTP databases are ingrained in the business
ü Pervasive
ü ERPs, CRMs
ü Oracle #1 in most sales
ü Oracle is reported to have over 50% of all RDBMS sales
ü Oracle is here to stay
8. 8Confidential
Confluent Platform: It’s Kafka ++
Feature Benefit Apache Kafka Confluent Platform 3.0 Confluent Enterprise 3.0
Apache Kafka
High throughput, low latency, high availability, secure distributed
message system
Kafka Connect
Advanced framework for connecting external sources
and destinations into Kafka
Java Client Provides easy integration into Java applications
Kafka Streams
Simple library that enables streaming application development within
the Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, etc.
Rest Proxy
Provides universal access to Kafka from any network connected device
via HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is
always consumable
Pre-Built Connectors
HDFS, JDBC and other connectors fully Certified
and fully supported by Confluent
Confluent Control Center Includes Connector Management and Stream Monitoring
Support
Connection and Monitoring command center provides advanced
functionality and control
Community Community 24x7x365
Free Free Subscription
9. 9Confidential
Confluent Platform with Dbvisit Connectivity
Confluent Platform
Alerting
Monitoring
Real-time
Analytics
Custom
Application
Transformations
Real Time
Applications
Apache Kafka Core
Connectors
Control Center Clients & Developer Tools
Hadoop
ERP
CRM
Data Warehouse
RDBMS
Data
Integration
Connectors
Database
Changes
Mobile DevicesloTLogs Website Events
Confluent Platform Confluent Platform Enterprise External Product
Support, Services and Consulting
Kafka Streams
Source Sink
10. Dbvisit Replicate Connector
• Minimal impact on
source Oracle OLTP
systems
• Easy to install, configure
and administer
• Release Date
• Evaluation version
Plugin for the Kafka Connect framework (Confluent Platform)
11. Dbvisit Replicate Connector
• Database TABLE to
Kafka TOPIC mapping
• Separate automatic
Kafka metadata TOPIC
(transaction information)
• Delivers row changes &
existing information, plus
meta data (transaction
id, type (I, D, U))
Plugin for the Kafka Connect framework (Confluent Platform)
12. License
Type
Price Per
Database USD
Perpetual
License
$21,000
1 Year Term
License
$12,600
Monthly
Rental
$2,250
• Per database pricing – minimum 1 database
• Perpetual license never expires, support is 25% per year additional
• Term and Monthly Rental includes Support & Maintenance.
• Supported Dbvisit Replicate for Kafka Connect - $5250 per year
Dbvisit Replicate and Replicate for Kafka Connect Pricing
13. Why Dbvisit?
• 100 % focused:
- Oracle
- Database replication
• Proprietary log mining technology
• Support Oracle databases back to 9.2
• Total cost of ownership
• Installed base of 1300 in 110 countries on 6 continents
• Growing, flexible, friendly
14. Dbvisit vs Others
• Oracle
- High Cost of Ownership
- Not always welcome for anything beyond their database
- Big company
• Attunity
- Not focused on Oracle
- Not focused on replication
• Striim
- Not focused on Oracle
- Not focused on replication
- Only support more recent Oracle database versions
15. When Dbvisit?
• Oracle source databases
• Oracle expertise is important to the prospect
• Other Confluent partners are providing the intelligence and analytics tools
• Cost is important
• Keeping Oracle footprint to a minimum is important
• Vendor support reputation is important
• Flexibility is a consideration
• Medium to large enterprises
• Any area of the world.