This presentation covers Real-World Pulsar Architectural Patterns involving Distributed Caching and Distributed Tracing. We also cover the use of Apache Ignite, Jaeger, Apache Flink, and many other technologies, as well as industry best-practices.
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
Microservices, events, containers, and orchestrators are dominating our vernacular today. As operations teams adapt to support these technologies in production, cloud-native platforms like Pivotal Cloud Foundry and Kubernetes have quickly risen to serve as force multipliers of automation, productivity and value.
Apache Kafka® is providing developers a critically important component as they build and modernize applications to cloud-native architecture.
This talk will explore:
• Why cloud-native platforms and why run Apache Kafka on Kubernetes?
• What kind of workloads are best suited for this combination?
• Tips to determine the path forward for legacy monoliths in your application portfolio
• Demo: Running Apache Kafka as a Streaming Platform on Kubernetes
Integrating Apache Kafka Into Your Environmentconfluent
Watch this talk here: https://www.confluent.io/online-talks/integrating-apache-kafka-into-your-environment-on-demand
Integrating Apache Kafka with other systems in a reliable and scalable way is a key part of an event streaming platform. This session will show you how to get streams of data into and out of Kafka with Kafka Connect and REST Proxy, maintain data formats and ensure compatibility with Schema Registry and Avro, and build real-time stream processing applications with Confluent KSQL and Kafka Streams.
This session is part 4 of 4 in our Fundamentals for Apache Kafka series.
Neo4j Graph Streaming Services with Apache Kafkajexp
In this presentation we give an high level overview of the Neo4j-Kafka integration and the Confluent partnership.
Providing change-data-capture and ingestion capabilities as Neo4j Extension and the Kafka Connect Neo4j Sink on Confluent Hub allows you to integrate real-time streaming with graph querying and analytics.
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentHostedbyConfluent
A talk discussing the rise of Apache Kafka and data in motion plus the impact of cloud native data systems. This talk will cover how Kafka needs to evolve to keep up with the future of cloud, what this means for distributed systems engineers, and what work is being done to truly make Kafka Cloud Native
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
Microservices, events, containers, and orchestrators are dominating our vernacular today. As operations teams adapt to support these technologies in production, cloud-native platforms like Pivotal Cloud Foundry and Kubernetes have quickly risen to serve as force multipliers of automation, productivity and value.
Apache Kafka® is providing developers a critically important component as they build and modernize applications to cloud-native architecture.
This talk will explore:
• Why cloud-native platforms and why run Apache Kafka on Kubernetes?
• What kind of workloads are best suited for this combination?
• Tips to determine the path forward for legacy monoliths in your application portfolio
• Demo: Running Apache Kafka as a Streaming Platform on Kubernetes
Integrating Apache Kafka Into Your Environmentconfluent
Watch this talk here: https://www.confluent.io/online-talks/integrating-apache-kafka-into-your-environment-on-demand
Integrating Apache Kafka with other systems in a reliable and scalable way is a key part of an event streaming platform. This session will show you how to get streams of data into and out of Kafka with Kafka Connect and REST Proxy, maintain data formats and ensure compatibility with Schema Registry and Avro, and build real-time stream processing applications with Confluent KSQL and Kafka Streams.
This session is part 4 of 4 in our Fundamentals for Apache Kafka series.
Neo4j Graph Streaming Services with Apache Kafkajexp
In this presentation we give an high level overview of the Neo4j-Kafka integration and the Confluent partnership.
Providing change-data-capture and ingestion capabilities as Neo4j Extension and the Kafka Connect Neo4j Sink on Confluent Hub allows you to integrate real-time streaming with graph querying and analytics.
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentHostedbyConfluent
A talk discussing the rise of Apache Kafka and data in motion plus the impact of cloud native data systems. This talk will cover how Kafka needs to evolve to keep up with the future of cloud, what this means for distributed systems engineers, and what work is being done to truly make Kafka Cloud Native
MongoDB .local London 2019: Streaming Data on the Shoulders of GiantsLisa Roth, PMP
Life doesn't happen in batch mode which is why application engineers and data architects need to closely cooperate to get the best out of streaming platforms like Apache Kafka and NoSQL data stores such as MongoDB. This session explores ways and means to integrate both worlds in a streaming fashion.
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit and when it is not.
The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage and ksqlDB as event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration.
Key takeaways:
- Kafka can store data forever in a durable and high available manner
- Kafka has different options to query historical data
- Kafka-native add-ons like ksqlDB or Tiered Storage make Kafka more powerful than ever before to store and process data
- Kafka does not provide transactions, but exactly-once semantics
- Kafka is not a replacement for existing databases like MySQL, MongoDB or Elasticsearch
- Kafka and other databases complement each other; the right solution has to be selected for a problem
- Different options are available for bi-directional pull and push-based integration between Kafka and databases to complement each other
Video Recording:
https://youtu.be/7KEkWbwefqQ
Blog post:
https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/
From data stream management to distributed dataflows and beyondVasia Kalavri
Recent efforts by academia and open-source communities have established stream processing as a principal data analysis technology across industry. All major cloud vendors offer streaming dataflow pipelines and online analytics as managed services. Notable use-cases include real-time fault detection in space networks, city traffic management, dynamic pricing for car-sharing, and anomaly detection in financial transactions. At the same time, streaming dataflow systems are increasingly being used for event-driven applications beyond analytics, such as orchestrating microservices and model serving. In the past decades, streaming technology has evolved significantly, however, emerging applications are once more challenging the design decisions of modern streaming systems. In this talk, I will discuss the evolution of stream processing and bring current trends and open problems to the attention of our community.
apidays LIVE India - REST the Events - REST APIs for Event-Driven Architectur...apidays
apidays LIVE India 2021 - Connecting 1.3 billion digital innovators
May 20, 2021
REST the Events - REST APIs for Event-Driven Architecture
Mark Teehan, Principal Solution Engineer at Confluent APAC
Self-service Events & Decentralised Governance with AsyncAPI: A Real World Ex...HostedbyConfluent
Despite great advances in Kafka's SaaS offerings it can still be challenging to create a sustainable event-driven ecosystem. Often platform engineers become de facto ‘gatekeepers’ of events & topics, yet their day job is not about data modelling or domain expertise. We've all seen the bottlenecks these unsustainable processes create.
Realising the potential of event streams requires much more than infrastructure. Beyond an event-driven mindset, it requires domain experts to lead creation of well-defined discoverable events through fit-for-purpose governance. AsyncAPI is the OpenAPI for events that can form the basis of the required self-governing, self-service eventing framework.
This session will introduce a self-governing framework using AsyncAPI and share how the Bank of New Zealand applied this framework to leverage a passionate Kafka community and embed event-driven thinking. You’ll leave with a tangible set of ideas to give your own events a bit more swagger using AsyncAPI.
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowKai Wähner
Use cases and architectures for IoT projects leveraging Apache Kafka, ksqlDB, machine Learning / deep Learning frameworks like TensorFlow, and cloud infrastructure.
Large numbers of IoT devices lead to big data and the need for further processing and analysis. Apache Kafka is a highly scalable and distributed open source streaming platform, which can connect to MQTT and other IoT standards. Kafka ingests, stores, processes and forwards high volumes of data from thousands of IoT devices.
The rapidly expanding world of stream processing can be daunting, with new concepts such as various types of time semantics, windowed aggregates, changelogs, and programming frameworks to master. KSQL is the streaming SQL engine on top of Apache Kafka which simplifies all this and make stream processing available to everyone without the need to write source code.
This talk shows how to leverage Kafka and KSQL in an IoT sensor analytics scenario for predictive maintenance and integration with real time monitoring systems. A live demo shows how to embed and deploy Machine Learning models - built with frameworks like TensorFlow, DeepLearning4J or H2O - into mission-critical and scalable real time applications.
Cloud Native London 2019 Faas composition using Kafka and cloud-eventsNeil Avery
Serverless functions or FaaS are all the rage.
By leveraging well established event-driven microservice design principles and applying them to serverless functions you can build a homogenous ecosystem to run FaaS applications. Kafka’s natural ability to store and replay events means serverless functions can not only be replayed, but they can also be used to choreograph call chains or driven using orchestration. Kafka also means you can democratize and organize FaaS environments in a way that scales across the enterprise. Underpinning this mantra is the use of Cloud Events by the CNCF serverless working group (of which Confluent is an active member).
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
New Features in Confluent Platform 6.0 / Apache Kafka 2.6Kai Wähner
New Features in Confluent Platform 6.0 / Apache Kafka 2.6, including REST Proxy and API, Tiered Storage for AWS S3 and GCP GCS, Cluster Linking (On-Premise, Edge, Hybrid, Multi-Cloud), Self-Balancing Clusters), ksqlDB.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
Kakfa summit london 2019 - the art of the event-streaming appNeil Avery
Have you ever imagined what it would be like to build a massively scalable streaming application on Kafka, the challenges, the patterns and the thought process involved? How much of the application can be reused? What patterns will you discover? How does it all fit together? Depending upon your use case and business, this can mean many things. Starting out with a data pipeline is one thing, but evolving into a company-wide real-time application that is business critical and entirely dependent upon a streaming platform is a giant leap. Large-scale streaming applications are also called event streaming applications. They are classically different from other data systems; event streaming applications are viewed as a series of interconnected streams that are topologically defined using stream processors; they hold state that models your use case as events. Almost like a deconstructed real-time database.
In this talk, I step through the origins of event streaming systems, understanding how they are developed from raw events to evolve into something that can be adopted at an organizational scale. I start with event-first thinking, Domain Driven Design to build data models that work with the fundamentals of Streams, Kafka Streams, KSQL and Serverless (FaaS).
Building upon this, I explain how to build common business functionality by stepping through the patterns for: – Scalable payment processing – Run it on rails: Instrumentation and monitoring – Control flow patterns Finally, all of these concepts are combined in a solution architecture that can be used at an enterprise scale. I will introduce enterprise patterns such as events-as-a-backbone, events as APIs and methods for governance and self-service. You will leave talk with an understanding of how to model events with event-first thinking, how to work towards reusable streaming patterns and most importantly, how it all fits together at scale.
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
High level introduction to Confluent REST Proxy and Schema Registry (leveraging Apache Avro under the hood), two components of the Apache Kafka open source ecosystem. See the concepts, architecture and features.
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
Organizations today are looking to stream IoT data to Apache Kafka. However, connecting tens of thousands or even millions of devices over unreliable networks can create some architecture challenges. In this session, we will identify and demo some best practices for implementing a large scale IoT system that can stream MQTT messages to Apache Kafka.
We use HiveMQ as open source MQTT broker to ingest data from IoT devices, ingest the data in real time into an Apache Kafka cluster for preprocessing (using Kafka Streams / KSQL), and model training + inference (using TensorFlow 2.0 and its TensorFlow I/O Kafka plugin).
We leverage additional enterprise components from HiveMQ and Confluent to allow easy operations, scalability and monitoring.
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Kai Wähner
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Understand how event streaming with Kafka and Confluent complements tools and frameworks such as Kong, Mulesoft, Apigee, Envoy, Istio, Linkerd, Software AG, TIBCO Mashery, IBM, Axway, etc.
A Streaming API Data Exchangeprovides streaming replication between business units and companies. API Management with REST/HTTP is not appropriate for streaming data.
GCP for Apache Kafka® Users: Stream Ingestion and Processingconfluent
Watch this talk here: https://www.confluent.io/online-talks/gcp-for-apache-kafka-users-stream-ingestion-processing
In private and public clouds, stream analytics commonly means stateless processing systems organized around Apache Kafka® or a similar distributed log service. GCP took a somewhat different tack, with Cloud Pub/Sub, Dataflow, and BigQuery, distributing the responsibility for processing among ingestion, processing and database technologies.
We compare the two approaches to data integration and show how Dataflow allows you to join and transform and deliver data streams among on-prem and cloud Apache Kafka clusters, Cloud Pub/Sub topics and a variety of databases. The session will have a mix of architectural discussions and practical code reviews of Dataflow-based pipelines.
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...HostedbyConfluent
As cyber threats continuously grow in sophistication and frequency, companies need to quickly acclimate to effectively detect, respond, and protect their environments. At Intel, we’ve addressed this need by implementing a modern, scalable Cyber Intelligence Platform (CIP) based on Splunk and Apache Kafka. We believe that CIP positions us for the best defense against cyber threats well into the future.
Our CIP ingests tens of terabytes of data each day and transforms it into actionable insights through streams processing, context-smart applications, and advanced analytics techniques. Kafka serves as a massive data pipeline within the platform. It provides us the ability to operate on data in-stream, enabling us to reduce Mean Time to Detect (MTTD) and Mean Time to Respond (MTTR). Faster detection and response ultimately leads to better prevention.
In our session, we’ll discuss the details described in the IT@Intel white paper that was published in Nov 2020 with same title.
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
Kafka Streams and the addition of KSQL has provided opportunities do stateful processing of data. Sometimes, the biggest challenge is determining how you can join that data. Keying and windowing are core concepts that need to be understood in order to properly and efficiently stream data. In this presentation, Neil will utilize geospatial data to showcase non-trivial joining; particularly, but not limited to, distance comparisons. The stream processing will be written in Kafka Streams DSL and in KSQL with the topologies being compared. KSQL 2.0 concepts of User Defined Functions (UDFs), nested AVRO structures, and ‘insert into’ functionality of KSQL will be showcased.
The presentation will show a custom OpenSky Connector for obtaining real-time aircraft, a Streams application for processing that data, a D3 topojson application to visualize the data, and an addition KSQL implementation of the streams application for comparison. Expect a deep dive into the Streams DSL and KSQL implementations that will provide the bases into a discussion around Apache Kafka and stream processing.
Benefits of Stream Processing and Apache Kafka Use Casesconfluent
Watch this talk here: https://www.confluent.io/online-talks/benefits-of-stream-processing-and-apache-kafka-use-cases-on-demand
This talk explains how companies are using event-driven architecture to transform their business and how Apache Kafka serves as the foundation for streaming data applications.
Learn how major players in the market are using Kafka in a wide range of use cases such as microservices, IoT and edge computing, core banking and fraud detection, cyber data collection and dissemination, ESB replacement, data pipelining, ecommerce, mainframe offloading and more.
Also discussed in this talk are the differences between Apache Kafka and Confluent Platform.
This session is part 1 of 4 in our Fundamentals for Apache Kafka series.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
ndependent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
MongoDB .local London 2019: Streaming Data on the Shoulders of GiantsLisa Roth, PMP
Life doesn't happen in batch mode which is why application engineers and data architects need to closely cooperate to get the best out of streaming platforms like Apache Kafka and NoSQL data stores such as MongoDB. This session explores ways and means to integrate both worlds in a streaming fashion.
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit and when it is not.
The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage and ksqlDB as event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration.
Key takeaways:
- Kafka can store data forever in a durable and high available manner
- Kafka has different options to query historical data
- Kafka-native add-ons like ksqlDB or Tiered Storage make Kafka more powerful than ever before to store and process data
- Kafka does not provide transactions, but exactly-once semantics
- Kafka is not a replacement for existing databases like MySQL, MongoDB or Elasticsearch
- Kafka and other databases complement each other; the right solution has to be selected for a problem
- Different options are available for bi-directional pull and push-based integration between Kafka and databases to complement each other
Video Recording:
https://youtu.be/7KEkWbwefqQ
Blog post:
https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/
From data stream management to distributed dataflows and beyondVasia Kalavri
Recent efforts by academia and open-source communities have established stream processing as a principal data analysis technology across industry. All major cloud vendors offer streaming dataflow pipelines and online analytics as managed services. Notable use-cases include real-time fault detection in space networks, city traffic management, dynamic pricing for car-sharing, and anomaly detection in financial transactions. At the same time, streaming dataflow systems are increasingly being used for event-driven applications beyond analytics, such as orchestrating microservices and model serving. In the past decades, streaming technology has evolved significantly, however, emerging applications are once more challenging the design decisions of modern streaming systems. In this talk, I will discuss the evolution of stream processing and bring current trends and open problems to the attention of our community.
apidays LIVE India - REST the Events - REST APIs for Event-Driven Architectur...apidays
apidays LIVE India 2021 - Connecting 1.3 billion digital innovators
May 20, 2021
REST the Events - REST APIs for Event-Driven Architecture
Mark Teehan, Principal Solution Engineer at Confluent APAC
Self-service Events & Decentralised Governance with AsyncAPI: A Real World Ex...HostedbyConfluent
Despite great advances in Kafka's SaaS offerings it can still be challenging to create a sustainable event-driven ecosystem. Often platform engineers become de facto ‘gatekeepers’ of events & topics, yet their day job is not about data modelling or domain expertise. We've all seen the bottlenecks these unsustainable processes create.
Realising the potential of event streams requires much more than infrastructure. Beyond an event-driven mindset, it requires domain experts to lead creation of well-defined discoverable events through fit-for-purpose governance. AsyncAPI is the OpenAPI for events that can form the basis of the required self-governing, self-service eventing framework.
This session will introduce a self-governing framework using AsyncAPI and share how the Bank of New Zealand applied this framework to leverage a passionate Kafka community and embed event-driven thinking. You’ll leave with a tangible set of ideas to give your own events a bit more swagger using AsyncAPI.
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowKai Wähner
Use cases and architectures for IoT projects leveraging Apache Kafka, ksqlDB, machine Learning / deep Learning frameworks like TensorFlow, and cloud infrastructure.
Large numbers of IoT devices lead to big data and the need for further processing and analysis. Apache Kafka is a highly scalable and distributed open source streaming platform, which can connect to MQTT and other IoT standards. Kafka ingests, stores, processes and forwards high volumes of data from thousands of IoT devices.
The rapidly expanding world of stream processing can be daunting, with new concepts such as various types of time semantics, windowed aggregates, changelogs, and programming frameworks to master. KSQL is the streaming SQL engine on top of Apache Kafka which simplifies all this and make stream processing available to everyone without the need to write source code.
This talk shows how to leverage Kafka and KSQL in an IoT sensor analytics scenario for predictive maintenance and integration with real time monitoring systems. A live demo shows how to embed and deploy Machine Learning models - built with frameworks like TensorFlow, DeepLearning4J or H2O - into mission-critical and scalable real time applications.
Cloud Native London 2019 Faas composition using Kafka and cloud-eventsNeil Avery
Serverless functions or FaaS are all the rage.
By leveraging well established event-driven microservice design principles and applying them to serverless functions you can build a homogenous ecosystem to run FaaS applications. Kafka’s natural ability to store and replay events means serverless functions can not only be replayed, but they can also be used to choreograph call chains or driven using orchestration. Kafka also means you can democratize and organize FaaS environments in a way that scales across the enterprise. Underpinning this mantra is the use of Cloud Events by the CNCF serverless working group (of which Confluent is an active member).
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
New Features in Confluent Platform 6.0 / Apache Kafka 2.6Kai Wähner
New Features in Confluent Platform 6.0 / Apache Kafka 2.6, including REST Proxy and API, Tiered Storage for AWS S3 and GCP GCS, Cluster Linking (On-Premise, Edge, Hybrid, Multi-Cloud), Self-Balancing Clusters), ksqlDB.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
Kakfa summit london 2019 - the art of the event-streaming appNeil Avery
Have you ever imagined what it would be like to build a massively scalable streaming application on Kafka, the challenges, the patterns and the thought process involved? How much of the application can be reused? What patterns will you discover? How does it all fit together? Depending upon your use case and business, this can mean many things. Starting out with a data pipeline is one thing, but evolving into a company-wide real-time application that is business critical and entirely dependent upon a streaming platform is a giant leap. Large-scale streaming applications are also called event streaming applications. They are classically different from other data systems; event streaming applications are viewed as a series of interconnected streams that are topologically defined using stream processors; they hold state that models your use case as events. Almost like a deconstructed real-time database.
In this talk, I step through the origins of event streaming systems, understanding how they are developed from raw events to evolve into something that can be adopted at an organizational scale. I start with event-first thinking, Domain Driven Design to build data models that work with the fundamentals of Streams, Kafka Streams, KSQL and Serverless (FaaS).
Building upon this, I explain how to build common business functionality by stepping through the patterns for: – Scalable payment processing – Run it on rails: Instrumentation and monitoring – Control flow patterns Finally, all of these concepts are combined in a solution architecture that can be used at an enterprise scale. I will introduce enterprise patterns such as events-as-a-backbone, events as APIs and methods for governance and self-service. You will leave talk with an understanding of how to model events with event-first thinking, how to work towards reusable streaming patterns and most importantly, how it all fits together at scale.
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
High level introduction to Confluent REST Proxy and Schema Registry (leveraging Apache Avro under the hood), two components of the Apache Kafka open source ecosystem. See the concepts, architecture and features.
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
Organizations today are looking to stream IoT data to Apache Kafka. However, connecting tens of thousands or even millions of devices over unreliable networks can create some architecture challenges. In this session, we will identify and demo some best practices for implementing a large scale IoT system that can stream MQTT messages to Apache Kafka.
We use HiveMQ as open source MQTT broker to ingest data from IoT devices, ingest the data in real time into an Apache Kafka cluster for preprocessing (using Kafka Streams / KSQL), and model training + inference (using TensorFlow 2.0 and its TensorFlow I/O Kafka plugin).
We leverage additional enterprise components from HiveMQ and Confluent to allow easy operations, scalability and monitoring.
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Kai Wähner
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Understand how event streaming with Kafka and Confluent complements tools and frameworks such as Kong, Mulesoft, Apigee, Envoy, Istio, Linkerd, Software AG, TIBCO Mashery, IBM, Axway, etc.
A Streaming API Data Exchangeprovides streaming replication between business units and companies. API Management with REST/HTTP is not appropriate for streaming data.
GCP for Apache Kafka® Users: Stream Ingestion and Processingconfluent
Watch this talk here: https://www.confluent.io/online-talks/gcp-for-apache-kafka-users-stream-ingestion-processing
In private and public clouds, stream analytics commonly means stateless processing systems organized around Apache Kafka® or a similar distributed log service. GCP took a somewhat different tack, with Cloud Pub/Sub, Dataflow, and BigQuery, distributing the responsibility for processing among ingestion, processing and database technologies.
We compare the two approaches to data integration and show how Dataflow allows you to join and transform and deliver data streams among on-prem and cloud Apache Kafka clusters, Cloud Pub/Sub topics and a variety of databases. The session will have a mix of architectural discussions and practical code reviews of Dataflow-based pipelines.
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...HostedbyConfluent
As cyber threats continuously grow in sophistication and frequency, companies need to quickly acclimate to effectively detect, respond, and protect their environments. At Intel, we’ve addressed this need by implementing a modern, scalable Cyber Intelligence Platform (CIP) based on Splunk and Apache Kafka. We believe that CIP positions us for the best defense against cyber threats well into the future.
Our CIP ingests tens of terabytes of data each day and transforms it into actionable insights through streams processing, context-smart applications, and advanced analytics techniques. Kafka serves as a massive data pipeline within the platform. It provides us the ability to operate on data in-stream, enabling us to reduce Mean Time to Detect (MTTD) and Mean Time to Respond (MTTR). Faster detection and response ultimately leads to better prevention.
In our session, we’ll discuss the details described in the IT@Intel white paper that was published in Nov 2020 with same title.
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
Kafka Streams and the addition of KSQL has provided opportunities do stateful processing of data. Sometimes, the biggest challenge is determining how you can join that data. Keying and windowing are core concepts that need to be understood in order to properly and efficiently stream data. In this presentation, Neil will utilize geospatial data to showcase non-trivial joining; particularly, but not limited to, distance comparisons. The stream processing will be written in Kafka Streams DSL and in KSQL with the topologies being compared. KSQL 2.0 concepts of User Defined Functions (UDFs), nested AVRO structures, and ‘insert into’ functionality of KSQL will be showcased.
The presentation will show a custom OpenSky Connector for obtaining real-time aircraft, a Streams application for processing that data, a D3 topojson application to visualize the data, and an addition KSQL implementation of the streams application for comparison. Expect a deep dive into the Streams DSL and KSQL implementations that will provide the bases into a discussion around Apache Kafka and stream processing.
Benefits of Stream Processing and Apache Kafka Use Casesconfluent
Watch this talk here: https://www.confluent.io/online-talks/benefits-of-stream-processing-and-apache-kafka-use-cases-on-demand
This talk explains how companies are using event-driven architecture to transform their business and how Apache Kafka serves as the foundation for streaming data applications.
Learn how major players in the market are using Kafka in a wide range of use cases such as microservices, IoT and edge computing, core banking and fraud detection, cyber data collection and dissemination, ESB replacement, data pipelining, ecommerce, mainframe offloading and more.
Also discussed in this talk are the differences between Apache Kafka and Confluent Platform.
This session is part 1 of 4 in our Fundamentals for Apache Kafka series.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
ndependent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
Shattering The Monolith(s) (Martin Kess, Namely) Kafka Summit SF 2019 confluent
Namely is a late-stage startup that builds HR, Payroll and Benefits software for mid-sized businesses. Over the years, we've ended up with a number of monolithic and legacy applications covering overlapping domain concepts, which has limited our ability to deliver new and innovative features to our customers. We need a way to get our data out of the monoliths to decouple our systems and increase our velocity. We've chosen Kafka as our way to liberate our data in a reliable, scalable and maintainable way. This talk covers specific examples of successes and missteps in our move to Kafka as the backbone of our architecture. It then looks to the future - where we are trying to go, and how we plan on getting, both from the short term and long term perspectives. Key Takeaways: - Successful and unsuccessful approaches to gradually introducing Kafka to a large organization in a way that meets the short and long term needs of the business. - Successful and unsuccessful patterns for using Kafka. - Pragmaticism versus purisim: Building Kafka-first systems, and migrating legacy systems to Kafka with Debezium. - Combining event driven systems with RPC based systems. Observability, alerting and testing. - Actionable steps that you can take to your organization to help drive adoption.
MLOps with a Feature Store: Filling the Gap in ML InfrastructureData Science Milan
A Feature Store enables machine learning (ML) features to be registered, discovered, and used as part of ML pipelines, thus making it easier to transform and validate the training data that is fed into machine learning systems. Feature stores can also enable consistent engineering of features between training and inference, but to do so, they need a common data processing platform. The first Feature Stores, developed at hyperscale AI companies such as Uber, Airbnb, and Facebook, enabled feature engineering using domain specific languages, providing abstractions tailored to the companies’ feature engineering domains. However, a general purpose Feature Store needs a general purpose feature engineering, feature selection, and feature transformation platform.
In this talk, we describe how we built a general purpose, open-source Feature Store for ML around dataframes and Apache Spark. We will demonstrate how data engineers can transform and engineers features from backend databases and data lakes, while data scientists can use PySpark to select and transform features into train/test data in a file format of choice (.tfrecords, .npy, .petastorm, etc) on a file system of choice (S3, HDFS). Finally, we will show how the Feature Store enables end-to-end ML pipelines to be factored into feature engineering and data science stages that each can run at different cadences.
Bio:
Fabio Buso is the head of engineering at Logical Clocks AB, where he leads the Feature Store development. Fabio holds a master's degree in cloud computing and services with a focus on data intensive applications, awarded by a joint program between KTH Stockholm and TU Berlin.
Topics: feature store, MLOps.
The aim of this report is to introduce developers to the world of Magento optimization, giving suggestions and practical examples of the best practices to apply.
Windows Server AppFabric Caching - What it is & when you should use it?Robert MacLean
This is from my Tech-Ed Africa 2010 talk. For more information see: http://www.sadev.co.za/content/teched-africa-2010-slides-scripts-and-demos-my-talks
This session looks at what AppFabric Caching is from start to deep dive.
Ch-ch-ch-ch-changes....Stitch Triggers - Andrew MorganMongoDB
Intelligent apps are emerging as the next frontier in analytics and application development. Learn how to build intelligent apps on MongoDB powered by Google Cloud with TensorFlow for machine learning and DialogFlow for artificial intelligence. Get your developers and data scientists to finally work together to build applications that understand your customer, automate their tasks, and provide knowledge and decision support.
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
One of the biggest challenges in data science is to build a continuous data application which delivers results rapidly and reliably. Spark Streaming offers a powerful solution for real-time data processing. However, the challenge remains in how to connect them with various continuous and real-time data sources, guaranteeing the responsiveness and reliability of data applications.
In this talk, Nan and Arijit will summarize their experiences learned from serving the real-time Spark-based data analytic solutions on Azure HDInsight. Their solution seamlessly integrates Spark and Azure EventHubs which is a hyper-scale telemetry ingestion service enabling users to ingress massive amounts of telemetry into the cloud and read the data from multiple applications using publish-subscribe semantics.
They’ll will cover three topics: bridging the gap of data communication model in Spark and data source, accommodating Spark to rate control and message addressing of data source, and the co-design of fault tolerance Mechanisms. This talk will share the insights on how to build continuous data applications with Spark and boost more availabilities of connectors for Spark and different real-time data sources.
10 Principals for Effective Event Driven MicroservicesBen Stopford
This talk includes an introduction to the Kafka ecosystem as well as event-driven microserivces, culminating with 10 rules that help with the design of such systems:
1. Don’t use Kafka for shopping carts!
2. Pick Topics with Business Significance
3. Decouple publishers from subscribers
4. Use the log to regenerate state
5. Apply the Single Writer Principal
6. Leverage keeping datasets inside the broker
7. Prefer stream processing over maintaining historic views
8. Sometimes you need historic views. => Replicate Read Only
9. Use Schemas
10. Consider “Stream Management” Services
Continuous Deployment: The Dirty DetailsMike Brittain
Presented at ALM Summit 3 in Redmond, WA. January 2013.
Like what you've read? We're frequently hiring for a variety of engineering roles at Etsy. If you're interested, drop me a line or send me your resume: mike@etsy.com.
http://www.etsy.com/careers
Presentación sobre Integration Services en SQL Server 2008.
Ing. Eduardo Castro Martinez, PhD
Microsoft SQL Server MVP
http://ecastrom.blogspot.com
http://comunidadwindows.org
How to build unified Batch & Streaming Pipelines with Apache Beam and DataflowDaniel Zivkovic
Apache Beam is a beautiful framework that blurs the line between Batch and Streaming, so check out this interactive tutorial by Patrick Lecuyer - Head of Specialist Customer Engineering at Google Canada. His examples run on GCP Dataflow, but what you'll learn will be portable across clouds, and distributed processing engines like Apache Flink, Apache Samza, Apache Spark, IBM Streams... regardless of where you do your Big Data processing!
The meetup recording with TOC for easy navigation is at https://youtu.be/7pUYKX40RfA.
P.S. For more interactive lectures like this, go to http://youtube.serverlesstoronto.org/ or sign up for our upcoming live events at https://www.meetup.com/Serverless-Toronto/events/
This is an adaptation of the presentation given at the SpringOne 2008 conference in Hollywood, FL. It contains some updates on project status, and also information about the recently published book "Spring Python 1.1"
This slideshow is licensed under a Creative Commons Attribution 3.0 United States License.
10 Principals for Effective Event-Driven Microservices with Apache KafkaBen Stopford
This talk includes an introduction to the Kafka ecosystem as well as event-driven microserivces, culminating with 10 rules that help with the design of such systems:
1. Don’t use Kafka for shopping carts!
2. Pick Topics with Business Significance
3. Decouple publishers from subscribers
4. Use the log to regenerate state
5. Apply the Single Writer Principal
6. Leverage keeping datasets inside the broker
7. Prefer stream processing over maintaining historic views
8. Sometimes you need historic views. => Replicate Read Only
9. Use Schemas
10. Consider “Stream Management” Services
Vector Search / Generative AI introduction at Pulsar MeetupDevin Bost
Presentation at the Pulsar Meetup in Bangalore on the use of vector search and generative AI when integrated with streaming. Explores mechanics of how vector embeddings work, the backdrop of retrieval augmented generation, and how vector databases like AstraDB enable a rich generative experience for AI applications.
Streaming Patterns and Best Practices with Apache Pulsar for Enabling Machine...Devin Bost
Featuring a TON of patterns, wisdom, and analogies to help you understand how to maximize the value of your data and get increased ROI on your machine learning and data architecture.
Features Apache Pulsar, Flink, Druid, Imply, and many other technologies.
How to introduce Apache Pulsar into your organization successfully - Devin BostDevin Bost
This presentation covers principles and best practices based on significant research regarding how to maximize the likelihood of success with any technological adoption or revolution in your company, especially involving Apache Pulsar.
Pulsar Architectural Patterns for CI/CD Automation and Self-ServiceDevin Bost
We examine real-world architectural patterns involving Apache Pulsar to automate the creation of function and pub/sub flows for improved operational scalability and ease of management. We’ll cover CI/CD automation patterns and reveal our innovative approach of leveraging streaming data to create a self-service platform that automates the provisioning of new users. We will also demonstrate the innovative approach of creating function flows through patterns and configuration, enabling non-developer users to create entire function flows simply by changing configurations. These patterns enable us to drive the automation of managing Pulsar to a whole new level. We also cover CI/CD for on-prem, GCP, and AWS users.
Apache Pulsar - Real-time data flows drive core business processesDevin Bost
The most profitable companies are leveraging real-time data flows to drive core business processes. Studies show that these companies are focusing on data streams, rather than on data sets. Why is streaming so much more powerful as a business capability? We will examine this paradigm shift and introduce Apache Pulsar, the next generation streaming platform.
Devin Bost - a Senior Data Engineer at Overstock - will talk about why you should choose streaming architecture for your organization to solve data problems, how you can solve problems with this new paradigm, and how you can leverage Pulsar to drive business decisions at-scale.
Sponsored by: Overstock, Recursion Pharmaceuticals, Google Cloud, Snowflake, Ternary Data, Pluralsight.
This is a presentation on natural language processing (NLP), machine learning (ML), and Big Data. I introduce neural networks, unsupervised machine learning, and a variety of natural language processing techniques, and I cover architectural best practices, ways to implement data science for practical applications, and how to deal with organizational road-blocks.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
1. Real-World Pulsar Architectural Patterns
Every pattern shown here has been developed and implemented with my
team at Overstock
Email: dbost@overstock.com
Twitter: DevinBost
LinkedIn: https://www.linkedin.com/in/devinbost/
By Devin Bost, Senior Data Engineer at Overstock
Distributed Caching + Distributed Tracing
+
14. Producer Producer Producer
Consumer Consumer Consumer
Passthrough function
It’s much safer to use a
distributed cache
technology like Ignite
/ingest
/feeds
15. Producer Producer Producer
Consumer Consumer Consumer
Passthrough function
It’s much safer to use a
distributed cache
technology like Ignite
Smart Persistence
/ingest
/feeds
16. Producer Producer Producer
Consumer Consumer Consumer
Passthrough function
It’s much safer to use a
distributed cache
technology like Ignite
Smart Persistence
Faster than Redis
/ingest
/feeds
17. Producer Producer Producer
Consumer Consumer Consumer
Passthrough function
It’s much safer to use a
distributed cache
technology like Ignite
Smart Persistence
Faster than Redis
Supports tables with backing cache
/ingest
/feeds
18. Producer Producer Producer
Consumer Consumer Consumer
Passthrough function
It’s much safer to use a
distributed cache
technology like Ignite
Smart Persistence
Faster than Redis
Supports tables with backing cache
Supports transactions
/ingest
/feeds
19. What if you have a business-
critical service that can’t lose
messages?
25. Replication
Persistent storage
Pulsar Function
Alerts to end users
(e.g. Email, SMS, Twilio call, etc.)
Alerts to end users
(e.g. Email, SMS, Twilio call, etc.)
Batch job OR
Etc
Producer Producer Producer
Passthrough function
Backfill Topic
Pulsar Function
Message
delivered yet?
Message
delivered yet?
You could add
another passthrough
function and topic if
you want more
isolation.
/ingest
/feeds
26. How about just for ingesting
data into a cache with a
backfill?
28. Option 2:
Achieves separation of concerns and
prevents QoS problems with live
traffic when running a backfill
Web Service
Passthrough
Function
Cache Sink
Batch Engine
(e.g. Spark, NiFi, etc.)
Loads into
backfill topic OR
Starts Job
Cache Sink
Persistent Append
Only Storage
Read All Data
Etc
Replication
/ingest
/feeds
29. Topic with Retention
Function
Cache Sink
Function
(stopped until
needing to backfill)
Exclusive
Mode
(Subscription stores in
Bookkeeper
automatically.)
Tiered Storage in
S3 or Google
Cloud
Backfill Cache Sink
Starts Function
Passthrough
FunctionOption 3:
Note: You need to ensure the
Bookkeeper cluster is fast enough
to keep up with the brokers or
your brokers’ memory will fill up
Also, this approach will only give
you a single backfill run unless
you have additional replication.
/feeds
31. Legacy
SQL DB
Website
Enrich from cached data
Extract Relevant Data
Filter to desired clicks
(Raw Clicks)
Store in cache
Web Application
Emitting specific events
(Omitting
passthrough
details for
simplicity.)
32. Legacy
SQL DB
Website
Enrich from cached data
Extract Relevant Data
Filter to desired clicks
Store in cache
Web Application
Emitting specific events
You can also emit directly to Pulsar as a producer.
It’s simpler if you have the ability to touch the website code.
33. Legacy
SQL DB
Website
Enrich from cached data
Extract Relevant Data
Filter to desired clicks
Store in cache
Web Application
Emitting specific events
You can also emit directly to Pulsar as a producer.
It’s simpler if you have the ability to touch the website code.
However, the raw clicks flow still gets messy.
34. Legacy
SQL DB
Website
Enrich from cached data
Store in cache
Web Application
Emitting specific events
Event A
Event B
Event C
Cleaner & better separation of concerns to have
purposeful topics... Easier to debug & maintain.
35. What if you’re using a graph engine
for more complex query logic but
need that data in real-time?
36. If you don’t make it synchronous, you
will get race conditions when updating
and querying the graph!
Web Application
Emitting specific deltas
(e.g. state change, increment, etc.)
(2) Return on completion
(1) Write Change, Wait for Success
Get full record
Complex
graph query
Synchronous Update
Function
37. What if you need a more robust verification that deltas are in
order and aren’t duplicates?
(e.g. financially impacting increment/decrement values or
state variables)
38. Usually, it’s best to separate concerns into separate functions, like this.
Web Application
Emitting specific deltas
(e.g. state change, increment, etc.)
(2) Return on completion
(1) Write Change, Wait for Success
Get full record
Complex
graph query
Synchronous Update
Function
Gate Keeper Filter
Check if
message has
been seen
already
Duplicate
or late?
Drop messageYes
No
But, is that the right approach here?
40. Web Application
Emitting specific deltas
(e.g. state change, increment, etc.)
(2) Return on completion
(1) Write Change, Wait for Success
Get full record
Complex
graph query
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Gate Keeper Filter
Gate Keeper Filter
Gate Keeper Filter
Gate Keeper Filter
Gate Keeper Filter
Gate Keeper Filter
41. In this case, the right approach is to
consolidate your logic to leverage the
transactional guarantees of your
database.
Web Application
(2) Return on completion
(1) Check if
duplicate or
outdated.
If not, write
Change, Wait for
Success
Get full record
Complex
graph query
Synchronous Update
Function
42. Leverage transactional guarantees of
your database! (Your function will
need to retry if a transaction fails.)
Web Application
Get full record
Complex
graph query
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
43. Leverage transactional guarantees of
your database! (Your function will
need to retry if a transaction fails.)
Web Application
Get full record
Complex
graph query
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Always be mindful of how the behavior
might change when function
parallelism is turned up!
44. Leverage transactional guarantees of
your database! (Your function will
need to retry if a transaction fails.)
Web Application
Get full record
Complex
graph query
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Synchronous Update
Function
Always be mindful of how the behavior
might change when function
parallelism is turned up!
If making state changes,
be sure that you get
timestamps on your
upstream data contract
so you can verify if the
messages are in order!
45. Now, what happens when you
need to debug a large or
complex function flow?
47. FunctionFunction
Function
Web Application Function
Web Service
Web Application
Function
Function
Web Service Web Application
FunctionFunction
Function
Function
Function
Function
What happens if some
messages seem to not be
reaching their destination?
48. FunctionFunction
Function
Web Application Function
Web Service
Web Application
Function
Function
Web Service Web Application
FunctionFunction
Function
Function
Function
Function
What happens if some
messages seem to not be
reaching their destination?
What happens if a message
isn’t getting transformed
correctly at some point or
null values are appearing?
49. FunctionFunction
Function
Web Application Function
Web Service
Web Application
Function
Function
Web Service Web Application
FunctionFunction
Function
Function
Function
Function
What happens if some
messages seem to not be
reaching their destination?
What happens if a message
isn’t getting transformed
correctly at some point or
null values are appearing?
What if we can’t modify the
function code (since it’s a
multi-tenant application)?
50. FunctionFunction
Function
Web Application Function
Web Service
Web Application
Function
Function
Web Service Web Application
FunctionFunction
Function
Function
Function
Function
What happens if some
messages seem to not be
reaching their destination?
What happens if a message
isn’t getting transformed
correctly at some point or
null values are appearing?
What if we can’t modify
the data contracts either
for the same reason?
What if we can’t modify the
function code (since it’s a
multi-tenant application)?
60. CorrelationId is
derived and put into
the envelope
produced by the
TapFunction.
{ "correlationKey": "productId",
"correlationValue": "20603199",
"correlationId": "productId-20603199”,
. . .
}
The TapFunction
defines the
correlationKey in
its Pulsar Config.
You can tap ANY topic!
Message1
Message2
Message3
Message1’
Message2’
Message3’
Message1’’
Message2’’
Message3’’
. . .Function1 Function1
Taps wrap
message with
header containing:
CorrelationId,
Tenant,
Namespace,
Name,
timestamp,
etc.
TapFunction TapFunction TapFunction
61. Uses Flink’s stateful join capability. It all happens in a keyed stream!
JoinerFunction
Message2 Message2’ Message2’’
CorrelationId=productId-784 Message1 Message1’ Message1’’
CorrelationId=productId-142
Message3 Message3’ Message3’’CorrelationId=productId-923
62. Allows us to specify rate to limit how
many messages are sampled. For dev,
this is set to 100% to allow all. This is just a simple Pulsar filter function.
SamplerFunction
CorrelationId=productId-784 Message1 Message1’ Message1’’
63. These Spans emit to Jaeger and can be stored in a
Cassandra or Elasticsearch backend for production.
OR
Jaeger Sink
StartTimestamp
EndTimestamp
Span
CorrelationId=productId-784
StartTimestamp
EndTimestamp
Span
Message1
StartTimestamp
EndTimestamp
Span
Message1’
StartTimestamp
EndTimestamp
Span
Message1’’Message1’ Message1’’ Message1’’’
64.
65. If fields are omitted
from the tap’s
config, we capture
all that we can.
66. Another trick is to provide alternative representations of a value to make search/analytics easier downstream.
68. Real-World Pulsar Architectural Patterns
Every pattern shown here has been developed and implemented with my
team at Overstock
Email: dbost@overstock.com
Twitter: DevinBost
LinkedIn: https://www.linkedin.com/in/devinbost/
By Devin Bost, Senior Data Engineer at Overstock
Distributed Caching + Distributed Tracing