This document summarizes a presentation about Kafka multi-tenancy at LINE Corporation. The presentation discusses how LINE runs a single shared Kafka cluster to handle over 100 billion messages per day from many independent services. It describes the hardware used, requirements for multi-tenancy like protecting against abusive workloads and providing isolation. It then discusses specific issues identified like slow response times caused by disk reads, and the solutions implemented like request quotas, metrics, and pre-loading data into memory to avoid blocking. The presentation concludes that after addressing these issues, their shared Kafka hosting model works efficiently while maintaining a single data hub.
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBayAltinity Ltd
LIVE WEBINAR: October 21, 2021 | 10 am PT
SPEAKERS: Jun Li, Principal Architect, eBay & Robert Hodges, CEO, Altinity
eBay depends on Kafka to solve the impedance mismatch between rapidly arriving messages in event streams and efficient block insert into ClickHouse clusters. Naïve loading procedures from Kafka to ClickHouse generate non-deterministic blocks, which can lead to data loss and incorrect results in applications. The eBay team solved this problem with a block aggregator that leverages Kafka to store message processing metadata as well as ClickHouse deduplication to ensure blocks being loaded to ClickHouse exactly once. The block aggregator allows eBay to support a sharded ClickHouse architecture across multiple data centers that can tolerate failures in any individual part of the system. Join us to learn how eBay developed this unique architecture and how they use it to deliver low-latency analytics to users.
Real-time streaming and data pipelines with Apache KafkaJoe Stein
Get up and running quickly with Apache Kafka http://kafka.apache.org/
* Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
* Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers
* Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
* Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Large scale near real-time log indexing with Flume and SolrCloudDataWorks Summit
Apache Flume’s extensible architecture allows Cisco to stream system and application logs from worldwide production data centers to a central Hadoop cluster and Solr. This architecture enables a new level of scalable indexing so that a larger volume of logs is searchable within seconds. Using Solr 4.0′s near real time features together with Hadoop, we can execute mission critical tasks much quicker, improving our ability to meet tight SLAs. At the same time, using the same infrastructure, we can perform large-scale historical analysis and pattern extraction to help further improve our services. This talk will explore our infrastructure and decisions we?ve made to meet key requirements, i.e. high indexing load, high availability and disaster recovery. We will further explore other uses of Flume and SolrCloud within Cisco including dynamic event routing, parsing and multi-tenancy.
At Splunk, we have made the decision to deprecate a home-brewed platform that powers the DSP's (Data Stream Processor) connector framework in favor of a framework that is powered by Pulsar IO.
In this talk, I will go over our evaluation and decision process on choosing to use the Pulsar IO framework. I will also discuss how the Splunk's DSP product is leveraging the Pulsar IO framework and especially batch sources that was recently added to Pulsar IO. I will conclude the talk with discussing the various improvements that we at Splunk have contributed to the Pulsar Functions/IO framework to increase scalability and stability. In my final remarks, I will also discuss how we intend to leverage and use Pulsar IO/Functions further in the future at Splunk.
Query Pulsar Streams using Apache FlinkStreamNative
Both Apache Pulsar and Apache Flink share a similar view on how the data and the computation level of an application can be “streaming-first” with batch as a special case streaming. With Apache Pulsar’s Segmented-Stream storage and Apache Flink’s steps to unify batch and stream processing workloads under one framework, there are numerous ways of integrating the two technologies to provide elastic data processing at massive scale, and build a real streaming warehouse.
In this talk, Sijie Guo from the Apache Pulsar community will share the latest integrations between Apache Pulsar and Apache Flink. He will explain how Apache Flink can integrate and leverage Pulsar’s built-in efficient schemas to allow users of Flink SQL query Pulsar streams in realtime.
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
The driving question behind redesigns of countless data collection architectures has often been, ?how can we make the data available to our analytical systems faster?? Increasingly, the go-to solution for this data collection problem is Apache Flume. In this talk, architectures and techniques for designing a low-latency Flume-based data collection and delivery system to enable Hadoop-based analytics are explored. Techniques for getting the data into Flume, getting the data onto HDFS and HBase, and making the data available as quickly as possible are discussed. Best practices for scaling up collection, addressing de-duplication, and utilizing a combination streaming/batch model are described in the context of Flume and Hadoop ecosystem components.
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
You will learn how Orange Financial combats financial fraud over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...StreamNative
We will introduce HerdDB a distributed database written in Java.
We will see how a distributed database can be built using Apache BookKeeper as write-ahead commit log.
At Clever Cloud, we are working on extremely light virtual machines to run WebAssembly binaries. As it’s WASM, we can write code using a lot of languages. We use a custom unikernel to run this WASM as Function-as-a-Service, using one VM per function execution. These VM can run on events from messages coming through Pulsar, or from HTTP invocation, the run is on-demand as only the consumers stay up. This can be a new model: Pulsar functions for real isolation in multi-tenancy use cases. This talk will show the use case, explain the virtualization underneath and demonstrate the multi-tenancy use case.
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
OpenTSDB continues to scale along with HBase. A number of updates have been implemented to push writes over 2 million data points a second. Here we will discuss about HBase schema improvements, including salting, random UI assignment, and using append operations instead of puts. You'll also get AsyncHBase development updates about rate limiting, statistics, and security.
Large scale log pipeline using Apache Pulsar_NozomiStreamNative
Yahoo Japan Corporation has been using Apache Pulsar as a centralized pub-sub messaging platform for more than 3 years.
We adopted Pulsar because of its great performance, scalability and multi-tenancy capability.
It plays an important role to provide our 100+ services in various areas such as e-commerce media, advertising and more.
Recently, we addressed to solve our new use case: A large scale log pipeline.
In our production environment, we are starting to run a lot of our services on container environments.
Our goal is to send all logs and metrics from application containers to various monitoring or analyzing platforms.
We expect Pulsar to keep its performance even in tremendously high traffic volume situations (i.e. in tens of Gbps).
In this presentation, we will talk about our architecture design, producer/consumer side implementation and the result of performance test.
We will also share our experience and knowledge from our production environment operations for more than 3 years.
Takeaway:
- Practical use case of Apache Pulsar on production
- Knowledge of operating Apache Pulsar for large scale data stream
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Spark Streaming has supported Kafka since it's inception, but a lot has changed since those times, both in Spark and Kafka sides, to make this integration more fault-tolerant and reliable.Apache Kafka 0.10 (actually since 0.9) introduced the new Consumer API, built on top of a new group coordination protocol provided by Kafka itself.
So a new Spark Streaming integration comes to the playground, with a similar design to the 0.8 Direct DStream approach. However, there are notable differences in usage, and many exciting new features. In this talk, we will cover what are the main differences between this new integration and the previous one (for Kafka 0.8), and why Direct DStreams have replaced Receivers for good. We will also see how to achieve different semantics (at least one, at most one, exactly once) with code examples.
Finally, we will briefly introduce the usage of this integration in Billy Mobile to ingest and process the continuous stream of events from our AdNetwork.
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBayAltinity Ltd
LIVE WEBINAR: October 21, 2021 | 10 am PT
SPEAKERS: Jun Li, Principal Architect, eBay & Robert Hodges, CEO, Altinity
eBay depends on Kafka to solve the impedance mismatch between rapidly arriving messages in event streams and efficient block insert into ClickHouse clusters. Naïve loading procedures from Kafka to ClickHouse generate non-deterministic blocks, which can lead to data loss and incorrect results in applications. The eBay team solved this problem with a block aggregator that leverages Kafka to store message processing metadata as well as ClickHouse deduplication to ensure blocks being loaded to ClickHouse exactly once. The block aggregator allows eBay to support a sharded ClickHouse architecture across multiple data centers that can tolerate failures in any individual part of the system. Join us to learn how eBay developed this unique architecture and how they use it to deliver low-latency analytics to users.
Real-time streaming and data pipelines with Apache KafkaJoe Stein
Get up and running quickly with Apache Kafka http://kafka.apache.org/
* Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
* Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers
* Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
* Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Large scale near real-time log indexing with Flume and SolrCloudDataWorks Summit
Apache Flume’s extensible architecture allows Cisco to stream system and application logs from worldwide production data centers to a central Hadoop cluster and Solr. This architecture enables a new level of scalable indexing so that a larger volume of logs is searchable within seconds. Using Solr 4.0′s near real time features together with Hadoop, we can execute mission critical tasks much quicker, improving our ability to meet tight SLAs. At the same time, using the same infrastructure, we can perform large-scale historical analysis and pattern extraction to help further improve our services. This talk will explore our infrastructure and decisions we?ve made to meet key requirements, i.e. high indexing load, high availability and disaster recovery. We will further explore other uses of Flume and SolrCloud within Cisco including dynamic event routing, parsing and multi-tenancy.
At Splunk, we have made the decision to deprecate a home-brewed platform that powers the DSP's (Data Stream Processor) connector framework in favor of a framework that is powered by Pulsar IO.
In this talk, I will go over our evaluation and decision process on choosing to use the Pulsar IO framework. I will also discuss how the Splunk's DSP product is leveraging the Pulsar IO framework and especially batch sources that was recently added to Pulsar IO. I will conclude the talk with discussing the various improvements that we at Splunk have contributed to the Pulsar Functions/IO framework to increase scalability and stability. In my final remarks, I will also discuss how we intend to leverage and use Pulsar IO/Functions further in the future at Splunk.
Query Pulsar Streams using Apache FlinkStreamNative
Both Apache Pulsar and Apache Flink share a similar view on how the data and the computation level of an application can be “streaming-first” with batch as a special case streaming. With Apache Pulsar’s Segmented-Stream storage and Apache Flink’s steps to unify batch and stream processing workloads under one framework, there are numerous ways of integrating the two technologies to provide elastic data processing at massive scale, and build a real streaming warehouse.
In this talk, Sijie Guo from the Apache Pulsar community will share the latest integrations between Apache Pulsar and Apache Flink. He will explain how Apache Flink can integrate and leverage Pulsar’s built-in efficient schemas to allow users of Flink SQL query Pulsar streams in realtime.
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
The driving question behind redesigns of countless data collection architectures has often been, ?how can we make the data available to our analytical systems faster?? Increasingly, the go-to solution for this data collection problem is Apache Flume. In this talk, architectures and techniques for designing a low-latency Flume-based data collection and delivery system to enable Hadoop-based analytics are explored. Techniques for getting the data into Flume, getting the data onto HDFS and HBase, and making the data available as quickly as possible are discussed. Best practices for scaling up collection, addressing de-duplication, and utilizing a combination streaming/batch model are described in the context of Flume and Hadoop ecosystem components.
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
You will learn how Orange Financial combats financial fraud over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...StreamNative
We will introduce HerdDB a distributed database written in Java.
We will see how a distributed database can be built using Apache BookKeeper as write-ahead commit log.
At Clever Cloud, we are working on extremely light virtual machines to run WebAssembly binaries. As it’s WASM, we can write code using a lot of languages. We use a custom unikernel to run this WASM as Function-as-a-Service, using one VM per function execution. These VM can run on events from messages coming through Pulsar, or from HTTP invocation, the run is on-demand as only the consumers stay up. This can be a new model: Pulsar functions for real isolation in multi-tenancy use cases. This talk will show the use case, explain the virtualization underneath and demonstrate the multi-tenancy use case.
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
OpenTSDB continues to scale along with HBase. A number of updates have been implemented to push writes over 2 million data points a second. Here we will discuss about HBase schema improvements, including salting, random UI assignment, and using append operations instead of puts. You'll also get AsyncHBase development updates about rate limiting, statistics, and security.
Large scale log pipeline using Apache Pulsar_NozomiStreamNative
Yahoo Japan Corporation has been using Apache Pulsar as a centralized pub-sub messaging platform for more than 3 years.
We adopted Pulsar because of its great performance, scalability and multi-tenancy capability.
It plays an important role to provide our 100+ services in various areas such as e-commerce media, advertising and more.
Recently, we addressed to solve our new use case: A large scale log pipeline.
In our production environment, we are starting to run a lot of our services on container environments.
Our goal is to send all logs and metrics from application containers to various monitoring or analyzing platforms.
We expect Pulsar to keep its performance even in tremendously high traffic volume situations (i.e. in tens of Gbps).
In this presentation, we will talk about our architecture design, producer/consumer side implementation and the result of performance test.
We will also share our experience and knowledge from our production environment operations for more than 3 years.
Takeaway:
- Practical use case of Apache Pulsar on production
- Knowledge of operating Apache Pulsar for large scale data stream
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Spark Streaming has supported Kafka since it's inception, but a lot has changed since those times, both in Spark and Kafka sides, to make this integration more fault-tolerant and reliable.Apache Kafka 0.10 (actually since 0.9) introduced the new Consumer API, built on top of a new group coordination protocol provided by Kafka itself.
So a new Spark Streaming integration comes to the playground, with a similar design to the 0.8 Direct DStream approach. However, there are notable differences in usage, and many exciting new features. In this talk, we will cover what are the main differences between this new integration and the previous one (for Kafka 0.8), and why Direct DStreams have replaced Receivers for good. We will also see how to achieve different semantics (at least one, at most one, exactly once) with code examples.
Finally, we will briefly introduce the usage of this integration in Billy Mobile to ingest and process the continuous stream of events from our AdNetwork.
How to use kakfa for storing intermediate data and use it as a pub/sub model with each of the Producer/Consumer/Topic configs deeply and the Internals working of it.
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
Developing Realtime Data Pipelines With Apache Kafka. Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation
Yuto Kawamura
LINE / Z Part Team
At LINE we've been operating Apache Kafka to provide the company-wide shared data pipeline for services using it for storing and distributing data.
Kafka is underlying many of our services in some way, not only the messaging service but also AD, Blockchain, Pay, Timeline, Cryptocurrency trading and more.
Many services feeding many data into our cluster, leading over 250 billion daily messages and 3.5GB incoming bytes in 1 second which is one of the world largest scale.
At the same time, it is required to be stable and performant all the time because many important services uses it as a backend.
In this talk I will introduce the overview of Kafka usage at LINE and how we're operating it.
I'm also going to talk about some engineerings we did for maximizing its performance, solving troubles led particularly by hosting huge data from many services, leveraging advanced techniques like kernel-level dynamic tracing.
Presentation from kafka meetup 13-SEP-2013. including some notes to clarify some slides. enjoy
Avi Levi
123avi@gmail.com
https://www.linkedin.com/in/leviavi/
Streaming in Practice - Putting Apache Kafka in Productionconfluent
This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production.
We will touch on the following topics:
- Patterns for integrating with existing data systems and applications
- Metadata management at enterprise scale
- Tradeoffs in performance, cost, availability and fault tolerance
- Choosing which cross-datacenter replication patterns fit with your application
- Considerations for operating Kafka-based data pipelines in production
Movile Internet Movel SA: A Change of Seasons: A big move to Apache CassandraDataStax Academy
A few years ago, processing large volumes of data was an exclusive problem of big companies. Nowadays, technological advancement allows people to be connected with each other all the time, generating and consuming large amounts of data.
In the challenge to follow Movile's exponential growth and increasing volume of information, we soon realized that traditional relational database and data analysis solutions were no longer a good fit to solve new order issues. Therefore, we present Movile's 'Change Of Seasons', a use case on adopting Apache Cassandra as a solution for critical high-performance distributed systems.
Cassandra Summit 2015 - A Change of SeasonsEiti Kimura
A CHANGE OF SEASONS: A big move to Apache Cassandra!
This is an extended version of the material presented at Cassandra Summit 2015 - Santa Clara - California - USA.
In this presentation I will show you 3 moves, use cases, that constitute our Big Move to Apache Cassandra @Movile.
Walking through relational model to NoSQL solution, hybrid platforms and a staggering cost reduction and throughput increase.
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...ScyllaDB
ScyllaDB is a distributed database designed to scale horizontally and vertically — in theory. What about in practice? ScyllaDB’s Benny Halevy, Director, Software Engineering, will take you through the process and results of benchmarking our NoSQL database at the petabyte level, showing how you can use advanced features like workload prioritization to control priorities of transactional (read-write) and analytic (read-only) queries on the same cluster with smooth and predictable performance.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Similar to Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE (20)
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
The Art of the Pitch: WordPress Relationships and Sales
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
1. Kafka Multi-Tenancy - 160
Billion Daily Messages on One
Shared Cluster at LINE
Yuto Kawamura - LINE Corporation
2. Speaker introduction
Yuto Kawamura
Senior Software Engineer at LINE
Leading a team for providing a
company-wide Kafka platform
Apache Kafka Contributor
Speaker at Kafka Summit SF
2017 1
1
https://kafka-summit.org/sessions/single-data-hub-
services-feed-100-billion-messages-per-day/
3. LINE
Messaging service
164 million active users in
countries with top market share
like Japan, Taiwan, Thailand and
Indonesia.2
And many other services:
- News
- Bitbox/Bitmax -
Cryptocurrency trading
- LINE Pay - Digital payment
2
As of June 2018.
4. Kafka platform at LINE
Two main usages:
— "Data Hub" for distributing data to other services
— e.g: Users relationship update event from
messaging service
— As a task queue for buffering and processing business
logic asynchronously
5. Kafka platform at LINE
Single cluster is shared by many independent services
for:
- Concept of Data Hub
- Efficiency of management/operation
Messaging, AD, News, Blockchain and etc... all of their
data stored and distributed on single Kafka cluster.
6. From department-wide to company-wide platform
It was just for messaging service. Now everyone uses it.
7. Broker installation
CPU: Intel(R) Xeon(R) 2.20GHz x 20 cores (HT) * 2
Memory: 256GiB
- more memory, more caching (page cache)
- newly written data can survive only 20 minutes ...
Network: 10Gbps
Disk: HDD x 12 RAID 1+0
- saves maintenance costs
Kafka version: 0.10.2.1 ~ 0.11.1.2
8. Requirements doing multitenancy
Cluster can protect itself against abusing workloads
- Accidental workload doesn't propagates to other
users.
We can track on which client is sending requests
- Find source of strange requests.
Certain level of isolation among client workloads
- Slow response for one client doesn't appears to
another client.
9. Protect cluster against abusing workload - Request
Quota
It is more important to manage number of requests over
incoming/outgoing byte rate.
Kafka is amazingly durable for large data if they are well-batched.
=> Producers which configures linger.ms=0 with large number of
servers probably leads large amount of requests
Starting from 0.11.0.0, by KIP-124 we can configure request rate
quota 3
3
https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+Request+rate+quotas
10. Protect cluster against abusing
workload - Request Quota
Basic idea is to apply default
quota for preventing single
abusing client destabilize the
cluster as a least protection.
*Not for controlling resource
quantity for each client.
11. Track on requests from clients - Metrics
— kafka.server:type=Request,user=([-.w]+),client-
id=([-.w]+):request-time
— Percentage of time spent in broker network and I/O
threads to process requests from each client
group.
— Useful to see how much of broker resource is being
consumed by each client.
12. Track on requests from clients -
Slowlog
Log requests which took longer
than certain threshold to
process.
- Kafka has "request logging"
but it leads too many of lines
- Inspired by HBase's
Thresholds can be changed
dynamically through JMX console
for each request type.
21. Network thread runs event loop
— Multiplex and processes assigned client sockets sequentially.
— It never blocks awaiting IO completion.
=> So it makes sense to set num.network.threads <= CPU_CORES
22. When Network threads gets busy...
It means either one of:
1. Really busy doing lots of work. Many requests/
responses to read/write
2. Blocked by some operations (which should not
happen in event loop in general)
23. Response handling of normal
requests
When response is in queue, all
data to be transferred are in
memory.
24. Exceptional handling for Fetch
response
When response is in queue, topic
data segments are not in
userspace memory.
=> Copy to client socket directly
inside the kernel using
sendfile(2) system call.
25. What if target data doesn't exists in page cache?
Target data in page cache:
=> Just a memory copy. Very fast: ~ 100us
Target data is NOT in page cache:
=> Needs to load data from disk into page cache first.
Can be slow: ~ 50ms (or even slower)
26. Suspecting blocking in sendfile(2)
Inspected duration of sendfile system calls issued by broker process using
SystemTap (dynamic tracing tool to probe events in kernel. see my previous talk 4
)
$ stap -e ‘(script counting sendfile(2) duration histogram)’
# value (us)
value |---------------------------------------- count
0 | 0
1 | 71
2 |@@@ 6171
16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 29472
32 |@@@ 3418
2048 | 0
...
8192 | 3
4
https://www.confluent.io/kafka-summit-sf17/One-Day-One-Data-Hub-100-Billion-Messages-
Kafka-at-LINE
27. Problem hypothesis
Fetch request reading old data causes blocking sendfile(2) in event loop and applying latency for
responses needs to be processed in the same network thread.
28. Problem hypothesis
Super harmful because of:
It can be triggered either by:
- Consumers attempting to fetch old data
- Replica fetch by follower brokers for restoring replica
of old logs
=> Both are very common scenario
Breaks performance isolation among independent
clients.
29. Solution candidates
A: Separate network threads among clients
=> Possible, but a lot of changes required
=> Not essential because network threads should be
completely computation intensive
B: Balance connections among network threads
=> Possible, but again a lot of changes
=> Still for first moment other connections will get
affected
30. Solution candidates
C: Make sure that data are ready on memory before the
response passed to the network thread
=> Event loop never blocks
31. Choice: Warmup page cache
before the network thread
Move blocking part to request
handler threads (= single queue
and pool of threads)
=> Free thread can take arbitrary
task (request) while some
threads are blocked.
32. Choice: Warmup page cache
before the network thread
When Network thread calls
sendfile(2) for transferring log
data, it's always in page cache.
33. Warming up page cache with minimal overhead
Easiest way: Do synchronous read(2) on target data
=> Large overhead by copying memory from kernel to
userland
Why is Kafka using sendfile(2) for transferring topic data?
=> To avoid expensive large memory copy
How can we achieve it keeping this property?
34. Trick #1 Zero copy synchronous
page load
Call sendfile(2) for target data
with dest /dev/null.
The /dev/null driver does not
actually copy data to anywhere.
35. Why it has almost no overhead?
Linux kernel internally uses splice to implement sendfile(2).
splice implementation of /dev/null returns w/o iterating target data.
# ./drivers/char/mem.c
static const struct file_operations null_fops = {
...
.splice_write = splice_write_null,
};
static int pipe_to_null(...)
{
return sd->len;
}
static ssize_t splice_write_null(...)
{
return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_null);
}
37. Trick #2 Skip the "hot" last log
segment
Another concern: additional
syscalls * Fetch req count?
- Warming up is necessary only
for older data.
- Exclude the last log segment
from the warmup target.
38. Trick #2 Skip the "hot" last log segment
# Log.scala#read
@@ -585,6 +586,17 @@ class Log(@volatile var dir: File,
if(fetchInfo == null) {
entry = segments.higherEntry(entry.getKey)
} else {
+ // For last entries we assume that it is hot enough to still have all data in page cache.
+ // Most of fetch requests are fetching from the tail of the log, so this optimization
+ // should save call of sendfile significantly.
+ if (!isLastEntry && fetchInfo.records.isInstanceOf[FileRecords]) {
+ try {
+ info("Prepare Read for " + fetchInfo.records.asInstanceOf[FileRecords].file().getPath)
+ fetchInfo.records.asInstanceOf[FileRecords].prepareForRead()
+ } catch {
+ case e: Throwable => warn("failed to prepare cache for read", e)
+ }
+ }
return fetchInfo
}
39. It works
No response time degradation in irrelevant requests while there are coincidence of Fetch request
triggers disk read.
40. Patch upstream?
Concern: The patch heavily assumes underlying kernel
implementation.
Still:
- Effect is tremendous.
- Fixes very common performance degradation scenario.
Discuss at KAFKA-7504
41. Conclusion
— Talked requirements for multi tenancy clusters and
solutions
— Quota, Metrics, Slowlog ... and hacky patch.
— After fixing some issues our hosting policy is working well
and efficient, keeping:
— concept of single "Data Hub" and
— operational cost not proportional to the number of
users/usages.
— Kafka is well designed and implemented to contain many,
independent and different workloads.