This document provides a summary of a presentation on garbage collection tuning in the Java HotSpot Virtual Machine. It introduces the presenters and their backgrounds in GC and Java performance. The main points covered are that GC tuning is an art that requires experience, and tuning advice is provided for the young generation, Parallel GC, and Concurrent Mark Sweep GC. Monitoring GC performance and avoiding fragmentation are also discussed.
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
Common issues with Apache Kafka® Producerconfluent
Badai Aqrandista, Confluent, Senior Technical Support Engineer
This session will be about a common issue in the Kafka Producer: producer batch expiry. We will be discussing the Kafka Producer internals, its common causes, such as a slow network or small batching, and how to overcome them. We will also be sharing some examples along the way!
https://www.meetup.com/apache-kafka-sydney/events/279651982/
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentHostedbyConfluent
Currently, Apache Kafka® uses Apache ZooKeeper™ to store its metadata. Data such as the location of partitions and the configuration of topics are stored outside of Kafka itself, in a separate ZooKeeper cluster. In 2019, we outlined a plan to break this dependency and bring metadata management into Kafka itself through a dynamic service that runs inside the Kafka Cluster. We call this the Quorum Controller.
In this talk, we’ll look at how the Quorum Controller works and how it integrates with other parts of the next-generation Kafka architecture, such as the Raft quorum and snapshotting mechanism. We’ll also explain how the Quorum Controller will simplify operations, improve security, and enhance scalability and performance.
Finally, we’ll look at some of the practicalities, such as how to monitor and run the Quorum Controller yourself. We’ll talk about some of the performance gains we’ve seen, and our plans for the future.
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Streams...
Big Data and Machine Learning are key for innovation in many industries today. Large amounts of historical data are stored and analyzed in Hadoop, Spark or other clusters to find patterns and insights, e.g. for predictive maintenance, fraud detection or cross-selling.
This first part of the session explains how to build analytic models with R, Python and Scala leveraging open source machine learning / deep learning frameworks like Apache Spark, TensorFlow or H2O.ai. The second part discusses how to leverage these built analytic models in your own streaming applications or microservices; leveraging the Apache Kafka cluster and Kafka Streams instead of building an own stream processing cluster. The session focuses on live demos and teaches lessons learned for executing analytic models in a highly scalable and performant way.
The last part explains how Apache Kafka can help to move from a manual build and deployment of analytic models to continuous online model improvement in real time.
Kafka Tiered Storage separates compute and data storage in two independently scalable layers. Uber's Kafka Improvement Proposal (KIP) #405 describes two-tiered storage, which is a major step towards cloud-native Kafka. It stores the most recent data locally and offloads older data to a remote storage service. Operationally, the benefit is faster routine cluster maintenance activities. In Linkedin, Kafka tiered storage is strongly desired to reduce the cost of running Kafka in the Azure cloud environment. As KIP-405 does not dictate the implementation of remote storage substrate, Linkedin's choice for tiering Kafka in Azure deployments is the Azure Blob Service. This presentation will begin with the motivation behind Linkedin efforts to adopt Kafka Tiered Storage. Next, the architecture of KIP-405 will be discussed. Finally, the Remote Storage Manager for Azure Blobs, which is a work-in-progress, will be presented.
Video: https://youtu.be/V5gaBE5CMwg?t=1387
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
Flink Forward San Francisco 2022.
In normal situations, the default Kafka consumer and producer configuration options work well. But we all know life is not all roses and rainbows and in this session we’ll explore a few knobs that can save the day in atypical scenarios. First, we'll take a detailed look at the parameters available when reading from Kafka. We’ll inspect the params helping us to spot quickly an application lock or crash, the ones that can significantly improve the performance and the ones to touch with gloves since they could cause more harm than benefit. Moreover we’ll explore the partitioning options and discuss when diverging from the default strategy is needed. Next, we’ll discuss the Kafka Sink. After browsing the available options we'll then dive deep into understanding how to approach use cases like sinking enormous records, managing spikes, and handling small but frequent updates.. If you want to understand how to make your application survive when the sky is dark, this session is for you!
by
Olena Babenko
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
Common issues with Apache Kafka® Producerconfluent
Badai Aqrandista, Confluent, Senior Technical Support Engineer
This session will be about a common issue in the Kafka Producer: producer batch expiry. We will be discussing the Kafka Producer internals, its common causes, such as a slow network or small batching, and how to overcome them. We will also be sharing some examples along the way!
https://www.meetup.com/apache-kafka-sydney/events/279651982/
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentHostedbyConfluent
Currently, Apache Kafka® uses Apache ZooKeeper™ to store its metadata. Data such as the location of partitions and the configuration of topics are stored outside of Kafka itself, in a separate ZooKeeper cluster. In 2019, we outlined a plan to break this dependency and bring metadata management into Kafka itself through a dynamic service that runs inside the Kafka Cluster. We call this the Quorum Controller.
In this talk, we’ll look at how the Quorum Controller works and how it integrates with other parts of the next-generation Kafka architecture, such as the Raft quorum and snapshotting mechanism. We’ll also explain how the Quorum Controller will simplify operations, improve security, and enhance scalability and performance.
Finally, we’ll look at some of the practicalities, such as how to monitor and run the Quorum Controller yourself. We’ll talk about some of the performance gains we’ve seen, and our plans for the future.
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Streams...
Big Data and Machine Learning are key for innovation in many industries today. Large amounts of historical data are stored and analyzed in Hadoop, Spark or other clusters to find patterns and insights, e.g. for predictive maintenance, fraud detection or cross-selling.
This first part of the session explains how to build analytic models with R, Python and Scala leveraging open source machine learning / deep learning frameworks like Apache Spark, TensorFlow or H2O.ai. The second part discusses how to leverage these built analytic models in your own streaming applications or microservices; leveraging the Apache Kafka cluster and Kafka Streams instead of building an own stream processing cluster. The session focuses on live demos and teaches lessons learned for executing analytic models in a highly scalable and performant way.
The last part explains how Apache Kafka can help to move from a manual build and deployment of analytic models to continuous online model improvement in real time.
Kafka Tiered Storage separates compute and data storage in two independently scalable layers. Uber's Kafka Improvement Proposal (KIP) #405 describes two-tiered storage, which is a major step towards cloud-native Kafka. It stores the most recent data locally and offloads older data to a remote storage service. Operationally, the benefit is faster routine cluster maintenance activities. In Linkedin, Kafka tiered storage is strongly desired to reduce the cost of running Kafka in the Azure cloud environment. As KIP-405 does not dictate the implementation of remote storage substrate, Linkedin's choice for tiering Kafka in Azure deployments is the Azure Blob Service. This presentation will begin with the motivation behind Linkedin efforts to adopt Kafka Tiered Storage. Next, the architecture of KIP-405 will be discussed. Finally, the Remote Storage Manager for Azure Blobs, which is a work-in-progress, will be presented.
Video: https://youtu.be/V5gaBE5CMwg?t=1387
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
Flink Forward San Francisco 2022.
In normal situations, the default Kafka consumer and producer configuration options work well. But we all know life is not all roses and rainbows and in this session we’ll explore a few knobs that can save the day in atypical scenarios. First, we'll take a detailed look at the parameters available when reading from Kafka. We’ll inspect the params helping us to spot quickly an application lock or crash, the ones that can significantly improve the performance and the ones to touch with gloves since they could cause more harm than benefit. Moreover we’ll explore the partitioning options and discuss when diverging from the default strategy is needed. Next, we’ll discuss the Kafka Sink. After browsing the available options we'll then dive deep into understanding how to approach use cases like sinking enormous records, managing spikes, and handling small but frequent updates.. If you want to understand how to make your application survive when the sky is dark, this session is for you!
by
Olena Babenko
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
Flink Forward San Francisco 2022.
At ThousandEyes we receive billions of events every day that allow us to monitor the internet; the most important aspect of our platform is to detect outages and anomalies that have a potential to cause serious impact to customer applications and user experience. Automatic detection of such events at lowest latency and highest accuracy is extremely important for our customers and their business. After launching several resilient and low latency data pipelines in production using Flink we decided to take it up a notch; we leveraged Flink to build statistical models in near real-time and apply them on incoming stream of events to detect anomalies! In this session we will deep dive into the design as well as discuss pitfalls and learnings while developing our real-time platform that leverages Debezium, Kafka, Flink, ElasticCache and DynamoDB to process events at scale!
by
Kunal Umrigar & Balint Kurnasz
Machine Learning (ML) works by using powerful algorithms to discover patterns in data, and constructing complex mathematical models using these patterns. Once a model is built, you perform inference by applying data to the trained model to make predictions for your application. Building and training ML models requires massive computing resources so it is a natural fit for the cloud. But, inference takes a lot less computing power and is typically done in real-time when new data is available, so getting inference results with very low latency is important to making sure your applications can respond quickly to local events. AWS Greengrass ML inference gives you the best of both worlds. You use ML models that are built and trained in the cloud, and you deploy and run ML inference locally on connected devices. For example, autonomous cars need to identify road signs in real time; and drones need to recognize objects with or without network connectivity.
Changelog Stream Processing with Apache FlinkFlink Forward
Flink Forward San Francisco 2022.
The world is constantly changing. Data is continuously produced and thus should be consumed in a similar fashion by enterprise systems. Only this enables real-time decisions at scale. Message logs such as Apache Kafka can be found in almost every architecture, while databases and other batch systems still provide the foundation. Change Data Capture (CDC) propagates changes downstream. In this talk, we will highlight what it means to be a general data processor and how Flink can act as an integration hub. We present the current state of Flink and how it can power various use cases on both finite and infinite streams. We demonstrate Flink's SQL engine as a changelog processor that is shipped with an ecosystem tailored to process CDC data and maintain materialized views. We will use Kafka as an upsert log, Debezium for connecting to databases, and enrich streams of various sources. Finally, we will combine Flink's Table API with DataStream API for event-driven applications beyond SQL.
by
Timo Walther
Flink powered stream processing platform at PinterestFlink Forward
Flink Forward San Francisco 2022.
Pinterest is a visual discovery engine that serves over 433MM users. Stream processing allows us to unlock value from realtime data for pinners. At Pinterest, we adopt Flink as the unified streaming processing engine. In this talk, we will share our journey in building a stream processing platform with Flink and how we onboarding critical use cases to the platform. Pinterest has supported 90+near realtime streaming applications. We will cover the problem statement, how we evaluate potential solutions and our decision to build the framework.
by
Rainie Li & Kanchi Masalia
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...confluent
Microservices are seen as the way to simplify complex systems, until you need to coordinate a transaction across services, and in that instant, the dream ends. Transactions involving multiple services can lead to a spaghetti web of interactions. Protocols such as two-phase commit come with complexity and performance bottlenecks. The Saga pattern involves a simplified transactional model. In sagas, a sequence of actions are executed, and if any action fails, a compensating action is executed for each of the actions that have already succeeded. This is particularly well suited to long-running and cross-microservice transactions. In this talk we introduce the new Simple Sagas library (https://github.com/simplesourcing/simplesagas). Built using Kafka streams, it provides a scalable fault tolerance event-based transaction processing engine. We walk through a use case of coordinating a sequence of complex financial transactions. We demonstrate the easy to use DSL, show how the system copes with failure, and discuss this overall approach to building scalable transactional systems in an event-driven streaming context.
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Speed Up Your Kubernetes Upgrades For Your Kafka ClustersVanessa Vuibert
Upgrading Kubernetes can be difficult enough without the additional challenge of the petabytes of data associated with your Kafka cluster. At Shopify, we reduced our live Kafka cluster migration time from six months to two weeks by reusing persistent disks. In this talk I’ll explain why we chose this approach, how we did it, and how you can too.
When upgrading Kubernetes it can be preferable to create a new Kubernetes cluster and migrate your workloads instead of performing an in-place upgrade. However, with this approach migrating Kafka to a new Kubernetes cluster over the network can be time consuming.
I will go over how to stretch a Kafka cluster across the old and new Kubernetes clusters without adding any extra brokers. Finally, I will discuss how the Kafka brokers in the new Kubernetes cluster get scaled up while the old one gets decommissioned.
Recommendation is one of the most popular applications in machine learning (ML). In this workshop, we’ll show you how to build a movie recommendation model based on factorization machines — one of the built-in algorithms of Amazon SageMaker — and the popular MovieLens dataset.
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
Build Large-Scale Data Analytics and AI Pipeline Using RayDPDatabricks
A large-scale end-to-end data analytics and AI pipeline usually involves data processing frameworks such as Apache Spark for massive data preprocessing, and ML/DL frameworks for distributed training on the preprocessed data. A conventional approach is to use two separate clusters and glue multiple jobs. Other solutions include running deep learning frameworks in an Apache Spark cluster, or use workflow orchestrators like Kubeflow to stitch distributed programs. All these options have their own limitations. We introduce Ray as a single substrate for distributed data processing and machine learning. We also introduce RayDP which allows you to start an Apache Spark job on Ray in your python program and utilize Ray’s in-memory object store to efficiently exchange data between Apache Spark and other libraries. We will demonstrate how this makes building an end-to-end data analytics and AI pipeline simpler and more efficient.
Session delivered at Malaga, Spain in the Wey Wey Web conference about how to use and integrate IA, ChatGPT and other LLMs into your websites including: plugins, how ChatGPT browses the web, and how to use prompt engineering for formatted data generation.
AI is everywhere nowadays, but if you are a web developer, you don't know where it fits in your work.
In this session, you will quickly understand how to add AI models to your website. You will also see how ChatGPT plugins work, how to create one, and how to gain control of the content used by LLMs.
In this session, you'll learn about API integration with OpenAI and Google LaMDA APIs, tokens, and how to keep things secure while scaling up. We'll walk you through real examples and hands-on demos, so you'll be ready to bring AI magic to your web projects quickly.
But that's not all! We'll also discuss how to create your plugin for LLMs, how Bing Chat and ChatGPT browser plugin works when browsing your web content, and how to opt out or optimize the results for AI. We'll cover basic concepts of data preprocessing, structuring, and how to tweak the model for your needs. Let's have fun and unlock ChatGPT and AI's power together!
PostgreSQL is a very popular and feature-rich DBMS. At the same time, PostgreSQL has a set of annoying wicked problems, which haven't been resolved in decades. Miraculously, with just a small patch to PostgreSQL core extending this API, it appears possible to solve wicked PostgreSQL problems in a new engine made within an extension.
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
Nowadays people usually talk more about big data, internet of things, and other buzzwords on various conferences. However, sometimes developers tend to not pay enough attention to the core things such as garbage collection. After having a short discussion with many somewhat experienced Java developers I came to a conclusion that most of them do not know how many garbage collectors there are in the latest JVM, and under what circumstances each of them should be enabled. This presentation is aimed to improve or refresh people’s knowledge on this core topic, and share a real use case when it helped us to resolve production issue.
Every Java developer should have a good working knowledge of JVM bytecode. It’s fun, it can help you diagnose problems, improve performance, and even opens the door to building languages of your own. No matter what kind of Java application you work on, you’ll get something out of this talk. We’ll start with bytecode fundamentals. You’ll learn how the most common operations work and see visual representations of how the JVM executes that code. The second part of the talk will introduce Jitescript, a Java library for generating bytecode. You’ll learn how to use Jitescript with some plain old Java code to create your own JVM languages.
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
Flink Forward San Francisco 2022.
At ThousandEyes we receive billions of events every day that allow us to monitor the internet; the most important aspect of our platform is to detect outages and anomalies that have a potential to cause serious impact to customer applications and user experience. Automatic detection of such events at lowest latency and highest accuracy is extremely important for our customers and their business. After launching several resilient and low latency data pipelines in production using Flink we decided to take it up a notch; we leveraged Flink to build statistical models in near real-time and apply them on incoming stream of events to detect anomalies! In this session we will deep dive into the design as well as discuss pitfalls and learnings while developing our real-time platform that leverages Debezium, Kafka, Flink, ElasticCache and DynamoDB to process events at scale!
by
Kunal Umrigar & Balint Kurnasz
Machine Learning (ML) works by using powerful algorithms to discover patterns in data, and constructing complex mathematical models using these patterns. Once a model is built, you perform inference by applying data to the trained model to make predictions for your application. Building and training ML models requires massive computing resources so it is a natural fit for the cloud. But, inference takes a lot less computing power and is typically done in real-time when new data is available, so getting inference results with very low latency is important to making sure your applications can respond quickly to local events. AWS Greengrass ML inference gives you the best of both worlds. You use ML models that are built and trained in the cloud, and you deploy and run ML inference locally on connected devices. For example, autonomous cars need to identify road signs in real time; and drones need to recognize objects with or without network connectivity.
Changelog Stream Processing with Apache FlinkFlink Forward
Flink Forward San Francisco 2022.
The world is constantly changing. Data is continuously produced and thus should be consumed in a similar fashion by enterprise systems. Only this enables real-time decisions at scale. Message logs such as Apache Kafka can be found in almost every architecture, while databases and other batch systems still provide the foundation. Change Data Capture (CDC) propagates changes downstream. In this talk, we will highlight what it means to be a general data processor and how Flink can act as an integration hub. We present the current state of Flink and how it can power various use cases on both finite and infinite streams. We demonstrate Flink's SQL engine as a changelog processor that is shipped with an ecosystem tailored to process CDC data and maintain materialized views. We will use Kafka as an upsert log, Debezium for connecting to databases, and enrich streams of various sources. Finally, we will combine Flink's Table API with DataStream API for event-driven applications beyond SQL.
by
Timo Walther
Flink powered stream processing platform at PinterestFlink Forward
Flink Forward San Francisco 2022.
Pinterest is a visual discovery engine that serves over 433MM users. Stream processing allows us to unlock value from realtime data for pinners. At Pinterest, we adopt Flink as the unified streaming processing engine. In this talk, we will share our journey in building a stream processing platform with Flink and how we onboarding critical use cases to the platform. Pinterest has supported 90+near realtime streaming applications. We will cover the problem statement, how we evaluate potential solutions and our decision to build the framework.
by
Rainie Li & Kanchi Masalia
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...confluent
Microservices are seen as the way to simplify complex systems, until you need to coordinate a transaction across services, and in that instant, the dream ends. Transactions involving multiple services can lead to a spaghetti web of interactions. Protocols such as two-phase commit come with complexity and performance bottlenecks. The Saga pattern involves a simplified transactional model. In sagas, a sequence of actions are executed, and if any action fails, a compensating action is executed for each of the actions that have already succeeded. This is particularly well suited to long-running and cross-microservice transactions. In this talk we introduce the new Simple Sagas library (https://github.com/simplesourcing/simplesagas). Built using Kafka streams, it provides a scalable fault tolerance event-based transaction processing engine. We walk through a use case of coordinating a sequence of complex financial transactions. We demonstrate the easy to use DSL, show how the system copes with failure, and discuss this overall approach to building scalable transactional systems in an event-driven streaming context.
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
Speed Up Your Kubernetes Upgrades For Your Kafka ClustersVanessa Vuibert
Upgrading Kubernetes can be difficult enough without the additional challenge of the petabytes of data associated with your Kafka cluster. At Shopify, we reduced our live Kafka cluster migration time from six months to two weeks by reusing persistent disks. In this talk I’ll explain why we chose this approach, how we did it, and how you can too.
When upgrading Kubernetes it can be preferable to create a new Kubernetes cluster and migrate your workloads instead of performing an in-place upgrade. However, with this approach migrating Kafka to a new Kubernetes cluster over the network can be time consuming.
I will go over how to stretch a Kafka cluster across the old and new Kubernetes clusters without adding any extra brokers. Finally, I will discuss how the Kafka brokers in the new Kubernetes cluster get scaled up while the old one gets decommissioned.
Recommendation is one of the most popular applications in machine learning (ML). In this workshop, we’ll show you how to build a movie recommendation model based on factorization machines — one of the built-in algorithms of Amazon SageMaker — and the popular MovieLens dataset.
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
Build Large-Scale Data Analytics and AI Pipeline Using RayDPDatabricks
A large-scale end-to-end data analytics and AI pipeline usually involves data processing frameworks such as Apache Spark for massive data preprocessing, and ML/DL frameworks for distributed training on the preprocessed data. A conventional approach is to use two separate clusters and glue multiple jobs. Other solutions include running deep learning frameworks in an Apache Spark cluster, or use workflow orchestrators like Kubeflow to stitch distributed programs. All these options have their own limitations. We introduce Ray as a single substrate for distributed data processing and machine learning. We also introduce RayDP which allows you to start an Apache Spark job on Ray in your python program and utilize Ray’s in-memory object store to efficiently exchange data between Apache Spark and other libraries. We will demonstrate how this makes building an end-to-end data analytics and AI pipeline simpler and more efficient.
Session delivered at Malaga, Spain in the Wey Wey Web conference about how to use and integrate IA, ChatGPT and other LLMs into your websites including: plugins, how ChatGPT browses the web, and how to use prompt engineering for formatted data generation.
AI is everywhere nowadays, but if you are a web developer, you don't know where it fits in your work.
In this session, you will quickly understand how to add AI models to your website. You will also see how ChatGPT plugins work, how to create one, and how to gain control of the content used by LLMs.
In this session, you'll learn about API integration with OpenAI and Google LaMDA APIs, tokens, and how to keep things secure while scaling up. We'll walk you through real examples and hands-on demos, so you'll be ready to bring AI magic to your web projects quickly.
But that's not all! We'll also discuss how to create your plugin for LLMs, how Bing Chat and ChatGPT browser plugin works when browsing your web content, and how to opt out or optimize the results for AI. We'll cover basic concepts of data preprocessing, structuring, and how to tweak the model for your needs. Let's have fun and unlock ChatGPT and AI's power together!
PostgreSQL is a very popular and feature-rich DBMS. At the same time, PostgreSQL has a set of annoying wicked problems, which haven't been resolved in decades. Miraculously, with just a small patch to PostgreSQL core extending this API, it appears possible to solve wicked PostgreSQL problems in a new engine made within an extension.
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
Nowadays people usually talk more about big data, internet of things, and other buzzwords on various conferences. However, sometimes developers tend to not pay enough attention to the core things such as garbage collection. After having a short discussion with many somewhat experienced Java developers I came to a conclusion that most of them do not know how many garbage collectors there are in the latest JVM, and under what circumstances each of them should be enabled. This presentation is aimed to improve or refresh people’s knowledge on this core topic, and share a real use case when it helped us to resolve production issue.
Every Java developer should have a good working knowledge of JVM bytecode. It’s fun, it can help you diagnose problems, improve performance, and even opens the door to building languages of your own. No matter what kind of Java application you work on, you’ll get something out of this talk. We’ll start with bytecode fundamentals. You’ll learn how the most common operations work and see visual representations of how the JVM executes that code. The second part of the talk will introduce Jitescript, a Java library for generating bytecode. You’ll learn how to use Jitescript with some plain old Java code to create your own JVM languages.
English version of the presentation we gave at Devoxx FR 2012.
In depth analysis on how java Garbage collector works and how to minimise pause in your application.
This presentation is primarily based on Oracle's "Java SE 6 HotSpot™ Virtual Machine Garbage Collection Tuning" document.
This introduces how Java manages memory using generations, available garbage collectors and how to tune them to achieve desired performance.
Virtual machines don't have to be slow, they don't even have to be slower than running native code.
All you have to do is write your code, lay back and let the JVM do its magic !
Learn about various JVM runtime optimizations and why is it considered one of the best VMs in the world.
G1 Garbage Collector - Big Heaps and Low Pauses?C2B2 Consulting
Devoxx 2012 talk by Jaromir Hamala, C2B2 Senior Consultant.
The Garbage-First (G1) is the latest garbage collector in the JVM, aiming to be the long-term replacement for CMS. Targeted for machines with large memories and multiple processors. Promising low and more predictable pause times while achieving high throughput.
The session will introduce the architecture and design of G1. Then the main focus of the talk will be the performance characteristics observed under different loads; tuning capabilities and common pitfalls. With the aim of answering the question can you run big heaps and achieve low pauses.
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
G1 GC
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads.
Software Profiling: Understanding Java Performance and how to profile in JavaIsuru Perera
Guest lecture at University of Colombo School of Computing on 27th May 2017
Covers following topics:
Software Profiling
Measuring Performance
Java Garbage Collection
Sampling vs Instrumentation
Java Profilers. Java Flight Recorder
Java Just-in-Time (JIT) compilation
Flame Graphs
Linux Profiling
A presentation on how automatic memory management and adaptive compilation impact on latency of applications. Includes some ideas on how to minimise these affects.
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCErik Krogen
Erik Krogen of LinkedIn presents regarding Dynamometer, a system open sourced by LinkedIn for scale- and performance-testing HDFS. He discusses one major use case for Dynamometer, tuning NameNode GC, and discusses characteristics of NameNode GC such as why it is important, and how it interacts with various current and future GC algorithms.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Java Garbage Collectors – Moving to Java7 Garbage First (G1) CollectorGurpreet Sachdeva
One of the key strengths of JVM is automatic memory management (Garbage Collection). Its understanding can help in writing better applications. This becomes all the more important as enterprise server applications have large amount of live heap data and significant parallel threads. Until recently, main collectors were parallel collector and concurrent-mark-sweep (CMS) collector. This presentation introduces the various Garbage Collectors and compares the CMS collector against its replacement, a new implementation in Java7 i.e. G1. It is characterized by a single contiguous heap which is split into same-sized regions. In fact if your application is still running on the 1.5 or 1.6 JVM, a compelling argument to upgrade to Java 7 is to leverage G1.
This presentation was given to the system adminstration team to give them an idea of how GC works and what to look for when there is abottleneck and troubles.
Similar to GC Tuning in the HotSpot Java VM - a FISL 10 Presentation (20)
Performances Java et OpenDJ - LyonJUG Janv. 2012Ludovic Poitou
Presentation sur le projet OpenDJ et les performances en Java. Contient une description du fonctionnement de la JVM Hotspot et des divers GC, y compris G1.
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
1. Garbage Collection
Tuning in the Java
HotSpot™ Virtual
Machine
Tony Printezis, Charlie Hunt,
Ludovic Poitou
Sun Microsystems, Inc.
1
2. Who We Are
• Tony Printezis
> GC Group / HotSpot JVM development team
> Been working on the HotSpot JVM since 2006
> 10+ years of GC experience
• Charlie Hunt
> Java Platform Performance Engineering Group
> Works with many Sun product teams and customers
> 10+ years of Java technology performance work
• Ludovic Poitou (just the narrator)
> Directory Services Engineering, OpenDS Community guy
> 10+ years of Scaling LDAP directories, now with Java
Copyright Sun Microsystems, Inc. 2
3. If you remember only one thing...
GC Tuning is an Art !
Copyright Sun Microsystems, Inc. 3
4. GC Tuning is an Art
• Unfortunately, we can't give you a flawless recipe or
a flowchart that will apply to all your GC tuning
scenarios
• GC tuning involves a lot of common pattern
recognition
• This pattern recognition requires experience
> We have a lot of it. :-)
Copyright Sun Microsystems, Inc. 4
5. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 5
6. GCs in the HotSpot JVM
• Three available GCs:
> Serial GC
> Parallel GC / Parallel Old GC
> Concurrent Mark-Sweep GC (CMS)
Copyright Sun Microsystems, Inc. 6
7. Heap Layout (same for all GCs)
Young Generation
Old Generation
Permanent Generation
Copyright Sun Microsystems, Inc. 7
8. Young Generation
Allocation (new Object())
Eden Survivor Spaces
Copyright Sun Microsystems, Inc. 8
9. Old Generation
Promotion
(survivors from the Young Generation)
Copyright Sun Microsystems, Inc. 9
10. Permanent Generation
Allocation
(only directly from the JVM)
Copyright Sun Microsystems, Inc. 10
11. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 11
12. Your Dream GC
• You would really like a GC that has
> Low GC overhead,
> Low GC pause times, and
> Good space efficiency
• Unfortunately, you'll have to pick two (any two!)
Copyright Sun Microsystems, Inc. 12
13. Heap Sizing Tuning Advice
Supersize it!
Copyright Sun Microsystems, Inc. 13
14. Heap Sizing Trade-Offs
• Generally, the larger the heap space, the better
> For both young and old generation
> Larger space: less frequent GCs, lower GC overhead,
objects more likely to become garbage
> Smaller space: faster GCs (not always! see later)
• Sometimes max heap size is dictated by available
memory and/or max space the JVM can address
> You have to find a good balance between young and old
generation size
Copyright Sun Microsystems, Inc. 14
15. Generation Size Roles
• Young Generation Size
> Dictates frequency of minor GCs
> Dictates how many objects will be reclaimed in the young
generation
– Along with tenuring threshold + survivor space size tuning
• Old Generation
> Should comfortably hold the application's steady-state
live size
> Decrease the major GC frequency as much as possible
Copyright Sun Microsystems, Inc. 15
16. Two Very Important Points
• You should try to maximize the number of
objects reclaimed in the young generation
> This is probably the most important piece of advice when
sizing a heap and/or tuning the young generation
• Your application's memory footprint should not
exceed the available physical memory
> This is probably the second most important piece of
advice when sizing a heap
• The above apply to all our GCs
Copyright Sun Microsystems, Inc. 16
17. Sizing Heap Spaces
• -Xmx<size> : max heap size
> young generation + old generation
• -Xms<size> : initial heap size
> young generation + old generation
• -Xmn<size> : young generation size
• Applications with emphasis on performance tend to
set -Xms and -Xmx to the same value
• When -Xms != -Xmx, heap growth or shrinking
requires a Full GC
Copyright Sun Microsystems, Inc. 17
18. Should -Xms == -Xmx?
• Set -Xms to what you think would be your desired
heap size
> It's expensive to grow the heap
• If memory allows, set -Xmx to something larger than
-Xms “just in case”
> Maybe the application is hit with more load
> Maybe the DB gets larger over time
• In most occasions, it's better to do a Full GC and
grow the heap than to get an OOM and crash
Copyright Sun Microsystems, Inc. 18
19. Sizing Heap Spaces (ii)
• -XX:PermSize=<size> : permanent generation initial
size
• -XX:MaxPermSize=<size> : permanent generation
max size
• Applications with emphasis on performance almost
always set -XX:PermSize and -XX:MaxPermSize to
the same value
> Growing or shrinking the permanent generation requires
a Full GC too
• Unfortunately, the permanent generation occupancy
is hard to predict
Copyright Sun Microsystems, Inc. 19
20. Stop-The-World Parallel GC Threads
• The number of parallel GC threads is controlled by -
XX:ParallelGCThreads=<num>
• Default value assumes only one JVM per system
• Set the parallel GC thread number according to:
> Number of JVMs deployed on the system / processor set
/ zone
> CPU chip architecture
– Multiple hardware threads per chip core, i.e.,
UltraSPARC T1 / T2
Copyright Sun Microsystems, Inc. 20
21. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 21
22. Young Generation Sizing
• Eden size determines
> The frequency of minor GCs
> Which objects will be reclaimed at age 0
– Newly-allocated objects in Eden start from age 0
– Their age is incremented at every minor GC
• Increasing the size of the Eden will not always affect
minor GC times
> Remember: minor GC times are proportional to the
amount of objects they copy (i.e., the live objects), not
the young generation size
Copyright Sun Microsystems, Inc. 22
23. Young Object Survivor Ratio
Survivor Ratio
0 Youngest New-Allocated Object Age Oldest
Copyright Sun Microsystems, Inc. 23
24. Young Object Survivor Ratio (ii)
Survivor Ratio
0 Youngest New-Allocated Object Age Oldest
Copyright Sun Microsystems, Inc. 24
25. Young Object Survivor Ratio (iii)
Survivor Ratio
0 Youngest New-Allocated Object Age Oldest
Copyright Sun Microsystems, Inc. 25
26. Sizing Heap Spaces (iii)
• -XX:NewSize=<size> : initial young generation size
• -XX:MaxNewSize=<size> : max young generation
size
• -XX:NewRatio=<ratio> : young generation to old
generation ratio
• Applications with emphasis on performance tend to
use -Xmn to size the young generation since it
combines the use of -XX:NewSize and
-XX:MaxNewSize
Copyright Sun Microsystems, Inc. 26
27. Tenuring
• -XX:TargetSurvivorRatio=<percent>, e.g., 50
> How much of the survivor space should be filled
– Typically leave extra space to deal with “spikes”
• -XX:InitialTenuringThreshold=<threshold>
• -XX:MaxTenuringThreshold=<threshold>
• -XX:+AlwaysTenure
> Never keep any objects in the survivor spaces
• -XX:SurvivorRatio=<Integer>, e.g., 6
> Eden to Survivor Size Ratio
Copyright Sun Microsystems, Inc. 27
28. Tenuring Threshold Trade-Offs
• Try to retain as many objects as possible in the
survivor spaces so that they can be reclaimed in the
young generation
> Less promotion into the old generation
> Less frequent old GCs
• But also, try not to unnecessarily copy very long-
lived objects between the survivors
> Unnecessary overhead on minor GCs
• Not always easy to find the perfect balance
> Generally: better copy more, than promote more
Copyright Sun Microsystems, Inc. 28
29. Tenuring Distribution
• Monitor tenuring distribution with
-XX:+PrintTenuringDistribution
Desired survivor size 6684672 bytes, new threshold 8 (max 8)
- age 1: 2315488 bytes, 2315488 total
- age 2: 19528 bytes, 2335016 total
- age 3: 96 bytes, 2335112 total
- age 4: 32 bytes, 2335144 total
• Young generation seems well tuned here
> We can even decrease the survivor space size
Copyright Sun Microsystems, Inc. 29
30. Tenuring Distribution (ii)
Desired survivor size 3342336 bytes, new threshold 1 (max 6)
- age 1: 3956928 bytes, 3956928 total
• Survivor space too small!
> Increase survivor space and/or eden size
Copyright Sun Microsystems, Inc. 30
31. Tenuring Distribution (iii)
Desired survivor size 3342336 bytes, new threshold 6 (max 6)
- age 1: 2483440 bytes, 2483440 total
- age 2: 501240 bytes, 2984680 total
- age 3: 50016 bytes, 3034696 total
- age 4: 49088 bytes, 3083784 total
- age 5: 48616 bytes, 3132400 total
- age 6: 50128 bytes, 3182528 total
• Might be able to do better
> Either increase max tenuring threshold
> Or even set max tenuring threshold to 2
– If ages > 6 still have around 50K of surviving bytes
Copyright Sun Microsystems, Inc. 31
32. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 32
33. Parallel GC Ergonomics
• The Parallel GC has ergonomics
> i.e., auto-tuning
• Ergonomics help in improving out-of-the-box GC
performance
• To get maximum performance, most customers we
know do manual tuning
Copyright Sun Microsystems, Inc. 33
34. Parallel GC Tuning Advice
• Tune the young generation as described so far
• Try to avoid / decrease the frequency of major GCs
• We know of customers who use the Parallel GC in
low-pause environments
> Avoid Full GCs by avoiding / minimizing promotion
> Maximize heap size
Copyright Sun Microsystems, Inc. 34
35. NUMA
• Non-Uniform Memory Access
> Applicable to most SPARC, Opteron, more recently Intel
platforms
• -XX:+UseNUMA
• Splits the young generation into partitions
> Each partition “belongs” to a CPU
• Allocates new objects into the partition that belongs
to the allocating CPU
• Big win for some applications
Copyright Sun Microsystems, Inc. 35
36. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 36
37. CMS Tuning Advice
• Tune the young generation as described so far
• Need to be even more careful about avoiding
premature promotion
> Originally we were using an +AlwaysTenure policy
> We have since changed our mind :-)
• Promotion in CMS is expensive (free lists)
• The more often promotion / reclamation happens,
the more likely fragmentation will settle in the heap
Copyright Sun Microsystems, Inc. 37
38. CMS Tuning Advice (ii)
• We know customers who tune their
applications to do mostly minor GCs, even with
CMS
> CMS is used as a “safety net”, when applications
load exceeds what they have provisioned for
> Schedule Full GCs at non-critical times (say, late at
night) to “tidy up” the heap and minimize
fragmentation
Copyright Sun Microsystems, Inc. 38
39. Fragmentation
• Two types
> External fragmentation
– No free chuck is large enough to satisfy an allocation
> Internal fragmentation
– Allocator rounds up allocation requests
– Free space wasted due to this rounding up
Copyright Sun Microsystems, Inc. 39
40. Fragmentation (ii)
• The bad news: you can never
eliminate it!
> It has been proven
• The good news: you can decrease its likelihood
> Decrease promotion into the CMS old generation
> Be careful when coding
– Large objects of various sizes are the main cause
Copyright Sun Microsystems, Inc. 40
41. Concurrent CMS GC Threads
• Number of parallel CMS threads is controlled by
-XX:ParallelCMSThreads=<num>
> Available in post 6 JVMs
• Trade-Off
> CMS cycle duration vs.
> Concurrent overhead during a CMS cycle
Copyright Sun Microsystems, Inc. 41
42. Permanent Generation and CMS
• To date, classes will not be unloaded by default from
the permanent generation when using CMS
> Both -XX:+CMSClassUnloadingEnabled and -XX:
+PermGenSweepingEnabled need to be set to enable
class unloading in CMS
> The 2nd switch is not needed in post 6u4 JVMs
Copyright Sun Microsystems, Inc. 42
43. Setting CMS Initiating Threshold
• Again, a tricky trade-off!
• Starting a CMS cycle too early
> Frequent CMS cycles
> High concurrent overhead
• Starting a CMS cycle too late
> Chance of an evacuation failure / Full GC
• Initiating heap occupancy should be (much) higher
than the application steady-state live size
• Otherwise, CMS will constantly do CMS cycles
Copyright Sun Microsystems, Inc. 43
44. Common CMS Scenarios
• Applications that promote non-trivial amounts of
objects to the old generation
> Old generation grows at a non-trivial rate
> Very frequent CMS cycles
> CMS cycles need to start relatively early
• Applications that promote very few or even no
objects to the old generation
> Old generation grows very slowly, if at all
> Very infrequent CMS cycles
> CMS cycles can start quite late
Copyright Sun Microsystems, Inc. 44
45. Initiating CMS Cycles
• CMS will try to automatically find the best initiating
occupancy
> It first does a CMS cycle early to collect stats
> Then, it tries to start cycles as late as possible, but early
enough not to run out of heap before the cycle
completes
> It keeps collecting stats and adjusting when to start
cycles
> Sometimes, the second cycle starts too late
Copyright Sun Microsystems, Inc. 45
46. Initiating CMS Cycles (ii)
• -XX:CMSInitiatingOccupancyFraction=<percent>
> Occupancy percentage of CMS old generation that
triggers a CMS cycle
• -XX:+UseCMSInitiatingOccupancyOnly
> Don't use the ergonomic initiating occupancy
Copyright Sun Microsystems, Inc. 46
47. Initiating CMS Cycles (iii)
• -XX:CMSInitiatingPermOccupancyFraction=<percent>
> Occupancy percentage of permanent generation that
triggers a CMS cycle
> Class unloading must be enabled
Copyright Sun Microsystems, Inc. 47
50. CMS Cycle Initiation Example (iii)
• Cycle started too late:
[ParNew 742993K->648506K(773376K), 0.1688876 secs]
[ParNew 753466K->659042K(773376K), 0.1695921 secs]
[CMS-initial-mark 661142K(773376K), 0.0861029 secs]
[Full GC 645986K->234335K(655360K), 8.9112629 secs]
[ParNew 339295K->247490K(773376K), 0.0230993 secs]
[ParNew 352450K->259959K(773376K), 0.1933945 secs]
Copyright Sun Microsystems, Inc. 50
51. Start CMS Cycles Explicitly
• If relying on explicit GCs and want them to be
concurrent, use:
> -XX:+ExplicitGCInvokesConcurrent
– Requires a post 6 JVM
> -XX:+ExplicitGCInvokesConcurrentAndUnloadClasses
– Requires a post 6u4 JVM
• Useful when wanting to cause references / finalizers
to be processed
Copyright Sun Microsystems, Inc. 51
52. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 52
53. Monitoring the GC
• Online
> VisualVM: http://visualvm.dev.java.net/
> VisualGC:
– http://java.sun.com/performance/jvmstat/
– VisualGC is also available as a VisualVM plug-in
– Can monitor multiple JVMs within the same tool
• Offline
> GC Logging
> PrintGCStats
> GChisto
Copyright Sun Microsystems, Inc. 53
54. GC Logging in Production
• Don't be afraid to enable GC logging in
production
> Very helpful when diagnosing production issues
• Extremely low / non-existent overhead
> Maybe some large files in your file system. :-)
> We are surprised that customers are still afraid to enable
it
• Real customer quote:
> “If someone doesn't enable GC logging in production, I
shoot them!”
Copyright Sun Microsystems, Inc. 54
55. Important GC Logging Parameters
• You need at least:
> -XX:+PrintGCTimeStamps
– Add -XX:+PrintGCDateStamps if you must
> -XX:+PrintGCDetails
– Preferred over -verbosegc as it's more detailed
• Also useful:
> -Xloggc:<file>
> Separates GC logging output from application output
Copyright Sun Microsystems, Inc. 55
56. PrintGCStats
• Summarizes GC logs
• Downloadable script from
> http://java.sun.com/developer/technicalArticles/Program
ming/turbo/PrintGCStats.zip
• Usage
> PrintGCStats -v cpus=<num> <gc log file>
– Where <num> is the number of CPUs on the machine where
the GC log was obtained
• It might not work with some of the printing flags
Copyright Sun Microsystems, Inc. 56
57. PrintGCStats Parallel GC
what count total mean max stddev
gen0t(s) 193 11.470 0.05943 0.687 0.0633
gen1t(s) 1 7.350 7.34973 7.350 0.0000
GC(s) 194 18.819 0.09701 7.350 0.5272
alloc(MB) 193 11244.609 58.26222 100.875 18.8519
promo(MB) 193 807.236 4.18257 96.426 9.9291
used0(MB) 193 16018.930 82.99964 114.375 17.4899
used1(MB) 1 635.896 635.89648 635.896 0.0000
used(MB) 194 91802.213 473.20728 736.490 87.8376
commit0(MB) 193 17854.188 92.50874 114.500 9.8209
commit1(MB) 193 123520.000 640.00000 640.000 0.0000
commit(MB) 193 141374.188 732.50874 754.500 9.8209
alloc/elapsed_time = 11244.609 MB / 77.237 s = 145.586 MB/s
alloc/tot_cpu_time = 11244.609 MB / 1235.792 s = 9.099 MB/s
alloc/mut_cpu_time = 11244.609 MB / 934.682 s = 12.030 MB/s
promo/elapsed_time = 807.236 MB / 77.237 s = 10.451 MB/s
promo/gc0_time = 807.236 MB / 11.470 s = 70.380 MB/s
gc_seq_load = 301.110 s / 1235.792 s = 24.366%
gc_conc_load = 0.000 s / 1235.792 s = 0.000%
gc_tot_load = 301.110 s / 1235.792 s = 24.366%
Copyright Sun Microsystems, Inc. 57
59. GChisto
• Graphical GC log visualizer
• Under development
> Currently, can only show pause times
• Open source at
> http://gchisto.dev.java.net/
• It might not work with some of the printing flags
Copyright Sun Microsystems, Inc. 59
60. GCHisto (ii)
Copyright Sun Microsystems, Inc. 60
62. Agenda
• Introductions
• Brief GC Overview
• GC Tuning
> Tuning the young generation
> Tuning Parallel GC
> Tuning CMS
• Monitoring the GC
• Conclusions
Copyright Sun Microsystems, Inc. 62
63. Conclusions
• Remember: GC tuning is an Art
• The talk contained
> Basic GC tuning concepts
> How to monitor GCs
> What to look out for
> Examples of good tuning practices
• ...and practice makes perfect!
Copyright Sun Microsystems, Inc. 63
64. Garbage Collection
Tuning in the Java
HotSpot™ Virtual
Machine
Tony Printezis, Charlie Hunt
Antonios.Printezis@sun.com
Charlie.Hunt@sun.com
64