SlideShare a Scribd company logo
Seite 1
Java and LINUX Containers
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 2
JVM vs. LINUX Container
Java and Docker, Ralf Ernst, ITSYS, April 2016
• JVM is initialized with resources of the host (#Cores, RAM)
• See:http://code.metager.de/source/xref/openjdk/jdk8/hotspot/src/os/linux/vm/os_li
nux.cpp
• CPU: sysconf(_SC_NRPROCESSORS_ONLN);
• RAM: sysconf(_SC_PHYS_PAGES) * sysconf(_SC_PAGESIZE)
• verfügbare Ressourcen
• CPUs: all Cores of the Host - Runtime.getRuntime().availableProcessors == 32
• RAM: Maximum Heap = ca. ¼ of physical RAM of the Host
• Initialization of the JVM is naive and especially RAM has to be modified by the
user
• The JVM has no idea of cgroups limits, if it runs in a LINUX-Container
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 3
Java Heap: mandatory options: Xms und Xmx
Java and Docker, Ralf Ernst, ITSYS, April 2016
• Xmx (max Heap) is absolutely mandatory
• generous Heap enables adequate sizing of Eden and Old Space by JVMs
defaults
• Solves most GC-problems too
• RAM is cheaper than GC-Tuning
• Xms (initial Heap) should be the same as Xmx
• Faster Startup
• Container Memory is reserved by Schedulers like Kubernetes, Mesos, Swarm
anyway
• Out-of-scope for today:
• -Xmn young generation
• -XX:SurvivorRatio Ratio of Survivor Spaces relative to young space
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 4
Okay - Xmx is set and here we go with Java in a
Container…
Java and Docker, Ralf Ernst, ITSYS, April 2016Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 5
CPU / available Cores
Java and Docker, Ralf Ernst, ITSYS, April 2016
• JVM sets the following parameters during initialization according to the
available cores of the host-system:
• Number of Threads for gc
• -XX:ParallelGCThreads
• -XX:ConcGCThreads
• Initialization of ForkJoinPool 32 Cores = 32 Threads
• -Djava.util.concurrent.ForkJoinPool.common.parallelism
• Number of Threads for JIT-Compiler
• HotSpot Optimization
• Time To Safe Point (TTSP)
• If number of parallel Threads higher than the number of cores
• Usually you have no problem if you are sticking to cpu-shares (soft limit
in cgroups) because modern CPUs don‘t run at the limit
• CPU Problems of Java-Processes are mostly caused by frequent GCs caused by
too low heap-size or Xms < Xmx
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 6
Thanks for the memory – Memory of a container with
JDK8
Java and Docker, Ralf Ernst, ITSYS, April 2016
• Container
• OS Tools
• OS FSCache / Page Cache
• JRE itself
• Heap (Eden / Old space)
• Out of Heap
• Meta Space
• JIT Bytecode (Code Cache)
• AND
• JNI
• NIO
• Threads
• AND lots of Virtual Memory
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 7
Memory-Management JDK7 vs. JDK8 – Meta Space
Java and Docker, Ralf Ernst, ITSYS, April 2016
Eden Old space Perm space
JDK7:
-Xms=1g –Xmx=1g –XX:MaxPermSize=512m
Eden Old space Perm space
JDK8:
META space
MetaSpace = memory outside of heap for Class-
Metadata, statics, string interning, …
-Xms=1g –Xmx=1g
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 8
JDK8: No more PermGen Problems – Welcome
MetaSpace ;-)
Java and Docker, Ralf Ernst, ITSYS, April 2016
• Meta Space replaces PermGen
• Known design like IBM JVM or BEA/Oracle JRockit
• Memory NOT inside the JVM but on the machine
• No Default! metaspace is unlimited…
• limit with -XX:MaxMetaspaceSize=512m
• You‘ll might get OOM or Container-Swapping
• If RAM-limit of the container is too low
• Or a Metadata Leak
• OOM of container means: no Exception, no Application-Log, only a
Restart if your orchestration platform supports it
Seite 9
NIO with native memory Buffer (direct java.nio.ByteBuffer)
Examples: Apache Cassandra, Lucene, Elasticsearch…
Java and Docker, Ralf Ernst, ITSYS, April 2016
!!!
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 10
Virtual
Memory
Tradeoffs with Lucene (Elasticsearch - index.store.type:
mmapfs / niofs) as an example
Java and Docker, Ralf Ernst, ITSYS, April 2016
FSCache JVM HeapHeap
Buffer
NIOFSDirectory with Heap Buffer (new Versions of Lucene)
copy read
Code
FSCache JVM Heap
Direct
Buffer
NIOFSDirectory with Direct Buffer (older Versions of Lucene)
copy read
Code
MMapDirectory with MappedByte
Buffer
FSCache JVM Heap
read
Mapped
ByteBuffer
Code
slow Copy, fast Reads, but huge Heap
slow Copy, slow Reads, small Heap
no Copy, slow Reads, small Heap, Default on 64bit
Seite 11
Container-Memory: JRE, Code Cache, Thread-Stack,
Page Cache, etc.
Java and Docker, Ralf Ernst, ITSYS, April 2016
• JRE8: ca. 256M
• Code Cache for JIT JRE8: 96M
• Thread-Stack 1m per Thread on 64bit Systems
• OS Tools (ca. 60MB with RHEL)
• FS-Cache (Page Cache) of OS for File I/O
• Recommendation:
• Container-RAM = 2 * Heap for small Stateless Containers like SpringBoot and for
stateful services like Cassandra, Kafka, Elasticsearch...
• I/O intensive applications (DBs, etc.): More Container-RAM means more FS-
Cache of OS
• IBM: Thanks for the memory
• http://www.ibm.com/developerworks/library/j-nativememory-linux/index.html
• Old article for IBM JVM, which had a similar memory-management like JDK8
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 12
Native Memory Tracking of a Webapplication using
Tomcat8 on JDK8
Java and Docker, Ralf Ernst, ITSYS, April 2016
sudo -u tomcat /opt/jdk1.8.0_91/bin/jcmd 22239 VM.native_memory summary
22239:
Native Memory Tracking:
Total: reserved=2894609KB, committed=1723517KB
- Java Heap (reserved=1048576KB, committed=1048576KB)
(mmap: reserved=1048576KB, committed=1048576KB)
- Class (reserved=1258404KB, committed=239444KB)
(classes #41586)
(malloc=5028KB #81169)
(mmap: reserved=1253376KB, committed=234416KB)
- Thread (reserved=184932KB, committed=184932KB)
(thread #176)
(stack: reserved=179900KB, committed=179900KB)
(malloc=580KB #885)
(arena=4453KB #352)
- Code (reserved=270879KB, committed=120795KB)
(malloc=21279KB #27527)
(mmap: reserved=249600KB, committed=99516KB)
- GC (reserved=81676KB, committed=81676KB)
(malloc=9996KB #52352)
(mmap: reserved=71680KB, committed=71680KB)
- Compiler (reserved=4901KB, committed=4901KB)
(malloc=312KB #745)
(arena=4589KB #9)
- Internal (reserved=12462KB, committed=12462KB)
(malloc=12430KB #55846)
(mmap: reserved=32KB, committed=32KB)
- Symbol (reserved=19876KB, committed=19876KB)
(malloc=16192KB #176321)
(arena=3684KB #1)
Seite 13
JDK8 and LINUX Virtual Memory
Java and Docker, Ralf Ernst, ITSYS, April 2016
• Virtual memory gets more and more and will never be released
• Issue #15020: growing reported docker container virtual memory size with java
processes
• Issue #31594 docker 1.13.1: Memory Restrictions not works to Virtual Memory
• Java 8 Threading uses malloc arenas (glibc > 2.10) on Metaspace 
reservation: 64MB Virtual Memory per Thread
• Garbage Collection needs Threads too  virtual memory gets more and more
• No problem, just virtual memory (16 Exabyte…), which never gets really
used?
• Here comes mlockall()
• mlockall() allocates, every mapped page residently into physical RAM, even
unused address space like the one mentioned above
• Recommendation Elasticsearch Guide Heap-Sizing: bootstrap.mlockall: true
(means no Swapping too…)  Ouch
• MALLOC_ARENA_MAX=3 limits per-thread-malloc-arenas
Java and Docker, Ralf Ernst, ITSYS, April 2016
Seite 14
Takeaways – Running Java Applications in LINUX
Containers
Java and Docker, Ralf Ernst, ITSYS, April 2016
• Careful sizing of JVM and its container is crucial
• Cpu-limits under cgroups are soft limits (cpu-share) and are reached only on
hosts with unusual high loads
• Usually no problem
• If these limits are reached you usually get long response-times because no new
threads can be started
• Careful sizing of JVMs memory management (heap and outside) and setting
appropriate limits is crucial. You need to take care of enough container-
memory
• JDK8 uses a significant lot of native-memory outside the JVM
• Container-Swapping can kill your host! Don‘t enable swapping on the host-system
• I / O intensive Applications take benefits of LINUX Page Cache
• Monitoring of virtual memory makes no sense (e.g. top)
• Memory is a hard limit in cgroups!
• Configuration and tuning recommendations of popular services are not always
appropriate for runnign the service in a container
Seite 15
The JVM -as of today- is neither container-aware or
rather container-friendly…
Java and Docker, Ralf Ernst, ITSYS, April 2016
Enhancement Requests for JDK9 are on the way…
Java and Docker, Ralf Ernst, ITSYS, April 2016

More Related Content

What's hot

Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Ceph Community
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to Jewel
Colleen Corrice
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Sage Weil
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
Ceph Community
 
Road show 2015 triangle meetup
Road show 2015 triangle meetupRoad show 2015 triangle meetup
Road show 2015 triangle meetup
wim_provoost
 
Open vStorage Road show 2015 Q1
Open vStorage Road show 2015 Q1Open vStorage Road show 2015 Q1
Open vStorage Road show 2015 Q1
wim_provoost
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
Sage Weil
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
Sage Weil
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
Gluster.org
 
Integration of Glusterfs in to commvault simpana
Integration of Glusterfs in to commvault simpanaIntegration of Glusterfs in to commvault simpana
Integration of Glusterfs in to commvault simpana
Gluster.org
 
Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
Peter Eisentraut
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
Jonathan Long
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
Gluster.org
 
Hybrid Storage Pools (Now with the benefit of hindsight!)
Hybrid Storage Pools (Now with the benefit of hindsight!)Hybrid Storage Pools (Now with the benefit of hindsight!)
Hybrid Storage Pools (Now with the benefit of hindsight!)
ahl0003
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster Suite
EDB
 
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Ceph Community
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at Spreadshirt
Jens Hadlich
 

What's hot (20)

Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
 
Ceph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to JewelCeph Performance: Projects Leading up to Jewel
Ceph Performance: Projects Leading up to Jewel
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
Road show 2015 triangle meetup
Road show 2015 triangle meetupRoad show 2015 triangle meetup
Road show 2015 triangle meetup
 
Open vStorage Road show 2015 Q1
Open vStorage Road show 2015 Q1Open vStorage Road show 2015 Q1
Open vStorage Road show 2015 Q1
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
 
Integration of Glusterfs in to commvault simpana
Integration of Glusterfs in to commvault simpanaIntegration of Glusterfs in to commvault simpana
Integration of Glusterfs in to commvault simpana
 
Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
 
Hybrid Storage Pools (Now with the benefit of hindsight!)
Hybrid Storage Pools (Now with the benefit of hindsight!)Hybrid Storage Pools (Now with the benefit of hindsight!)
Hybrid Storage Pools (Now with the benefit of hindsight!)
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster Suite
 
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangLinux Block Cache Practice on Ceph BlueStore - Junxin Zhang
Linux Block Cache Practice on Ceph BlueStore - Junxin Zhang
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at Spreadshirt
 

Similar to Java and cgroups eng

Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)
Marcos García
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
Intel® Software
 
Java performance tuning
Java performance tuningJava performance tuning
Java performance tuning
Mohammed Fazuluddin
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
Hazelcast
 
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
Atwix
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Tzach Livyatan
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
DataWorks Summit
 
Heapoff memory wtf
Heapoff memory wtfHeapoff memory wtf
Heapoff memory wtf
Olivier Lamy
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
Arnab Biswas
 
Scaling up java applications on windows
Scaling up java applications on windowsScaling up java applications on windows
Scaling up java applications on windows
Juarez Junior
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
 
The effect of page size modification on jvm
The effect of page size modification on jvmThe effect of page size modification on jvm
The effect of page size modification on jvm
Parameswaran Selvam
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Vigyan Jain
 
ContainerWorkloadwithSemeru.pdf
ContainerWorkloadwithSemeru.pdfContainerWorkloadwithSemeru.pdf
ContainerWorkloadwithSemeru.pdf
SumanMitra22
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
ScyllaDB
 
Inside The Java Virtual Machine
Inside The Java Virtual MachineInside The Java Virtual Machine
Inside The Java Virtual Machine
elliando dias
 
Java in containers
Java in containersJava in containers
Java in containers
Martin Baez
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage Devices
Databricks
 

Similar to Java and cgroups eng (20)

Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
 
Java performance tuning
Java performance tuningJava performance tuning
Java performance tuning
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
Host and Boast: Best Practices for Magento Hosting | Imagine 2013 Technolog…
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Heapoff memory wtf
Heapoff memory wtfHeapoff memory wtf
Heapoff memory wtf
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Scaling up java applications on windows
Scaling up java applications on windowsScaling up java applications on windows
Scaling up java applications on windows
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
The effect of page size modification on jvm
The effect of page size modification on jvmThe effect of page size modification on jvm
The effect of page size modification on jvm
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
 
ContainerWorkloadwithSemeru.pdf
ContainerWorkloadwithSemeru.pdfContainerWorkloadwithSemeru.pdf
ContainerWorkloadwithSemeru.pdf
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
 
Inside The Java Virtual Machine
Inside The Java Virtual MachineInside The Java Virtual Machine
Inside The Java Virtual Machine
 
Java in containers
Java in containersJava in containers
Java in containers
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage Devices
 

Recently uploaded

8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 

Recently uploaded (20)

8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 

Java and cgroups eng

  • 1. Seite 1 Java and LINUX Containers Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 2. Seite 2 JVM vs. LINUX Container Java and Docker, Ralf Ernst, ITSYS, April 2016 • JVM is initialized with resources of the host (#Cores, RAM) • See:http://code.metager.de/source/xref/openjdk/jdk8/hotspot/src/os/linux/vm/os_li nux.cpp • CPU: sysconf(_SC_NRPROCESSORS_ONLN); • RAM: sysconf(_SC_PHYS_PAGES) * sysconf(_SC_PAGESIZE) • verfügbare Ressourcen • CPUs: all Cores of the Host - Runtime.getRuntime().availableProcessors == 32 • RAM: Maximum Heap = ca. ¼ of physical RAM of the Host • Initialization of the JVM is naive and especially RAM has to be modified by the user • The JVM has no idea of cgroups limits, if it runs in a LINUX-Container Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 3. Seite 3 Java Heap: mandatory options: Xms und Xmx Java and Docker, Ralf Ernst, ITSYS, April 2016 • Xmx (max Heap) is absolutely mandatory • generous Heap enables adequate sizing of Eden and Old Space by JVMs defaults • Solves most GC-problems too • RAM is cheaper than GC-Tuning • Xms (initial Heap) should be the same as Xmx • Faster Startup • Container Memory is reserved by Schedulers like Kubernetes, Mesos, Swarm anyway • Out-of-scope for today: • -Xmn young generation • -XX:SurvivorRatio Ratio of Survivor Spaces relative to young space Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 4. Seite 4 Okay - Xmx is set and here we go with Java in a Container… Java and Docker, Ralf Ernst, ITSYS, April 2016Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 5. Seite 5 CPU / available Cores Java and Docker, Ralf Ernst, ITSYS, April 2016 • JVM sets the following parameters during initialization according to the available cores of the host-system: • Number of Threads for gc • -XX:ParallelGCThreads • -XX:ConcGCThreads • Initialization of ForkJoinPool 32 Cores = 32 Threads • -Djava.util.concurrent.ForkJoinPool.common.parallelism • Number of Threads for JIT-Compiler • HotSpot Optimization • Time To Safe Point (TTSP) • If number of parallel Threads higher than the number of cores • Usually you have no problem if you are sticking to cpu-shares (soft limit in cgroups) because modern CPUs don‘t run at the limit • CPU Problems of Java-Processes are mostly caused by frequent GCs caused by too low heap-size or Xms < Xmx Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 6. Seite 6 Thanks for the memory – Memory of a container with JDK8 Java and Docker, Ralf Ernst, ITSYS, April 2016 • Container • OS Tools • OS FSCache / Page Cache • JRE itself • Heap (Eden / Old space) • Out of Heap • Meta Space • JIT Bytecode (Code Cache) • AND • JNI • NIO • Threads • AND lots of Virtual Memory Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 7. Seite 7 Memory-Management JDK7 vs. JDK8 – Meta Space Java and Docker, Ralf Ernst, ITSYS, April 2016 Eden Old space Perm space JDK7: -Xms=1g –Xmx=1g –XX:MaxPermSize=512m Eden Old space Perm space JDK8: META space MetaSpace = memory outside of heap for Class- Metadata, statics, string interning, … -Xms=1g –Xmx=1g Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 8. Seite 8 JDK8: No more PermGen Problems – Welcome MetaSpace ;-) Java and Docker, Ralf Ernst, ITSYS, April 2016 • Meta Space replaces PermGen • Known design like IBM JVM or BEA/Oracle JRockit • Memory NOT inside the JVM but on the machine • No Default! metaspace is unlimited… • limit with -XX:MaxMetaspaceSize=512m • You‘ll might get OOM or Container-Swapping • If RAM-limit of the container is too low • Or a Metadata Leak • OOM of container means: no Exception, no Application-Log, only a Restart if your orchestration platform supports it
  • 9. Seite 9 NIO with native memory Buffer (direct java.nio.ByteBuffer) Examples: Apache Cassandra, Lucene, Elasticsearch… Java and Docker, Ralf Ernst, ITSYS, April 2016 !!! Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 10. Seite 10 Virtual Memory Tradeoffs with Lucene (Elasticsearch - index.store.type: mmapfs / niofs) as an example Java and Docker, Ralf Ernst, ITSYS, April 2016 FSCache JVM HeapHeap Buffer NIOFSDirectory with Heap Buffer (new Versions of Lucene) copy read Code FSCache JVM Heap Direct Buffer NIOFSDirectory with Direct Buffer (older Versions of Lucene) copy read Code MMapDirectory with MappedByte Buffer FSCache JVM Heap read Mapped ByteBuffer Code slow Copy, fast Reads, but huge Heap slow Copy, slow Reads, small Heap no Copy, slow Reads, small Heap, Default on 64bit
  • 11. Seite 11 Container-Memory: JRE, Code Cache, Thread-Stack, Page Cache, etc. Java and Docker, Ralf Ernst, ITSYS, April 2016 • JRE8: ca. 256M • Code Cache for JIT JRE8: 96M • Thread-Stack 1m per Thread on 64bit Systems • OS Tools (ca. 60MB with RHEL) • FS-Cache (Page Cache) of OS for File I/O • Recommendation: • Container-RAM = 2 * Heap for small Stateless Containers like SpringBoot and for stateful services like Cassandra, Kafka, Elasticsearch... • I/O intensive applications (DBs, etc.): More Container-RAM means more FS- Cache of OS • IBM: Thanks for the memory • http://www.ibm.com/developerworks/library/j-nativememory-linux/index.html • Old article for IBM JVM, which had a similar memory-management like JDK8 Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 12. Seite 12 Native Memory Tracking of a Webapplication using Tomcat8 on JDK8 Java and Docker, Ralf Ernst, ITSYS, April 2016 sudo -u tomcat /opt/jdk1.8.0_91/bin/jcmd 22239 VM.native_memory summary 22239: Native Memory Tracking: Total: reserved=2894609KB, committed=1723517KB - Java Heap (reserved=1048576KB, committed=1048576KB) (mmap: reserved=1048576KB, committed=1048576KB) - Class (reserved=1258404KB, committed=239444KB) (classes #41586) (malloc=5028KB #81169) (mmap: reserved=1253376KB, committed=234416KB) - Thread (reserved=184932KB, committed=184932KB) (thread #176) (stack: reserved=179900KB, committed=179900KB) (malloc=580KB #885) (arena=4453KB #352) - Code (reserved=270879KB, committed=120795KB) (malloc=21279KB #27527) (mmap: reserved=249600KB, committed=99516KB) - GC (reserved=81676KB, committed=81676KB) (malloc=9996KB #52352) (mmap: reserved=71680KB, committed=71680KB) - Compiler (reserved=4901KB, committed=4901KB) (malloc=312KB #745) (arena=4589KB #9) - Internal (reserved=12462KB, committed=12462KB) (malloc=12430KB #55846) (mmap: reserved=32KB, committed=32KB) - Symbol (reserved=19876KB, committed=19876KB) (malloc=16192KB #176321) (arena=3684KB #1)
  • 13. Seite 13 JDK8 and LINUX Virtual Memory Java and Docker, Ralf Ernst, ITSYS, April 2016 • Virtual memory gets more and more and will never be released • Issue #15020: growing reported docker container virtual memory size with java processes • Issue #31594 docker 1.13.1: Memory Restrictions not works to Virtual Memory • Java 8 Threading uses malloc arenas (glibc > 2.10) on Metaspace  reservation: 64MB Virtual Memory per Thread • Garbage Collection needs Threads too  virtual memory gets more and more • No problem, just virtual memory (16 Exabyte…), which never gets really used? • Here comes mlockall() • mlockall() allocates, every mapped page residently into physical RAM, even unused address space like the one mentioned above • Recommendation Elasticsearch Guide Heap-Sizing: bootstrap.mlockall: true (means no Swapping too…)  Ouch • MALLOC_ARENA_MAX=3 limits per-thread-malloc-arenas Java and Docker, Ralf Ernst, ITSYS, April 2016
  • 14. Seite 14 Takeaways – Running Java Applications in LINUX Containers Java and Docker, Ralf Ernst, ITSYS, April 2016 • Careful sizing of JVM and its container is crucial • Cpu-limits under cgroups are soft limits (cpu-share) and are reached only on hosts with unusual high loads • Usually no problem • If these limits are reached you usually get long response-times because no new threads can be started • Careful sizing of JVMs memory management (heap and outside) and setting appropriate limits is crucial. You need to take care of enough container- memory • JDK8 uses a significant lot of native-memory outside the JVM • Container-Swapping can kill your host! Don‘t enable swapping on the host-system • I / O intensive Applications take benefits of LINUX Page Cache • Monitoring of virtual memory makes no sense (e.g. top) • Memory is a hard limit in cgroups! • Configuration and tuning recommendations of popular services are not always appropriate for runnign the service in a container
  • 15. Seite 15 The JVM -as of today- is neither container-aware or rather container-friendly… Java and Docker, Ralf Ernst, ITSYS, April 2016 Enhancement Requests for JDK9 are on the way… Java and Docker, Ralf Ernst, ITSYS, April 2016