SlideShare a Scribd company logo
1 of 29
Download to read offline
Dealing with JVM limitations
in Apache Cassandra
Jonathan Ellis / @spyced
Pain points for Java databases

✤   GC
✤   GC
✤   GC
Pain points for Java databases

✤   GC
✤   Platform specific code
GC

✤   Concurrent and compacting: choose one
    ✤   G1
    ✤   Azul C4 / Zing?
Fragmentation

✤   Bloom filter arrays
✤   Compression offsets
Automatic mitigation?

✤   http://www.research.ibm.com/people/d/dfb/papers/Bacon03Controlling.pdf
✤   http://researcher.ibm.com/files/us-hirzel/pldi10-arraylets.pdf
Fragmentation, 2

✤   Arena allocation for memtables
(Memtables?)

    write( k1 , c1:v1 )

                                       Memory




                          Memtable




   Commit log



                                     Hard drive
write( k1 , c1:v )

                                             Memory
                      k1 c1:v




                                Memtable



     k1 c1:v




Commit log



                                           Hard drive
write( k1 , c2:v )

                                      Memory
                     k1 c1:v c2:v




    k1 c1:v
    k1 c2:v




                                    Hard drive
write( k2 ,     c1:v c2:v   )
                                                   Memory
                                k1 c1:v c2:v

                                k2   c1:v c2:v




   k1 c1:v
   k1 c2:v
 k2 c1:v c2:v




                                                 Hard drive
write( k1 ,     c1:v c3:v   )
                                                      Memory
                                k1 c1:v c2:v c3:v

                                k2   c1:v c2:v




   k1 c1:v
   k1 c2:v
 k2 c1:v c2:v
 k1 c1:v c3:v




                                                    Hard drive
Memory




          flush




                 index
cleanup    k1 c1:v c2:v c3:v

           k2   c1:v c2:v


                               SSTable




                                         Hard drive
“Java is a memory hog”

✤   Large overhead for typical objects and collections
✤   How large?
✤   java.lang.instrument.Instrumentation

    ✤   JAMM: Java Agent for Memory Measurements
    ✤   https://github.com/jbellis/jamm
org.apache.cassandra.cache.SerializingCache


✤   Live objects are about 85% JVM bookeeping
✤   org.apache.cassandra.cache.FreeableMemory   using reference
    counting
✤   Considering doing reference-counted, off-heap memtables
    as well
Don’t forget about young gen

✤   Always stop-the-world for ~100ms
Platform-specific code

✤   OS
✤   JVM
m[un]map

✤   Log-structured storage wants to remove old files post-
    compaction; some platforms disallow deleting open files
✤   Old workaround (pre-1.0):
    ✤   use PhantomReference to tell when mmap’d file is GC (hence
        unmapped)
    ✤   Poor user experience and messy corner cases
✤   New workaround:
    ✤   Class.forName("sun.nio.ch.DirectBuffer").getMethod("cleaner")
mmap part 2

✤   2GB limit via ByteBuffer:
    public abstract byte get(int index)

✤   Workaround: MmappedSegmentedFile
    public Iterator<DataInput> iterator(long position)
link

✤   Used for snapshots
✤   Old workaround: JNA
✤   New workaround: supported directly by Java7
mlockall

✤   swappiness: pissing off database developers since 2001 (?)
✤   mlockall(MCL_CURRENT)
Low-level i/o

✤   posix_fadvise
✤   mincore/fincore
✤   fctl


✤   ... JNA
A plug for JNA

✤   https://github.com/twall/jna

     static {
         try {
              Native.register("c");
       ...

     private static native int mlockall(int flags)
     throws LastErrorException;
The fallacy of choosing portability over power

✤   Applets have been dead for years
✤   Python gets it right
    ✤   import readline
The fallacy of choosing safety over power

✤   Allowing munmap would expose developers to segfaults
✤   But, relying on the GC to clean up external resources is a
    well-known antipattern
    ✤   File.close
✤   We need munmap badly enough that we resort to
    unnatural and unportable code to get it
    ✤   You haven’t kept us from risking segfaults, you’ve just made us
        miserable
Compatibility through obscurity?

✤   sun.misc.Unsafe
✤   Used by high-profile libraries like high-scale-lib
... even public options




     http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card
Too negative?
Still true

✤   "Many concurrent algorithms are very easy to write with a
    GC and totally hard (to down right impossible) using
    explicit free." -- Cliff Click

More Related Content

What's hot

Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMnknytk
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixXPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixThe Linux Foundation
 
Ceph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph Community
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practiceEugene Fidelin
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeCeph Community
 
Improvements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseImprovements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseDeepak Shetty
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinEd Balduf
 
Nexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta Systems
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Danielle Womboldt
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiCeph Community
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauCeph Community
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replicationEd Balduf
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRaz Tamir
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project UpdateCeph Community
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitDon Marti
 

What's hot (20)

Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVM
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixXPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
 
Ceph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance Barriers
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMe
 
Improvements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseImprovements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecase
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit Austin
 
Nexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on Lab
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replication
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration Summit
 

Similar to Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)

Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012pcmanus
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conferencepcmanus
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State DrivesDataStax Academy
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanJimin Hsieh
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingHostedbyConfluent
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingYaroslav Tkachenko
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Jean-Philippe BEMPEL
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best PracticesJeff Larkin
 
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoNext Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoJessica Tams
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Yoshiyasu SAEKI
 
Introduction to Khronos SYCL
Introduction to Khronos SYCLIntroduction to Khronos SYCL
Introduction to Khronos SYCLMin-Yih Hsu
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationKhai Le
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDBLuca Garulli
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...confluent
 
Colin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfColin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfxiso
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCdterei
 

Similar to Dealing with JVM limitations in Apache Cassandra (Fosdem 2012) (20)

Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conference
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoNext Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
 
Lev
LevLev
Lev
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
 
Introduction to Khronos SYCL
Introduction to Khronos SYCLIntroduction to Khronos SYCL
Introduction to Khronos SYCL
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_Acceleration
 
Jvm Language Summit Rose 20081016
Jvm Language Summit Rose 20081016Jvm Language Summit Rose 20081016
Jvm Language Summit Rose 20081016
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDB
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
 
Colin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfColin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdf
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHC
 
Starburn
StarburnStarburn
Starburn
 

More from jbellis

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1jbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011jbellis
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandrajbellis
 

More from jbellis (20)

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandra
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)

  • 1. Dealing with JVM limitations in Apache Cassandra Jonathan Ellis / @spyced
  • 2. Pain points for Java databases ✤ GC ✤ GC ✤ GC
  • 3. Pain points for Java databases ✤ GC ✤ Platform specific code
  • 4. GC ✤ Concurrent and compacting: choose one ✤ G1 ✤ Azul C4 / Zing?
  • 5. Fragmentation ✤ Bloom filter arrays ✤ Compression offsets
  • 6. Automatic mitigation? ✤ http://www.research.ibm.com/people/d/dfb/papers/Bacon03Controlling.pdf ✤ http://researcher.ibm.com/files/us-hirzel/pldi10-arraylets.pdf
  • 7. Fragmentation, 2 ✤ Arena allocation for memtables
  • 8. (Memtables?) write( k1 , c1:v1 ) Memory Memtable Commit log Hard drive
  • 9. write( k1 , c1:v ) Memory k1 c1:v Memtable k1 c1:v Commit log Hard drive
  • 10. write( k1 , c2:v ) Memory k1 c1:v c2:v k1 c1:v k1 c2:v Hard drive
  • 11. write( k2 , c1:v c2:v ) Memory k1 c1:v c2:v k2 c1:v c2:v k1 c1:v k1 c2:v k2 c1:v c2:v Hard drive
  • 12. write( k1 , c1:v c3:v ) Memory k1 c1:v c2:v c3:v k2 c1:v c2:v k1 c1:v k1 c2:v k2 c1:v c2:v k1 c1:v c3:v Hard drive
  • 13. Memory flush index cleanup k1 c1:v c2:v c3:v k2 c1:v c2:v SSTable Hard drive
  • 14. “Java is a memory hog” ✤ Large overhead for typical objects and collections ✤ How large? ✤ java.lang.instrument.Instrumentation ✤ JAMM: Java Agent for Memory Measurements ✤ https://github.com/jbellis/jamm
  • 15. org.apache.cassandra.cache.SerializingCache ✤ Live objects are about 85% JVM bookeeping ✤ org.apache.cassandra.cache.FreeableMemory using reference counting ✤ Considering doing reference-counted, off-heap memtables as well
  • 16. Don’t forget about young gen ✤ Always stop-the-world for ~100ms
  • 18. m[un]map ✤ Log-structured storage wants to remove old files post- compaction; some platforms disallow deleting open files ✤ Old workaround (pre-1.0): ✤ use PhantomReference to tell when mmap’d file is GC (hence unmapped) ✤ Poor user experience and messy corner cases ✤ New workaround: ✤ Class.forName("sun.nio.ch.DirectBuffer").getMethod("cleaner")
  • 19. mmap part 2 ✤ 2GB limit via ByteBuffer: public abstract byte get(int index) ✤ Workaround: MmappedSegmentedFile public Iterator<DataInput> iterator(long position)
  • 20. link ✤ Used for snapshots ✤ Old workaround: JNA ✤ New workaround: supported directly by Java7
  • 21. mlockall ✤ swappiness: pissing off database developers since 2001 (?) ✤ mlockall(MCL_CURRENT)
  • 22. Low-level i/o ✤ posix_fadvise ✤ mincore/fincore ✤ fctl ✤ ... JNA
  • 23. A plug for JNA ✤ https://github.com/twall/jna static { try { Native.register("c"); ... private static native int mlockall(int flags) throws LastErrorException;
  • 24. The fallacy of choosing portability over power ✤ Applets have been dead for years ✤ Python gets it right ✤ import readline
  • 25. The fallacy of choosing safety over power ✤ Allowing munmap would expose developers to segfaults ✤ But, relying on the GC to clean up external resources is a well-known antipattern ✤ File.close ✤ We need munmap badly enough that we resort to unnatural and unportable code to get it ✤ You haven’t kept us from risking segfaults, you’ve just made us miserable
  • 26. Compatibility through obscurity? ✤ sun.misc.Unsafe ✤ Used by high-profile libraries like high-scale-lib
  • 27. ... even public options http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card
  • 29. Still true ✤ "Many concurrent algorithms are very easy to write with a GC and totally hard (to down right impossible) using explicit free." -- Cliff Click