@tall_chris#Devoxx #TCoffheap
Terracotta’s OffHeap Explained
Chris Dennis
Terracotta (aka Software AG)
@tall_chris#Devoxx #TCoffheap
Who Am I?
• Trained as a Physicist, clearly not trained as a Computer
Scientist.
• 4Years Doing Unnatural Things With Bytecode In Academia
• 3Years Doing Unnatural Things With Bytecode For Money
• 4Years Doing Unnatural Things With ByteBuffers
• 11Years Doing Java Development
• Software Engineer working at Terracotta (Software AG)
@tall_chris#Devoxx #TCoffheap
[dungeon@Main1	~]$	cat	/proc/meminfo		
MemTotal:							6354030896	kB	
MemFree:								112170556	kB	
[dungeon@Main1	~]$	cat	/proc/cpuinfo		
processor	:	119	
vendor_id	:	GenuineIntel	
cpu	family	 :	6	
model	 	 :	62	
model	name	 :	Intel(R)	Xeon(R)	CPU	E7-4890	v2	@	2.80GHz	
stepping	 :	7	
cpu	MHz		 :	1200.000	
cache	size	 :	38400	KB	
physical	id	:	3
I Get To Play With Big Toys
@tall_chris#Devoxx #TCoffheap
A Bit of History
2010 Started development as a caching ‘tier’ within Ehcache.
2011 Integrated as a caching tier in front of Oracle BDB in the
Terracotta Server.
2013 Legal complications push it in to service as the primary
storage for the Terracotta Server.
2015 Open Sourced (https://github.com/Terracotta-OSS/
offheap-store).
@tall_chris#Devoxx #TCoffheap
Problem Statement
• Map: collection of key-value pairs
• Cache ≈ a Map with bells on
• Caching is good:
https://xkcd.com/908/
https://xkcd.com/908/
@tall_chris#Devoxx #TCoffheap
Problem Statement
• “a lot of caching” leads to
• a lot of heap, which leads to,
• a lot of work for the garbage collector, which leads to,
• a lot of GC pausing/overhead”
• The situation is markedly better now than when the bulk of this
library was written. (Please don’t tell my employer I said that)
@tall_chris#Devoxx #TCoffheap
Map/Cache Best Practices
• Immutable Keys
• please do this!
• ImmutableValues
• please do this!
• So with immutability everywhere, who cares about object
identity?
• If I don’t need object identity, do I need a heap?
• If I don’t need a heap, do I need a garbage collector?
@tall_chris#Devoxx #TCoffheap
Solution
• Replace heavy (large) map/cache usage with an ‘outside the
heap’ but ‘inside the process’ implementation.
• Benefits at two scales:
• At moderate scale, the GC offload reduces overheads.
• At large scale, we can still function: -Xmx6T
• Caveats
• Marshalling/unmarshalling costs time (and CPU)
• Trading away average latency to control the tail.
@tall_chris#Devoxx #TCoffheap
Replace What?
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
Maps
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
JDK HashMap
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
put(k1, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
put(k1, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
put(k1, v)
k1, v
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
k3, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
k3, v
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
put(k1, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
put(k1, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
put(k1, v)
k1, v
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k2, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v
put(k3, v)
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v
@tall_chris#Devoxx #TCoffheap
OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v
• Hash Map
• Open Addressing
• Linear Reprobe (1 slot)
@tall_chris#Devoxx #TCoffheap
class	Node<K,	V>	{	
		final	int	hash;	
		final	K	key;	
		V	value;	
		Node<K,	V>	next;	
}
JDK HashMap
k1, v
primitive - easy to store
heap references
closed addressing specific
@tall_chris#Devoxx #TCoffheap
‘struct’	slot	{	
		int	status	
		int	hash;	
		long	encoding	
}
OffHeap Map
k1, v
primitive - easy to store
encoded key/value pair
@tall_chris#Devoxx #TCoffheap
interface	StorageEngine<K,	V>	{	
		Long	writeMapping(K	key,	V	value,	int	hash,	int	metadata);	
		void	freeMapping(long	encoding,	int	hash,	boolean	removal);	
			
		V	readValue(long	encoding);	
		boolean	equalsValue(Object	value,	long	encoding);	
			
		K	readKey(long	encoding,	int	hashCode);	
			
		boolean	equalsKey(Object	key,	long	encoding);	
}
Storing Key & Values
@tall_chris#Devoxx #TCoffheap
Options with 64 bits available
• 64 bit combined pointer
• 32 bit key pointer & 32 bit value pointer
• int key directly + 32 bit pointer
• long key directly + 32 bit pointer
• …anything else you like
@tall_chris#Devoxx #TCoffheap
Pointer to What?
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
A Native ‘Heap’
byte addressable memory (logical address space)
0 max
page page page page
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer.allocateDirect() (physical address space)
@tall_chris#Devoxx #TCoffheap
Managing The ‘Heap’
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
A Native Heap Allocator
• malloc/free performed using a Java port of dlmalloc
• http://g.oswego.edu/dl/html/malloc.html
• Works well for our use cases as we do not generally control
or even know the malloc size distribution.
@tall_chris#Devoxx #TCoffheap
Marshaling
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
“Java Serialization Sucks”
• Serialization is self describing.
• It supports
• object identity
• cycles
• complex versioning
• Pretty heavyweight, especially for short streams…
• …but it’s the default serialization mechanism available in
Ehcache 2.x
@tall_chris#Devoxx #TCoffheap
“Java Serialization Sucks”
• serialize(new Integer(42))
• results in these 81 bytes:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 11 6A 61 76 61 2E 6C 61 6E
1 67 2E 49 6E 74 65 67 65 72 12 E2 A0 A4 F7 81 87
2 38 02 00 01 49 00 05 76 61 6C 75 65 78 72 00 10
3 6A 61 76 61 2E 6C 61 6E 67 2E 4E 75 6D 62 65 72
4 86 AC 95 1D 0B 94 E0 8B 02 00 00 78 70 00 00 00
5 2A
@tall_chris#Devoxx #TCoffheap
OffHeap’s Serialization Sucks Less?
• serialize(new Integer(42))
• results in 22 bytes
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 00 00 00 78 72 00 00 00 01
1 78 70 00 00 00 2A
2
3
4
5
@tall_chris#Devoxx #TCoffheap
With some structure
STREAM_MAGIC	STREAM_VERSION	
TC_OBJECT	
		TC_CLASSDESC	utf(17,	java.lang.Integer)	
				serialVersionUID[12E2A0A4F7818738]	SC_SERIALIZABLE	
				fields=[I:utf(5,	value)]	
		TC_END_BLOCKDATA	
		TC_CLASSDESC	utf(16,	java.lang.Number)	
				serialVersionUID[86AC951D0B94E08B]	SC_SERIALIZABLE	
				fields=[]	
		TC_END_BLOCKDATA	
		TC_NULL	
0000002A
@tall_chris#Devoxx #TCoffheap
With some structure
STREAM_MAGIC	STREAM_VERSION	
TC_OBJECT	
		TC_CLASSDESC	descriptor(0)	
		TC_END_BLOCKDATA	
		TC_CLASSDESC	descriptor(1)	
		TC_END_BLOCKDATA	
		TC_NULL	
0000002A
@tall_chris#Devoxx #TCoffheap
Where did the 59 bytes go?
• How many types are in my map?
• All keys the same type: really common
• All values the same type: fairly common
• Stick those common ObjectStreamClass instances in a look
aside structure
• Map<Integer, ObjectStreamClass> for reading streams
• Map<SerializableDataKey, Integer> for writing streams
@tall_chris#Devoxx #TCoffheap
class	ObjectOutputStream	{	
		protected	void	writeClassDescriptor(ObjectStreamClass	desc);	
}	
class	ObjectInputStream	{	
		protected	ObjectStreamClass	readClassDescriptor();	
}
Serialization is pretty malleable
@tall_chris#Devoxx #TCoffheap
Portability
• But if serialization still sucks…
interface	Portability<T>	{	
		ByteBuffer	encode(T	object);	
		T	decode(ByteBuffer	buffer);	
		boolean	equals(Object	object,	ByteBuffer	buffer);	
}
@tall_chris#Devoxx #TCoffheap
Concurrency
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
j.u.c.ConcurrentMap
• What does a concurrent map provide?
• happens-before relationship: “actions in a thread prior to placing an
object into a ConcurrentMap as a key or value happen-before actions
subsequent to the access or removal of that object from the
ConcurrentMap in another thread”
• atomic operations: “…except that the action is performed atomically.”
• What do we want?
• concurrent access (readers and writers)
@tall_chris#Devoxx #TCoffheap
Happens Before Relationships
• volatile write/read
• but not on offheap memory locations
• synchronized
• needs a heap object
• other j.u.c classes (Lock,Atomic…)
• needs a heap object
• There is no way within the JDK to enforce a happens before
relationship between writes/reads of an offheap location…
@tall_chris#Devoxx #TCoffheap
No Unsafe please, we’re a library
• Our testing has never shown our offheap implementation to
be a bottleneck in our usages.
• Unnecessary complexity costs $$$
• support
• maintenance
• bugs
@tall_chris#Devoxx #TCoffheap
Simple solution:
OffHeapMap offheap memory area
dlmalloc serializer
@tall_chris#Devoxx #TCoffheap
ReadWriteLock
Simple solution:
OffHeapMap
offheap memory area
dlmalloc
serializer
ConcurrentOffHeapMap
@tall_chris#Devoxx #TCoffheap
A ‘Concurrent’ Map
✅ happens-before relationship: “actions in a thread prior to
placing an object into a ConcurrentMap as a key or value
happen-before actions subsequent to the access or removal of
that object from the ConcurrentMap in another thread”
✅ atomic operations: “…except that the action is performed
atomically.”
⚠ concurrent access (readers and writers)
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k1, v)
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k1, v)
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k1, v)
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k1, v)
k1, v
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
k1, v
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k2, v)
k1, v
k2, v
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
k1, v
k2, v
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
put(k3, v)
k1, v k3, v
k2, v
@tall_chris#Devoxx #TCoffheap
Moar Write Concurrency!
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ConcurrentOffHeapMap
StripingLogic
k1, v k3, v
k2, v
@tall_chris#Devoxx #TCoffheap
Concurrency
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic
@tall_chris#Devoxx #TCoffheap
Conclusions
1. Simple engineering is simpler to support and maintain.
2. Going off-heap doesn’t require Unsafe
• (unless ultimate performance is your primary concern)
@tall_chris#Devoxx #TCoffheap
Additional Topics
• Caching
• Weakly-consistent Iterators
• Cross Segment Eviction
• Page Stealing Algorithms
• Native Heap Compaction
• Map Rehashing (Growing &
Shrinking)
• Off-Memory (SSDs)
• Persistence/Durability
• Entry Level Pinning
• Probably Other Stuff I
Forgot About…
@tall_chris#Devoxx #TCoffheap
Questions?
(BTW We’re Hiring)
https://github.com/Terracotta-OSS/offheap-store/

Terracotta's OffHeap Explained