Know your platform. 7 things every scala developer should know about jvm

"Know your platform."
7 things every Scala developer
should know about the JVM

Origins of this talk
We were having a beer...

We were having beers...

Financial & Big Data driven ;)

There is vast number of newcomers to Scala
world. Some percentage of those developers
have never programmed in Java before.

There is vast number of newcomers to Scala
world. Some percentage of those developers
have never programmed in Java before.
Do they have a notion of the platform they are
running their software on?

There is a significant number of Java
developers obsessed with all kind of APIs not
knowing a single thing about the platform that
they are using.

There is a significant number of Java
developers obsessed with all kind of APIs not
knowing a single thing about the platform that
they are using.
Those Java developers begin to move towards
other JVM languages (like Scala)

Should we even care?
“You always want to understand one level
below the level you write your code"
-- Ted Neward’s mentor

Knowing one level below your level leads to:

● Being a better engineer on general

● Ability to more accurately reason about the code

● Improved performance

● Leaving the folklore beliefs towards scientific methods

● Separation of dogmas and facts

● Separation of dogmas and facts
● Efficiency in handling non-trivial errors and bugs

We believe that as Scala developers

we should at least understand basics of
the JVM platform

the JVM platform in order to achieve
efficiency,

efficiency, understanding,

efficiency, understanding, robustness,

determinism

determinism and sanity.

How is our code executed?
The answer: it is not directly executed by
OS/CPU.

JVM Bytecode

JVM Bytecode
JVM

JVM Bytecode
JVM
Interpreter Just In Time Compiler

Source: https://en.wikipedia.org/wiki/Strongtalk

“Work began in 1994 and they completed an
implementation in 1996.

implementation in 1996. The company was bought by
Sun Microsystems in 1997,

implementation in 1996. The company was bought by
Sun Microsystems in 1997, and the team got focused
on Java, releasing the HotSpot virtual machine,[3] and
work on Strongtalk stalled.”

So what exactly is bytecode?
“JVM bytecode is a low level language for non
existing CPU"
-- Jaroslaw.Palka

“JVM bytecode is a low level language for non
existing CPU"
-- Jaroslaw.Palka + beer

Bytecode is an instruction set.

Each instruction is 1-byte size code.

Thus there are only 255 opcodes possible.

Thus there are only 255 opcodes possible.
198 are currently in use, 54 are reserved for
future use, and 3 instructions are ‘reserved
opcodes’

JVM is Stack Machine
● No registers, no accumulators, stackpointers

● Why stack based? Two theories:

1. Different platforms, no worries about
number of and sizes of registers

2. Compactness of bytecode

2. Compactness of bytecode
● Learning is easy

Examples!
Warning: contains actual bytecode!

0: iconst_2
1: ireturn
def f1 = 2

0: iconst_2
1: ireturn
def f1 = 2
2

0: iconst_1
1: iconst_2
2: iadd
3: ireturn
def f1 = { 1 + 2 }

0: iconst_1
1: iconst_2
2: iadd
3: ireturn
def f1 = { 1 + 2 }
1

0: iconst_1
1: iconst_2
2: iadd
3: ireturn
def f1 = { 1 + 2 }
2
1

0: iconst_1
1: iconst_2
2: iadd
3: ireturn
def f1 = { 1 + 2 }
3

0: bipush 17
2: iconst_3
3: iadd
4: ireturn
def f1 = { 17 + 3 }

0: bipush 17
2: iconst_3
3: iadd
4: ireturn
def f1 = { 17 + 3 }
17

0: bipush 17
2: iconst_3
3: iadd
4: ireturn
def f1 = { 17 + 3 }
3
17

0: bipush 17
2: iconst_3
3: iadd
4: ireturn
def f1 = { 17 + 3 }
20

I’ve lied :)
And now, the best part!

0: iconst_1
1: iconst_2
2: iadd
3: ireturn
0: bipush 17
2: iconst_3
3: iadd
4: ireturn
def f1 = { 1 + 2}
def f1 = { 17 + 3 }

0: iconst_3
1: ireturn
0: bipush 20
2: ireturn
def f1 = { 1 + 2}
def f1 = { 17 + 3 }

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i }

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i } 0
1

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i } 0 this
1 i

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i }
1
0 this
1 i

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i }
i
1
0 this
1 i

0: iconst_1
1: iload_1
2: iadd
3: ireturn
def f1(i: Int) =
{ 1 + i }
1 + i
0 this
1 i

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i } 0 this
1 i

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1
0 this
1 i

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1
0 this
1 i
Stack is 32-bit long.
Thus Long (64-bit) must
takes two slots on the
stack

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1
0 this
1 i
Stack is 32-bit long.
Thus Long (64-bit) must
takes two slots on the
stack
Some consider 32-bit stack as “the
biggest mistake Sun ever made"

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
i
1
0 this
1 i

0: lconst_1
1: lload_1
2: ladd
3: lreturn
def f1(i: Long) =
{ 1 + i }
1 + i
0 this
1 i

0: iconst_0
1: ireturn
def f1 = { false }

0: ldc #12
2: areturn
def f1 =
"Hello world!

0: ldc #12
2: areturn
def f1 =
"Hello world!
?

0: ldc #12
2: areturn
def f1 =
"Hello world!
?
#1 ….
... ...
#12 Hello world!

0: ldc #12
2: areturn
def f1 =
"Hello world!
#1 ….
... ...
#12 Hello world!
This is known as ‘constant pool’ and is
designed to hold constant values
(most of the time UTF Strings), that
can be referenced by #number.

0: ldc #12 // String Hello world!
2: areturn
def f1 =
"Hello world!
#1 ….
... ...
#12 Hello world!
Our tools help us, so we tend not to
look at the number, but at the
comment provided

0: ldc #12
2: areturn
def f1 =
"Hello world!
#1 ….
... ...
#12 Hello world!

0: ldc #12
2: areturn
def f1 =
"Hello world!
#12
#1 ….
... ...
#12 Hello world!

0: aload_0
1: invokevirtual #13
4: areturn
0: ldc #32
2: areturn
def f1 = f2
def f2 =
"Hi all"

InvokeVirtual Invoke this method on the most derived method type available on given
object

0: aload_0
4: areturn
0: ldc #32
2: areturn
def f1 = f2
def f2 =
"Hi all"
#1 ….
#13 f2:()Ljava/lang/String;
#32 Hi all

0: aload_0
4: areturn
0: ldc #32
2: areturn
def f1 = f2
def f2 =
"Hi all"
this
#1 ….
#32 Hi all

0: aload_0
4: areturn
0: ldc #32
2: areturn
def f1 = f2
def f2 =
"Hi all"
#32
#1 ….
#32 Hi all

0: aload_0
1: invokespecial #12
4: return
class A1 {}
#12 java/lang/Object."<init>":()V

InvokeVirtual Invoke this method on the most derived method type available on given
object
InvokeSpecial Screw what virtual table tells you to do. Invoke method on exactly this class
provided

0: aload_0
1: invokespecial #12
4: return
class A1 {}
this
#12 java/lang/Object."<init>":()V

public java.lang.String t1();
0: aload_0
1: getfield #13
4: areturn
public void t1_$eq(java.lang.String);
0: aload_0
1: aload_1
2: putfield #13
5: return
class A3(var t1:
String)
#13 t1:Ljava/lang/String;

#CAFEBABE
“We used to go to lunch at a place called St Michael's Alley. According to local legend, in
the deep dark past, the Grateful Dead used to perform there before they made it big. It
was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry
died, they even put up a little Buddhist-esque shrine. When we used to go there, we
referred to the place as Cafe Dead. Somewhere along the line it was noticed that this was
a HEX number. I was re-vamping some file format code and needed a couple of magic
numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for
the object file format, and in grepping for 4 character hex words that fit after "CAFE" (it
seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn't
seem terribly important or destined to go anywhere but the trash-can of history. So
CAFEBABE became the class file format, and CAFEDEAD was the persistent object
format. But the persistent object facility went away, and along with it went the use of
CAFEDEAD - it was eventually replaced by RMI."
-- James Gosling

J2SE 8 = 52 (0x34 hex)
J2SE 7 = 51 (0x33 hex)
J2SE 6.0 = 50 (0x32 hex)
J2SE 5.0 = 49 (0x31 hex)
JDK 1.4 = 48 (0x30 hex)
JDK 1.3 = 47 (0x2F hex)
JDK 1.2 = 46 (0x2E hex)
JDK 1.1 = 45 (0x2D hex)

Demystifying
folklore, reasoning
about facts
with Bytecode (for fun and profit)!

Belief: Use ‘Short’ for better performance

def f1(i: Short) = { 1 + i }

def f1(i: Short) = { 1 + i }
0: iconst_1
1: iload_1
2: iadd
3: ireturn

def f1(i: Short) = { 1 + i }
0: iconst_1
1: iload_1
2: iadd
3: ireturn
FALSE

Belief: Use StringBuilder instead of String
concatenation

concatenation
def f1(thing: String) = "it's a " + thing + "!!!"

concatenation
0: ldc #12 // String jeronimo
2: astore_1
3: new #14 // class scala/collection/mutable/StringBuilder
6: dup
7: invokespecial #17 // Method scala/collection/mutable/StringBuilder."<init>":()V
10: ldc #19 // String it's
12: invokevirtual #23 // Method sca(..) /StringBuilder.append:(Ljava/lang/Object;)
15: aload_1
16: invokevirtual #23 // Method sca(...)/StringBuilder.append:(Ljava/lang/Object;)
19: ldc #25 // String !!!

concatenation
2: astore_1
6: dup
15: aload_1
19: ldc #25 // String !!!
21: invokevirtual #23 // Method scal(...)/StringBuilder.append:(Ljava/lang/Object;
24: invokevirtual #29 // Method scala(..)/StringBuilder.toString:()Ljava/lang/String;
27: astore_2
28: return

concatenation
2: astore_1
6: dup
15: aload_1
19: ldc #25 // String !!!
21: invokevirtual #23 // Method scal(...)/StringBuilder.append:(Ljava/lang/Object;
24: invokevirtual #29 // Method scala(..)/StringBuilder.toString:()Ljava/lang/String;
27: astore_2
28: return
FALSE

Question: How Scala’s String Interpolation is
implemented?

implemented?
def f1(thing: String) = s"it's a ${thing}!!!"

implemented?
0: new #12 // class scala/StringContext
3: dup
4: getstatic #18 // Field scala/Predef$.MODULE$:Lscala/Predef$;
7: iconst_2
8: anewarray #20 // class java/lang/String
11: dup
12: iconst_0
13: ldc #22 // String it's a
15: aastore
16: dup

implemented?
3: dup
7: iconst_2
11: dup
12: iconst_0
15: aastore
16: dup
17: iconst_1
18: ldc #24 // String !!!
20: aastore

implemented?
21: checkcast #26 // class "[Ljava/lang/Object;"
24: invokevirtual #30 // Method (..)/Predef$.wrapRefArray:([Ljava/lang/Object;)
27: invokespecial #34 // Method (..)/StringContext."<init>":(Ls(..)/collection/Seq;)V
33: iconst_1
34: anewarray #4 // class java/lang/Object
37: dup
38: iconst_0
39: aload_1
40: aastore
41: invokevirtual #38 // Method scala/Predef$.genericWrapArray:
(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;

implemented?
44: invokevirtual #42 // Method scala/StringContext.s:(Lscala/collection/Seq;)47:
47: areturn

implemented?
3: dup
7: iconst_2
11: dup
12: iconst_0
15: aastore
16: dup
String Interpolation triggers a rather more
complex bytecode compared to the one
produced by String concatenation.
However whether this causes any side
effects or performance issues, is a separate
question.

Question: How Lambdas are implemented?

class A17() { def f1 = () => "yeah" }

class A17() { def f1 = () => "yeah" }
0: new #12 // class A17$$anonfun$f1$1
3: dup
4: aload_0
5: invokespecial #16 // Method A17$$anonfun$f1$1."<init>":(LA17;)V
8: areturn

class A17() { def f1 = () => "yeah" }
public final class A17$$anonfun$1 extends scala.runtime.AbstractFunction0<java.
lang.String>
0: ldc #18 // String yeah
2: areturn

class A17() { def f1 = () => "yeah" }
0: new #12 // class A17$$anonfun$f1$1
3: dup
4: aload_0
5: invokespecial #16 // Method A17$$anonfun$f1$1."<init>":(LA17;)V
8: areturn
Lambdas are implemented using inner
classes

Question: Does it mean it produces anonymous inner
class for every tiny lambda?

def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}

def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: new #26 // class A16$$anonfun$d2$1
4: dup
5: aload_0
6: invokespecial #30 // Method A16$$anonfun$d2$1."<init>":(LA16;)V
9: invokevirtual #32 // Method d1:(Lscala/Function0;)Ljava/lang/String;
12: pop
13: aload_0
17: dup
18: aload_0
25: pop
26: aload_0 ….

def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
4: dup
5: aload_0
12: pop
13: aload_0
17: dup
18: aload_0
25: pop
26: aload_0 ….
Scala 2.12-M2
&
Scala 2.11.7 + flag

def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
1: invokedynamic #52, 0 // InvokeDynamic #0:apply:()
Lscala/runtime/java8/JFunction0;
6: checkcast #19 // class scala/Function0
12: pop
13: aload_0
...

def d2 = {
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
d1(() => "a")
}
0: aload_0
12: pop
13: aload_0
...
Inner classes are created for every lambda in
the system. However Scala 2.11.7 + flag and
Scala 2.12.x (by default) uses
InvokeDynamic to implement lambdas
(similar to how it is implemented in Java 8).

Belief: Scala has tail recursion optimization

final def f1(a: Int): Int = a match {
case 0 => 0
case n => f1(n-1) + 1
}

0: iload_1
1: istore_2
2: iload_2
3: tableswitch { 0: 32
default: 20 }
20: aload_0
21: iload_2
22: iconst_1
23: isub
24: invokevirtual #12 // Method f1:(I)I
27: iconst_1
28: iadd
29: goto 33
32: iconst_0
33: ireturn

case 0 => 0
case n => f1(n-1)
}

@scala.annotation.tailrec
case 0 => 0
case n => f1(n-1)
}

0: iload_1
1: istore_3
2: iload_3
3: tableswitch { // 0 to 0
0: 27
default: 20
}
20: iload_3
21: iconst_1
22: isub
23: istore_1
24: goto 0
27: iconst_0
28: ireturn

FALSE

TRUE

… it
depends

Homework:
1. How Inner class can access private field of
Outer class? How is that even possible?

Homework:
2. Lambdas, are they being given copies of
data by value or by reference?

Homework:
3. How Nothing is transformed to bytecode?

Homework:
3. How Nothing is transformed to bytecode?
4. How traits are implemented? (2.11.7 vs 2.12.
x)

HotSpot memory organization
http://www.pointsoftware.ch/wp-content/uploads/2012/10/JUtH_20121024_RuntimeDataAreas_1_MemoryModel.png

http://cdn.infoq.com/statics_s2_20150819-0313/resource/articles/G1-One-Garbage-Collector-To-Rule-Them-All/en/resources/fig2largeB.jpg

HotSpot memory organization
http://www.occupycfs.com/wp-content/uploads/2014/10/are-you-serious-wtf-meme-baby-face.jpg

So what does really matter?
Generations

Main assumptions
● Objects die young

Main assumptions
● Objects die young
● Not many references from old
objects to young objects

And… ?
● Young Generation
○ Eden

And… ?
○ Eden
○ Survivor Spaces

And… ?
○ Eden
○ Survivor Spaces
● Old Generation

And… ?
○ Eden
○ Survivor Spaces
● Old Generation
○ Tenured

Eden Space Survivor 1
Survivor 2

Eden Space Survivor 1
Survivor 2
Tenured Space

And the rest of process space
● PermGen/MetaSpace
○ classes
○ compiled code

And the rest of process space
● PermGen/MetaSpace
○ classes
○ compiled code
● Native (Non-heap)

Young Generation GC Algorithms

● SerialGC

● SerialGC
● ParallelGC

● SerialGC
● ParallelGC
● ParNewGC

● SerialGC
● ParallelGC
● ParNewGC
● G1

Old Generation GC Algorithms
● MarkSweepCompact

● ParallelOldGC

● ParallelOldGC
● ConcurrentMarkAndSwap(CMS)

● ParallelOldGC
● ConcurrentMarkAndSwap(CMS)
● G1

ParallelOldGC
http://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf

ParallelOldGC - compaction

CMS

CMS - fragmentation

G1
http://cdn.infoq.com/statics_s2_20150819-0313/resource/articles/G1-One-Garbage-Collector-To-Rule-Them-All/en/resources/fig2largeB.jpg

G1
http://cdn.meme.am/instances/57348598.jpg

G1object MyBenchmarkLatency {
@State(Scope.Benchmark)
class Memory {
val heap = new Array[Array[Byte]](100)
}
}
@State(Scope.Thread)
class MyBenchmarkLatency {
val rand = new Random()
val base = 3000
val randBase = 100
def baseline() = ...
def testMethod(memory: Memory) = ...
def fib (max: Int): BigInt = …
}

G1@Benchmark
@BenchmarkMode(Array(Mode.AverageTime))
@OutputTimeUnit(TimeUnit.MILLISECONDS)
def testMethod(memory: Memory) {
for (i <- 0 until 100) memory.heap(i) =
new Array[Byte](1024 * 1024)
val result: BigInt =
fib(base + rand.nextInt(randBase))
for (i <- 0 until 100) memory.heap(i) = null
result + rand.nextInt()
}

G1 @Benchmark
@BenchmarkMode(Array(Mode.
AverageTime))
@OutputTimeUnit(TimeUnit.
MILLISECONDS)
def baseline() {
val result = fib(base +
rand.nextInt(randBase))
}
@Benchmark
@BenchmarkMode(Array(Mode.AverageTime))
@OutputTimeUnit(TimeUnit.MILLISECONDS)
for (i <- 0 until 100) memory.heap(i) =
new Array[Byte](1024 * 1024)
val result =
fib(base + rand.nextInt(randBase))
for (i <- 0 until 100) memory.heap(i) = null
}

G1def fib (max: Int): BigInt = {
def fibInner (n: Int, val1: BigInt, val2: BigInt): BigInt = {
Blackhole.consumeCPU(1000L)
if (n == 0) return val1
fibInner(n - 1, val2, val1 + val2)
}
fibInner(max, 0, 1)
}

G1val opts:Options = new OptionsBuilder()
.include("MyBenchmarkLatency")
.warmupIterations(10)
.warmupTime(TimeValue.seconds(1))
.measurementIterations(15)
.measurementTime(TimeValue.seconds(1))
.threads(Runtime.getRuntime.availableProcessors())
.forks(2)
.detectJvmArgs()
.jvmArgsAppend("-Xmx1024m", "-XX:+UseConcMarkSweepGC")
.build()

Results - Latency
CMS 1 2
baseline 9.848 ms/op 9.779 ms/op
testMethod 183.138 ms/op 172.518 ms/op
Parallel 1 2
baseline 9.302 ms/op 9.151 ms/op
testMethod 236.05 ms/op 215.804 ms/op

G1object MyBenchmarkBatch {
@State(Scope.Benchmark)
class Memory {
val heap = new Array[Array[Byte]](100)
}
}
@State(Scope.Thread)
class MyBenchmarkBatch {
@Param(Array("10"))
var offset: Int = _
var startPtr: Int = 0
var endPtr: Int = 0
def testMethod(memory: Memory) = ...
}
}

G1@Benchmark
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
for (i <- startPtr until startPtr + offset) memory.heap(i % 100) = new Array[Byte]
(1024)
startPtr += offset
for (i <- endPtr until endPtr + offset) memory.heap(i % 100) = null
}

G1val opts:Options = new OptionsBuilder()
.include("MyBenchmarkBatch")
.warmupIterations(10)
.warmupTime(TimeValue.seconds(1))
.measurementIterations(20)
.measurementTime(TimeValue.seconds(5))
.threads(1)
.forks(1)
.detectJvmArgs()
.jvmArgsAppend("-Xmx120m", "-XX:+UseParallelGC")
.build()

Results - Throughput
CMS 1 2
testMethod 134813.615 ops/s 135632.189 ops/s
Parallel 1 2
testMethod 149831.605 ops/s 155873.381 ops/s

JDK/bin
● jstack
● jmap
● jstat

JDK/bin
● jstack
● jmap
● jstat
● jconsole

JDK/bin
● jstack
● jmap
● jstat
● jconsole
● jvisualvm

JDK/bin
● jstack
● jmap
● jstat
● jconsole
● jvisualvm
● jmc

jstat
● jstat -gc <pid>
S0C S1C S0U S1U EC EU OC OU MC
MU CCSC CCSU YGC YGCT FGC FGCT GCT
2112.0 2112.0 0.0 115.4 16896.0 0.0 42368.0 2827.9 7808.0
7366.1 1152.0 1015.1 17667 25.297 0 0.000 25.297

jstat
● jstat -gc <pid>
2112.0 2112.0 0.0 115.4 16896.0 0.0 42368.0 2827.9 7808.0
7366.1 1152.0 1015.1 17667 25.297 0 0.000 25.297
2112.0 2112.0 114.3 0.0 16896.0 2367.5 42368.0 2839.5 7808.0
7374.2 1152.0 1015.1 30246 43.833 0 0.000 43.833

jstat
● jstat -gc <pid>
● jstat -compiler <pid>

jstat
● jstat -gc <pid>
Compiled Failed Invalid Time FailedType FailedMethod
513 0 0 0.58 0

jstat
● jstat -gc <pid>
● jstat -printcompilation <pid>

jstat
● jstat -gc <pid>
● jstat -printcompilation <pid>
Compiled Size Type Method
496 24 1 java/io/ObjectOutputStream$ReplaceTable lookup

G1
http://www.quickmeme.com/img/fc/fc3646a02beca4bbf6c05e711f81c0eb354d253149028d9ce0fca9def3aaf63a.jpg

But what with GC logs?
● ParallelGC
○ Young
○ Old
● CMS
○ Young
○ Old
● G1

ParallelGC
[PSYoungGen: 117951K->17408K(128512K)] 163328K->83265K(217600K),
0,0356419 secs] [Times: user=0,03 sys=0,11, real=0,04 secs]
0,703: [Full GC (Ergonomics) [PSYoungGen: 17408K->0K(128512K)]
[ParOldGen: 65857K->77121K(135168K)] 83265K->77121K(263680K),
[Metaspace: 7223K->7223K(1056768K)], 0,0401522 secs] [Times: user=0,12
sys=0,02, real=0,04 secs]

CMS
1,881: [GC (Allocation Failure) 1,881: [ParNew
Desired survivor size 1081344 bytes, new threshold 1 (max 6)
- age 1: 2097184 bytes, 2097184 total
: 18760K->2048K(19008K), 0,0152627 secs] 705206K->704878K(725612K),
1,902: [GC (Allocation Failure) 1,902: [ParNew: 18927K->18927K(19008K),
0,0000199 secs]1,902: [CMS1,902: [CMS-concurrent-abortable-preclean:
0,072/0,678 secs] [Times: user=2,26 sys=0,20, real=0,68 secs]
(concurrent mode failure): 702830K->90458K(706604K), 0,0536025 secs]
721757K->90458K(725612K), [Metaspace: 7235K->7235K(1056768K)],

G1def fib (max: Int): BigInt = {
def fibInner (n: Int, val1: BigInt, val2: BigInt): BigInt = {
if (n == 0) val1
fibInner(n - 1, val2, val1 + val2)
}
fibInner(max, 0, 1)
}

How to make thread dump
● jstack [-Flm] <pid>
● kill -3 <pid>

But what with GC logs?"org.openjdk.jmh.samples.MyBenchmarkLatency.baseline-jmh-worker-3"
#13 daemon prio=5 os_prio=0 tid=0x00007fb5c0243000 nid=0x22fb runnable
[0x00007fb5a6b50000]
java.lang.Thread.State: RUNNABLE at org.openjdk.jmh.samples.
MyBenchmarkLatency.fibInner$1(MyBenchmarkLatency.scala:56) at org.
openjdk.jmh.samples.MyBenchmarkLatency.fib(MyBenchmarkLatency.scala:
59) at org.openjdk.jmh.samples.MyBenchmarkLatency.baseline
(MyBenchmarkLatency.scala:35) at org.openjdk.jmh.samples.generated.
MyBenchmarkLatency_baseline.baseline_AverageTime
(MyBenchmarkLatency_baseline.java:124)

But what with GC logs?
"org.openjdk.jmh.samples.MyBenchmarkLatency.baseline-jmh-worker-3"
#13 daemon prio=5 os_prio=0 tid=0x00007fb5c0243000 nid=0x22fb runnable
[0x00007fb5a6b50000]
java.lang.Thread.State: RUNNABLE
at java.math.BigInteger.add(BigInteger.java:1315)
at java.math.BigInteger.add(BigInteger.java:1221)
at scala.math.BigInt.$plus(BigInt.scala:203)
at org.openjdk.jmh.samples.MyBenchmarkLatency.fibInner$1
(MyBenchmarkLatency.scala:56)
at org.openjdk.jmh.samples.MyBenchmarkLatency.fib
(MyBenchmarkLatency.scala:59)

How to make heap dump
● jmap -histo <pid>
● jmap -dump:format=b,file=dump.hprof

Histogram with jmap
num #instances #bytes class name
----------------------------------------------
1: 26031 4361408 [I
2: 24598 983920 java.math.BigInteger
3: 24518 392288 scala.math.BigInt
4: 3846 333024 [C
5: 124 305176 [B

Three modes
● C1
● C2
● tiered compilation

Compilation Threshold
● client
● server
● OSR (On-Stack Replacement)

Some optimizations
● -XX:+DoEscapeAnalysis
● -XX:+Inline
● -XX:MaxFreqInlineSize=N (325 bytes)
● -XX:MaxInlineSize=N (35 bytes)

JMH
https://www.google.pl/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCIn41uLEyMcCFSPVcgodewsG4A&url=http%3A%2F%2Fmemegenerator.net%
2Finstance%2F54121031&ei=AqHeVYnvM6OqywP7lpiADg&psig=AFQjCNFrs-9HN7x4SCV87eiuK4eIl4VhKQ&ust=1440739955356463

Paweł Szulc
@rabbit
Bartek Kaflowski
@bartkaf

Know your platform. 7 things every scala developer should know about jvm

More Related Content

What's hot

Similar to Know your platform. 7 things every scala developer should know about jvm

More from Pawel Szulc

Recently uploaded

Know your platform. 7 things every scala developer should know about jvm