Architecting for a scalable enterprise - John Davies

C24
Architecting for a
Scalable Enterprise

C24
2
PROBLEMS DON’T CHANGE, THEY JUST GET BIGGER
I’ve been working for 30 years now and I just see the same
problems over and over again - History repeating itself
It’s true we have some new ones but they’re just
reincarnations of the old ones
You could argue that IoT, social media & eCommerce are all
new (since I started anyway) but the problems are the
same…
Massive volumes of data
It needs parsing, ﬁltering, sorting, analysing, alerts, triggers, reporting,
compliance… Same ol’ same ol’.

C24
3
30-ISH YEARS AGO
I can remember in 1987 we re-wrote a trading system
in Objective-C
In those days there was a “war” between Objective-C and C++
History tells us C++ won but it wasn’t a clear victory as Apple shows
us today
Anyway, one of the problems we had was loading in all
of the exchange rates as the system started up
We used 80386s (the latest 32 bit CPU)
The big machines had 1MB (yes 1 mega bytes) of RAM
WAN was a 2400 baud modem (2.4k bits ber second)
Network was 1MB token-ring (40k/sec on a good day)
20MB hard-disk on the top machines (80ms access time)

C24
4
SOUND FAMILIAR?
Hundreds of currency pairs (USD/GBP) - remember pre-Euro so more
currencies
A dozen forward rates (spot, 1 week, 1 month etc.)
Several changes per second (coming down the modem @ 2400 baud)
These are some of the problems we faced…
When the machines started up in the morning it 
took along time to get the current snap-shot to the trader’s machines
Each new rate (GBP/USD=1.68750) took a long time to update on the
clients machines, something do to the new Object Oriented model we’d
used
Querying the data was slow
Storing and querying the historic data presented serious issues

C24
5
HARDWARE TO THE RESCUE!
You could do in a few seconds on an iPhone what took a day
to do in those days but that’s thanks to the hardware changes
NOT THE SOFTWARE
The problems and architecture remain the same, we just have
a lot more data because we’re now global
We want every trade quoted on every exchange in every
country on every trader’s desk in every ofﬁce in every country
Competition has just meant that as soon as one bank does it,
you have to do better otherwise you lose the deals and go
bankrupt

C24
6
IT’S NOT JUST THE CPU
Networks speeds
20 years ago we used wired modems 38k (bits) was 
good, we now expect mobile networks to give us 50+MB, 
that’s >1000X
Screens
My laptop (2 years old) is 2880x1800 and “full colour”, 20 years ago
1024x768 in 16 colours was good but the real change is 3D & OpenGL
etc.
Low res 3D pie charts needed a maths co-processor to display them in
under a second today we expect realistic 3D in real-time on full resolution
Today’s graphics cards are computers within computers
Memory
We used to have to page 1MB of RAM in and out (memory banks), today
we have 16GB or RAM in a laptop and 128GB of SSD on a keyring

C24
7
RAW POWER
Apollo 11’s guidance computer had just 2k 
of memory and 32k of read-only storage 
But it got it to the moon - and back!
Most of the time :-)
The backup was a slide rule
Today you can compress a full 1080p 
movie into about 1GB and watch it on your 
mobile phone
Why then do we have problems getting XML into memory?

C24
8
50 YEARS ON AND MOORE’S LAW IS STILL WITH US
His article “Cramming more components onto
integrated circuits” predicted the future of transistor
density based on a simple doubling every 2 years
Every few years we’re told this 
has to end but someone comes 
up with a new idea and it just 
keeps going
Eventually it will ﬂatten but it’s got a long way to go yet
so there are exciting time ahead

C24
9
50 YEARS ON AND MOORE’S LAW IS STILL WITH US
His article “Cramming more components onto
integrated circuits” predicted the future of transistor
density based on a simple doubling every 2 years
Every few years we’re told this has to 
end but someone comes up with a new 
idea and it just keeps going
Eventually it will ﬂatten but it’s got a 
long way to go yet so there are exciting 
time ahead

C24
10
SOFTWARE IS SLOWING US DOWN
Programmers are lazy, I’m one, I gave all my demos to
someone 
else to write while I write these slides (thanks Iain)
We’ve added layer upon layer of abstraction to hide the 
complexity and hardware - Good but it slows things down
We simpliﬁed programming with drag-n-drop 
now even kindergarten kids can program
Twenty years ago Java was introduced to the 
world, it took away all the problems we had 
with memory management and hardware architectures
It was cool then and I think it’s still cool now but it does 
have a lot of issues, many of which we can work around

C24
11
MY SLIDE FROM 2012
Today my optimisation target is Java itself

C24
12
JAVA IS VERBOSE AND SLOW
OK, don’t get offended, Java is a great language and small
applications are often as fast as C/C++, the JIT compiler is
seriously powerful
BUT
For data high volume processing, distributed computing and
analytics Java performance sucks
BUT
It is still the best we have so we just need to improve the way it
works

C24
13
GARBAGE COLLECTION
Programmers make mistakes, in the days of C/C++ it
crashed or hung the machine today it just kills the hangs
or crashes the JVM
Memory management was supposed to help but you can
bring your entire machine to a grinding halt with ease…
Concatenating Strings in a loop
Adding to a collection and forgetting to clear it
Processing too much data
The JVM and Garbage Collection doesn’t ﬁx your bad
programming it just limits the damage is can cause
This comes at a cost too

C24
14
SERIALISATION
Java Serialisation sucks - full stop!
It’s so bad there are over 2 dozen open source frameworks to replace it
Almost all of the In-Memory Data Grids (IMDGs) have alternatives to
native serialisation
A serialised Java object is usually larger in size than its XML equivalent
Even the process of serialisation and de-serialisation is slow, extremely
slow
We’ll come back to this later

C24
15
ANALYTICS & BIG DATA
There are essentially 4 options
Do it in memory on one machine - fast but limited size
Do it off disk on one machine - slow due to disk I/O and limited CPU
Use distributed memory - faster but not linearly faster than one
machine
Use distributed disk - fast due to more CPU but limited by disk I/O
If we could somehow improve GC, network serialisation
and disk I/O we could vastly improve on the latency
(time for results) and throughput (complexity of results)
Heard of Hadoop and Spark?

C24
16
HADOOP
To run a Hadoop query…
First understand the data you’re analysing so that you can extract it
Write some code to extract, transform and load the data into HBase/
HDFS
This can take days or weeks to code
And can take hours or days to run
Now ﬁre up Hadoop to get your answer - more time because it’s on
disk and distributed - It’s SLOW
Make one small change and you’re back to square one
Query to result can take weeks

C24
17
SPARK
Spark is faster because it runs in memory but it still has the
overhead of Java Serialisation for distribution
There are two modes, cached and un-cached
As the name would suggest un-cached is off disk so we’re back to
serialisation costs again
Spark can use Kryo to improve serialisation, this is good but
means writing code and it’s not practical for complex data
models
Spark is an improvement on Hadoop but still limited by Java

C24
18
LET’S PARK BIG DATA FOR NOW…
We’ll come back to Spark later…

C24
19
IN THE MORE RECENT PAST…
At JAX Finance earlier this year I introduced the idea of
using binary instead of classic Java objects
This is really bringing the skills we used 20 years ago in
C and C++ back into the Java world
As long as your getter() returns the Object you
expected why should you care if it was stored in binary
or as a Java object?
This is after all the beauty of abstraction
Let’s just see what this does again…

C24
20
SAME API, JUST BINARY
Classic getter and setter vs. binary implementation
Identical API

C24
21
JUST AN EXAMPLE…
Classic getter and setter vs. binary implementation
Identical API

C24
22
TIME TO ACCESS DATA…
I scaled by 1 million times simply because that’s roughly the ratio between an modern airplane and the
speed of light
Event Latency (approx.) Scaled x 1
million1 CPU cycle 0.3 ns (3.5 GHz) 0.3 ms (3.5 KHz)
Level 1 cache access 0.9 ns 0.9 ms
Main memory access (DRAM) 120 ns 120ms (1/8th sec)
Solid-state disk I/O (SSD) 50-150 µs 50 - 150 seconds
Read 1MB sequentially from SSD 1 ms 17 minutes
Rotational Disk I/0 1-10 ms 17 mins-2.8 hours
Read 1MB sequentially from Disk 20 ms 5.6 hours
Network SF to NY (round trip) 40 ms 11 hours
Network London to Tokyo (round trip) 81 ms 1 day
Network SF to Oz (round trip) 183 ms 2 days
TCP packet retransmit 1-3 s 2-6 weeks

C24
23
COMPARING…
These two graphs show the GC pause time during
message creation and serialisation
Left is “classic” Java
Right is the binary version
The top of the right hand graph is lower than the ﬁrst rung of the left
(50ms)

C24
24
COMPARING…
These two graphs show the GC pause time during
message creation and serialisation
Left is “classic” Java
Right is the binary version
The top of the right hand graph is lower than the ﬁrst
rung of the left (50ms)

C24
25
SERIALISATION
Serialisation was compared (by a client) to several dozen
serialisation frameworks
The test framework can be found here:
https://github.com/eishay/jvm-serializers/
Preon was either at the top or within 5% 
of the top
However the use-case was very 
simple, SDOs work better with more 
complex models
0 500 1000 1500 2000 2500 3000 3500
c24(sdo
wobly
wobly(compact
protostuff
protobuf/protostuff
fst(flat(pre
protobuf/protostuff(runtime
kryo(opt
protobuf
protostuff(graph
protostuff(graph(runtime
thrift
json/dsl(platform
fst(flat
smile/jackson/manual
json/fastjson/databind
cbor/jackson/manual
jboss(marshalling(river(ct(manual
msgpack/manual
scala/sbinary
Serialise
Deserialise
0 5000 10000 15000 20000 25000 30000 35000 40000
c24(sdo
wobly
wobly(compact
protostuff
protobuf/protostuff
fst(flat(pre
kryo(opt
protobuf
protostuff(graph
thrift
json/dsl(platform
fst(flat
cbor/jackson/manual
msgpack/manual
scala/sbinary
msgpack/databind
smile/jackson+afterburner/databind
avro(specific
json(col/jackson/databind
cbor/jackson+afterburner/databind
fst
smile/jackson/databind
json/jackson/manual
json/protostuff(manual
jboss(marshalling(river(ct
json/jackson(jr/databind
xml/aalto(manual
json/json(smart/manual(tree
xml/woodstox(manual
json/gson/manual
xml/jackson/databind
hessian
json/gson/manual(tree
xml/javolution/manual
xml/xstream+c(fastinfo
xml/xstream+c(aalto
json/org.json/manual(tree
xml/xstream+c(woodstox
bson/mongodb/manual
xml/exi(manual
xml/xstream+c
jboss(marshalling(river
java(built(in
java(built(in(serializer
Serialise
Deserialise

C24
26
SERIALISATION
Serialisation was compared (by a client) to several dozen
serialisation frameworks
The test framework can be found here:
https://github.com/eishay/jvm-serializers/
C24 is either at the top or within 5% 
of the top
However the use-case was very 
simple, SDOs work better with more 
complex models
0 5000 10000 15000 20000 25000 30000 35000 40000
c24(sdo
wobly
wobly(compact
protostuff
protobuf/protostuff
fst(flat(pre
kryo(opt
protobuf
protostuff(graph
thrift
json/dsl(platform
fst(flat
cbor/jackson/manual
msgpack/manual
scala/sbinary
msgpack/databind
smile/jackson+afterburner/databind
avro(specific
json(col/jackson/databind
cbor/jackson+afterburner/databind
fst
smile/jackson/databind
json/jackson/manual
json/protostuff(manual
jboss(marshalling(river(ct
json/jackson(jr/databind
xml/aalto(manual
json/json(smart/manual(tree
xml/woodstox(manual
json/gson/manual
xml/jackson/databind
hessian
json/gson/manual(tree
xml/javolution/manual
xml/xstream+c(fastinfo
xml/xstream+c(aalto
json/org.json/manual(tree
xml/xstream+c(woodstox
bson/mongodb/manual
xml/exi(manual
xml/xstream+c
jboss(marshalling(river
java(built(in
java(built(in(serializer
Serialise
Deserialise
0 500 1000 1500 2000 2500 3000 3500
c24(sdo
wobly
wobly(compact
protostuff
protobuf/protostuff
fst(flat(pre
kryo(opt
protobuf
protostuff(graph
thrift
json/dsl(platform
fst(flat
cbor/jackson/manual
msgpack/manual
scala/sbinary
Serialise
Deserialise

C24
27
SDOS & IMDGS
Performance and memory gain can be very signiﬁcant
The following demonstrates different storage capacity of XML,
standard Java vs binary for storing XML (in this case FpML and
ISO-20022)
The IMDG in this speciﬁc case is Coherence but others are similar

C24
28
BINARY WITH SPARK
Compare the red (cached memory) and there’s no
difference but compare cached disk
Since binary serialisation to/from disk is almost cost-
free we see almost no degradation from disk to memory
Using binary with disk 
is about 20 times faster 
than the best alternative 
and less than half the 
speed of cached 
memory
Using binary on disk with Spark offers the volumes of
Hadoop and performance of Spark

C24
29
WHAT’S LEFT TO TALK ABOUT?
Spring, Groovy, C#, Go, Scala, Clojure, Swift
MicroServices, Docker, CloudFoundry
ESBs, SOA, JEE
Databases, RDBMS and NoSQL
In-Memory Data Grids (IMDGs)
Cloud
IoT
PaaS / SaaS / IaaS / YaSaaS
Virtualisation

C24
30
NO MORE TIME SADLY
Spring, Groovy, C#, Go, Scala, Clojure, Swift
MicroServices, Docker, CloudFoundry
ESBs, SOA, JEE
Databases, RDBMS and NoSQL
In-Memory Data Grids (IMDGs)
Cloud
IoT
PaaS / SaaS / IaaS / YaSaaS
Virtualisation

C24
31
SPRING BOOT
Spring Boot is becoming the de facto framework for Java-
based applications
With very little else a Spring Boot application can be deployed
on a local machine, onto a server, a data centre (private cloud)
or the cloud (a data centre where you don’t know the addrsss)
Unless you’re going enterprise scale with configurable
workflow, high availability, automated scalability then it’s
difficult to justify MicroServices
Given the option we use Spring Boot with our clients and they
seem to get hooked on the simplicity and power

C24
32
MICROSERVICES
There are two types of MicroService or two main “needs”
To be able to package the entire application and deploy it in one go
To be able to manage deployment and life-cycle of large scale systems
The Java Virtual Machine with Spring Boot and Maven
(cough splutter) goes a long way to providing all the
functionality of the ﬁrst need
At enterprise scale then we need more than a packaged
application - Cloud Foundry / BlueMix etc.
This is usually in the realm of the Dev Ops guys (and gals) not the Java/
Spring programmer
In a perfect world programmers need an abstraction from MicroServices
implementations, Spring Boot goes a long way to providing this

C24
33
TIME IS SHORT…
… but hopefully time for a few questions…
@jtdavies

Architecting for a scalable enterprise - John Davies

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

Similar to Architecting for a scalable enterprise - John Davies

Similar to Architecting for a scalable enterprise - John Davies (20)

More from JAXLondon_Conference

More from JAXLondon_Conference (20)

Recently uploaded

Recently uploaded (20)

Architecting for a scalable enterprise - John Davies