SlideShare a Scribd company logo
1 of 54
Microservices: Terabytes in Microseconds
Per Minborg
CTO Speedment, Inc.
Malin Weiss
CMO Speedment, Inc.
About Us
Microservices
• Microservices bring together best practices from a variety of
areas
• Most likely you are already using some of these best practices
Microservices
• It sounds like marketing hype
• It all sounds pretty familiar
• It just a rebranding of stuff we already do
• There is room for improvement in what we do
• There are some tools, and ideas we could apply to our
systems without changing too much
• Simple component based design
• Asynchronous messaging
• Automated, dynamic deployment of services
• Service private data sets
• Transparent messaging
• Teams can develop independently based on well defined
contracts
Microservices
Terabytes in Microseconds Solution
Latency
Data volume
GB TB EB
μs
ms
s
min
hour
days
Speedment
Chronicle
Terabytes in Microseconds Solution
Maps the data in a key-value store
Synchronizes SQL Data into an in-
JVM-memory data store
Maps the data in Column key stores
Provides application APIs
Terabytes
1. Synchronization
2. Key Value Store
3. Column Key Stores
4. API
Microservice 3
Microservice 2
Microservice
1. Synchronization
• Synchronizes SQL Data with the in-JVM-memory solution:
• Initially
• Over time
• Periodically
• Reactively
1. Synchronization
Periodically
•Dumps are reloaded
periodically
•All data elements are reloaded
•Data remains unchanged
between reloads
•System restart is just a reload
•MVCC so that several dumps
can be active at the same time
Reactively
•Changed data is captured in the
Database
•Changed data events are pushed into
memory
•Events are grouped in transactions
•Cache updates are persisted
•Data changes all the time
•System restart, replay the missed
events
or
1. Synchronization
*
Periodically Reactively Poll
Max Data Age Dump period Replication Latency- ms Eviction time
Lookup Performance Consistently Instant Consistently Instant 20% very slow
Consistency Eventually Consistent Eventually Consistent Inconsistent – stale data
Database Cache Update
Load
~ Total Size / Dump Period ~ Rate of Change ~ Cache size * Eviction
time
Restart Complete Reload Down time*update rate
->10% of down time
Heat-up time
1. Synchronization Properties
• Detect changes in a database
• Buffer the changes
• Can replay the changes later on
• Preserve order
• Preserve transactions
• See data as it was persisted
• Detect changes from any source
JVM
1. Synchronization
How is data synchronized?
•Periodic MVCC reload
•Transaction log harvesting
•Event Sourcing and CQRS
2. Key Value Store
Makes it possible to
• Having Maps that can exceed the server’s RAM by up to 40x
• Opening up the path to mammoth JVMs with tens of terabytes
• Data persistence inside JVM provides fast restart
• When you have multiple instances of a service they can share the data
with a single copy in memory
How Much Data Can We Have?
• JVM (32), 32-bit (2-3 GB)
• Java 7 (64), –XX:+UseCompressedOops, 35 bit (32 GB)
• Java 8 (64), –XX:ObjectAlignmentInBytes=16, 36 bit (64
GB)
• JVM (64), 64 bit references, 100 GB+ (G1 collector, GC concerns)
• NUMA Regions (40 bits or 46 bits), GC over NUMA borders is a
problem
• Virtual Address Space (48-bit)
• Memory Mapped Files (48+ bits)
• Peta Byte JVMs (50+ bits) with library support
Virtual Address Space (48-bit)
Peta Byte JVMs (50+ bits)
3. Column Key Store
• Column Key Stores are used much like database indexes
• The Column Key Stores can produce data lazily
• Can be mapped onto memory mapped files
3. Column Key Store
• Features over ConcurrentNavigableMap
• Handles null values
• Can handle several values for one key
• Can serialize data off-heap
• Can map onto files
• Can compress its keys
• Contains a bi-directional skip dictionary
• Remains performant for billions and trillions of entries
• Supports transactions
3. Column Key Store
Filesystem
Page cache
User Space Kernel Space
Physical memory
Disk blocks
SSD
Page mapping
Microservice JVM
mappedmapped bufferbuffer filesystem pagesfilesystem pages
memory pagesmemory pages
3. Column Key Store
Filesystem
Page cache
User Space Kernel Space
Physical memory
Disk blocks
SSD
Page mapping
filesystem pagesfilesystem pages
memory pagesmemory pages
Microservice JVM 2
mappedmapped bufferbuffer
Microservice JVM 1
mappedmapped bufferbuffer
3. Column Key Store
• O(1) operations:
• equals, notEquals
• isNull, isNotNull
• in, notIn
• sort
• O(log N) operations
• <, <=, >, >=
• between
• startsWith(String)
• O(N)
• matches(regexp)
3. Column Key Store
3. Column Key Store
Key-Value
Store
ID
Column Key
Store
User.AGE.greaterThan(42)
AGE
Column Key
Store
NAME
Column Key
Store
greaterThan(42) Keys Values
User
3. Column Key Store - The Stream
Pipeline
Key-Value
Store
AGE
Column
Key
Store
greaterThan(42) Keys Values
3. Column Key Store - Parallelism
Key-Value
Store
AGE
Column Key
Store
greaterThan(42)
Parallel
Keys
Parallel
Values
Optional
Parallel
Strategy
4. API
• Provides application APIs
• Java 8 Stream API
• REST API
4. API
• Generates Java code
automatically from SQL metadata
• Free from hand coding errors
• Reduces development time
• Improves the application quality
• Reduces maintenance for
applications
• Uses standard streams for
querying
• Custom code possibilities
4. API
Java 8 Stream API
REST API
users.stream()
.filter(AGE.greaterThen(42))
.forEach(someOperation());
https://mysite/user/
?filter=[{"property":”age",”operator":”gt","value":42}]
Querying In-Memory Data Using
Streams
Map<Long, User> keyValueStore.values()
AGE.greaterThan(42)
forEach()
Source
Filter
Term.
Pipeline
users.stream()
.filter(AGE.greaterThen(42))
.forEach(someOperation());
Querying In-Memory Data Using
Streams
AGE ColumnStore<Integer,
Long>.greaterThan(42)
AGE.greaterThan(42)
forEach()
Source
Filter
Term.
Pipeline
keyValueStore::getMap
Querying In-Memory Data Using
Streams
AGE ColumnStore<Long,
User>.greaterThan(42)
forEach()
Source
Term.
Pipeline
keyValueStore::getMap
4. API
Java 8 Stream API
REST API
users.stream()
.filter(AGE.greaterThen(42))
.skip(10)
.limit(100)
.sorted(NAME.comparator())
.forEach(someOperation());
https://mysite/user/
?filter=[{"property":”age",
”operator":”gt","value":42}]
&skip=10
&limit=100
&sort=[{"property":”name", ”direction":”ASC”}]
4. API - Parallelism
• Java 8 Stream API
users.stream()
.parallel()
.filter(AGE.greaterThen(42))
.flatMap(users::findLogins)
.forEach(expensiveOperation());
Demo 1: Equality
• 300,000 Users
• Off Heap Data Store and Key-Value map
• Test
• Find all users with id = 42
• Count them
• Add the count to a sum
• Iterate 1,000,000 times
• JVM Warmup and then Benchmark
• Sum will be 1,000,000
• A standard MySQL database will process 2,000 TPS on a 4
CPU laptop like mine
Demo 2: Mixed Predicates
• 1,000,000 Users
• Tests
• ID.equal(42)
• ID.greaterOrEqual(42)
• ID.in(42, 43, 44)
• ID.between(42, 45)
Demo 3: Parallelism
• 300,000 Users
• 4 threads
• Tests
• ID.equal(42)
• ID.greaterOrEqual(42)
• ID.in(42, 43, 44)
• ID.between(42, 45)
Compare Latencies Using the Speed
of Light
During the time a database makes a 1 s query,
how far will the light move?
Database CPU L1 cache
Conclusion : Do not place your data on the moon,
keep it close by using in-JVM-memory technology!
Epic Threshold for Mankind Passed
in 2016
Conclusion: Quit drinking coffee
and buy more RAM for your application
server instead!
Microservices Solution
MS1
Speedment
Chronicle
MS1 MS1 MS2
Speedment
Chronicle
MS2 MS2 MS3
Speedment
Chronicle
MS3 MS3
Microservices JVM
• The solution maps Java NavigableMaps++ to persistent stores
(files)
• The views of the maps are available for any JVM that has
access to these files
• Microservice JVMs can be started or restarted very rapidly
• Access to these maps will be gained in millisecond time
• These maps can be shared instantly between several
microservice JVMs
• New microservice instances are added, removed, or restarted
very quickly
If You Only Remember One
Thing….
Latency
Data volume
GB TB EB
μs
ms
s
min
hour
days
Thank you!
Per Minborg
minborg@speedment.com
Malin Weiss
malin@speedment.com
www.speedment.com
www.speedment.org
Demo 1, Tool Work Flow
• Connect to an existing database
• Extract the domain model
• Generate code
32 bit Operating System (31-bit
heap)
Compress Oops in Java 7 (35-bit)
• Using the default of
–XX:+UseCompressedOops
• In a 64-bit JVM, it can use “compressed” memory references
• This allows the heap to be up to 32 GB without the overhead of 64-bit object
references. The Oracle/OpenJDK JVM still uses 64-bit class references by default
• As all object must be 8-byte aligned, the lower 3 bits of the address are always 000
and don’t need to be stored. This allows the heap to reference 4 billion * 8-bytes or
32 GB
• Uses 32-bit references
Compressed Oops with 8 Byte
Alignment
Compress Oops in Java 8 (36 bits)
• Using the default of
–XX:+UseCompressedOops
–XX:ObjectAlignmentInBytes=16
• In a 64-bit JVM, it can use “compressed” memory references
• This allows the heap to be up to 64 GB without the overhead of 64-bit object
references. The Oracle/OpenJDK JVM still uses 64-bit class references by default
• As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of the address are
always zeros and don’t need to be stored. This allows the heap to reference 4 billion
* 16-bytes or 64 GB
• Uses 32-bit references
64-bit References in Java (100+
GB)
• A small but significant overhead on main memory use
• Reduces the efficiency of CPU caches as less objects can fit in
• Can address up to the limit of the free memory. Limited to main memory
• GC pauses become a real concern and can take tens of second or many minutes
• For larger heap sizes, using the G1 collector may be a good choice.
NUMA Regions (~40 bits)
• Large machine are limited in how large a single bank of memory can be. This varies
based on the architecture
• Ivy and Sandy bridge Xeon processors are limited to addressing 40 bits of real
memory
• In Haswell and Boradwell this has been lifted to 46-bits
• Each Socket has “local” access to a bank of memory, however to access other bank it
may need to use a bus. This is much slower
• The GC of a JVM can perform very poorly if it doesn’t sit within one NUMA region.
Ideally you want a JVM to use just one NUMA region
NUMA Regions (~40 bits)
Virtual Address Space (48-bit)
Memory Mapped Files (48+ bits)
• Memory mappings are not limited to main memory size
• 64-bit OS support 128 TiB to 256 TiB virtual memory at once
• For larger data sizes, memory mapping need to be managed and cached manually
• Can be shared between processes
• A library can hide the 48-bit limitation by caching memory mapping
Peta Byte JVMs (50+ bits)
• If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you will have received
over 1 PB
• Managing this much data in large servers is more complex than your standard JVM
• Replication is critical. Large complex systems, are more likely to fail and take longer
to recover
• You can have systems which cannot be recovered in the normal way. i.e. Unless you
recover faster than new data is added, you will never catch up
Peta Byte JVMs (50+ bits)

More Related Content

What's hot

Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkDvir Volk
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperknowbigdata
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 
Document Similarity with Cloud Computing
Document Similarity with Cloud ComputingDocument Similarity with Cloud Computing
Document Similarity with Cloud ComputingBryan Bende
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDBStone Gao
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsyLucidworks
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In RubySATOSHI TAGOMORI
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of Viewaragozin
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBMongoDB
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at nightMichael Yarichuk
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 

What's hot (20)

Boosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and SparkBoosting Machine Learning with Redis Modules and Spark
Boosting Machine Learning with Redis Modules and Spark
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
memcached Distributed Cache
memcached Distributed Cachememcached Distributed Cache
memcached Distributed Cache
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Document Similarity with Cloud Computing
Document Similarity with Cloud ComputingDocument Similarity with Cloud Computing
Document Similarity with Cloud Computing
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of View
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 

Similar to Terabytes in Microseconds: A Microservices Solution

JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...Amazon Web Services
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Marco Tusa
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Web Services
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefGaurav "GP" Pal
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4Gaurav "GP" Pal
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Web Services
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLArnab Biswas
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildTim Vaillancourt
 
MySQL Optimization from a Developer's point of view
MySQL Optimization from a Developer's point of viewMySQL Optimization from a Developer's point of view
MySQL Optimization from a Developer's point of viewSachin Khosla
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 

Similar to Terabytes in Microseconds: A Microservices Solution (20)

JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in Java
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from Amazon
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4
 
Amazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from AmazonAmazon Aurora: The New Relational Database Engine from Amazon
Amazon Aurora: The New Relational Database Engine from Amazon
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the Wild
 
MySQL Optimization from a Developer's point of view
MySQL Optimization from a Developer's point of viewMySQL Optimization from a Developer's point of view
MySQL Optimization from a Developer's point of view
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 

More from Speedment, Inc.

How to generate customized java 8 code from your database
How to generate customized java 8 code from your databaseHow to generate customized java 8 code from your database
How to generate customized java 8 code from your databaseSpeedment, Inc.
 
Silicon Valley JUG - How to generate customized java 8 code from your database
Silicon Valley JUG - How to generate customized java 8 code from your databaseSilicon Valley JUG - How to generate customized java 8 code from your database
Silicon Valley JUG - How to generate customized java 8 code from your databaseSpeedment, Inc.
 
SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017Speedment, Inc.
 
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSpeedment, Inc.
 
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...Speedment, Inc.
 
Speed-up Your Big Data Applications with Sencha and Speedment
Speed-up Your Big Data Applications with Sencha and SpeedmentSpeed-up Your Big Data Applications with Sencha and Speedment
Speed-up Your Big Data Applications with Sencha and SpeedmentSpeedment, Inc.
 
Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment, Inc.
 
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015Speedment, Inc.
 
Speedment - Reactive programming for Java8
Speedment - Reactive programming for Java8Speedment - Reactive programming for Java8
Speedment - Reactive programming for Java8Speedment, Inc.
 
SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015Speedment, Inc.
 

More from Speedment, Inc. (11)

How to generate customized java 8 code from your database
How to generate customized java 8 code from your databaseHow to generate customized java 8 code from your database
How to generate customized java 8 code from your database
 
Silicon Valley JUG - How to generate customized java 8 code from your database
Silicon Valley JUG - How to generate customized java 8 code from your databaseSilicon Valley JUG - How to generate customized java 8 code from your database
Silicon Valley JUG - How to generate customized java 8 code from your database
 
SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017SenchaCon Roadshow Irvine 2017
SenchaCon Roadshow Irvine 2017
 
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
 
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
 
Speed-up Your Big Data Applications with Sencha and Speedment
Speed-up Your Big Data Applications with Sencha and SpeedmentSpeed-up Your Big Data Applications with Sencha and Speedment
Speed-up Your Big Data Applications with Sencha and Speedment
 
Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015
 
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015
eXtreme Tuesday Club at Pivotal Labs ft. Speemdnet / San Francisco - SEP 2015
 
Speedment - Reactive programming for Java8
Speedment - Reactive programming for Java8Speedment - Reactive programming for Java8
Speedment - Reactive programming for Java8
 
Java days gbg online
Java days gbg onlineJava days gbg online
Java days gbg online
 
SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015
 

Recently uploaded

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Recently uploaded (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Terabytes in Microseconds: A Microservices Solution

  • 1. Microservices: Terabytes in Microseconds Per Minborg CTO Speedment, Inc. Malin Weiss CMO Speedment, Inc.
  • 3. Microservices • Microservices bring together best practices from a variety of areas • Most likely you are already using some of these best practices
  • 4. Microservices • It sounds like marketing hype • It all sounds pretty familiar • It just a rebranding of stuff we already do • There is room for improvement in what we do • There are some tools, and ideas we could apply to our systems without changing too much
  • 5. • Simple component based design • Asynchronous messaging • Automated, dynamic deployment of services • Service private data sets • Transparent messaging • Teams can develop independently based on well defined contracts Microservices
  • 6. Terabytes in Microseconds Solution Latency Data volume GB TB EB μs ms s min hour days
  • 7. Speedment Chronicle Terabytes in Microseconds Solution Maps the data in a key-value store Synchronizes SQL Data into an in- JVM-memory data store Maps the data in Column key stores Provides application APIs Terabytes 1. Synchronization 2. Key Value Store 3. Column Key Stores 4. API Microservice 3 Microservice 2 Microservice
  • 8. 1. Synchronization • Synchronizes SQL Data with the in-JVM-memory solution: • Initially • Over time • Periodically • Reactively
  • 9. 1. Synchronization Periodically •Dumps are reloaded periodically •All data elements are reloaded •Data remains unchanged between reloads •System restart is just a reload •MVCC so that several dumps can be active at the same time Reactively •Changed data is captured in the Database •Changed data events are pushed into memory •Events are grouped in transactions •Cache updates are persisted •Data changes all the time •System restart, replay the missed events or
  • 10. 1. Synchronization * Periodically Reactively Poll Max Data Age Dump period Replication Latency- ms Eviction time Lookup Performance Consistently Instant Consistently Instant 20% very slow Consistency Eventually Consistent Eventually Consistent Inconsistent – stale data Database Cache Update Load ~ Total Size / Dump Period ~ Rate of Change ~ Cache size * Eviction time Restart Complete Reload Down time*update rate ->10% of down time Heat-up time
  • 11. 1. Synchronization Properties • Detect changes in a database • Buffer the changes • Can replay the changes later on • Preserve order • Preserve transactions • See data as it was persisted • Detect changes from any source JVM
  • 12. 1. Synchronization How is data synchronized? •Periodic MVCC reload •Transaction log harvesting •Event Sourcing and CQRS
  • 13. 2. Key Value Store Makes it possible to • Having Maps that can exceed the server’s RAM by up to 40x • Opening up the path to mammoth JVMs with tens of terabytes • Data persistence inside JVM provides fast restart • When you have multiple instances of a service they can share the data with a single copy in memory
  • 14. How Much Data Can We Have? • JVM (32), 32-bit (2-3 GB) • Java 7 (64), –XX:+UseCompressedOops, 35 bit (32 GB) • Java 8 (64), –XX:ObjectAlignmentInBytes=16, 36 bit (64 GB) • JVM (64), 64 bit references, 100 GB+ (G1 collector, GC concerns) • NUMA Regions (40 bits or 46 bits), GC over NUMA borders is a problem • Virtual Address Space (48-bit) • Memory Mapped Files (48+ bits) • Peta Byte JVMs (50+ bits) with library support
  • 16. Peta Byte JVMs (50+ bits)
  • 17. 3. Column Key Store • Column Key Stores are used much like database indexes • The Column Key Stores can produce data lazily • Can be mapped onto memory mapped files
  • 18. 3. Column Key Store • Features over ConcurrentNavigableMap • Handles null values • Can handle several values for one key • Can serialize data off-heap • Can map onto files • Can compress its keys • Contains a bi-directional skip dictionary • Remains performant for billions and trillions of entries • Supports transactions
  • 19. 3. Column Key Store Filesystem Page cache User Space Kernel Space Physical memory Disk blocks SSD Page mapping Microservice JVM mappedmapped bufferbuffer filesystem pagesfilesystem pages memory pagesmemory pages
  • 20. 3. Column Key Store Filesystem Page cache User Space Kernel Space Physical memory Disk blocks SSD Page mapping filesystem pagesfilesystem pages memory pagesmemory pages Microservice JVM 2 mappedmapped bufferbuffer Microservice JVM 1 mappedmapped bufferbuffer
  • 21. 3. Column Key Store • O(1) operations: • equals, notEquals • isNull, isNotNull • in, notIn • sort • O(log N) operations • <, <=, >, >= • between • startsWith(String) • O(N) • matches(regexp)
  • 22. 3. Column Key Store
  • 23. 3. Column Key Store Key-Value Store ID Column Key Store User.AGE.greaterThan(42) AGE Column Key Store NAME Column Key Store greaterThan(42) Keys Values User
  • 24. 3. Column Key Store - The Stream Pipeline Key-Value Store AGE Column Key Store greaterThan(42) Keys Values
  • 25. 3. Column Key Store - Parallelism Key-Value Store AGE Column Key Store greaterThan(42) Parallel Keys Parallel Values Optional Parallel Strategy
  • 26. 4. API • Provides application APIs • Java 8 Stream API • REST API
  • 27. 4. API • Generates Java code automatically from SQL metadata • Free from hand coding errors • Reduces development time • Improves the application quality • Reduces maintenance for applications • Uses standard streams for querying • Custom code possibilities
  • 28. 4. API Java 8 Stream API REST API users.stream() .filter(AGE.greaterThen(42)) .forEach(someOperation()); https://mysite/user/ ?filter=[{"property":”age",”operator":”gt","value":42}]
  • 29. Querying In-Memory Data Using Streams Map<Long, User> keyValueStore.values() AGE.greaterThan(42) forEach() Source Filter Term. Pipeline users.stream() .filter(AGE.greaterThen(42)) .forEach(someOperation());
  • 30. Querying In-Memory Data Using Streams AGE ColumnStore<Integer, Long>.greaterThan(42) AGE.greaterThan(42) forEach() Source Filter Term. Pipeline keyValueStore::getMap
  • 31. Querying In-Memory Data Using Streams AGE ColumnStore<Long, User>.greaterThan(42) forEach() Source Term. Pipeline keyValueStore::getMap
  • 32. 4. API Java 8 Stream API REST API users.stream() .filter(AGE.greaterThen(42)) .skip(10) .limit(100) .sorted(NAME.comparator()) .forEach(someOperation()); https://mysite/user/ ?filter=[{"property":”age", ”operator":”gt","value":42}] &skip=10 &limit=100 &sort=[{"property":”name", ”direction":”ASC”}]
  • 33. 4. API - Parallelism • Java 8 Stream API users.stream() .parallel() .filter(AGE.greaterThen(42)) .flatMap(users::findLogins) .forEach(expensiveOperation());
  • 34. Demo 1: Equality • 300,000 Users • Off Heap Data Store and Key-Value map • Test • Find all users with id = 42 • Count them • Add the count to a sum • Iterate 1,000,000 times • JVM Warmup and then Benchmark • Sum will be 1,000,000 • A standard MySQL database will process 2,000 TPS on a 4 CPU laptop like mine
  • 35. Demo 2: Mixed Predicates • 1,000,000 Users • Tests • ID.equal(42) • ID.greaterOrEqual(42) • ID.in(42, 43, 44) • ID.between(42, 45)
  • 36. Demo 3: Parallelism • 300,000 Users • 4 threads • Tests • ID.equal(42) • ID.greaterOrEqual(42) • ID.in(42, 43, 44) • ID.between(42, 45)
  • 37. Compare Latencies Using the Speed of Light During the time a database makes a 1 s query, how far will the light move? Database CPU L1 cache Conclusion : Do not place your data on the moon, keep it close by using in-JVM-memory technology!
  • 38. Epic Threshold for Mankind Passed in 2016 Conclusion: Quit drinking coffee and buy more RAM for your application server instead!
  • 39. Microservices Solution MS1 Speedment Chronicle MS1 MS1 MS2 Speedment Chronicle MS2 MS2 MS3 Speedment Chronicle MS3 MS3
  • 40. Microservices JVM • The solution maps Java NavigableMaps++ to persistent stores (files) • The views of the maps are available for any JVM that has access to these files • Microservice JVMs can be started or restarted very rapidly • Access to these maps will be gained in millisecond time • These maps can be shared instantly between several microservice JVMs • New microservice instances are added, removed, or restarted very quickly
  • 41. If You Only Remember One Thing…. Latency Data volume GB TB EB μs ms s min hour days
  • 42. Thank you! Per Minborg minborg@speedment.com Malin Weiss malin@speedment.com www.speedment.com www.speedment.org
  • 43. Demo 1, Tool Work Flow • Connect to an existing database • Extract the domain model • Generate code
  • 44. 32 bit Operating System (31-bit heap)
  • 45. Compress Oops in Java 7 (35-bit) • Using the default of –XX:+UseCompressedOops • In a 64-bit JVM, it can use “compressed” memory references • This allows the heap to be up to 32 GB without the overhead of 64-bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default • As all object must be 8-byte aligned, the lower 3 bits of the address are always 000 and don’t need to be stored. This allows the heap to reference 4 billion * 8-bytes or 32 GB • Uses 32-bit references
  • 46. Compressed Oops with 8 Byte Alignment
  • 47. Compress Oops in Java 8 (36 bits) • Using the default of –XX:+UseCompressedOops –XX:ObjectAlignmentInBytes=16 • In a 64-bit JVM, it can use “compressed” memory references • This allows the heap to be up to 64 GB without the overhead of 64-bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default • As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of the address are always zeros and don’t need to be stored. This allows the heap to reference 4 billion * 16-bytes or 64 GB • Uses 32-bit references
  • 48. 64-bit References in Java (100+ GB) • A small but significant overhead on main memory use • Reduces the efficiency of CPU caches as less objects can fit in • Can address up to the limit of the free memory. Limited to main memory • GC pauses become a real concern and can take tens of second or many minutes • For larger heap sizes, using the G1 collector may be a good choice.
  • 49. NUMA Regions (~40 bits) • Large machine are limited in how large a single bank of memory can be. This varies based on the architecture • Ivy and Sandy bridge Xeon processors are limited to addressing 40 bits of real memory • In Haswell and Boradwell this has been lifted to 46-bits • Each Socket has “local” access to a bank of memory, however to access other bank it may need to use a bus. This is much slower • The GC of a JVM can perform very poorly if it doesn’t sit within one NUMA region. Ideally you want a JVM to use just one NUMA region
  • 52. Memory Mapped Files (48+ bits) • Memory mappings are not limited to main memory size • 64-bit OS support 128 TiB to 256 TiB virtual memory at once • For larger data sizes, memory mapping need to be managed and cached manually • Can be shared between processes • A library can hide the 48-bit limitation by caching memory mapping
  • 53. Peta Byte JVMs (50+ bits) • If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you will have received over 1 PB • Managing this much data in large servers is more complex than your standard JVM • Replication is critical. Large complex systems, are more likely to fail and take longer to recover • You can have systems which cannot be recovered in the normal way. i.e. Unless you recover faster than new data is added, you will never catch up
  • 54. Peta Byte JVMs (50+ bits)

Editor's Notes

  1. I live in Palo Alto in California..
  2. Progressive adaptation of the microservice concept