SlideShare a Scribd company logo
1 of 31
Download to read offline
In-Memory Computing Essentials
for Java Developers and Architects
Your Speaker: Denis Magda
➔ Distributed in-memory system
◆ Apache Ignite Committer and PMC Member
◆ Head of DevRel at GridGain
➔ Java engineering and architecture
◆ Java engineering at Oracle
◆ Technology evangelism at Sun Microsystems
Agenda
• Introduction
– Why In-Memory Computing?
– Apache Ignite, Brief Overview
• Essentials
– Data Partitioning
– Affinity Co-location
– Co-located Computations
Why In-Memory
Computing?
Speed and Scale
In-Memory Software Scales Horizontally
Application
Cluster
Comparing System-Event Latencies
System Event Actual Latency Scaled Latency
One CPU cycle 0.4 ns 1 s
Level 1 cache access 0.9 ns 2 s
Level 2 cache access 2.8 ns 7 s
Level 3 cache access 28 ns 1 min
Main memory access (DDR DIMM) ~100 ns 4 min
Intel Optane DC persistent memory access ~350 ns 15 min
Intel Optane DC SSD I/O <10 µs 7 hrs
NVMe SSD I/O ~25 µs 17 hrs
SSD I/O 50 – 150 µs 1.5 – 4 days
Rotational disk I/O 1 – 10 ms 1 – 9 months
Internet: SF to NY 65 ms 5 years
Memory Versus Disk Latency
System Event Actual Latency Scaled Latency
One CPU cycle 0.4 ns 1 s
Level 1 cache access 0.9 ns 2 s
Level 2 cache access 2.8 ns 7 s
Level 3 cache access 28 ns 1 min
Main memory access (DDR DIMM) ~100 ns 4 min
Intel Optane DC persistent memory access ~350 ns 15 min
Intel Optane DC SSD I/O < 10 µs 7 hrs
NVMe SSD I/O ~25 µs 17 hrs
SSD I/O 50 – 150 µs 1.5 – 4 days
Rotational disk I/O 1 – 10 ms 1 – 9 months
Internet: SF to NY 65 ms 5 years
Apache Ignite
Brief Overview
Apache Ignite In-Memory Computing Platform
Mainframe NoSQL HadoopDistributed Ignite Persistence
Disk Tier
RDBMS
Machine and Deep Learning
EventsStreamingMessaging
Transaction
s
SQLKey-Value
Service GridCompute Grid
Application Layer
Web SaaS SocialMobile IoT
RollingUpgrades
Security&Auditing
Monitoring&Management
SegmentationProtection
DataCenterReplication
NetworkBackups
Full,Incremental,ContinuousBackups
Point-in-TimeRecovery
HeterogeneousRecovery
Distributed In-Memory Tier
Apache Ignite as a Cache or as a Database
Ignite as a Cache and Data Grid
Ignite as a Database
Apache Ignite Is In Top 5 Apache Projects
...
8.
Data Partitioning
Using all of the cluster’s memory and CPUs
Partitioning and Horizontal Scalability
RDBMS
Partitioned Database
record#1
record#2
record#3
record#4
record#1
record#2
record#3
record#4
partitioning
The World Database Schema
Country
City CountryLanguage
https://dev.mysql.com/doc/world-setup/en/
Partitioning
Country
p0
...
p1 p2
p3 p4 p5
p1022 p1023 p1024
p0
p1024
p4
p2
p1022
p5
Record -> Partition -> Node Mapping
Country
p0
...
p1
p3 p4
p1022 p1023 p1024
p0
p1024
p4
p2
p1022
p5
p2
p5
1. primary key to partition
2.partition to node
Record -> Partition -> Node Mapping
Country
p0
...
p1
p3 p4
p1022 p1023 p1024
p0
p1024
p4
p2
p1022
p5
p2
p5
1. primary key to partition
Code = ‘USA’
2.partition to node
Affinity Co-Location
Reducing network utilization for complex requests
Default Data Distribution
Canada
Toronto
Calgary
Paris
France
Marseille
Montreal
Ottawa
Country Table City Table
Data is Shuffled During the JOIN phase
Thick Client
Canada
Toronto
Calgary
Paris
France
Marseille
Ottawa
Montreal
Paris
Ottawa
Montreal
1 & 4
2
2
3
1. Initiating Execution
2. Execution on Servers (map phase)
3. Data Shuffling
4. Reduce Phase
Disk Versus Network Latency Latency
System Event Actual Latency Scaled Latency
One CPU cycle 0.4 ns 1 s
Level 1 cache access 0.9 ns 2 s
Level 2 cache access 2.8 ns 7 s
Level 3 cache access 28 ns 1 min
Main memory access (DDR DIMM) ~100 ns 4 min
Intel Optane DC persistent memory access ~350 ns 15 min
Intel Optane DC SSD I/O < 10 µs 7 hrs
NVMe SSD I/O ~25 µs 17 hrs
SSD I/O 50 – 150 µs 1.5 – 4 days
Rotational disk I/O 1 – 10 ms 1 – 9 months
Internet: SF to NY 65 ms 5 years
Co-Located Data Distribution
Canada
Toronto
Calgary
France
Marseille
Country Table City Table
Montreal
Ottawa Paris
How to Group Related Data
JOINs with Co-Located Data
Thick Client
Canada
Toronto
Calgary
France
Marseille
1 & 3
2
2
1. Initiating Execution
2. Execution on Servers (map phase)
3. Reduce Phase
Ottawa
Paris
Co-Located Computations
Executing data-intensive logic on cluster nodes
Executing Custom Logic in Cluster
Story About Millions Savings Accounts
All
savings accounts
RDBMS
Application
1. Read all accounts
3. Write changes back
2. Interest
calculation
Story About Millions Savings Accounts
Ignite Cluster
Application
1. Send compute task
2 2
2
Summary
In-Memory Computing Essentials
In-Memory Computing Essentials

More Related Content

What's hot

Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Databricks
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...Databricks
 
How to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin EcosystemHow to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin EcosystemInfluxData
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryDatabricks
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Big Data Spain
 
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...DataWorks Summit
 
Bring Your Own Container: Using Docker Images In Production
Bring Your Own Container: Using Docker Images In ProductionBring Your Own Container: Using Docker Images In Production
Bring Your Own Container: Using Docker Images In ProductionDatabricks
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlowBringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlowDataWorks Summit
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
 
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayVMware Tanzu
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...Databricks
 
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...Databricks
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...DataWorks Summit
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangSpark Summit
 
Apache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationApache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationAriel Jatib
 

What's hot (20)

Introduction to AWS Services
Introduction to AWS ServicesIntroduction to AWS Services
Introduction to AWS Services
 
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
 
How to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin EcosystemHow to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin Ecosystem
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
 
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...
 
Bring Your Own Container: Using Docker Images In Production
Bring Your Own Container: Using Docker Images In ProductionBring Your Own Container: Using Docker Images In Production
Bring Your Own Container: Using Docker Images In Production
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlowBringing Real-Time to the Enterprise with Hortonworks DataFlow
Bringing Real-Time to the Enterprise with Hortonworks DataFlow
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
 
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
 
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...
Assigning Responsibility for Deteriorations in Video Quality with Henry Milne...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
 
Apache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database OrchestrationApache Ignite - Distributed Database Orchestration
Apache Ignite - Distributed Database Orchestration
 

Similar to In-Memory Computing Essentials

SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs FasterBob Ward
 
Fallacies of Distributed Computing
Fallacies of Distributed Computing Fallacies of Distributed Computing
Fallacies of Distributed Computing Arnon Rotem-Gal-Oz
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 editionBob Ward
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)Intel
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBScott Mansfield
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...Codemotion
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...Ernie Souhrada
 
Building an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneBuilding an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneUniFabric
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
Hardware planning & sizing for sql server
Hardware planning & sizing for sql serverHardware planning & sizing for sql server
Hardware planning & sizing for sql serverDavide Mauri
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCMemVerge
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Haidee McMahon
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...In-Memory Computing Summit
 
Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Scott Mansfield
 

Similar to In-Memory Computing Essentials (20)

SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
 
Fallacies of Distributed Computing
Fallacies of Distributed Computing Fallacies of Distributed Computing
Fallacies of Distributed Computing
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
IO Dubi Lebel
IO Dubi LebelIO Dubi Lebel
IO Dubi Lebel
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
 
MYSQL
MYSQLMYSQL
MYSQL
 
Building an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneBuilding an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optane
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
Hardware planning & sizing for sql server
Hardware planning & sizing for sql serverHardware planning & sizing for sql server
Hardware planning & sizing for sql server
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)
 

More from Denis Magda

Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTApache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTDenis Magda
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDenis Magda
 
In-Memory Computing Essentials for Architects and Engineers
In-Memory Computing Essentials for Architects and EngineersIn-Memory Computing Essentials for Architects and Engineers
In-Memory Computing Essentials for Architects and EngineersDenis Magda
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
 
Apache Spark and Apache Ignite: Where Fast Data Meets IoT
Apache Spark and Apache Ignite: Where Fast Data Meets IoTApache Spark and Apache Ignite: Where Fast Data Meets IoT
Apache Spark and Apache Ignite: Where Fast Data Meets IoTDenis Magda
 
Microservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteMicroservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteDenis Magda
 

More from Denis Magda (6)

Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTApache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
 
In-Memory Computing Essentials for Architects and Engineers
In-Memory Computing Essentials for Architects and EngineersIn-Memory Computing Essentials for Architects and Engineers
In-Memory Computing Essentials for Architects and Engineers
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
 
Apache Spark and Apache Ignite: Where Fast Data Meets IoT
Apache Spark and Apache Ignite: Where Fast Data Meets IoTApache Spark and Apache Ignite: Where Fast Data Meets IoT
Apache Spark and Apache Ignite: Where Fast Data Meets IoT
 
Microservices Architectures With Apache Ignite
Microservices Architectures With Apache IgniteMicroservices Architectures With Apache Ignite
Microservices Architectures With Apache Ignite
 

Recently uploaded

What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 

Recently uploaded (20)

What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 

In-Memory Computing Essentials

  • 1. In-Memory Computing Essentials for Java Developers and Architects
  • 2. Your Speaker: Denis Magda ➔ Distributed in-memory system ◆ Apache Ignite Committer and PMC Member ◆ Head of DevRel at GridGain ➔ Java engineering and architecture ◆ Java engineering at Oracle ◆ Technology evangelism at Sun Microsystems
  • 3. Agenda • Introduction – Why In-Memory Computing? – Apache Ignite, Brief Overview • Essentials – Data Partitioning – Affinity Co-location – Co-located Computations
  • 6. In-Memory Software Scales Horizontally Application Cluster
  • 7. Comparing System-Event Latencies System Event Actual Latency Scaled Latency One CPU cycle 0.4 ns 1 s Level 1 cache access 0.9 ns 2 s Level 2 cache access 2.8 ns 7 s Level 3 cache access 28 ns 1 min Main memory access (DDR DIMM) ~100 ns 4 min Intel Optane DC persistent memory access ~350 ns 15 min Intel Optane DC SSD I/O <10 µs 7 hrs NVMe SSD I/O ~25 µs 17 hrs SSD I/O 50 – 150 µs 1.5 – 4 days Rotational disk I/O 1 – 10 ms 1 – 9 months Internet: SF to NY 65 ms 5 years
  • 8. Memory Versus Disk Latency System Event Actual Latency Scaled Latency One CPU cycle 0.4 ns 1 s Level 1 cache access 0.9 ns 2 s Level 2 cache access 2.8 ns 7 s Level 3 cache access 28 ns 1 min Main memory access (DDR DIMM) ~100 ns 4 min Intel Optane DC persistent memory access ~350 ns 15 min Intel Optane DC SSD I/O < 10 µs 7 hrs NVMe SSD I/O ~25 µs 17 hrs SSD I/O 50 – 150 µs 1.5 – 4 days Rotational disk I/O 1 – 10 ms 1 – 9 months Internet: SF to NY 65 ms 5 years
  • 10. Apache Ignite In-Memory Computing Platform Mainframe NoSQL HadoopDistributed Ignite Persistence Disk Tier RDBMS Machine and Deep Learning EventsStreamingMessaging Transaction s SQLKey-Value Service GridCompute Grid Application Layer Web SaaS SocialMobile IoT RollingUpgrades Security&Auditing Monitoring&Management SegmentationProtection DataCenterReplication NetworkBackups Full,Incremental,ContinuousBackups Point-in-TimeRecovery HeterogeneousRecovery Distributed In-Memory Tier
  • 11. Apache Ignite as a Cache or as a Database Ignite as a Cache and Data Grid Ignite as a Database
  • 12. Apache Ignite Is In Top 5 Apache Projects ... 8.
  • 13. Data Partitioning Using all of the cluster’s memory and CPUs
  • 14. Partitioning and Horizontal Scalability RDBMS Partitioned Database record#1 record#2 record#3 record#4 record#1 record#2 record#3 record#4 partitioning
  • 15. The World Database Schema Country City CountryLanguage https://dev.mysql.com/doc/world-setup/en/
  • 16. Partitioning Country p0 ... p1 p2 p3 p4 p5 p1022 p1023 p1024 p0 p1024 p4 p2 p1022 p5
  • 17. Record -> Partition -> Node Mapping Country p0 ... p1 p3 p4 p1022 p1023 p1024 p0 p1024 p4 p2 p1022 p5 p2 p5 1. primary key to partition 2.partition to node
  • 18. Record -> Partition -> Node Mapping Country p0 ... p1 p3 p4 p1022 p1023 p1024 p0 p1024 p4 p2 p1022 p5 p2 p5 1. primary key to partition Code = ‘USA’ 2.partition to node
  • 19. Affinity Co-Location Reducing network utilization for complex requests
  • 21. Data is Shuffled During the JOIN phase Thick Client Canada Toronto Calgary Paris France Marseille Ottawa Montreal Paris Ottawa Montreal 1 & 4 2 2 3 1. Initiating Execution 2. Execution on Servers (map phase) 3. Data Shuffling 4. Reduce Phase
  • 22. Disk Versus Network Latency Latency System Event Actual Latency Scaled Latency One CPU cycle 0.4 ns 1 s Level 1 cache access 0.9 ns 2 s Level 2 cache access 2.8 ns 7 s Level 3 cache access 28 ns 1 min Main memory access (DDR DIMM) ~100 ns 4 min Intel Optane DC persistent memory access ~350 ns 15 min Intel Optane DC SSD I/O < 10 µs 7 hrs NVMe SSD I/O ~25 µs 17 hrs SSD I/O 50 – 150 µs 1.5 – 4 days Rotational disk I/O 1 – 10 ms 1 – 9 months Internet: SF to NY 65 ms 5 years
  • 24. How to Group Related Data
  • 25. JOINs with Co-Located Data Thick Client Canada Toronto Calgary France Marseille 1 & 3 2 2 1. Initiating Execution 2. Execution on Servers (map phase) 3. Reduce Phase Ottawa Paris
  • 28. Story About Millions Savings Accounts All savings accounts RDBMS Application 1. Read all accounts 3. Write changes back 2. Interest calculation
  • 29. Story About Millions Savings Accounts Ignite Cluster Application 1. Send compute task 2 2 2