APACHE IGNITE
AN INTRO TO
WHO IS THIS HUMAN?
DANI TRAPHAGEN - @DTRAPEZOID
▸ Probably most important - I
have perfected the art of
awkward dog family photo -see
image right

▸ I spend a lot of time with
databases 

▸ I used to consult & train folks
on NoSQL databases (C*) and
now I don’t…because……..
MEMORY IS
WELL WHAT ARE
THE TRENDS?
SOURCE HERE
MEMORY IS DECREASING IN COST:
SO, WHAT
IS THIS APACHE IGNITE?
BUT SERIOUSLY, DON’T
DO IT…WE’RE TIRED.
APACHE IGNITE
HISTORY OF
APACHE IGNITE’S HISTORY
▸ Apache Ignite came fresh out the kitchen in Oct. 2014, released
by GridGain

▸ Aug. 2015 Ignite is the 2nd fastest project to graduate after
Apache Spark

▸ Today, 100+ Contributors internationally & rapidly growing out
the community 

▸ Over 1m lines of code
•HIGH AVAILABILITY
•PEER TO PEER
•SCALE OUT
© 2017 GridGain Systems, Inc.
Agenda
• Introduction - what is this thing?
• Use Cases - when do I use this thing?
• Who else has used this thing?
• Does it apply to my thing I am doing?
• Features of this thing, a lot of them, not all.
• Demo - show me where to get started with this
thing.
• Q&A
© 2017 GridGain Systems, Inc.
Introduction
© 2017 GridGain Systems, Inc.
the in-memory computing platform
that is durable, strongly consistent and highly available
with powerful SQL, key-value and processing APIs
Apache Ignite
© 2017 GridGain Systems, Inc.
Apache Ignite In-Memory Computing Platform
Memory-Centric Storage
Ignite Native Persistence
(Flash, SSD, Intel 3D XPoint)
Third-Party Persistence
(RDBMS, HDFS, NoSQL)
SQL Transactions Compute Services MLStreaming
Applications
Key/Value
IoTFinancial
Services
Pharma &
Healthcare
E-CommerceTravel &
Logistics
Telco
© 2017 GridGain Systems, Inc.
Use Cases
© 2017 GridGain Systems, Inc.
Apache Ignite Users
FinTech
Financial Services Software Logistics & Travel
E-commerce
Telco
IoT
Pharma & Healthcare
Adtech
© 2017 GridGain Systems, Inc.
JacTravel are a global B2B travel firm, providing
realtime services for over 15k city hotels worldwide.
Problem
• Could not meet latency and throughput SLAs
• Could no longer scale and was costly to maintain

Apache Ignite Solution
• More than 550M searches per day
• Enables sub-second response times on a 4-node
cluster
• Delivers savings of over $500K per year on
infrastructure
SQL API
Load Balancer
REST API
DB updates
via MQ
Apache Ignite Cluster
DB
- Real-time Search at Scale (Travel & Retail)
IN-MEMORY IN-MEMORY IN-MEMORY IN-MEMORY
Distributed In-Memory Partitioned Cache
© 2017 GridGain Systems, Inc.
The company develops IoT solutions that transmit
energy consumption data between meters,
consumers and utilities in real time.
Problem
• Could not meet latency and throughput SLAs
• Missing scalability and elasticity

GridGain Solution
• 50 millions meters stream the data back in real-time
• Collocated in-memory processing
• Advanced security and multi-tenancy
SQL
Smart Meters
GridGain Ignite Cluster
DB
IN-MEMORY IN-MEMORY IN-MEMORY IN-MEMORY
GridGain Advanced Security
Large IoT Provider - Smart Metering and Utilities
Compute Transactions
Company’s Platform
© 2017 GridGain Systems, Inc.
Feature Overview
© 2017 GridGain Systems, Inc.
Feature Overview
SQL
Key / Value
Collocated Processing
ACID Transactions
Micro-Services
Data Streaming
Machine Learning
DURABLE MEMORY
DURABLE MEMORY
DURABLE MEMORY
DURABLE MEMORY
DURABLE MEMORY
DURABLE MEMORY
Server NodesClient Nodes
© 2017 GridGain Systems, Inc.
Durable Memory
Off-heap Removes
noticeable GC pauses
Automatic
Defragmentation
Stores Superset
of Data
Predictable memory
consumption
Fully Transactional
(Write-Ahead Log)
DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY
Server Node Server Node Server Node
Ignite Cluster
Instantaneous
Restarts
© 2017 GridGain Systems, Inc.
Ignite Native Persistence
1. Update
RAM
2. Persist
Write-Ahead Log
Partition File 1
3. Ack
4. Checkpointing
Partition File N
Server Node
© 2017 GridGain Systems, Inc.
Data Grid
JCache Transactions Compute SQL
RDBMS
NoSQL
HDFS
Server Node
Distributed Key-Value Store
Dynamic
Scaling
Distributed
partitioned
hash map
ACID TransactionJCache & SQL
Server Node Server Node
3rd party storage caching
DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY
© 2017 GridGain Systems, Inc.
Distributed SQL
JDBC ODBC SQL API
Java .NET C++ BI
SELECT, UPDATE,
INSERT, MERGE,
DELETE, CREATE
and ALTER
DDL, DML Support
Cross-platform
Compatibility
Indexes in
RAM or Disk
Dynamic
Scaling
Server Node Server NodeServer Node
Apache Ignite Cluster
DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY
Tools
© 2017 GridGain Systems, Inc.
1. Initial Query
2. Query execution over local data
3. Reduce multiple results in one
1. Initial Query
2. Query execution (local + remote data)
3. Potential data movement
4. Reduce multiple results in one
2
2
1
Collocated Joins Non-Collocated Joins
Server Node
ON-DISK
Server Node
ON-DISK
Client Node
3
2
2
1
Server Node
ON-DISK
Server Node
ON-DISK
Client Node
4
3
© 2017 GridGain Systems, Inc.
Compute Grid
DURABLE MEMORY
DURABLE MEMORY
Ignite Cluster
C1
R1
C2
R2
C = C1 + C2
R = R1 + R2
C = Compute
R = Result
in T/2 time
Automatic Failover
Load Balancing
Zero Deployment
© 2017 GridGain Systems, Inc.
1. Initial Request
2. Fetch data from remote nodes
3. Process entire data-set
3
1
Data 1
2
2 Data 2
Client-Server Processing Co-located Processing
Server Node
ON-DISK
Server Node
ON-DISK
1. Initial Request
2. Co-located processing with data
3. Reduce multiple results in one
2
2
1Client Node
Server Node
ON-DISK
Server Node
ON-DISK
Client Node
3
© 2017 GridGain Systems, Inc.
Service Grid
Node Singleton
Cluster Singleton
Cluster Singleton
Node Singleton
Microservices
Foundation
Lifecycle
Management
Load Balancing
Automatic
Failover
© 2017 GridGain Systems, Inc.
Machine Learning Grid
K-Means Regressions Decision Trees
R C++ Python Java
Server Node Server NodeServer Node
Distributed Core Algebra
DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY
Scala REST
Random ForestDistributed Algorithms
Dense and Sparse
Algebra
Large Scale
Parallelization
Multi-Language
Support
Dense and Sparse
Algebra
No ETL
© 2017 GridGain Systems, Inc.
Genetic Algorithms Grid
DURABLE MEMORY
DURABLE MEMORY
Ignite Cluster
F2, C2, M2
F = F1 + F2
C = C1 + C2
Collocated Computation
Biological Evolution
Simulation
Chromosome and Genes Cluster
M = M1 + M2
F1, C1, M1
F = Fitness Calculation
C = Crossover
M = Mutation
© 2017 GridGain Systems, Inc.
Ignite and Spark Integration
Spark Application
Spark Worker
Spark
Job
Spark
Job
Yarn Mesos Docker HDFS
Spark Worker
Spark
Job
Spark
Job
Spark Worker
Spark
Job
Spark
Job
In-Memory Shared RDD or DataFrame
Share RDD
across jobs on
the host
In-Memory
Indexes
SQL on top of
RDDs
Share RDD
Globally
Ignite Node Ignite Node Ignite Node
GEE
THANKS!DANI TRAPHAGEN - @DTRAPEZOID

Nike tech-talk-intro-to-apache-ignite

  • 1.
  • 2.
    WHO IS THISHUMAN? DANI TRAPHAGEN - @DTRAPEZOID ▸ Probably most important - I have perfected the art of awkward dog family photo -see image right ▸ I spend a lot of time with databases ▸ I used to consult & train folks on NoSQL databases (C*) and now I don’t…because……..
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    SO, WHAT IS THISAPACHE IGNITE? BUT SERIOUSLY, DON’T DO IT…WE’RE TIRED.
  • 8.
  • 9.
    APACHE IGNITE’S HISTORY ▸Apache Ignite came fresh out the kitchen in Oct. 2014, released by GridGain ▸ Aug. 2015 Ignite is the 2nd fastest project to graduate after Apache Spark ▸ Today, 100+ Contributors internationally & rapidly growing out the community ▸ Over 1m lines of code
  • 10.
  • 11.
    © 2017 GridGainSystems, Inc. Agenda • Introduction - what is this thing? • Use Cases - when do I use this thing? • Who else has used this thing? • Does it apply to my thing I am doing? • Features of this thing, a lot of them, not all. • Demo - show me where to get started with this thing. • Q&A
  • 12.
    © 2017 GridGainSystems, Inc. Introduction
  • 13.
    © 2017 GridGainSystems, Inc. the in-memory computing platform that is durable, strongly consistent and highly available with powerful SQL, key-value and processing APIs Apache Ignite
  • 14.
    © 2017 GridGainSystems, Inc. Apache Ignite In-Memory Computing Platform Memory-Centric Storage Ignite Native Persistence (Flash, SSD, Intel 3D XPoint) Third-Party Persistence (RDBMS, HDFS, NoSQL) SQL Transactions Compute Services MLStreaming Applications Key/Value IoTFinancial Services Pharma & Healthcare E-CommerceTravel & Logistics Telco
  • 15.
    © 2017 GridGainSystems, Inc. Use Cases
  • 16.
    © 2017 GridGainSystems, Inc. Apache Ignite Users FinTech Financial Services Software Logistics & Travel E-commerce Telco IoT Pharma & Healthcare Adtech
  • 17.
    © 2017 GridGainSystems, Inc. JacTravel are a global B2B travel firm, providing realtime services for over 15k city hotels worldwide. Problem • Could not meet latency and throughput SLAs • Could no longer scale and was costly to maintain
 Apache Ignite Solution • More than 550M searches per day • Enables sub-second response times on a 4-node cluster • Delivers savings of over $500K per year on infrastructure SQL API Load Balancer REST API DB updates via MQ Apache Ignite Cluster DB - Real-time Search at Scale (Travel & Retail) IN-MEMORY IN-MEMORY IN-MEMORY IN-MEMORY Distributed In-Memory Partitioned Cache
  • 18.
    © 2017 GridGainSystems, Inc. The company develops IoT solutions that transmit energy consumption data between meters, consumers and utilities in real time. Problem • Could not meet latency and throughput SLAs • Missing scalability and elasticity
 GridGain Solution • 50 millions meters stream the data back in real-time • Collocated in-memory processing • Advanced security and multi-tenancy SQL Smart Meters GridGain Ignite Cluster DB IN-MEMORY IN-MEMORY IN-MEMORY IN-MEMORY GridGain Advanced Security Large IoT Provider - Smart Metering and Utilities Compute Transactions Company’s Platform
  • 19.
    © 2017 GridGainSystems, Inc. Feature Overview
  • 20.
    © 2017 GridGainSystems, Inc. Feature Overview SQL Key / Value Collocated Processing ACID Transactions Micro-Services Data Streaming Machine Learning DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY Server NodesClient Nodes
  • 21.
    © 2017 GridGainSystems, Inc. Durable Memory Off-heap Removes noticeable GC pauses Automatic Defragmentation Stores Superset of Data Predictable memory consumption Fully Transactional (Write-Ahead Log) DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY Server Node Server Node Server Node Ignite Cluster Instantaneous Restarts
  • 22.
    © 2017 GridGainSystems, Inc. Ignite Native Persistence 1. Update RAM 2. Persist Write-Ahead Log Partition File 1 3. Ack 4. Checkpointing Partition File N Server Node
  • 23.
    © 2017 GridGainSystems, Inc. Data Grid JCache Transactions Compute SQL RDBMS NoSQL HDFS Server Node Distributed Key-Value Store Dynamic Scaling Distributed partitioned hash map ACID TransactionJCache & SQL Server Node Server Node 3rd party storage caching DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY
  • 24.
    © 2017 GridGainSystems, Inc. Distributed SQL JDBC ODBC SQL API Java .NET C++ BI SELECT, UPDATE, INSERT, MERGE, DELETE, CREATE and ALTER DDL, DML Support Cross-platform Compatibility Indexes in RAM or Disk Dynamic Scaling Server Node Server NodeServer Node Apache Ignite Cluster DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY Tools
  • 25.
    © 2017 GridGainSystems, Inc. 1. Initial Query 2. Query execution over local data 3. Reduce multiple results in one 1. Initial Query 2. Query execution (local + remote data) 3. Potential data movement 4. Reduce multiple results in one 2 2 1 Collocated Joins Non-Collocated Joins Server Node ON-DISK Server Node ON-DISK Client Node 3 2 2 1 Server Node ON-DISK Server Node ON-DISK Client Node 4 3
  • 26.
    © 2017 GridGainSystems, Inc. Compute Grid DURABLE MEMORY DURABLE MEMORY Ignite Cluster C1 R1 C2 R2 C = C1 + C2 R = R1 + R2 C = Compute R = Result in T/2 time Automatic Failover Load Balancing Zero Deployment
  • 27.
    © 2017 GridGainSystems, Inc. 1. Initial Request 2. Fetch data from remote nodes 3. Process entire data-set 3 1 Data 1 2 2 Data 2 Client-Server Processing Co-located Processing Server Node ON-DISK Server Node ON-DISK 1. Initial Request 2. Co-located processing with data 3. Reduce multiple results in one 2 2 1Client Node Server Node ON-DISK Server Node ON-DISK Client Node 3
  • 28.
    © 2017 GridGainSystems, Inc. Service Grid Node Singleton Cluster Singleton Cluster Singleton Node Singleton Microservices Foundation Lifecycle Management Load Balancing Automatic Failover
  • 29.
    © 2017 GridGainSystems, Inc. Machine Learning Grid K-Means Regressions Decision Trees R C++ Python Java Server Node Server NodeServer Node Distributed Core Algebra DURABLE MEMORY DURABLE MEMORY DURABLE MEMORY Scala REST Random ForestDistributed Algorithms Dense and Sparse Algebra Large Scale Parallelization Multi-Language Support Dense and Sparse Algebra No ETL
  • 30.
    © 2017 GridGainSystems, Inc. Genetic Algorithms Grid DURABLE MEMORY DURABLE MEMORY Ignite Cluster F2, C2, M2 F = F1 + F2 C = C1 + C2 Collocated Computation Biological Evolution Simulation Chromosome and Genes Cluster M = M1 + M2 F1, C1, M1 F = Fitness Calculation C = Crossover M = Mutation
  • 31.
    © 2017 GridGainSystems, Inc. Ignite and Spark Integration Spark Application Spark Worker Spark Job Spark Job Yarn Mesos Docker HDFS Spark Worker Spark Job Spark Job Spark Worker Spark Job Spark Job In-Memory Shared RDD or DataFrame Share RDD across jobs on the host In-Memory Indexes SQL on top of RDDs Share RDD Globally Ignite Node Ignite Node Ignite Node
  • 32.