Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
ROME 27-28 march 2015 – Ugo Landini
Quick Start Lab
JBoss Data Grid
Ugo Landini

Senior Solution Architect

ugol@redhat.co...
Quick Start Lab - JBoss Data Grid2
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sca...
Quick Start Lab - JBoss Data Grid3
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sca...
Quick Start Lab - JBoss Data Grid4
new generation of
technologies ... designed to
economically extract value
from very lar...
Quick Start Lab - JBoss Data Grid5
Not Only SQL
Just an alternative to
RDBMS
NoSQL
Quick Start Lab - JBoss Data Grid6
K/V Store
Document Store
Column based DB
Graph DB
XML, Object DB, Multidimensional, Gri...
Quick Start Lab - JBoss Data Grid7
NoSQL
Quick Start Lab - JBoss Data Grid8
We’re here
NoSQL
Quick Start Lab - JBoss Data Grid9
•Very hard to categorise in a systematic way
•Many nuances
•Many cases of “Evolutionary...
Quick Start Lab - JBoss Data Grid10
CAP Theorem
Quick Start Lab - JBoss Data Grid11
•Brewer’sTheorem (2000, proven in 2002)
•Three guarantees of a Distributed System
•Con...
Quick Start Lab - JBoss Data Grid12
All nodes see the same data at the same time
Consistency
Quick Start Lab - JBoss Data Grid13
A guarantee that every request receives a response
about whether it succeeded or faile...
Quick Start Lab - JBoss Data Grid14
The system continues to operate despite arbitrary
message loss or failure of part of t...
Quick Start Lab - JBoss Data Grid15
The system continues to operate despite arbitrary
message loss or failure of part of t...
Quick Start Lab - JBoss Data Grid16
Consistency:
Transactions
Availability:
Redundancy
Partition
Tolerance:
Scaleout
CAP: ...
Quick Start Lab - JBoss Data Grid17
Consistency:
Transactions
Availability:
Redundancy
Partition
Tolerance:
Scaleout
NO
GO...
Quick Start Lab - JBoss Data Grid18
Consistency:
Transactions
Availability:
Redundancy
Partition
Tolerance:
Scaleout
RDBMS...
Quick Start Lab - JBoss Data Grid19
Consistency:
Transactions
Availability:
Redundancy
Partition
Tolerance:
Scaleout
NoSQL...
Quick Start Lab - JBoss Data Grid20
Brewer wrote an essay in 2012 to clarify some of the
CAP implications
http://www.infoq...
Quick Start Lab - JBoss Data Grid21
The "two out of three" concept can be misleading or
misapplied and it should be consid...
Quick Start Lab - JBoss Data Grid22
Partitions are rare, so there is little reason to forfeit C or
A when the system is no...
Quick Start Lab - JBoss Data Grid23
Different decisions about C and A:
•for different operations
•for different data
•in d...
Quick Start Lab - JBoss Data Grid24
Finally, C, A e P are more continuos than binary:
•A is obviously continuous
•Many lev...
Quick Start Lab - JBoss Data Grid25
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sc...
26
Virtual Machine 1
Client
Cache
RDBMS
read & write
Local Caching
27
Virtual Machine 1
Client
Cache
RDBMS
read & write
•Single JVM
•few memory
•no HA
Local Caching
28
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
1. Client 1 reads A
First try at distribute...
29
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
2. Client 1 writes A
to Cache 1
First try a...
30
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
3. Client 2 writes A2
to RDBMS
First try at...
31
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
4. Client 1 reads A
from Cache 1
First try ...
32
Distributed Caching on many nodes
What about dirty reads? (i.e. how to cope with multiple
writes, invalidation, etc.)
F...
33
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
1. Client 2 writes A2
to RDBMS
Second try a...
34
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
2. Client 2 updates
Cache 2
Second try at d...
35
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
3. sync Caches
Second try at distributed ca...
36
Virtual Machine 1
Client 1
Cache 1
RDBMS
Virtual Machine 2
Client 2
Cache 2
1. Client 1 reads A2
from Cache 1
Second tr...
Quick Start Lab - JBoss Data Grid37
New Cache topology
Startup time
State transfers
Incompatible JVM tunings
GCs
Non Java ...
Quick Start Lab - JBoss Data Grid38
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sc...
Quick Start Lab - JBoss Data Grid39
Hashing Wheel: a mathematical “wheel” on which you
hash Ks (keys) and Ns (nodes).
The ...
Quick Start Lab - JBoss Data Grid40
N1 Node 1
N2
N3
Node 2
Node 3
Consistent Hashing
Quick Start Lab - JBoss Data Grid41
Ns (nodes) on the “wheel” partition the hash space in
segments
Every segment contains ...
Quick Start Lab - JBoss Data Grid42
N1 Node 1
N2
N3
Node 2
Node 3
K250
Consistent Hashing
Quick Start Lab - JBoss Data Grid43
N1 Node 1
N2
N3
Node 2
Node 3
K250
owner = N2
Consistent Hashing
Quick Start Lab - JBoss Data Grid44
N1 Node 1
N2
N3
Node 2
Node 3
K250
K570
K700
K900
K53
Consistent Hashing
Quick Start Lab - JBoss Data Grid45
Going clockwise from the K:
•the first N is the owner
•next N is the replica
•next next...
Quick Start Lab - JBoss Data Grid46
N1 Node 1
N2
N3
Node 2
Node 3
K250
K570
K700
K900
K53
owner = N2
replica = N3
Consiste...
Quick Start Lab - JBoss Data Grid47
What happens if a node dies?
Consistent Hashing
Quick Start Lab - JBoss Data Grid48
N1 Node 1
N3
Node 2
Node 3
K250
K570
K700
K900
K53
owner = N2
replica = N3
Consistent ...
Quick Start Lab - JBoss Data Grid49
N1 Node 1
N3Node 3
K250
K570
K700
K900
K53
Consistent Hashing
Quick Start Lab - JBoss Data Grid50
N1 Node 1
N3Node 3
K250
K570
K700
K900
K53
owner = N3
replica = N1
Consistent Hashing
Quick Start Lab - JBoss Data Grid51
The real CH algorithm implemented in JDG is slightly
different
CH is optimized to mini...
Quick Start Lab - JBoss Data Grid52
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sc...
Quick Start Lab - JBoss Data Grid53
Distributed Memory Storage Engine
Networked Memory
A Distributed Cache “on steroids”
A...
Quick Start Lab - JBoss Data Grid54
•Key/Value storage
•Search Engine (from K/V to Document storage)
•Linear Scalability, ...
Quick Start Lab - JBoss Data Grid55
•DifferentTopologies
•Querying
•Task Execution & Map/Reduce
•Partition Handling
•Data ...
Quick Start Lab - JBoss Data Grid56
•LOCAL
•INVALIDATION
•REPLICATED
•DISTRIBUTED
JDG Cache Topologies (Cluster modes)
Quick Start Lab - JBoss Data Grid57
•LOCAL
•simple cache (EHCache-like)
•INVALIDATION
•REPLICATED
•DISTRIBUTED
JDG Cache T...
Quick Start Lab - JBoss Data Grid58
•LOCAL
•INVALIDATION
•no sharing
•REPLICATED
•DISTRIBUTED
JDG Cache Topologies (Cluste...
Quick Start Lab - JBoss Data Grid59
•LOCAL
•INVALIDATION
•REPLICATED
•All node are equals
•4 Nodes @ 8 GB = 8 GB
•DISTRIBU...
Quick Start Lab - JBoss Data Grid60
•LOCAL
•INVALIDATION
•REPLICATED
•DISTRIBUTED
•For example: 1 Replica
•4 Nodes @ 8 GB ...
61
Server B
JDG 3 JDG 4
Server A
JDG 1 JDG 2 cluster
4 JDG Nodes on 2 servers
A Simple Grid
62
JDG 1 JDG 2 JDG 3 JDG 4
K0
K1
K6
K3
K8
K2
K4
K9
K5
K7
Distributed without Replica
63
JDG 1 JDG 2 JDG 3 JDG 4
K0
K1
K6
K3
K8
K2
K4
K9
K5
K7
K5
K2 K9
K7
K4
K3
K1
K0
K8
K6
Distributed with Replica
64
JDG 1 JDG 2 JDG 3 JDG 4
K0K1
K6
K3
K8
K2
K4
K9K5
K7
K0K1
K6
K3
K8
K2
K4
K9K5
K7
K0K1
K6
K3
K8
K2
K4
K9K5
K7
K0K1
K6
K3
...
Quick Start Lab - JBoss Data Grid65
•Replicated:
•“Small” set of data with high % of reads vs
writes
•Distributed:
•“Big” ...
Quick Start Lab - JBoss Data Grid66
•You can have different Cache configurations
in the same CacheManager
•mix&match Replic...
Quick Start Lab - JBoss Data Grid67
•Default hashing (Distributed mode):
MurmurHash3.
•It’s a simple and standard Hashing:...
Quick Start Lab - JBoss Data Grid68
•Can be “fine tuned” in 4 different ways:
•Server Hinting
•Virtual Servers
•Grouping
•K...
Quick Start Lab - JBoss Data Grid69
•A triple (site, rack, server)
•You increase availability avoiding that replicas
ends ...
Quick Start Lab - JBoss Data Grid70
•Number of di “segments” in which the
cluster is partitioned
•Improve the node distrib...
Quick Start Lab - JBoss Data Grid71
•Data colocation
•A cache node contains K but also other
relevant data afferent to K
•...
Quick Start Lab - JBoss Data Grid72
•Like Grouping, but from another perspective:
•You just ask a node for a key that will...
Quick Start Lab - JBoss Data Grid73
•All data needed by a node of your application are local,
at the distance of a single ...
Quick Start Lab - JBoss Data Grid74
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sc...
Quick Start Lab - JBoss Data Grid75
•Small self-contained projects that can be used to
simply explain JDG to customers
•ht...
Quick Start Lab - JBoss Data Grid76
• Big Data & NoSQL: super quick introduction to terminology
• What developers do to sc...
Quick Start Lab - JBoss Data Grid77
•If JDG detects a split brain, partitions enter
in degraded mode
•A degraded partition...
Quick Start Lab - JBoss Data Grid78
•Cache Store
•Not only in memory!
•Write through & write behind (ACK sync or
async)
•P...
Quick Start Lab - JBoss Data Grid79
•To avoid Out Of Memory
•Entry can be “passivated” on disk (you’ll need a
CacheStore)
...
Quick Start Lab - JBoss Data Grid80
•To avoid Out Of Memory
•Entry can be “passivated” on disk (you’ll need a
CacheStore)
...
Quick Start Lab - JBoss Data Grid81
•You assign a lifespan or a max idle time to a
key
•The key will then be automatically...
Quick Start Lab - JBoss Data Grid82
Expiry
Quick Start Lab - JBoss Data Grid83
•Both avoid Out Of Memory
•“Evicted” data can be maintained in the Grid
with Passivati...
Quick Start Lab - JBoss Data Grid84
•JDG has full support for transactions
•LocalTransactions
•GlobalTransactions (XA): if...
Quick Start Lab - JBoss Data Grid85
•Cache/CacheManager events
•Topology changes
•Entries being added, removed, modified
•C...
Quick Start Lab - JBoss Data Grid86
•Infinispan-query module
•Hibernate Search & Lucene
•Querying via DSL
•Lucene indexes c...
Quick Start Lab - JBoss Data Grid87
•with M/R you can implement distributed global
operation on the grid
•Each node works ...
Quick Start Lab - JBoss Data Grid88
Map/Reduce
Quick Start Lab - JBoss Data Grid89
Map/Reduce
Quick Start Lab - JBoss Data Grid90
•JDG 7 will implement HDFS API
•So it will be able to act as a super fast Hadoop
store...
Quick Start Lab - JBoss Data Grid91
•With Distexec you can submit “tasks” to the
Grid
•The task can be executed on each no...
Quick Start Lab - JBoss Data Grid92
Cross Site Replication
Quick Start Lab - JBoss Data Grid93
•“Follow the Sun” architectures
•Many different clusters that can be kept in
sync
Cros...
Quick Start Lab - JBoss Data Grid94
•JSR-107
•JavaTemporary Caching API
•Confirmed in January 2015
•In roadmap for JDG 6.5
...
Quick Start Lab - JBoss Data Grid95
•Command Line Console
•JMX
•JON Plugin
Management Tooling
Quick Start Lab - JBoss Data Grid96
•User Authentication
•SASL
•Role Based Access Control (RBAC)
•Users, Roles and mapping...
Quick Start Lab - JBoss Data Grid97
•Library mode
•Embedded in your JVM
•C/S mode
•REST
•Memcached
•Hot Rod
Embedded vs Cl...
Quick Start Lab - JBoss Data Grid98
Embedded vs Client/Server
Quick Start Lab - JBoss Data Grid99
Protocol
Client
Libs
Smart
Routing
Load
Balancing/
Failover
TX Listeners M/R Dist Quer...
ROME 27-28 march 2015 – Ugo Landini
Q&A
ROME 27-28 march 2015 – Ugo Landini
Thank You!
Leave your feedback on Joind.in!
https://joind.in/event/view/3347
Quick Sta...
Upcoming SlideShare
Loading in …5
×

Codemotion 2015 Infinispan Tech lab

1,159 views

Published on

Infinispan Tech Lab for Codemotion 2015

Published in: Engineering
  • Be the first to comment

Codemotion 2015 Infinispan Tech lab

  1. 1. ROME 27-28 march 2015 – Ugo Landini Quick Start Lab JBoss Data Grid Ugo Landini
 Senior Solution Architect
 ugol@redhat.com
 March 26th 2015
  2. 2. Quick Start Lab - JBoss Data Grid2 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  3. 3. Quick Start Lab - JBoss Data Grid3 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  4. 4. Quick Start Lab - JBoss Data Grid4 new generation of technologies ... designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery and/or analysis IDC, 2012 Big Data
  5. 5. Quick Start Lab - JBoss Data Grid5 Not Only SQL Just an alternative to RDBMS NoSQL
  6. 6. Quick Start Lab - JBoss Data Grid6 K/V Store Document Store Column based DB Graph DB XML, Object DB, Multidimensional, Grid/Cloud, … see map on https://451research.com/images/Marketing/dataplatformsmapoctober2014.pdf NoSQL
  7. 7. Quick Start Lab - JBoss Data Grid7 NoSQL
  8. 8. Quick Start Lab - JBoss Data Grid8 We’re here NoSQL
  9. 9. Quick Start Lab - JBoss Data Grid9 •Very hard to categorise in a systematic way •Many nuances •Many cases of “Evolutionary Convergence” •i.e. evolving similar features having to adapt to similar environments NoSQL
  10. 10. Quick Start Lab - JBoss Data Grid10 CAP Theorem
  11. 11. Quick Start Lab - JBoss Data Grid11 •Brewer’sTheorem (2000, proven in 2002) •Three guarantees of a Distributed System •Consistency •Availability •PartitionTolerance CAP Theorem
  12. 12. Quick Start Lab - JBoss Data Grid12 All nodes see the same data at the same time Consistency
  13. 13. Quick Start Lab - JBoss Data Grid13 A guarantee that every request receives a response about whether it succeeded or failed Availability
  14. 14. Quick Start Lab - JBoss Data Grid14 The system continues to operate despite arbitrary message loss or failure of part of the system Partition Tolerance
  15. 15. Quick Start Lab - JBoss Data Grid15 The system continues to operate despite arbitrary message loss or failure of part of the system Partition Tolerance
  16. 16. Quick Start Lab - JBoss Data Grid16 Consistency: Transactions Availability: Redundancy Partition Tolerance: Scaleout CAP: Popular Version
  17. 17. Quick Start Lab - JBoss Data Grid17 Consistency: Transactions Availability: Redundancy Partition Tolerance: Scaleout NO GO CAP: Popular Version
  18. 18. Quick Start Lab - JBoss Data Grid18 Consistency: Transactions Availability: Redundancy Partition Tolerance: Scaleout RDBMS CAP: Popular Version
  19. 19. Quick Start Lab - JBoss Data Grid19 Consistency: Transactions Availability: Redundancy Partition Tolerance: Scaleout NoSQL CAP: Popular Version
  20. 20. Quick Start Lab - JBoss Data Grid20 Brewer wrote an essay in 2012 to clarify some of the CAP implications http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed CAP: Modern Version
  21. 21. Quick Start Lab - JBoss Data Grid21 The "two out of three" concept can be misleading or misapplied and it should be considered as a tautology Many vendors used CAP theorem just as an excuse to sacrifice Consistency CAP: Modern Version
  22. 22. Quick Start Lab - JBoss Data Grid22 Partitions are rare, so there is little reason to forfeit C or A when the system is not partitioned The choice between C and A can occur many times within the same system at very fine granularity CAP: Modern Version
  23. 23. Quick Start Lab - JBoss Data Grid23 Different decisions about C and A: •for different operations •for different data •in different moments CAP: Modern Version
  24. 24. Quick Start Lab - JBoss Data Grid24 Finally, C, A e P are more continuos than binary: •A is obviously continuous •Many levels of Consistency (think isolation level in classic DB) •Even Partitions have nuances, including disagreement within the system about whether a partition exists CAP: Modern Version
  25. 25. Quick Start Lab - JBoss Data Grid25 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  26. 26. 26 Virtual Machine 1 Client Cache RDBMS read & write Local Caching
  27. 27. 27 Virtual Machine 1 Client Cache RDBMS read & write •Single JVM •few memory •no HA Local Caching
  28. 28. 28 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 1. Client 1 reads A First try at distributed caching
  29. 29. 29 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 2. Client 1 writes A to Cache 1 First try at distributed caching
  30. 30. 30 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 3. Client 2 writes A2 to RDBMS First try at distributed caching
  31. 31. 31 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 4. Client 1 reads A from Cache 1 First try at distributed caching
  32. 32. 32 Distributed Caching on many nodes What about dirty reads? (i.e. how to cope with multiple writes, invalidation, etc.) First try at distributed caching
  33. 33. 33 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 1. Client 2 writes A2 to RDBMS Second try at distributed caching
  34. 34. 34 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 2. Client 2 updates Cache 2 Second try at distributed caching
  35. 35. 35 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 3. sync Caches Second try at distributed caching
  36. 36. 36 Virtual Machine 1 Client 1 Cache 1 RDBMS Virtual Machine 2 Client 2 Cache 2 1. Client 1 reads A2 from Cache 1 Second try at distributed caching
  37. 37. Quick Start Lab - JBoss Data Grid37 New Cache topology Startup time State transfers Incompatible JVM tunings GCs Non Java clients Second try at distributed caching
  38. 38. Quick Start Lab - JBoss Data Grid38 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • Infinispan/JDG features • Q&A Agenda
  39. 39. Quick Start Lab - JBoss Data Grid39 Hashing Wheel: a mathematical “wheel” on which you hash Ks (keys) and Ns (nodes). The relative position of Ks and Ns determines which Node is the “owner” of that particular K in a topology Consistent Hashing
  40. 40. Quick Start Lab - JBoss Data Grid40 N1 Node 1 N2 N3 Node 2 Node 3 Consistent Hashing
  41. 41. Quick Start Lab - JBoss Data Grid41 Ns (nodes) on the “wheel” partition the hash space in segments Every segment contains a range of Ks Consistent Hashing
  42. 42. Quick Start Lab - JBoss Data Grid42 N1 Node 1 N2 N3 Node 2 Node 3 K250 Consistent Hashing
  43. 43. Quick Start Lab - JBoss Data Grid43 N1 Node 1 N2 N3 Node 2 Node 3 K250 owner = N2 Consistent Hashing
  44. 44. Quick Start Lab - JBoss Data Grid44 N1 Node 1 N2 N3 Node 2 Node 3 K250 K570 K700 K900 K53 Consistent Hashing
  45. 45. Quick Start Lab - JBoss Data Grid45 Going clockwise from the K: •the first N is the owner •next N is the replica •next next N could be another replica, and so on Consistent Hashing
  46. 46. Quick Start Lab - JBoss Data Grid46 N1 Node 1 N2 N3 Node 2 Node 3 K250 K570 K700 K900 K53 owner = N2 replica = N3 Consistent Hashing
  47. 47. Quick Start Lab - JBoss Data Grid47 What happens if a node dies? Consistent Hashing
  48. 48. Quick Start Lab - JBoss Data Grid48 N1 Node 1 N3 Node 2 Node 3 K250 K570 K700 K900 K53 owner = N2 replica = N3 Consistent Hashing
  49. 49. Quick Start Lab - JBoss Data Grid49 N1 Node 1 N3Node 3 K250 K570 K700 K900 K53 Consistent Hashing
  50. 50. Quick Start Lab - JBoss Data Grid50 N1 Node 1 N3Node 3 K250 K570 K700 K900 K53 owner = N3 replica = N1 Consistent Hashing
  51. 51. Quick Start Lab - JBoss Data Grid51 The real CH algorithm implemented in JDG is slightly different CH is optimized to minimize state transfer (i.e. number of keys moving when a node dies or a new one joins the cluster) Consistent Hashing
  52. 52. Quick Start Lab - JBoss Data Grid52 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  53. 53. Quick Start Lab - JBoss Data Grid53 Distributed Memory Storage Engine Networked Memory A Distributed Cache “on steroids” ATransactional NoSQL What’s a Data Grid?
  54. 54. Quick Start Lab - JBoss Data Grid54 •Key/Value storage •Search Engine (from K/V to Document storage) •Linear Scalability, Elasticity and Fault tolerance •Thanks to CH •Memory based •Persistence engines are optional What’s a Data Grid?
  55. 55. Quick Start Lab - JBoss Data Grid55 •DifferentTopologies •Querying •Task Execution & Map/Reduce •Partition Handling •Data Affinity (to squeeze every bit of performance) Data Grid > Distributed Caching
  56. 56. Quick Start Lab - JBoss Data Grid56 •LOCAL •INVALIDATION •REPLICATED •DISTRIBUTED JDG Cache Topologies (Cluster modes)
  57. 57. Quick Start Lab - JBoss Data Grid57 •LOCAL •simple cache (EHCache-like) •INVALIDATION •REPLICATED •DISTRIBUTED JDG Cache Topologies (Cluster modes)
  58. 58. Quick Start Lab - JBoss Data Grid58 •LOCAL •INVALIDATION •no sharing •REPLICATED •DISTRIBUTED JDG Cache Topologies (Cluster modes)
  59. 59. Quick Start Lab - JBoss Data Grid59 •LOCAL •INVALIDATION •REPLICATED •All node are equals •4 Nodes @ 8 GB = 8 GB •DISTRIBUTED JDG Cache Topologies (Cluster modes)
  60. 60. Quick Start Lab - JBoss Data Grid60 •LOCAL •INVALIDATION •REPLICATED •DISTRIBUTED •For example: 1 Replica •4 Nodes @ 8 GB = 16 GB JDG Cache Topologies (Cluster modes)
  61. 61. 61 Server B JDG 3 JDG 4 Server A JDG 1 JDG 2 cluster 4 JDG Nodes on 2 servers A Simple Grid
  62. 62. 62 JDG 1 JDG 2 JDG 3 JDG 4 K0 K1 K6 K3 K8 K2 K4 K9 K5 K7 Distributed without Replica
  63. 63. 63 JDG 1 JDG 2 JDG 3 JDG 4 K0 K1 K6 K3 K8 K2 K4 K9 K5 K7 K5 K2 K9 K7 K4 K3 K1 K0 K8 K6 Distributed with Replica
  64. 64. 64 JDG 1 JDG 2 JDG 3 JDG 4 K0K1 K6 K3 K8 K2 K4 K9K5 K7 K0K1 K6 K3 K8 K2 K4 K9K5 K7 K0K1 K6 K3 K8 K2 K4 K9K5 K7 K0K1 K6 K3 K8 K2 K4 K9K5 K7 Replicated
  65. 65. Quick Start Lab - JBoss Data Grid65 •Replicated: •“Small” set of data with high % of reads vs writes •Distributed: •“Big” set of data: linear scaling •You need M/R & Distexec How do I choose?
  66. 66. Quick Start Lab - JBoss Data Grid66 •You can have different Cache configurations in the same CacheManager •mix&match Replicated and Distributed as needed JDG Cache Topologies (Cluster modes)
  67. 67. Quick Start Lab - JBoss Data Grid67 •Default hashing (Distributed mode): MurmurHash3. •It’s a simple and standard Hashing: •you can change it as you like, f.e. if your key already identifies a partitioning criteria Tuning your hashing
  68. 68. Quick Start Lab - JBoss Data Grid68 •Can be “fine tuned” in 4 different ways: •Server Hinting •Virtual Servers •Grouping •Key Affinity Tuning your hashing
  69. 69. Quick Start Lab - JBoss Data Grid69 •A triple (site, rack, server) •You increase availability avoiding that replicas ends up in the same (site, rack, server) of the master Server Hinting
  70. 70. Quick Start Lab - JBoss Data Grid70 •Number of di “segments” in which the cluster is partitioned •Improve the node distribution on the hashing wheel to have a better distribution of keys •Default: 60 Virtual Servers
  71. 71. Quick Start Lab - JBoss Data Grid71 •Data colocation •A cache node contains K but also other relevant data afferent to K •Example: customer and its bank movements •You just have to define a group, JDG will colocate all data of the same group in the same node Grouping
  72. 72. Quick Start Lab - JBoss Data Grid72 •Like Grouping, but from another perspective: •You just ask a node for a key that will be hashed on that node •Grouping/Affinity are your best friends if you want to reach JDG Nirvana! Key Affinity
  73. 73. Quick Start Lab - JBoss Data Grid73 •All data needed by a node of your application are local, at the distance of a single Java method call JDG Nirvana
  74. 74. Quick Start Lab - JBoss Data Grid74 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  75. 75. Quick Start Lab - JBoss Data Grid75 •Small self-contained projects that can be used to simply explain JDG to customers •https://github.com/redhat-italy/jdg-quickstarts JDG Quickstarts
  76. 76. Quick Start Lab - JBoss Data Grid76 • Big Data & NoSQL: super quick introduction to terminology • What developers do to scale out • Consistent Hashing • What’s a Data Grid • DEMO • Infinispan/JDG features • Q&A Agenda
  77. 77. Quick Start Lab - JBoss Data Grid77 •If JDG detects a split brain, partitions enter in degraded mode •A degraded partition can read/write ONLY fully owned keys •A partition fully owns a key if contains master and replicas nodes for that key •You’ll get an AvailabilityException for other keys Partition Handling
  78. 78. Quick Start Lab - JBoss Data Grid78 •Cache Store •Not only in memory! •Write through & write behind (ACK sync or async) •Pluggable “drivers” •File System, JPA, LevelDB (supported) •MongoDB, Cassandra, BerkeleyDB, etc. (community) Persistence
  79. 79. Quick Start Lab - JBoss Data Grid79 •To avoid Out Of Memory •Entry can be “passivated” on disk (you’ll need a CacheStore) Eviction
  80. 80. Quick Start Lab - JBoss Data Grid80 •To avoid Out Of Memory •Entry can be “passivated” on disk (you’ll need a CacheStore) Eviction
  81. 81. Quick Start Lab - JBoss Data Grid81 •You assign a lifespan or a max idle time to a key •The key will then be automatically removed after that time •You don’t need to write “Garbage Clean code” Expiry
  82. 82. Quick Start Lab - JBoss Data Grid82 Expiry
  83. 83. Quick Start Lab - JBoss Data Grid83 •Both avoid Out Of Memory •“Evicted” data can be maintained in the Grid with Passivation •Eviction is a Cache configuration •Expiration is a Key configuration •Expiration could be a business requisite •Eviction is a system feature Eviction/Expiry: differences
  84. 84. Quick Start Lab - JBoss Data Grid84 •JDG has full support for transactions •LocalTransactions •GlobalTransactions (XA): if running inside an AS automatically uses itsTX Manager •Batching API Transactions
  85. 85. Quick Start Lab - JBoss Data Grid85 •Cache/CacheManager events •Topology changes •Entries being added, removed, modified •Cluster listeners Listener/Notifications
  86. 86. Quick Start Lab - JBoss Data Grid86 •Infinispan-query module •Hibernate Search & Lucene •Querying via DSL •Lucene indexes can be kept in memory, on disk or in the grid Querying the grid
  87. 87. Quick Start Lab - JBoss Data Grid87 •with M/R you can implement distributed global operation on the grid •Each node works on its data (Map) •Results are later aggregated (Reduce) Map/Reduce
  88. 88. Quick Start Lab - JBoss Data Grid88 Map/Reduce
  89. 89. Quick Start Lab - JBoss Data Grid89 Map/Reduce
  90. 90. Quick Start Lab - JBoss Data Grid90 •JDG 7 will implement HDFS API •So it will be able to act as a super fast Hadoop store Hadoop, coming soon…
  91. 91. Quick Start Lab - JBoss Data Grid91 •With Distexec you can submit “tasks” to the Grid •The task can be executed on each node or on a subset of the nodes •The task can modify data in the Grid Distributed Execution (Distexec)
  92. 92. Quick Start Lab - JBoss Data Grid92 Cross Site Replication
  93. 93. Quick Start Lab - JBoss Data Grid93 •“Follow the Sun” architectures •Many different clusters that can be kept in sync Cross Site Replication
  94. 94. Quick Start Lab - JBoss Data Grid94 •JSR-107 •JavaTemporary Caching API •Confirmed in January 2015 •In roadmap for JDG 6.5 •JSR-347 •Data Grids for the Java Platform •JSR Retired in January 2015 Standard APIs
  95. 95. Quick Start Lab - JBoss Data Grid95 •Command Line Console •JMX •JON Plugin Management Tooling
  96. 96. Quick Start Lab - JBoss Data Grid96 •User Authentication •SASL •Role Based Access Control (RBAC) •Users, Roles and mapping between roles and operations on Cache / Cache-Manager •Node Authentication & Authorisation •Encrypted communication between nodes Data Security
  97. 97. Quick Start Lab - JBoss Data Grid97 •Library mode •Embedded in your JVM •C/S mode •REST •Memcached •Hot Rod Embedded vs Client/Server
  98. 98. Quick Start Lab - JBoss Data Grid98 Embedded vs Client/Server
  99. 99. Quick Start Lab - JBoss Data Grid99 Protocol Client Libs Smart Routing Load Balancing/ Failover TX Listeners M/R Dist Querying Separated Cluster Library mode inVM N/A Yes Dynamic Yes Yes Yes Yes Yes No REST Text HTTP No Any HTTP load balancer No No No No No Yes Memcached Text Many No Predefined server list No No No No No Yes Hot Rod Binary Java/ Python/ C++ Yes Dynamic Local w MVCC Yes (6.4) No No Yes (6.3) Yes Protocol Comparison
  100. 100. ROME 27-28 march 2015 – Ugo Landini Q&A
  101. 101. ROME 27-28 march 2015 – Ugo Landini Thank You! Leave your feedback on Joind.in! https://joind.in/event/view/3347 Quick Start Lab JBoss Data Grid Ugo Landini
 Senior Solution Architect
 ugol@redhat.com
 March 26th 2015

×