Hazelcast for Terracotta Users

From Terracotta To Hazelcast
Introduction on Hazelcast for Terracotta Users
© 2014 Hazelcast Inc.
AUTUMN 2014

About me
Rahul Gupta
@wildnez
!
Senior Solutions Architect for Hazelcast
Worked with Terracotta and Coherence
Worked for Major Investment Banks
Java Programmer since 1996
Started programming in VHDL and later on 8048, 80286 CPU
2

How this is going down
3
• Limitations of Terracotta
• What is Hazelcast
• Migration
• Important Features

Limitations of Terracotta
!
• Only an in-memory data store
!
• Complex APIs to use distributed collections
!
• No capabilities of processing data in memory
!
• Data needs to be fetched by the application resulting in network hops
!
• Inflexible - Only Client-Server architecture
!
• Dedicated environment for backup, Extra Server Licenses (Passive/Mirror)
!
• Requires dedicated environment
!
• Requires downtime to scale
4

Hazelcast overcomes the limitations
• True IMDG space
– In-Memory Distributed Data Caching
» Native Memory
– In-Memory Parallel Data Processing
» Distributed ExecutorService
» EntryProcessors
– In-Memory Map-Reduce
!
• Distributed Pub-Sub Messaging Model
!
• Simple Access to Distributed Collections
5

• Highly Flexible Deployments
– Client-Server
» Servers run in a separate tier in dedicated
environment
» Does not require dedicated infrastructure for
running backup
– Embedded
» Hazelcast node runs within the application
JVM
» Application nodes made distributed by
Hazelcast running within their JVM
» Does not require dedicated environment
6

• Backups also serve as Main nodes
!
• No extra licenses for backup
!
• Scales on the fly
!
• No downtime required to add/remove nodes
7

Configuring & forming a cluster
8

Forming a Cluster
• Hazelcast Clusters run on JVM
• Hazelcast discovers other instances via Multicast
(Default)
• Use TCP/IP lists when Multicast not possible
• Segregate Clusters on same network via configuration
• Hazelcast can form clusters on Amazon EC2.
9

Hazelcast Configuration - Server
• Only one jar - look for hazelcast-all-x.x.x.jar
!
• Hazelcast searches for hazelcast.xml on class path
!
• Will use hazelcast-default.xml for everything else.
!
• Hazelcast can be configured via XML,API or Spring
!
• Configure Networks, Data Structures,
Indexes,Compute
10

Form a cluster
1. Sample hazelcast.xml looks like this
11

TOP TIP
• The <group> configuration element is your friend.
!
• It will help you isolate your cluster on the multicast
network.
!
• Don’t make the mistake of joining another developers
cluster or worse still a Production Cluster!
12

Configuration via API
1. Add GroupConfig to the Config instance.
13

TOP TIP
• You can run multiple Hazelcast instances in one JVM.
!
• Handy for unit testing.
!
14

Configure Cluster to use TCP/IP
1. Edit multicast enabled = false
2. Add tcp-ip element with your ip address
15

Client Configuration
16

Hazelcast Configuration - Client
• Hazelcast searches for hazelcast-client.xml on class path
• Full API stack - Client API same as Server API
• Clients in Java, C#, C++, Memcache, REST
17

Starting as a Client or Cluster JVM
18
Notice Client and Cluster return same HazelcastInstance reference.

Code Migration
19

Migration - Terracotta to Hazelcast
• Terracotta implementation of cache puts/gets:
!
!
!
!
• Replace Terracotta implementation by Hazelcast
code:
20

Migration - Terracotta to Hazelcast
• Terracotta implementation of Blocking Queue(notice the
complex APIs):
!
!
!
!
• Replace Terracotta by Hazelcast Queue (notice the
simplicity):
21

Topologies
22

Hazelcast Topologies
• Traditional Client -> Server (Client -> Cluster)
!
• Clients do not take part in standard cluster coms.
!
• Consider Client -> Cluster topology to segregate service
from storage.
!
• Smart Clients connect to all Clusters nodes. Operations go
directly to node holding data.
!
• Embedded model, for example in a J2EE container. service
and storage within one JVM.
23

Terracotta Cluster
24

Embedded Hazelcast
25

Client Server -> (Client -> Cluster)
26

Distributed Collections
27

Maps
28

Distributed Maps - IMap
• Conforms to the java.util.Map interface
!
• Conforms to the java.util.ConcurrentMap interface
!
• Hazelcast IMap interface provides extra features
EntryListeners
Aggregators
Predicate Queries
Locking
Eviction
29

Wildcard Configuration
• Hazelcast support wildcards for config.
30
• Beware of ambiguous config though.
• Hazelcast doesn’t pick best match and what it picks is
random not in the order it appears in config.

Properties
• Hazelcast supports property replacement in XML config
31
• Uses System Properties by default
• A Properties Loader can be configured

Near Cache
• Terracotta L1 Cache -> Hazelcast Near Cache
!
• Highly recommended for read-mostly maps
32

Replicated Map
33
• Does not partition data.
• Copies Map Entry to every Cluster JVM.
!
• Consider for immutable slow moving data like config.
!
• ReplicatedMap interface supports EntryListeners.
!

Data Distribution and Resource
Management
34

Data Distribution
• Data (primary + backup) is distributed in cluster using partitions
!
• 271 default partitions
!
• Partitions are divided among running cluster JVMs.
!
• Discovery of resident partition performed by the client before
sending out update/get calls
!
• In smart-client setup, requests go directly to the host node
!
• Hazelcast places a backup of the Map Entry on another partition
as part of the Map.put
35

Data Distribution
• The backup operation can be sync (default) or async to the
Map.put
!
• Each node acts as Primary and Backup compare to Active-Passive
on dedicated resources setup in Terracotta - efficient use of
resources
!
• When cluster JVM enters or leaves, cluster partitions are
rebalanced
!
• In event of a node failure -
– Primary data is retrieved from backup and distributed across
remaining nodes in cluster
– New backup is created on good nodes
36

Data Distribution
37
Fixed number of partitions (default 271)
Each key falls into a partition
partitionId = hash(keyData)%PARTITION_COUNT
Partition ownerships are reassigned upon membership
A B C

New Node Added
A B C D

Migration
A B C D

Migration Complete
A B C D
Crash

Fault Tolerance & Recovery
46

Node Crashes
A B C D
Crash

Backups are Restored
A B C D
Crash

Data is Recovered from backup
A B C D
Crash

Backup for Recovered Data
A B C D
Crash

All Safe
A C D

In Memory Format & Serialization
58

In Memory Format
• Flexibility in data store format compared to Terracotta’s
binary only
• By default, data in memory is binary (serialised) format.
• Local Processing on a node has to keep deserialising.
• Use OBJECT if local processing (entry processor,executors)
• Use BINARY if get(ing) data over the network
59

Serialization
60
• Custom Serialization as against “Serializable” only option in Terracotta
– DataSerializable
• Fine grained control over serialization
• Uses Reflection to create class instance
• “implements DataSerializable”
– public void writeData(ObjectDataOutput out)
– public void readData(ObjectDataInput in)
– IdentifiedDataSerializable
• Better version of DataSerializable
• Avoids Reflection - faster serialization
• extends DataSerializable
• Two new methods -
– int getId() - used instead of classname
– int getFactoryId() - used to load the class given to Id

Distributed Compute
61

Distributed Executor Service
62

Distributed Executor
63
• IExectorService extends
java.util.concurrent.ExecutorService
• Send a Runnable or Callable into the Cluster
• Targetable Execution
All Members
Member
MemberSelector
KeyOwner
• sync/async blocking based on Futures
• Or ExecutionCallback notifies onResponse
!

• If System Resources permit you can scale up the number of
threads the ExecutorService uses.
64
!!
!
!

65
!
!
• Each Member creates its own work queue.
• Tasks are not partitioned or load balanced.
• If member dies while task is enqueue on member it is lost.
• You need to lock any data you access, but beware of
Deadlocks!
!

EntryProcessor
66

EntryProcessor
• A Distributed Map Entry Processor Function
• Provides locking guarantees
• Work directly on the Entry object in a node
• Executed on the Partition Thread rather than the Executor
• Submitted via the IMap
• Best to apply delta updates without moving the object
across the network
67

EntryProcessor
68
• EntryProcessor also mutates the Backup copy
• Use the AbstractEntryProcesssor for default backup
behaviour
• Implement EntryProcessor directly to provide your own
Backup behaviour, for example sending delta only
!
• Only alternative to Terracotta DSO

EntryProcessor
69

EntryProcessor
70
• Other tasks run on Partition Thread (Puts, Gets)
• It is important to yield the EntryProcessor
• hazelcast.entryprocessor.batch.max.size:
Defaults to 10.000
• Hazelcast will not interrupt a running operation it only
yields when the current Key has been processed.

In-memory Map Reduce
71

MAP Reduce
72
• In-memory Map/Reduce compared to disk bound M/R
• Similar paradigm to Hadoop Map/Reduce
• Familiar nomenclature for ease of understanding and
use
– JobTracker
– Job
– Mapper
– CombinerFactory
– Reducer
– Collator

Distributed Aggregation
73

Aggregators
74
• Ready-to-use in-memory data aggregation algorithms
• Implemented on top of Hazelcast MapReduce framework
• More convenient than MR for large set of standard operations
• Work on both - IMap and MultiMap
• Types of aggregation:
– Average, Sum, Min, Max, DistinctValues, Count

Querying
75

Querying with Predicates
• Rich Predicate API that can be run against IMap, similar to
criterion based Terracotta Search
Collection<V> IMap.values(Predicate p)
Set<K> IMap.keySet(Predicate p)
Set<Map.Entry<K,V>> IMap.entrySet(Predicate p)
Set<K> IMap.localKeySet(Predicate p)
76
!!

77
notEqual instanceOf like (%,_) greaterThan
greaterEqual lessThan lessEqual between
in isNot regex

78
• Create your own Predicates
!!

SQL like queries
79
• SQLPredicate class.
• Runs only on Values.
• Converts the String to a set of concrete Predicates.
!!

Indexes
80
• Prevent full Map scans.
• Indexes can be ordered or unordered.
• Indexes can work along the Object Graph (x.y.z).
• When indexing non primitives they must implement
Comparable.
• Indexes can be created at runtime.
!!

Conclusion
81

Conclusion
• Hazelcast is easy to use
!
• Easy to migrate from Terracotta
!
• Familiar naming convention
!
• Lot more features and use cases than just a data store
!
• On the fly scale
!
• Zero downtime
!
• No single point of failure
82

Thank You
!
@wildnez
!
rahul@hazelcast.com
83

Hazelcast for Terracotta Users

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Hazelcast for Terracotta Users

Similar to Hazelcast for Terracotta Users (20)

More from Hazelcast

More from Hazelcast (20)

Recently uploaded

Recently uploaded (20)

Hazelcast for Terracotta Users