Adjusting carbon topology to match high availability scenario requirements

Adjusting Carbon Topology to
Match High Availability Scenario
Requirements

Afkham Azeez
Director of Architecture
WSO2 Inc

1

About Me
• PMC member Apache Axis, Committer Synapse
& Web Services
• Member, Apache Software Foundation
• Co-author, Axis2 Web Services
• Director of Architecture, WSO2 Inc
• Blog: http://blog.afkham.org
• Twitter: afkham_azeez

2

Agenda
• A brief look at the WSO2 platform
• Carbon clustering for availability
• Cost of availability & related topologies

3

WSO2 Offerings
• WSO2 Carbon
• Full platform of servers for deployment on-premise, in private or public cloud
• Products share a consistent architecture and core platform services (e.g.
logging, management, security, identity, caching) through OSGi and the “Carbon
Core”
• Includes ESB, AppServer, Data Services, Governance, Identity, Business
Process, and more

• WSO2 Stratos
• Platform-as-a-Service (PaaS) Foundation
• Supports running servers as elastic, metered, billed, multi-tenant with self-service
• Including all Carbon Servers, PHP, Jetty, and a growing list through a standard Cartridge
model

• WSO2 StratosLive
• http://stratoslive.wso2.com
• WSO2’s Public PaaS
• An instance of Stratos running in the cloud with all Carbon Servers available 4

Consistent Architecture
• Carbon: A consistent set of class-leading enterprise servers
• The same products run either on-premise or in the cloud, single-tenant or multi-
tenant
• Utilize the same Carbon core runtime for a seamless experience
• Stratos: A cloud platform for enterprise, hybrid and public deployment
• Extends the deployment to support full self-service, elastic scaling, metering and
billing
• Supports Carbon and native server runtimes
• Including Java and non-Java servers such as Jetty and PHP
• Re-uses the same core Carbon architecture to offer core PaaS services including:
• Identity, Logging, File, Relational Storage, Column Storage, Code Deployment, etc
• Both projects share a common set of OSGi modules and a core runtime
architecture

5

WSO2 SOA Platform

6

Availability
The degree to which a system, subsystem, or
equipment is in a specified operable and
committable state at the start of a mission, when
the mission is called for at an unknown, i.e., a
random, time.

Simply put, availability is the proportion of time a
system is in a functioning condition.

8

High Availability (HA)
A system that is designed for continuous operation in the
event of a failure of one or more components. However,
the system may display some degradation of service, but
will continue to perform correctly.

The proportion of time during which the service is
accessible with reasonable response times should be
close to 100%.

All single points of failure should be eliminated

11

HA, CO & CA
• Continuous Operation (CO)
• Ability to avoid planned outages.
• hardware and software maintenance carried out
while applications remains available users.
• Continuous Availability (CA)
• Combines the characteristics of HA and CO to keep
the applications running without any noticeable
downtime
• Hot update/ graceful round-robin restart

12

High Availability Techniques
• Redundancy
• Time – retransmit
• Data – e.g. parity bits
• Processing – e.g. redundant nodes
• Diversity
• e.g. Hybrid deployments, do the same thing using
different implementations

13

How to decide required availability?
• Average throughput (TPS)
• Max throughput (TPS)
• Monetary value of a transaction
• Average loss & max loss per second of
downtime
• Decide on how much to invest on availability
based on cost vs. benefit tradeoff

14

Patching Production Deployments
Patch Distribution Coordinator

1. Check patch list
2.Pull new patch

3. Push patch 3. Push patch
3. Push patch

3. Push patch

15

Patching Production Deployments
Patch Distribution Coordinator

Round-robin

4. Maintenance mode
5. Graceful restart

16

Clustering
• Clustering for scalability

• Clustering for availability

17

Clustering for scalability

18

Clustering for availability

Group Communication Channel/State replication

19

Carbon Clustering
• Membership types
• Static
• Dynamic
• Hybrid
• Membership modes
• Multicast
• Well-known address

20

Static membership
• Predefined members
• Other (non-predefined) nodes cannot join

Static group

M1 M2 N

M3 M4

21

Dynamic membership
• No predefined members
• Nodes can join & leave

Dynamic group

M1 M2 N
Join

M3 M4

22

Hybrid membership
• Some predefined (well-known) members, and some
dynamic members
• Nodes can join & leave
• Membership revolves around the static members
Hybrid group

Dynamic members Static members N

Join
M5 M6 M1 M2 (IP, Port)

M7 M3 M4

23

Multicast based membership management

M4

M1
N Join
(IP, Port)

M2 M3

24

Well-known Address (WKA) based
membership management

Hybrid group

Dynamic members Static members

M6
M5
WK1 N
WK2

Notify Join (IP,
Port)

M7 WK3 WK4

25

Multicast vs. WKA
Multicast WKA
All nodes should be in the same subnet Nodes can be in different networks
All nodes should be in the same multicast
domain No multicasting requirement
Multicasting should not be blocked
No fixed IP addresses or hosts required At least one well-known IP address or host
required
Failure of any member does not affect New members can join with some WKA
membership discovery nodes down, but not if all WKA nodes are
down
Does not work on IaaSs such as Amazon IaaS-friendly
EC2
Requires keepalived, elastic IPs or some
other mechanism for remapping IP
addresses of WK members in cases of
failure
26

Multicast vs. WKA – how to decide?
• Multicast
• Cluster is going to be setup in a network where
multicasting is allowed
• WKA
• Cloud based deployment
• Members are distributed across datacenters &
regions
• Multicasting blocked

27

HTTP Session Replication
• catalina-server.xml
• <Cluster className="org.wso2.carbon.core.session.CarbonTomcatSimpleTcpCluster"/>
• <Valve
className="org.wso2.carbon.tomcat.ext.valves.CarbonTomcatSessionReplicationValve"/>

• web.xml
• <distributable/>

28

State Replication
JSR-107/JCache
A standard Java Caching API for use by developers and a standard SPI ("Service Provider
Interface") for use by implementers.

import javax.cache.*

…

CacheManager cacheMgr = Caching.getCacheManager();

Cache<String, Integer> cache =cacheMgr .getCache(cacheName);
cache.put(“key”, sampleValue);
Integer i = cache.get(“key”);

29

State Replication
CarbonContext based API

Cache cache = CarbonContext.getCurrentContext().getCache();
cache.put(“key”, sampleValue);
Integer i = cache.get(“key”);

Axis2 Contexts
Using Axis2 clustering StateManager – axis2.xml
<stateManager class="org.apache.axis2.clustering.state.DefaultStateManager” enable=”true">

30

Elastic Load Balancer 2.0
• New sysadmin-friendly configuration language
• High performance PassThrough transport
• Tenant-aware load balancing
• Ability to dedicate clusters for tenants (private
jet mode)
• Improved auto-scaler
• Separate IaaS-aware Cloud controller takes care of
spawning new instances on different IaaSs

31

Private Jet mode

• Analogy
• Economy class
• no SLA management, only elasticity
• Business class
• elasticity plus SLA guarantees
• Private Jet
• Guaranteed isolated VMs or machines for a specific
tenant
• Still elastically scaled

Private Jet Mode

34

Topologies
• Single node
• Multi-node with LB
• Multi-node with elasticity using ELB
• Management & worker node separated
• Multi-zone or multi-datacenter deployment
• Multi-region

35

Single node
HIGHEST
Availability

Cost

LOWEST

36

Primary-secondary
HIGHEST
Availability

Cost

LOWEST

Primary Secondary
37

Primary-secondary, multiple LB
HIGHEST

keepalived
Availability

Cost

LOWEST

Primary Secondary
38

Active cluster, multiple LB
HIGHEST

keepalived
Availability

Cost

LOWEST

Active Active Active
39

Management & Worker Node Separation
• Proper separation of concerns - management nodes
specialize in management of the setup while worker nodes
specialize in serving requests to deployment artifacts
• Only management nodes are authorized to add new artifacts
into the system or make configuration changes
• Worker nodes can only deploy artifacts & read configuration
• Lower memory foot in the worker nodes because the
management console related OSGi bundles are not loaded
• Improved security - management nodes can be behind the
internal firewall & be exposed to clients running within the
organization only, while worker nodes can be exposed to
external clients.
• Isolation of failures
40

Management & Worker Node Separation
HIGHEST
Availability

Cost

LOWEST

41

Stratos 2.0 Architecture

43

Multi-zone or multi-datacenter Deployment

HIGHEST

Cloud
Controller

Zone 1

Zone 2
Availability

Region X
Cost

LOWEST

44

Multi-region deployment
HIGHEST

Zone 1

Zone 2

Region X

Zone 1
Availability

Zone 2
Cost

LOWEST
Region Y

45

Multi-IaaS Deployment

Cloud Controller

46

Multiple IaaS (hybrid) Deployment
HIGHEST

Zone 1

Private cloud (data center) Zone 2

Zone 1

Zone 2
Amazon EC2

Zone 1
Availability

Cost

Zone 2
LOWEST
Rackspace Cloud

47

Single Node

Primary-Secondary, single LB

Primary-Secondary,
with multiple LBs

Multi-node active cluster
- Single zone
Cost of Availability

Multi-zone
Multi-region
Multi-IaaS
48

HA for the Load Balancer
• Load balancer cluster
• Keepalived
• Elastic IP address
• Round Robin DNS

49

Monitoring Servers
• Monit
• Automatically provide alerts & restart processes
when monitored items (e.g. latency) fall below
certain thresholds.
• New Relic
• Nagios

50

References
Information on tenant-aware load balancing
http://sanjeewamalalgoda.blogspot.com/2012/03/tenant-aware-load-balancer-is-upcoming.html

http://sanjeewamalalgoda.blogspot.com/2012/05/tenant-aware-load-balancer.html

Scaling Stratos
http://srinathsview.blogspot.com/2012/06/scaling-wso2-stratos.html

http://blog.afkham.org/2011/09/how-to-setup-wso2-elastic-load-balancer.html

http://blog.afkham.org/2011/09/wso2-load-balancer-how-it-works.html

51

Auto-scaler service deployment
http://nirmalfdo.blogspot.com/2012/07/autoscaler-service-deployment.html

Auto-scaler service
http://nirmalfdo.blogspot.com/2012/07/wso2-autoscaler-service-part-i.html

Automatic failover for WSO2 ELB
http://gonesimple.org/2012/09/24/automatic-fail-over-for-wso2-elb/

52

Questions?

http://www.flickr.com/photos/oberazzi/
53

Adjusting carbon topology to match high availability scenario requirements

More Related Content

What's hot

Similar to Adjusting carbon topology to match high availability scenario requirements

More from Afkham Azeez

Recently uploaded

Adjusting carbon topology to match high availability scenario requirements

Editor's Notes