High Availability + GEO
SLE12 SP2
Antoine Giniès
Project Manager / Release Manager
SUSE / aginies@suse.com
Expert Days Paris
Feb 2017
2
SUSE Entreprise Server
HA SLE12 SP2
Main Features
• Policy Driven Cluster
• Cluster Aware FS
• Continuous Data Replication
• Setup and Installation bootstrap
• Simple
3
HA Cluster Stack Architecture
4
Cluster Scenarios
Main Features
• A / A
• A / P
• Hybrid (Phy / virt)
• Geo Cluster
5
Key Concept
Terminology
• Primitive
• RA ( React )
• Group ( other Node, order )
• Clone ( multiple Node, ie: FS )
• Constraint ( condition )
6
STONITH / FENCING
Shoot The Other Node In The Head
• UPS (Uninterruptible Power Supply)
• PDU (Power Distribution Unit)
• Blade Power Control
• Light-out devices (iLo, Darc etc...)
• SBD Devices (>1)
• Crm configure ra list/info stonith:XXX
7
HAWK2 (new)
High Availability Web Konsole
• Status
• Dashboard
• History
• Resources
• Constraint
• Wizard ( resources,
• Configuration ( constraints, cluster etc...)
• Logs ( debug )
• Access Control (Target = users / Role = access to CIB)
• Batch mode ( staging / simulate )
8
GEO Concept
• Ticket ~= service
• Boothd ~= manage ticket
• Arbitrator ~= take decision
• Dead Man Dep ~= Fence?
9
HA features by Difference
10
Maintenance / Standby
How to Update
• Standby
– still a R
– Elligible as a R
• Maintenance
– Un-managed
– No more a R
– monitor
11
Cluster-MD (new)
Cluster Multi-device
• Software based raid storage
• Improve performance compare to CLVM mirroring
• RAID1 (redundancy)
• Replace at Runtime
• Requires: corosync / DLM
• On top of 2 SAN storage → no more SPOF
12
DRBD
Distributed Replicated Block Device
• Replication DATA
• Mirror of 2 block Devices
• Stacked DRBD
• 8 → 9 ! (meta data)
13
DRBD VS Cluster-MD
• DRBD
– SAN storage
– 2 nodes only
– Possible Regular FS
– Primary / Primary with cluster aware FS
• Cluster-MD
– Classical or SAN storage
– > 2 nodes
– Cluster Aware FS
14
OCFS2 VS GFS2
• OCFS2
– Fast: small/large data files on different nodes
– No quota support
– No online-resize (no mounted)
• GFS2
– Fast: Large data files
– Perf issue accessing small files on different nodes
– Quota support
– Online-resize (mounted)
15
Testing HA easily
Get everything on github
• HA cluster testing using VM
• Fully automatic
• Testing scenarios
• https://github.com/krig/Deploy_HA_SLE_cluster
SLE12 SP2 : High Availability et Geo Cluster

SLE12 SP2 : High Availability et Geo Cluster

  • 1.
    High Availability +GEO SLE12 SP2 Antoine Giniès Project Manager / Release Manager SUSE / aginies@suse.com Expert Days Paris Feb 2017
  • 2.
    2 SUSE Entreprise Server HASLE12 SP2 Main Features • Policy Driven Cluster • Cluster Aware FS • Continuous Data Replication • Setup and Installation bootstrap • Simple
  • 3.
    3 HA Cluster StackArchitecture
  • 4.
    4 Cluster Scenarios Main Features •A / A • A / P • Hybrid (Phy / virt) • Geo Cluster
  • 5.
    5 Key Concept Terminology • Primitive •RA ( React ) • Group ( other Node, order ) • Clone ( multiple Node, ie: FS ) • Constraint ( condition )
  • 6.
    6 STONITH / FENCING ShootThe Other Node In The Head • UPS (Uninterruptible Power Supply) • PDU (Power Distribution Unit) • Blade Power Control • Light-out devices (iLo, Darc etc...) • SBD Devices (>1) • Crm configure ra list/info stonith:XXX
  • 7.
    7 HAWK2 (new) High AvailabilityWeb Konsole • Status • Dashboard • History • Resources • Constraint • Wizard ( resources, • Configuration ( constraints, cluster etc...) • Logs ( debug ) • Access Control (Target = users / Role = access to CIB) • Batch mode ( staging / simulate )
  • 8.
    8 GEO Concept • Ticket~= service • Boothd ~= manage ticket • Arbitrator ~= take decision • Dead Man Dep ~= Fence?
  • 9.
    9 HA features byDifference
  • 10.
    10 Maintenance / Standby Howto Update • Standby – still a R – Elligible as a R • Maintenance – Un-managed – No more a R – monitor
  • 11.
    11 Cluster-MD (new) Cluster Multi-device •Software based raid storage • Improve performance compare to CLVM mirroring • RAID1 (redundancy) • Replace at Runtime • Requires: corosync / DLM • On top of 2 SAN storage → no more SPOF
  • 12.
    12 DRBD Distributed Replicated BlockDevice • Replication DATA • Mirror of 2 block Devices • Stacked DRBD • 8 → 9 ! (meta data)
  • 13.
    13 DRBD VS Cluster-MD •DRBD – SAN storage – 2 nodes only – Possible Regular FS – Primary / Primary with cluster aware FS • Cluster-MD – Classical or SAN storage – > 2 nodes – Cluster Aware FS
  • 14.
    14 OCFS2 VS GFS2 •OCFS2 – Fast: small/large data files on different nodes – No quota support – No online-resize (no mounted) • GFS2 – Fast: Large data files – Perf issue accessing small files on different nodes – Quota support – Online-resize (mounted)
  • 15.
    15 Testing HA easily Geteverything on github • HA cluster testing using VM • Fully automatic • Testing scenarios • https://github.com/krig/Deploy_HA_SLE_cluster