How to design a Disaster Recovery Plan for HDP (Hortonworks Data Platform) Clusters?
Mohamed Mehdi BEN AISSA, Big Data Practice Manager at FINAXYS and Big Data ITO at CACIB
For HDP Clusters, we suggest, in a first phase, different Disaster Recovery Plan solutions depending on the SLA (Service-level agreement): RPO (Recovery Point Objective), RTO (Recovery Time Objective). In a second phase, we focus more on the stretch cluster solution: the advantages, the drawbacks and the impact of this choice on the global architecture. Finally, we explain in detail how to configure and deploy this solution and how to integrate each layer (storage layer, processing layer ...) into the architecture.
2. NOTRE
SINGULARITÉ ?
NOUS SOMMES
PLURIELS !
Mohamed Mehdi BEN AISSA
Big Data Practice Manager – Finaxys
Big Data ITO - CACIB
linkedin.com/in/mehdi-ben-aissa/ @Ben_Aissa_mehdi
3. HOW TO DESIGN A DISASTER RECOVERY PLAN FOR HDP
CLUSTERS?
2017-09
4. PLAN I. INTRODUCTION
II. BIG DATA DRP ARCHITECTURES
III. STRETCH CLUSTER ARCHITECTURE
IV. HDP ARCHITECTURE
V. HDFS : STRETCH CLUSTER CONFIGURATION
VI. YARN : STRETCH CLUSTER CONFIGURATION
VII. CONCLUSION
4
6. INTRODUCTION
6
• SLA (Service-Level Agreement) : Particular aspects of the service (quality, availability,
responsibilities) :
• RTO (Recovery Time Objective) : The targeted duration of time and a service level within
which a business process must be restored after a disaster
• RPO (Recovery Point Objective) : The maximum targeted period in which data might be
lost
• Goals :
24/7 RPO €
RTO=0 RPO=0 Cost=0 Consistency Performance
8. BIG DATA DRP ARCHITECTURES : MULTI-CLUSTER ARCHITECTURE VS STRETCH CLUSTER
8
Cluster 1 Cluster 2
Data Center 1 Data Center 2
Data Center 3
Data
Replication
Replication
(1) Multi-cluster Architecture (2) Stretch Cluster
33. CONCLUSION
33
• There is no one ideal architecture that can respond to all needs:
o RPO = 0
o RTO = 0
o Performance
o Consistency
• You can combine many architectures in the same Cluster : Hybrid Architectures
• Monitoring Tools are required to keep track of your replication process and have a global
visibility about your cluster status
• Resiliency and Performance Tests are required to validate your DRP Architecture