SlideShare a Scribd company logo
My experience writing a
DR service for
CloudStack
Alena Prokharchyk
Citrix
@Lemonjet
What is a disaster for the cloud
• Disaster for the Cloud is hardware/software
failure,network/power outage, physical damage to
the data center (DC)
• Disaster can cause partial or entire DC failure
• As a result, VMs become unresponsive and needs
to be restored in another DataCenter
• DR products’ goal is to prepare VM’s for failover
and recover them in a short time frame
Existing DR solutions in CS
• Recurring snapshots feature
!
No out-of-box cross zones recovery solution
What new DR service does
• Lets admin to configure recovery service w/o putting
extra scripts and config files
• Prepares for disaster and restores VM and all its
metadata - Networks/Networking rules
• Recovers VM cross zones
• Real time updates for the recovery VMs' metadata -
helps to keep MTTR (Mean Time to Repair) low
• Provides tiered DR service - most important apps/
accounts can be recovered first
Things DR service doesn’t cover
• No Storage replication is done by DR service, only
metadata replication
Storage replication is covered by the admin outside
of CS (NetApp’s Snapmirror)
Which version of Cloudstack
is supported by DR?
DR works with:
• Cloudstack 4.5 version
• Next Citrix CloudPlatform release based on ASF 4.4
Design principles followed while writing
the DR
• Develop as a CS plugin in V1 with ability to run as a separate
service in the future versions
• No changes to core/server CS code that are specific just to DR
• No direct access to CS DB. All data manipulation through CS
APIs only
• DR service doesn’t have its own DB in Version 1. All DR data is
stored in CS DB in form of resources’ metadata
• Rely on MTBF (Mean Time Between Failures) to be high. Never
fail VM in original zone if its preparation fails, let admin fix things
and retry
DR Service deployment
DR UI
plugin
DR API
plugin
DR
Events
listener
DR
Server
CS
Orchestration
engine
CS
API
DR service CloudStack
CS
UI
Event
message bus
CS
Services
/Plugins
DR UI
plugin
DR API
plugin
DR
Events
listener
DR
Service
DR process
• Configuration - configuring the DR service
• Preparation - preparing VM for failover
• Failover - failing over the vm to the Recovery zone
• Failback - failing back the vm to its Original zone
Configuration DR
• Setup Active zone with the Recovery zone
• Configure DR offerings (SLAs)
• Tag storages for the DR VMs’ volumes placement
Preparing VM for failover
• DR service listens to events from CS, and deploys/
updates a recovery VM metadata in the Recovery
zone
• Recovery Vm doesn’t occupy physical resources
on the CS side
• Recovery VM is invisible to an end user
Preparing VM for failover
Nic1
Nic 2
UserVm
Nic1
Nic 2
UserVm
Active zone Recovery zone
DR Service
Failover process
Process of restoring failed vm in the recovery zone
• DR doesn’t do automatic indication that the
Disaster happens
• DR admin triggers failover for the VM by calling the
DR API
• DR service performs the failover process
Failover process
UserVm
Active zone Recovery zone
CS storage1
Volume1
Volume2
UserVm
Volume1
Volume2
CS storage2
Physical storage1
DR Service
Volume1
Volume2
Volume1
Volume2
Physical storage2NetApp
SnapMirror
UUID1 UUID1
Failback process
Process of moving VM back to its original zone
• Vm metadata is preserved in the original zone and re-used
when vm is recovered
• Recovery VM’s volumes get re-introduced to the original
zone, and attached to the original vm
• VM in the recovery zone gets disabled
• VM in the original zone gets enabled
• UUID swap happens
DR metadata in CS DB
user_vm
CS DB
id name zone_id
1 VM-user1 1
2 VM-user1 2
user_vm_details
vm_id detail_name detail_value
1 DR_RECOVERY_ID 2
1 DR_STATE
FAILED_TO_PREPARE_FOR_
DR
1 DR_ALERT
Failed to attach Nic to the
Recovery vm
Who controls the DR
process
• Admin controls recovery process on behalf of users’ VMs
• End user can monitor:
- DR state of his VMs - “Ready to Failover”/“FailedOver”
- Recovery zone info - to which zone the VM recovers in case
of failure
- Recovery public ip address(es) info - to reconfigure his
public DNS
CS API enhancements
• Added some missing data to CS API responses
• Added missing “resource_details” tables for some CS
resources
• Put in the support for CS services to publish Alerts via
CS APIs
• Introduced External UUID management
• Implemented resource creation with delayed start for
some objects (VPC)
Things yet to fix on CS
• Single sign on is missing
• Resource creation in the DB and actual
implementation are not granular enough
Summary
If you are an API developer for open source IaaS
product:
• Always think from an end user/customer use case
perspective while adding/modifying end user APIs
• Look out what plugins/services/bug fixes people
write for your software. Helps to define missing
pieces/common problems in your software

More Related Content

What's hot

Fastback Technical Enablementv1
Fastback Technical Enablementv1Fastback Technical Enablementv1
Fastback Technical Enablementv1
petchpaitoon
 
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
ES19 – Under the Hood: Inside the Cloud Computing Hosting EnvironmnentES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
butest
 
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Windows Azure - Cloud Computing Hosting Environment PresentationMicrosoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Private Cloud
 
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized API
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized APIImplementing a Solution to the Cloud Vendor Lock-In Using Standardized API
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized API
IJCSIS Research Publications
 
[DSBW Spring 2009] Unit 05: Web Architectures
[DSBW Spring 2009] Unit 05: Web Architectures[DSBW Spring 2009] Unit 05: Web Architectures
[DSBW Spring 2009] Unit 05: Web Architectures
Carles Farré
 
LOAD BALANCING ALGORITHMS
LOAD BALANCING ALGORITHMSLOAD BALANCING ALGORITHMS
LOAD BALANCING ALGORITHMS
tanmayshah95
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Shun-ping Chiu
 
Cloud datacenters
Cloud datacentersCloud datacenters
Cloud datacenters
Iffat Anjum
 
Sql saturday dc vm ware
Sql saturday dc vm wareSql saturday dc vm ware
Sql saturday dc vm ware
Joseph D'Antoni
 
Hyper V High Availabitiy
Hyper V High AvailabitiyHyper V High Availabitiy
Hyper V High Availabitiy
Eduardo Castro
 
Cloud computing Module 2 First Part
Cloud computing Module 2 First PartCloud computing Module 2 First Part
Cloud computing Module 2 First Part
Soumee Maschatak
 
Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...
EMC Forum India
 
IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster Recovery
MarkTaylorIBM
 
VMware Site Recovery Manager
VMware Site Recovery ManagerVMware Site Recovery Manager
VMware Site Recovery Manager
Jürgen Ambrosi
 
Double-Take Availability - Technical Presentation
Double-Take Availability - Technical PresentationDouble-Take Availability - Technical Presentation
Double-Take Availability - Technical Presentation
Mücahid Akçay
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Management
guest2e11e8
 
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld
 
02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha
mlraviol
 
VMworld 2014: Site Recovery Manager and Stretched Storage
VMworld 2014: Site Recovery Manager and Stretched StorageVMworld 2014: Site Recovery Manager and Stretched Storage
VMworld 2014: Site Recovery Manager and Stretched Storage
VMworld
 
Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availability
Andrew Schofield
 

What's hot (20)

Fastback Technical Enablementv1
Fastback Technical Enablementv1Fastback Technical Enablementv1
Fastback Technical Enablementv1
 
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
ES19 – Under the Hood: Inside the Cloud Computing Hosting EnvironmnentES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
 
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Windows Azure - Cloud Computing Hosting Environment PresentationMicrosoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
 
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized API
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized APIImplementing a Solution to the Cloud Vendor Lock-In Using Standardized API
Implementing a Solution to the Cloud Vendor Lock-In Using Standardized API
 
[DSBW Spring 2009] Unit 05: Web Architectures
[DSBW Spring 2009] Unit 05: Web Architectures[DSBW Spring 2009] Unit 05: Web Architectures
[DSBW Spring 2009] Unit 05: Web Architectures
 
LOAD BALANCING ALGORITHMS
LOAD BALANCING ALGORITHMSLOAD BALANCING ALGORITHMS
LOAD BALANCING ALGORITHMS
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
 
Cloud datacenters
Cloud datacentersCloud datacenters
Cloud datacenters
 
Sql saturday dc vm ware
Sql saturday dc vm wareSql saturday dc vm ware
Sql saturday dc vm ware
 
Hyper V High Availabitiy
Hyper V High AvailabitiyHyper V High Availabitiy
Hyper V High Availabitiy
 
Cloud computing Module 2 First Part
Cloud computing Module 2 First PartCloud computing Module 2 First Part
Cloud computing Module 2 First Part
 
Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...Track 2, session 3, business continuity and disaster recovery in the virtuali...
Track 2, session 3, business continuity and disaster recovery in the virtuali...
 
IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster Recovery
 
VMware Site Recovery Manager
VMware Site Recovery ManagerVMware Site Recovery Manager
VMware Site Recovery Manager
 
Double-Take Availability - Technical Presentation
Double-Take Availability - Technical PresentationDouble-Take Availability - Technical Presentation
Double-Take Availability - Technical Presentation
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Management
 
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!
 
02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha
 
VMworld 2014: Site Recovery Manager and Stretched Storage
VMworld 2014: Site Recovery Manager and Stretched StorageVMworld 2014: Site Recovery Manager and Stretched Storage
VMworld 2014: Site Recovery Manager and Stretched Storage
 
Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availability
 

Similar to My experience writing DR service for CloudStack

vCloud Automation Center 6.0 -My Notes on Architecture
vCloud Automation Center 6.0 -My Notes on ArchitecturevCloud Automation Center 6.0 -My Notes on Architecture
vCloud Automation Center 6.0 -My Notes on Architecture
techstarts
 
Commvault Story - CVTSP_1.pptx
Commvault Story - CVTSP_1.pptxCommvault Story - CVTSP_1.pptx
Commvault Story - CVTSP_1.pptx
Hardeep Singh Manhas
 
Automation use cases_slides_jayendra_saxena
Automation use cases_slides_jayendra_saxenaAutomation use cases_slides_jayendra_saxena
Automation use cases_slides_jayendra_saxena
Jayendra Saxena
 
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
VMworld
 
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part TwoVMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
VMworld
 
IT Resilience Technical
IT Resilience TechnicalIT Resilience Technical
IT Resilience Technical
PT Datacomm Diangraha
 
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
VMworld 2013: Virtualize Active Directory ‒ The Right Way!VMworld 2013: Virtualize Active Directory ‒ The Right Way!
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
VMworld
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
WASdev Community
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld
 
VMware presentation - Clearpath Solutions Group.pptx
VMware presentation - Clearpath Solutions Group.pptxVMware presentation - Clearpath Solutions Group.pptx
VMware presentation - Clearpath Solutions Group.pptx
randoidzero
 
Configuring and Troubleshooting XenDesktop Sites
Configuring and Troubleshooting XenDesktop SitesConfiguring and Troubleshooting XenDesktop Sites
Configuring and Troubleshooting XenDesktop Sites
David McGeough
 
VMworld 2014: Data Protection for vSphere 101
VMworld 2014: Data Protection for vSphere 101VMworld 2014: Data Protection for vSphere 101
VMworld 2014: Data Protection for vSphere 101
VMworld
 
DRaaS at the museum, vCloud Air
DRaaS at the museum, vCloud AirDRaaS at the museum, vCloud Air
DRaaS at the museum, vCloud Air
VLCM Tech
 
Cloud-Based Disaster Recovery Service Overview
Cloud-Based Disaster Recovery Service OverviewCloud-Based Disaster Recovery Service Overview
Cloud-Based Disaster Recovery Service Overview
PT Datacomm Diangraha
 
VMware Disaster Recovery Solution Presentation EN (1).pptx
VMware Disaster Recovery Solution Presentation EN (1).pptxVMware Disaster Recovery Solution Presentation EN (1).pptx
VMware Disaster Recovery Solution Presentation EN (1).pptx
Fernando564134
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
Tokyo Azure Meetup
 
Nc v mware srm 5.1 what's new sales presentation - customer
Nc  v mware srm 5.1 what's new sales presentation - customerNc  v mware srm 5.1 what's new sales presentation - customer
Nc v mware srm 5.1 what's new sales presentation - customer
uppityintheyours
 
Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL Server
Bob Roudebush
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
mcini
 

Similar to My experience writing DR service for CloudStack (20)

vCloud Automation Center 6.0 -My Notes on Architecture
vCloud Automation Center 6.0 -My Notes on ArchitecturevCloud Automation Center 6.0 -My Notes on Architecture
vCloud Automation Center 6.0 -My Notes on Architecture
 
Commvault Story - CVTSP_1.pptx
Commvault Story - CVTSP_1.pptxCommvault Story - CVTSP_1.pptx
Commvault Story - CVTSP_1.pptx
 
Automation use cases_slides_jayendra_saxena
Automation use cases_slides_jayendra_saxenaAutomation use cases_slides_jayendra_saxena
Automation use cases_slides_jayendra_saxena
 
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
VMworld 2013: DR to The Cloud with VMware Site Recovery Manager and Rackspace...
 
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part TwoVMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
VMworld 2013: Implementing a Holistic BC/DR Strategy with VMware - Part Two
 
IT Resilience Technical
IT Resilience TechnicalIT Resilience Technical
IT Resilience Technical
 
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
VMworld 2013: Virtualize Active Directory ‒ The Right Way!VMworld 2013: Virtualize Active Directory ‒ The Right Way!
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
 
VMware presentation - Clearpath Solutions Group.pptx
VMware presentation - Clearpath Solutions Group.pptxVMware presentation - Clearpath Solutions Group.pptx
VMware presentation - Clearpath Solutions Group.pptx
 
Configuring and Troubleshooting XenDesktop Sites
Configuring and Troubleshooting XenDesktop SitesConfiguring and Troubleshooting XenDesktop Sites
Configuring and Troubleshooting XenDesktop Sites
 
VMworld 2014: Data Protection for vSphere 101
VMworld 2014: Data Protection for vSphere 101VMworld 2014: Data Protection for vSphere 101
VMworld 2014: Data Protection for vSphere 101
 
DRaaS at the museum, vCloud Air
DRaaS at the museum, vCloud AirDRaaS at the museum, vCloud Air
DRaaS at the museum, vCloud Air
 
Cloud-Based Disaster Recovery Service Overview
Cloud-Based Disaster Recovery Service OverviewCloud-Based Disaster Recovery Service Overview
Cloud-Based Disaster Recovery Service Overview
 
VMware Disaster Recovery Solution Presentation EN (1).pptx
VMware Disaster Recovery Solution Presentation EN (1).pptxVMware Disaster Recovery Solution Presentation EN (1).pptx
VMware Disaster Recovery Solution Presentation EN (1).pptx
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 
Nc v mware srm 5.1 what's new sales presentation - customer
Nc  v mware srm 5.1 what's new sales presentation - customerNc  v mware srm 5.1 what's new sales presentation - customer
Nc v mware srm 5.1 what's new sales presentation - customer
 
Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL Server
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
 

Recently uploaded

Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
bijceesjournal
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 

Recently uploaded (20)

Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 

My experience writing DR service for CloudStack

  • 1. My experience writing a DR service for CloudStack Alena Prokharchyk Citrix @Lemonjet
  • 2. What is a disaster for the cloud • Disaster for the Cloud is hardware/software failure,network/power outage, physical damage to the data center (DC) • Disaster can cause partial or entire DC failure • As a result, VMs become unresponsive and needs to be restored in another DataCenter • DR products’ goal is to prepare VM’s for failover and recover them in a short time frame
  • 3. Existing DR solutions in CS • Recurring snapshots feature ! No out-of-box cross zones recovery solution
  • 4. What new DR service does • Lets admin to configure recovery service w/o putting extra scripts and config files • Prepares for disaster and restores VM and all its metadata - Networks/Networking rules • Recovers VM cross zones • Real time updates for the recovery VMs' metadata - helps to keep MTTR (Mean Time to Repair) low • Provides tiered DR service - most important apps/ accounts can be recovered first
  • 5. Things DR service doesn’t cover • No Storage replication is done by DR service, only metadata replication Storage replication is covered by the admin outside of CS (NetApp’s Snapmirror)
  • 6. Which version of Cloudstack is supported by DR? DR works with: • Cloudstack 4.5 version • Next Citrix CloudPlatform release based on ASF 4.4
  • 7. Design principles followed while writing the DR • Develop as a CS plugin in V1 with ability to run as a separate service in the future versions • No changes to core/server CS code that are specific just to DR • No direct access to CS DB. All data manipulation through CS APIs only • DR service doesn’t have its own DB in Version 1. All DR data is stored in CS DB in form of resources’ metadata • Rely on MTBF (Mean Time Between Failures) to be high. Never fail VM in original zone if its preparation fails, let admin fix things and retry
  • 8. DR Service deployment DR UI plugin DR API plugin DR Events listener DR Server CS Orchestration engine CS API DR service CloudStack CS UI Event message bus CS Services /Plugins DR UI plugin DR API plugin DR Events listener DR Service
  • 9. DR process • Configuration - configuring the DR service • Preparation - preparing VM for failover • Failover - failing over the vm to the Recovery zone • Failback - failing back the vm to its Original zone
  • 10. Configuration DR • Setup Active zone with the Recovery zone • Configure DR offerings (SLAs) • Tag storages for the DR VMs’ volumes placement
  • 11. Preparing VM for failover • DR service listens to events from CS, and deploys/ updates a recovery VM metadata in the Recovery zone • Recovery Vm doesn’t occupy physical resources on the CS side • Recovery VM is invisible to an end user
  • 12. Preparing VM for failover Nic1 Nic 2 UserVm Nic1 Nic 2 UserVm Active zone Recovery zone DR Service
  • 13. Failover process Process of restoring failed vm in the recovery zone • DR doesn’t do automatic indication that the Disaster happens • DR admin triggers failover for the VM by calling the DR API • DR service performs the failover process
  • 14. Failover process UserVm Active zone Recovery zone CS storage1 Volume1 Volume2 UserVm Volume1 Volume2 CS storage2 Physical storage1 DR Service Volume1 Volume2 Volume1 Volume2 Physical storage2NetApp SnapMirror UUID1 UUID1
  • 15. Failback process Process of moving VM back to its original zone • Vm metadata is preserved in the original zone and re-used when vm is recovered • Recovery VM’s volumes get re-introduced to the original zone, and attached to the original vm • VM in the recovery zone gets disabled • VM in the original zone gets enabled • UUID swap happens
  • 16. DR metadata in CS DB user_vm CS DB id name zone_id 1 VM-user1 1 2 VM-user1 2 user_vm_details vm_id detail_name detail_value 1 DR_RECOVERY_ID 2 1 DR_STATE FAILED_TO_PREPARE_FOR_ DR 1 DR_ALERT Failed to attach Nic to the Recovery vm
  • 17. Who controls the DR process • Admin controls recovery process on behalf of users’ VMs • End user can monitor: - DR state of his VMs - “Ready to Failover”/“FailedOver” - Recovery zone info - to which zone the VM recovers in case of failure - Recovery public ip address(es) info - to reconfigure his public DNS
  • 18. CS API enhancements • Added some missing data to CS API responses • Added missing “resource_details” tables for some CS resources • Put in the support for CS services to publish Alerts via CS APIs • Introduced External UUID management • Implemented resource creation with delayed start for some objects (VPC)
  • 19. Things yet to fix on CS • Single sign on is missing • Resource creation in the DB and actual implementation are not granular enough
  • 20. Summary If you are an API developer for open source IaaS product: • Always think from an end user/customer use case perspective while adding/modifying end user APIs • Look out what plugins/services/bug fixes people write for your software. Helps to define missing pieces/common problems in your software