Building a Disaster Recovery Solution using OpenStack               Jorke Odolphi        Principal Research Engineer      ...
http://bionicvision.org.au/eye
The Team
Yuru – ‘cloud’, Gamilaraay People NSW
Problem           The cloud can fail.Online businesses that rely and benefitmost from the cloud don’t have the skills     ...
Disaster Recovery  process, policies and procedures related to   preparing for recovery or continuation of    technology i...
RPO          Recovery Point Objective“maximum tolerable period in which data might   be lost from an IT Service due to a M...
RTO           Recovery Time Objective  “duration of time and a service level withinwhich a business process must be restor...
Somewhere..Recovery    PointObjective    Realtime   recovery/     failover                0 downtime   Recovery Time Objec...
Our Goal  Without re-architecting your application;Provide a configurable warm standby solution,        with a known consi...
Goals and ChallengesReplicate application over to OpenStack incase of a disaster  – Preserve the running environment of th...
mypizzashop.com.auPublic IP / Load Balanced     Web front end   Apache/Nginx/IIS app.mypizzashop.com.au        Private IP ...
Architecting for DR in CloudVirtualise your servers  – snapshotting support in hypervisor primarily at    the diskUse Dyna...
Compatibility across IaaS CloudsCloud         Framework Compute         Object     Block        Network      SecurityProvi...
Overview of DR Process             Take snapshot   Create volume  AWS                                                   Pa...
Building DR using OpenStackProgress:  – Deploying OpenStack in our NICTA lab  – Successfully replicated AWS compute instan...
ProblemsLatencyPoint in TimeLog and replay / transactionalHow do modern databases handle brokentransactions / problem disk...
Optimisations: Incremental BackupTypical AWS system volume is around 10GBReplication is tricky for large data volumes  – I...
Large Data Transfer AcrossCloud DatacentersWhy so slow?
Optimisations: Large Data Transfer   Across Cloud Datacenters for DRProblem: Transferring large data volumes is slow  – Wh...
Reversing..
Point us to      Replicate to   Automatically   If the worstyour instances   new            sync changes    happens:      ...
Questions?       Or answers?       Jorke OdolphiJorke.odolphi@nicta.com.au          @jorke
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
NICTA, Disaster Recovery Using OpenStack
Upcoming SlideShare
Loading in …5
×

NICTA, Disaster Recovery Using OpenStack

2,391 views

Published on

Jorke Odolphi, NICTA, Disaster Recovery Solution using OpenStack, Thurs, 3:50 pm session

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,391
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
93
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

NICTA, Disaster Recovery Using OpenStack

  1. 1. Building a Disaster Recovery Solution using OpenStack Jorke Odolphi Principal Research Engineer NICTA jorke.odolphi@nicta.com.au @jorke
  2. 2. http://bionicvision.org.au/eye
  3. 3. The Team
  4. 4. Yuru – ‘cloud’, Gamilaraay People NSW
  5. 5. Problem The cloud can fail.Online businesses that rely and benefitmost from the cloud don’t have the skills to handle failure.
  6. 6. Disaster Recovery process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to anorganisation after a natural or human-induced disaster * *according to wikipedia..
  7. 7. RPO Recovery Point Objective“maximum tolerable period in which data might be lost from an IT Service due to a Major incident…” * *according to wikipedia..
  8. 8. RTO Recovery Time Objective “duration of time and a service level withinwhich a business process must be restored after a disaster…” * *according to wikipedia..
  9. 9. Somewhere..Recovery PointObjective Realtime recovery/ failover 0 downtime Recovery Time Objective Sometime...
  10. 10. Our Goal Without re-architecting your application;Provide a configurable warm standby solution, with a known consistent RPO, reducing RTO, minimising business impact.
  11. 11. Goals and ChallengesReplicate application over to OpenStack incase of a disaster – Preserve the running environment of the application, this includes: • Compute instances • Networks • DNSMinimise RTO and RPO AND cost!
  12. 12. mypizzashop.com.auPublic IP / Load Balanced Web front end Apache/Nginx/IIS app.mypizzashop.com.au Private IP Application Processing/memcache db.mypizzashop.com.au Private IP Database MySQL/PostgreSQL/MSSQL
  13. 13. Architecting for DR in CloudVirtualise your servers – snapshotting support in hypervisor primarily at the diskUse Dynamic DNS solutions – E.g. Route 53, Anycast DNS
  14. 14. Compatibility across IaaS CloudsCloud Framework Compute Object Block Network SecurityProvider Instance Store Storage GroupAWS Custom ✓ ✓ ✓ DHCP ✓Rackspace Custom ✓ ✓ ✗ STATIC ✗Ninefold CloudStack ✓ ✓ ✓ DHCP ✓TryStack OpenStack ✓ ✓ ✓ DHCP ✓HP Cloud OpenStack ✓ ✓ ✗ DHCP ✓ • Replication from one cloud to another is NOT always possible • Some clouds do not have all the technology pieces (e.g., Block Storage) • Minimum requirements for replicating application servers: • compute instance and persistent storage, such as object store or block storage • Snapshot service (to ensure point-in-time consistency) • Hypervisor support (e.g., PVGrub)
  15. 15. Overview of DR Process Take snapshot Create volume AWS Partition Mount new Download from Send to storageOpenStack instance storage
  16. 16. Building DR using OpenStackProgress: – Deploying OpenStack in our NICTA lab – Successfully replicated AWS compute instances to OpenStack • In Rackspace OpenStack public cloud (private beta) • Instances created from standard 64-bit EXT3 AWS OpenSuse imageRequirements: – Xen support for PVGrub – Write access to partition table – Network support
  17. 17. ProblemsLatencyPoint in TimeLog and replay / transactionalHow do modern databases handle brokentransactions / problem disks?Rollback
  18. 18. Optimisations: Incremental BackupTypical AWS system volume is around 10GBReplication is tricky for large data volumes – Initial backup: • Send the whole data volume (unavoidable!) • Optimise by compression and skipping empty space (0’s) – Subsequent backups: • Incremental – partition a volume into chunks and resend only the difference (the ‘delta’)
  19. 19. Large Data Transfer AcrossCloud DatacentersWhy so slow?
  20. 20. Optimisations: Large Data Transfer Across Cloud Datacenters for DRProblem: Transferring large data volumes is slow – Where is the bottleneck? • Reading from the source volume? YES!! • Transferring across LAN/WAN? • Writing to destination volume? • Our solution Data Transfer Evaluations 1 Clone 4 ClonesRapidly Cloning data 190 140volumes from snapshots – Parallel transfers 50 40 Volume Scan (MB/s) End-to-end Transfer (MB/s)
  21. 21. Reversing..
  22. 22. Point us to Replicate to Automatically If the worstyour instances new sync changes happens: cloud/region every hour failover
  23. 23. Questions? Or answers? Jorke OdolphiJorke.odolphi@nicta.com.au @jorke

×