Disaster Recovery Solution with Oracle Data Guard
and Site Recovery Manager
Kannan Mani, VMware
Brad Pinkston, VMware
BCO4...
2
Agenda
 Introduction
 SRM and Oracle Data Guard
 Architecture Overview
 Demo
 Best Practices
 Summary
 Q&A
3
Introduction
4
Kannan Mani
 15+ years Oracle experience : Oracle RAC, ASM, Clustering, CRM,
ERP, Business Intelligence, Performance an...
5
SRM and Oracle Data Guard
6
SRM Provides Broad Choice of Replication Options
vSphere Replication
Simple, cost-efficient replication for Tier 2 appli...
7
vSphere Replication Complements Storage-Based Replication
Replication
Provider
Cost Management Performance
vSphere
Repli...
8
Oracle Data Guard
http://www.oracle.com/technetwork/database/features/availability/twp-dataguard-11gr2-1-131981.pdf
 Or...
9
Architecture Overview
10
Oracle Database(SAP) – Oracle Data Guard and SRM
vCenter Server
Site
Recovery
Manager
vSphere
vCenter Server
Site
Recov...
11
Steps Tested
1 Oracle Primary DB at Site A and
Standby DB at Site B with Data Guard
3 Site A Down - SAP Application and...
12
Demo
13
SRM Callout Script – odgfail.sh (Example)
~ # cat odgfail.sh
#! /bin/sh
###############################################...
14
EMC Reference Architecture
15
EMC RA – Storage Replication Solution Overview
16
Oracle Database Configuration – Storage Layouts
17
Solution Testing Findings
 Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be car...
18
EMC RA – Storage Replication Solution Overview
19
Oracle Database Configuration – Storage Layouts
20
Solution Testing Findings
 Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be car...
21
Best Practices
22
Oracle DB on VMware Technical Best Practices
 Server selection
 Storage selection
 vSphere version
 vSphere operati...
23
General Best Practices
• Create a computing environment optimized for vSphere
• Enable required settings for ESX host B...
24
Virtual CPUs
 Best Practices for vCPUs
• Do not over-allocate vCPUs – try to match the exact workload
• If the exact w...
25
Virtual Memory Best Practices
• Do not overcommit memory until vCenter reports that steady state
usage is below the amo...
26
Network Best Practices
• Separate infrastructure traffic from virtual machine traffic for
security and isolation
• Use ...
27
Storage Virtualization Concepts
• Storage array – consists of physical disks that are presented as
logical disks (stora...
28
Storage Best Practices
• Use vSphere VMFS for single instance Oracle database
deployments
• For IP-based storage (iSCSI...
29
Summary
30
Performance
Rapid Provisioning
 I/O is not an issue
 Scale up and out
 Newer hardware can increase performance
 Str...
31
Where Can I Learn More?
 vCenter Site Recovery Manager
• Product Page – www.vmware.com/products/srm
• Overview, datash...
32
Questions
33
Disaster Recovery Solution with Oracle
Data Guard and Site Recovery Manager
VMware, Inc.
3401 Hillview Ave
Palo Alto, C...
34
Other VMware Activities Related to This Session
 HOL:
HOL-SDC-1305
Business Continuity and Disaster Recovery In Action...
THANK YOU
Disaster Recovery Solution with Oracle Data Guard
and Site Recovery Manager
Kannan Mani, VMware
Brad Pinkston, VMware
BCO4...
38
Backup Slides
39
Failover
 A failover is performed when the production database fails and one of the standby databases is transitioned ...
40
Switchover
 Switchover is a planned role reversal between the production database and one of its standby databases to ...
41
Switchover (cont’d)
Step No. Primary Site Standby Site
4 Shutdown the former primary and mount as a standby
database:
S...
VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager
Upcoming SlideShare
Loading in …5
×

VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

2,727 views

Published on

VMworld 2013

Kannan Mani, VMware
Brad Pinkston, VMware

Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Published in: Technology, News & Politics
1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total views
2,727
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
151
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

  1. 1. Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager Kannan Mani, VMware Brad Pinkston, VMware BCO4905 #BCO4905
  2. 2. 2 Agenda  Introduction  SRM and Oracle Data Guard  Architecture Overview  Demo  Best Practices  Summary  Q&A
  3. 3. 3 Introduction
  4. 4. 4 Kannan Mani  15+ years Oracle experience : Oracle RAC, ASM, Clustering, CRM, ERP, Business Intelligence, Performance and Scalable Enterprise Application Architecture, Benchmark and Performance , Technical solutions marketing and management, Virtualization and Cloud solutions.  Oracle ACE – Applications, DB  Speakers @ Oracle Open World, IOUG, VMWorld, VMware Partner Exchange, EMC World and Webinars  Industry recognized expert in Oracle and Virtualization technologies.  Blog: http://blogs.vmware.com/apps/oracle
  5. 5. 5 SRM and Oracle Data Guard
  6. 6. 6 SRM Provides Broad Choice of Replication Options vSphere Replication Simple, cost-efficient replication for Tier 2 applications and smaller sites Storage-based Replication High-performance replication for business-critical applications in larger sites vCenter Server Site Recovery Manager vSphere vCenter Server Site Recovery Manager vSphere vSphere Replication Storage-based replication Site A (Primary) Site B (Recovery)
  7. 7. 7 vSphere Replication Complements Storage-Based Replication Replication Provider Cost Management Performance vSphere Replication VMware • Low-end storage supported • No additional replication software • VM’ granularity • Managed directly in vCenter • 15 min RPOs • Scales to 500 VMs • File-level consistency • No automated failback, FT, linked clones, physical RDM Storage-based Replication • Higher-end replicating storage • Additional replication software • LUN – VM layout • Storage team coordination • Synchronous replication • High data volumes • Application consistency possible
  8. 8. 8 Oracle Data Guard http://www.oracle.com/technetwork/database/features/availability/twp-dataguard-11gr2-1-131981.pdf  Oracle Data Guard provides the management, monitoring, and automation software infrastructure to create and maintain one or more standby databases to protect Oracle data from failures, disasters, errors, and data corruptions. Data Guard is unique among Oracle replication solutions in supporting both synchronous (zero data loss) and asynchronous (near-zero data loss) configurations  Administrators can chose either manual or automatic failover of production to a standby system if the primary system fails in order to maintain high availability for mission critical applications
  9. 9. 9 Architecture Overview
  10. 10. 10 Oracle Database(SAP) – Oracle Data Guard and SRM vCenter Server Site Recovery Manager vSphere vCenter Server Site Recovery Manager vSphere vSphere Replication Site A (Primary) Site B (Recovery) Primary SAP DB Standby SAP DB Oracle Data Guard Log Shipping SAP CS SAP PAS SAP CS SAP PAS
  11. 11. 11 Steps Tested 1 Oracle Primary DB at Site A and Standby DB at Site B with Data Guard 3 Site A Down - SAP Application and Central services VM replicated to Site B using vSphere replication 4 Failover Oracle Primary to Standby using SRM Call out Script from SAP Application VM at Site B 5 Connect/Resume SAP application to the Oracle Database in site B 2 SAP Application connected to Primary DB at Site A
  12. 12. 12 Demo
  13. 13. 13 SRM Callout Script – odgfail.sh (Example) ~ # cat odgfail.sh #! /bin/sh ####################################################################################### # file name : odgfail.sh # location : /scripts # called from : Application VM on Site B ####################################################################################### echo "Job `basename $0`: started at `date`" # # Set up standard ORACLE environment variables ORACLE_SID=stdby; export ORACLE_SID ORACLE_BASE=/oracle; export ORACLE_BASE ORACLE_HOME=/oracle/PRD/102_64; export ORACLE_HOME PATH=/oracle/PRD/102_64/bin:.:/oracle/PRD:/usr/sap/PRD/SYS/exe/run:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin;export PATH LD_LIBRARY_PATH=/usr/sap/PRD/SYS/exe/run:/oracle/client/10x_64/instantclient; export LD_LIBRARY_PATH # # Failover to Standby $ORACLE_HOME/bin/sqlplus /nolog <<EOFarch1 connect / as sysdba --shutdown Primary database(in case of RAC, shutdown all RAC instances) --Initiate failover to Standby Database: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE; --Convert the physical standby database to the production role: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; --Comment/Uncomment either of the 2 sets of commands below --If the database was never opened read-only since the last time it was started, --open new production database via: ALTER DATABASE OPEN; --If the physical standby database has been opened in read-only mode since the last time it was started, --shutdown standby database and restart it --SHUTDOWN IMMEDIATE --STARTUP pfile=initSTDBY.ora exit EOFarch1 echo "Job `basename $0`: ended at `date`" ########################## end of script ~ #
  14. 14. 14 EMC Reference Architecture
  15. 15. 15 EMC RA – Storage Replication Solution Overview
  16. 16. 16 Oracle Database Configuration – Storage Layouts
  17. 17. 17 Solution Testing Findings  Integration of RecoverPoint with vCenter Site Recovery Manager enables DR testing to be carried out in isolated environments on the recovery site so that production can remain active and replication can continue uninterrupted. SRM also documents the recovery process  RecoverPoint enables replication of entire virtualized Oracle environments between data centers for disaster recovery  The RecoverPoint splitter supports replication across heterogeneous storage platforms  Integration of RecoverPoint with vCenter Site Recovery Manager enables DR testing to be carried out in isolated environments on the recovery site so that production can remain active and replication can continue uninterrupted http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdfDownload
  18. 18. 18 EMC RA – Storage Replication Solution Overview
  19. 19. 19 Oracle Database Configuration – Storage Layouts
  20. 20. 20 Solution Testing Findings  Integration of RecoverPoint with vCenter Site Recovery Manager enables DR testing to be carried out in isolated environments on the recovery site so that production can remain active and replication can continue uninterrupted. SRM also documents the recovery process  RecoverPoint enables replication of entire virtualized Oracle environments between data centers for disaster recovery  The RecoverPoint splitter supports replication across heterogeneous storage platforms  Integration of RecoverPoint with vCenter Site Recovery Manager enables DR testing to be carried out in isolated environments on the recovery site so that production can remain active and replication can continue uninterrupted http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdfDownload
  21. 21. 21 Best Practices
  22. 22. 22 Oracle DB on VMware Technical Best Practices  Server selection  Storage selection  vSphere version  vSphere operations  Performance monitoring  Guest operating system configuration • Virtual storage presentation • Workload and datastore fan-in ratios • vCPU allocation • Memory • Network • Security • Cloning • Disaster recovery
  23. 23. 23 General Best Practices • Create a computing environment optimized for vSphere • Enable required settings for ESX host BIOS – for example VT, Turbo Mode, hyper-threading • Disable unnecessary foreground and background processes on guest operating system • Create golden images of optimized operating systems using vSphere cloning technologies • Upgrade to vSphere ESX 5 for 10–20 % performance boost • Allow vSphere to choose the best virtual machine monitor based on the CPU and guest operating system combination. Virtual machine setting must be selected Automatic for the CPU/MMU Virtualization option. • Use Oracle recommended installation guidelines for respective operating system – same as physical • To minimize time drift in virtual machines follow guidelines in KB articles Timekeeping best practices for Linux guests http://kb.vmware.com/kb/1006427 Timekeeping best practices for Windows, including NTP http://kb.vmware.com/kb/1318 VMware vSphere 4.1 OS
  24. 24. 24 Virtual CPUs  Best Practices for vCPUs • Do not over-allocate vCPUs – try to match the exact workload • If the exact workload is unknown, start with fewer vCPUs initially and increase later if necessary • For larger production workloads, the total number of vCPUs assigned to all virtual machines should be less than or equal to the total number of cores on the ESX host • Enable hyper-threading for Intel Core i7 processors • For 5500 series processors, enabling hyper-threading is recommended • If unsure of the workload, use hardware vendor recommended Oracle sizing guidelines • Avoid remote NUMA access by sizing the number of vCPUs to be no greater than the number of cores on a NUMA node (processor socket)
  25. 25. 25 Virtual Memory Best Practices • Do not overcommit memory until vCenter reports that steady state usage is below the amount of physical memory on the server • Do not disable the balloon driver (installed with VMware Tools) • Set the memory reservation to SGA size plus OS. (Reservation and configured memory might be the same.) • Enable hardware-assisted virtualization in the ESX host BIOS and on the VM • Set CPU/MMU virtualization option to Automatic • vSphere will choose best Virtual Machine Monitor option base on CPU/Guest OS • Use Large Memory Pages • Consult Oracle Administration Guide for sizing of SGA
  26. 26. 26 Network Best Practices • Separate infrastructure traffic from virtual machine traffic for security and isolation • Use NIC teaming for availability and load balancing • Take advantage of Network I/O Control (NIOC) to converge network and storage traffic onto 10GbE • For “chatty” virtual machines on same host, connect to same vSwitch to avoid NIC traffic • Use VMXNET3 Paravirtualized network adapter drivers to increase performance • Reduces overhead versus vlance or E1000 emulation • Must have VMware Tools to enable VMXNET3 • Use jumbo frames • To configure, see iSCSI and Jumbo Frames configuration on ESX 3.x and ESX 4.x http://kb.vmware.com/kb/1007654 • Separate RAC interconnect network to isolate it from other traffic
  27. 27. 27 Storage Virtualization Concepts • Storage array – consists of physical disks that are presented as logical disks (storage array volumes or LUNs) to the ESX host • Storage array LUNs – formatted as VMware vSphere® VMFS volumes • Virtual disks – presented to the guest operating system, and can be partitioned and used in guest file systems
  28. 28. 28 Storage Best Practices • Use vSphere VMFS for single instance Oracle database deployments • For IP-based storage (iSCSI and NFS), enable jumbo frames • Create dedicated data stores to service database workloads • Align VMFS properly – Use vCenter to create VMFS partitions, because it automatically aligns the partitions • Use Oracle automatic storage management • Follow your storage vendor’s best practices documentation when laying out the Oracle database • Use Paravirtualized SCSI adapters for Oracle datafiles with demanding workloads http://www.vmware.com/files/pdf/partners/oracle/Oracle_Databases_on_VMware_-_Best_Practices_Guide.pdfDownload
  29. 29. 29 Summary
  30. 30. 30 Performance Rapid Provisioning  I/O is not an issue  Scale up and out  Newer hardware can increase performance  Streamline activation, deployment, and validation of servers  Avoid manual configuration errors Server Consolidation  Fully utilize hardware  Maintain application isolation  Scale dynamically and right-size infrastructure Workload Management Business Continuity High Availability  VMware vSphere® vMotion®, VMware vSphere High Availability (HA), VMware vSphere® Fault Tolerance (FT), VMware vSphere Distributed Resource Scheduler (DRS)  Without clustering or RAC  VMware vCenter Site Recovery Manager™  Hardware reduction at failover site  Comprehensive testing of DR solution Benefits of Oracle Databases on VMware  Zero downtime maintenance  Migrate live databases
  31. 31. 31 Where Can I Learn More?  vCenter Site Recovery Manager • Product Page – www.vmware.com/products/srm • Overview, datasheet, webinars, docs, community links  Oracle Data Guard • Overview – http://www.oracle.com/technetwork/database/features/availability/dataguardov erview-083155.html  Virtualizing Oracle with VMware • External Solution Page – http://www.vmware.com/solutions/business-critical- apps/oracle-virtualization/oracle-database.html  Blog • http://blogs.vmware.com/apps/oracle/
  32. 32. 32 Questions
  33. 33. 33 Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager VMware, Inc. 3401 Hillview Ave Palo Alto, CA 94304 Tel: 1-877-486-9273 or 650-427-5000 Fax: 650-427-5001
  34. 34. 34 Other VMware Activities Related to This Session  HOL: HOL-SDC-1305 Business Continuity and Disaster Recovery In Action  Group Discussions: BCO1003-GD Disaster Recovery and Replication with Ken Werneburg
  35. 35. THANK YOU
  36. 36. Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager Kannan Mani, VMware Brad Pinkston, VMware BCO4905 #BCO4905
  37. 37. 38 Backup Slides
  38. 38. 39 Failover  A failover is performed when the production database fails and one of the standby databases is transitioned to take over the production role, allowing business operations to continue. Once the failover is complete and applications have resumed, the administrative staff can turn its attention to resolving the problems with the failed system. Failover may or may not result in data loss depending on the Data Guard protection mode in effect at the time of the failover. There are two distinct types of failover: manual failover and fast-start failover  Steps after Primary database crashes : Step No. Standby Site 1 Initiate failover to Standby Database: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE. In rare circumstances DBA’s may wish to avoid waiting for the standby to complete applying redo in the current standby redo log file before performing the failover and so may issue an ‘ALTER DATABASE ACTIVATE STANDBY DATABASE’ command to perform an immediate failover, this will cause any un-applied redo in the standby redo log to be lost. 2 Convert the physical standby database to the production role: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY 3 If the database was never opened read-only since the last time it was started, open new production database via: ALTER DATABASE OPEN If the physical standby database has been opened in read-only mode since the last time it was started, shutdown standby database and restart it SHUTDOWN IMMEDIATE STARTUP
  39. 39. 40 Switchover  Switchover is a planned role reversal between the production database and one of its standby databases to avoid downtime during scheduled maintenance on the production system or to test readiness for future role transitions. A switchover guarantees no data loss.  Steps : Step No. Primary Site Standby Site 1 Get Status of Primary Database : SELECT NAME, DB_UNIQUE_NAME, LOG_MODE, OPEN_MODE, PROTECTION_MODE, PROTECTION_LEVEL, DATABASE_ROLE, SWITCHOVER_STATUS FROM V$DATABASE Ensure both log_archive_dest_state_1 (Local Archiving) and log_archive_dest_state_2 (Archiving to Standby) are enabled Get Status of Standby Database : SELECT NAME, DB_UNIQUE_NAME, LOG_MODE, OPEN_MODE, PROTECTION_MODE, PROTECTION_LEVEL, DATABASE_ROLE, SWITCHOVER_STATUS FROM V$DATABASE Ensure log_archive_dest_state_1 (Local Archiving) is enabled and log_archive_dest_state_2 (Archiving to Primary) is disabled. Ensure NO gaps in redo on the standby database 2 Verify that it is possible to perform a switchover operation: SELECT SWITCHOVER_STATUS FROM V$DATABASE if output is ‘SESSIONS ACTIVE’ then disconnect all sessions manually or when performing step 3 append the “with session shutdown” clause 3 Convert the current primary database to the new physical standby: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSIONS SHUTDOWN
  40. 40. 41 Switchover (cont’d) Step No. Primary Site Standby Site 4 Shutdown the former primary and mount as a standby database: SHUTDOWN IMMEDIATE STARTUP NOMOUNT PFILE= initPRD.ora ALTER DATABASE MOUNT STANDBY DATABASE Defer the remote archive destination on the old primary: ALTER SYSTEM SET log_archive_dest_state_2=DEFER Verify that the old physical standby can be converted to the new primary: SELECT SWITCHOVER_STATUS FROM V$DATABASE 5 Convert the old physical standby to the new primary: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSIONS SHUTDOWN If the physical standby database has not been opened in read-only mode since the last time it was started: ALTER DATABASE OPEN Shutdown and startup the new primary database: SHUTDOWN IMMEDIATE STARTUP PFILE= initSTDBY.ora 6 Start managed recover on the new standby database: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Enable remote archiving on the new primary to the new standby: ALTER SYSTEM SET log_archive_dest_state_2=ENABLE

×