Die Zero Data Loss Recovery Appliance (ZDLRA) ist das jüngste Mitglied der Engineered Systems Familie von Oracle. Sie dient der Sicherung bis zu hunderter Oracle Datenbanken auf einem zentralen System und löst dabei die größten Herausforderungen, die bei der Implementierung eines unternehmesweiten, zentralen Backup & Recovery Konzeptes auftreten:
- Minimale Beeinträchtigung der Produktion durch Backup Aktivitäten
- Optimierung von Recovery Time Objective (RTO) und Recovery Point Objective (RPO)
- End-to-End Überwachung und Steuerung aller B&R Aktivitäten
- Hohe Skalierbarkeit
In diesem Vortrag wird die Zero Data Loss Recovery Appliance (ZDLRA) vorgestellt und erklärt, wie sich die ZDLRA von herkömmlichen Backup & Recovery Konzepten unterscheidet.
Moe: Emphasized how many times have you restore/recovery and it did not work: 1) incomplete backup, 2) missing archives and 3) corrupted data. It’s similar to taking a long vacation for 3 months and starting your car for the first time.
Moe: If DR is not being validated and open, how can you trust it to come up in your time of need? Stories why remote mirorring failed. 1) data corruptions, 2) hardware/software bugs impacting the media, 3) user errors destroying the media
Here are several examples where traditional HA infrastructure failed to achieve the objective of eliminating risk of downtime and data loss. In each case the customer had made considerable investment in HA infrastructure and procedures. In each case, when called upon, it failed. I wouldn’t want to be the guys who justified the purchase of this infrastructure for each of these organizations.
Tieto – a very visible outage. They had strict service level agreements with high profile banking clients in Sweden. The impact of this outage made a number of trade journals with lasting negative impact to their reputation and business. They were counting on reliability built into a storage array – again, no matter what level of internal redundancy is implemented within a system, it is still a single point of failure. Then when going to plan b – restore from tape backup – that also failed. How confident can anyone be that a restore from tape will be successful when needed? You can’t know until you try.
American Eagle Outfitters – 8 day outage of their web site. Cascading disk failures made on-disk copies unusable, backups were found to be corrupt, and the DR site maintained with storage remote mirroring did not work. Interesting quote from the article: "I know they were supposed to have completed it with Oracle Data Guard, but apparently it must have fallen off the priority list in the past few months," the source told Schuman.
State of Virginia. SAN failure took out all applications serving state residents – and standby SAN was also impacted by failure. Time consuming restore from backup was required before service could be resumed.
Database optimization example:
- OSB performance differentiators for backup to tape
---- Zero copy shared buffer between RMAN and OSB
---- NUMA-aware for Oracle database shadow processes
Storage Location: Area to hold backups
Recovery Appliance retains sufficient backups to satisfy recovery window goal of each protected database
If space available, backups older than recovery window goal may still be present, effectively extending the time to which point-in-time recovery is available
As new backups arrive, upon space pressure, Recovery Appliance can begin purging backups in following order:
Expired archival (KEEP .. UNTIL TIME) backups
Backups that are older than the recovery window goal
Backups that exceed their reserved space, with those with highest percentage exceeded purged first, e.g. Database B in diagram.
Administrator can be notified if reserved space is not sufficient to meet recovery window goal for individual or all databases
Suppose total database size is X.
Must have one full backup – size X
Assume 10 day recovery window
Assume 10% DB change rate per day – add X times 10% per recovery window days – size now 2X
Assume redo generation rate is also 10% of DB size per day – size now 3X
Assume 50% compression – size now 1.5X
* Scales as disk drives get bigger over time – unlike Terabyte licensing
* Software license can be migrated to newer Recovery Appliance machines
* Recovery Appliance Hardware contractually restricted to running Recovery Appliance software (no Exadata)
Describe what MAA is – a Best Practices Blueprint and Integrated HA Architecture
Understand benefits & opportunity on customer basis