BI007 - Step-by-Step Disaster Recovery Scenario

Project BI007 - Analysis
BI007 - Sample name for a real case

Synapse
Workspace
SQL Pool
Data Lake
GZRS
Data Share
Data Share
Main Region
Storage
Data Providers Consumers

BI007 - DR Scenarios
RTO : Recovery Time Objective
RPO : Recovery Point Objective
Normal BI Solution
RPO = 0 –> ( re-import data )
RTO = How many Time Down ?

Data Share
To be prepared for a data center outage, the data provider can
have a data share environment provisioned in a secondary
region. Measures can be taken to ensure a smooth failover in the
event that a data center outage does occur.
In this context data consumers can have an active share
subscription that is idle for DR purposes.
https://docs.microsoft.com/en-us/azure/data-share/disaster-recovery

Data Lake
Storage accounts that have hierarchical namespace enabled (such
as for Data Lake Storage Gen2) are not supported for failover at
this time.
https://docs.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance
Trigger Failover Trigger Failover – not available
Failover for storage accounts with hierarchical namespace enabled (Azure Data Lake Storage Gen2 storage accounts)
is not supported at this time.

Copying data as an alternative to failover
If your storage account is configured for read access to the secondary, then you can design
your application to read from the secondary endpoint. If you prefer not to fail over in the
event of an outage in the primary region, you can use tools such as AzCopy, Azure
PowerShell, or the Azure Data Movement library to copy data from your storage account in
the secondary region to another storage account in an unaffected region. You can then
point your applications to that storage account for both read and write availability.
Data Lake

Data Lake
RA on secondary
Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location
in the secondary region to protect against regional outages. However, that data is available
to be read only if the customer or Microsoft initiates a failover from the primary to
secondary region.
When you enable read access to the secondary region, your data is available to be
read at all times, including in a situation where the primary region becomes unavailable.
For read access to the secondary region, enable read-access geo-redundant storage (RA-
GRS) or read-access geo-zone-redundant storage (RA-GZRS).

Synapse
Geo-backups and disaster recovery
A geo-backup is created once per day to a paired data center. The RPO for a geo-restore is 24
hours. You can restore the geo-backup to a server in any other region where dedicated SQL pool
is supported. A geo-backup ensures you can restore data warehouse in case you cannot access
the restore points in your primary region.
You can also create a user-defined restore point and restore from the newly created restore point
to a new data warehouse in a different region. After you have restored, you have the data
warehouse online and can pause it indefinitely to save compute costs. The paused database
incurs storage charges at the Azure Premium Storage rate. Another common pattern for a
shorter recovery point is to ingest data into primary and secondary instances of a data
warehouse in parallel. In this scenario, data is ingested from a source (or sources) and persisted
to two separate instances of the data warehouse (primary and secondary). To save on compute
costs, you can pause the secondary instance of the warehouse. If you need an active copy of the
data warehouse, you can resume, which should take only a few minutes.
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/backup-and-restore

Synapse
Move synapse from one region to another
https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-move-
workspace-from-one-region-to-another
Steps resume:
• Provision new Synapse instance and restore your last state to it from
the Automated or User-Defined snapshot.
• Proper permissions should be granted
• Connections to Azure Services should be reestablished
• New connection parameters should be propagated to the end-users
• Model drift should be mitigated

Synapse
Workspace
SQL Pool
Data Lake
GZRS
Data Share
Data Share
Main Region Pair Region
Secondary
Data Lake
LRS
Automated
Snapshot
Copy
Snapshot
RPO < 24h
RPO = 8h
RPO < 15m (no SLA)
Failover not available due
to hierarchical namespace
Current Scenario

Current scenario
Data lake: ( RPO < 15 minutes )
• Activate (RA-GRS) – allow read only Secondary region
SQL Pool ( RPO 8 H + 24 H )
• possibility to use user-defined snapshot backup

Synapse
Workspace
SQL Pool
Data Lake
GZRS
Data Share
Data Share
Scenario 1 : recover from current scenario
Synapse
Workspace
SQL Pool
Data Lake
Read Only
LRS
(Microsoft Activated)
Data Share
Data Share
Data Lake
Writable
Copy
Snapshot
restore
sync
Provision new

Scenario 1 : recover from current scenario
• Provision new Azure Services – BI007 infrastructure
network, roles, data lake permissions…
• Wait for failover by Microsoft – allow read only secondary
region
• Restore SQL Pool database

Synapse
Workspace
SQL Pool
Data Lake
GZRS
Data Share
Data Share
Scenario 2 : Pair region provisioned stand by
Synapse
Workspace
SQL Pool
Data Lake
Read Only
LRS
(Microsoft Activated)
Data Share
Data Share
Data Lake
Writable
Copy
Snapshot
restore
sync
Stand by / Stepped activation
1
3
1
2

The deploy is made to 2 regions – CI/CD?
All azure services will be deployed and configured
1. Activate data lake synchronization – read only to read/write
and Restore SQL Pool from snapshot
2. Activate Synapse pipeline triggers
3. Activate Data Share triggers

Synapse
Workspace
SQL Pool
Data Lake
GZRS
Data Share
Data Share
Scenario 3 : current replicated hot stand by
Synapse
Workspace
SQL Pool
Data Share
Data Share
Data Lake
Writable
GZRS
Daily
Activated

Scenario 3 : current replicated hot stand by
2 production systems in each region – CI/CD deploy
Data Share and Synapse pipelines activated
Start/Pause SQL Pool
Possibility to have 2 online systems
2 production systems to maintain
If data lake replication removed => data lake cost will be equal
Additional cost with Synapse pipelines
Low RPO and RTO – replicated environment

BI007 – Step by Step
Scenario 2

Pre-Requisites
Phase 0 : Provisioning and failover validation
Phase 1 : Data Lake synchronization and SQL Pool restore
Phase 2 : Activate Synapse trigger pipelines
Phase 3 : Activate Data Share triggers
Phase 4 : Adjust/Notify consumers for new endpoints
References

Pre-Requisites
Ensure that the current redundant Storage Account on secondary
is RA activated (RA-GZRS)
Ensure that a Read/Write Storage Account is provisioned on
secondary region, using ZRS
Ensure that Synapse Dedicated SQL Pool geo-backup policy is
enabled

Phase 0 : Provisioning and failover validation
• Use current automation deployment strategy (terraform)
• Provision all Azure Services that represent the current MD Data hub
infrastructure (as in previous diagram)
• Ensure network, roles, data lake permissions, etc
• Replicate Data Share subscription(s)
• Consider secondary backup region from data provider(s)
• Use current automation (DevOps CI/CD) to replicate last
developments
• Periodic failover validation

Phase 1 : Data Lake synchronization and SQL Pool
restore
Evaluate possible data loss (regarding RTO)
This steps can run in parallel:
• Sync Data Lake (avg 3s/GB)
• Execute AzCopy script to sync Read Only Storage Account with Read/Write
Storage Account
azcopy sync
"https://<source_storage>.blob.core.windows.net/?<sas_token>"
"https://<destination_storage>.blob.core.windows.net/?<sas_token>"
--recursive
--delete-destination=true
https://docs.microsoft.com/en-us/azure/storage/common/storage-ref-azcopy-sync?toc=/azure/storage/blobs/toc.json
• SQL Pool
• restore (using powershell or azure portal)
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-restore-from-geo-backup#restore-
from-an-azure-geographical-region-through-powershell
1

Using Synapse Workspace:
• Open Synapse Studio
• Go to Manage / Integration / Triggers (menu)
• Start triggers
Phase 2 : Activate Synapse trigger pipelines

Phase 3 : Activate Data Share triggers
Using Azure Portal:
• Go to Data Share Service
• Select Received Shares (on left menu)
• For each shared subscription:
• on snapshot schedule, enable recurrence interval

Phase 4 : Adjust/Notify consumers for new endpoints
• Distribute new end-points for consumers or
• If DNS based, apply new end-points

References
Synapse
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/backup-and-restore
https://www.sqlshack.com/restore-dedicated-sql-pools-in-azure-synapse-analytics/
https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-move-workspace-from-one-region-to-another
Data Lake
https://docs.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance#copying-data-as-an-alternative-to-failover
https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy?toc=/azure/storage/blobs/toc.json
https://docs.microsoft.com/en-us/azure/storage/common/last-sync-time-get?tabs=azure-powershell
https://docs.microsoft.com/en-us/azure/automation/automation-runbook-execution
https://docs.microsoft.com/en-us/azure/automation/automation-runbook-types#powershell-runbooks
Data Share
https://docs.microsoft.com/en-us/azure/data-share/disaster-recovery

Ricardo Linhares
BI Specialist | Data & AI Solutions @ DevScope
Start with SQL 2000
r.linhas@gmail.com
https://www.linkedin.com/in/r-linhares/
https://twitter.com/RLinhas

BI007 - Step-by-Step Disaster Recovery Scenario

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to BI007 - Step-by-Step Disaster Recovery Scenario

Similar to BI007 - Step-by-Step Disaster Recovery Scenario (20)

Recently uploaded

Recently uploaded (20)

BI007 - Step-by-Step Disaster Recovery Scenario