SlideShare a Scribd company logo
1 of 20
SRDF Topology Discussion Document for 
Deutsche Bank 
James Ridley /Jan Jedynak 
EMC Corporation 
VERSION 1.2 
Page 1 of 21
Introduction..............................................................................................................................................3 
Executive Overview..................................................................................................................................4 
SRDF Overview........................................................................................................................................6 
What is SRDF?.......................................................................................................................................6 
Why SRDF?...........................................................................................................................................6 
How does SRDF work?..........................................................................................................................6 
SRDF Storage Protocol used by Deutsche Bank...................................................................................7 
Where SRDF is used at Deutsche Bank.................................................................................................7 
DWDM Technology.................................................................................................................................8 
Overview................................................................................................................................................8 
Nortel Networks OPTera Metro.............................................................................................................8 
Latency....................................................................................................................................................10 
SRDF induced delays...........................................................................................................................10 
Example................................................................................................................................................11 
Recommendations for Handling High Activity Data...........................................................................12 
Types of applications that may not be suited to synchronous replication........................................13 
SRDF best practices in use at Deutsche Bank.....................................................................................14 
Alternative strategies.............................................................................................................................15 
1) SRDF Semi-Synchronous mode .....................................................................................................15 
2) SRDF Adaptive Copy mode ...........................................................................................................16 
3) SRDF Multi-Hop mode...................................................................................................................17 
4) Oracle8I Automated Standby Database...........................................................................................19 
Conclusion............................................................................................................................................21 
Page 2 of 21
Introduction 
Recent engineering work by EMC has validated the Nortel Optera DWDM 
(Dense Wave Division Multiplexer) for use with EMC SRDF to a distance of 
200KM. This document explains which applications can benefit from this 
extended distance mirroring, and which cannot. It also offers alternative 
solutions to support the protection of data if the application cannot support 
extended distance mirroring. 
Page 3 of 21
Executive Overview 
EMC Engineering has recently validated the Nortel Optera DWDM for use with 
EMC SRDF up to 200KM. 
DWDM technology allows for the ‘packaging up’ of multiple SRDF links into a 
smaller number of physical telecomms fibre cable thus reducing the number 
of telecomms fibre cables required without reducing the bandwidth or 
efficency. This leads to greatly reduced costs. 
The Nortel Optera DWDM is not the DWDM currently selected by Deutsche 
Bank ‘New World’. 
It is envisaged that this 200KM distance will be increased very shortly after 
further validation by EMC engineering. 
EMC SRDF works by forwarding write IO from a host to a Symmetrix onto a 
second remote Symmetrix. This is done transparently to the host, which only 
‘sees’ a slightly slower write IO. 
During normal non-BCP operation, only write IO is sent to the remote 
Symmetrix. A general rule of thumb is an application does 90-95% reads and 
only 10-5% writes. 
In order to calculate the additional latency we have to add the fixed ‘overhead’ 
of writing to 2 Symmetrix units, as opposed to 1, plus a variable value 
according to distance to be replicated. The variable value is proportionate to 
the speed of light, and it is not envisaged that EMC engineering will be able to 
improve on this in the near future. The DWDM units and Connectrix switch 
units required have negligible overhead. This overhead is per write IO. 
Applications with a heavy write IO, or during batch runs, may experience ‘IO 
Queuing’ as write IOs queue to be sent across the SRDF link to the second 
Symmetrix. 
Various host best practises can greatly reduce the potential for IO Queuing, 
and these are in use by Deutsche Bank. 
SRDF can operate in a number of modes, and these modes can be 
interactively switched on a very granular basis. These different modes can 
again greatly reduce or even eliminate IO Queuing. 
Thirdly, SRDF in conjunction with EMC TimeFinder can be architected into 
various ‘multi-hop’ and ‘time sync enabled’ solutions. 
In summary, SRDF can be extended across greater distances than previously 
possible using DWDM technology, but this extra distance is not without cost to 
the efficiency of the applications using it. Detailed examination of the service 
levels required and the IO profile of the application must be examined to see if 
Page 4 of 21
it is practical to use SRDF over extended distances in synchronous mode. If 
the application is not suitable for synchronous mode SRDF over the distance 
then there are other architected solutions available which may provide the 
required level of protection. 
Page 5 of 21
SRDF Overview 
What is SRDF? 
SRDF generates a mirror image of the data at the logical volume level in one 
or more remote Symmetrix systems. These remote volumes can be made 
addressable to remote hosts via software commands. SRDF Synchronous 
mode (which is the default mode of operation at Deutsche Bank) was first 
developed for Disaster Recovery within the customer’s campus. SRDF 
Adaptive Copy modes were later developed to support long distance bulk data 
transfers for data center relocations and content replication. Technology has 
evolved to support Wide Area Networking (WAN) and multiple transports, thus 
increasing distance and throughput for a wider variety of applications of 
SRDF. Additional customer uses for SRDF include remote data warehousing, 
remote test beds, remote report generation, remote backup and workload 
sharing between hosts at the same or geographically remote sites. 
Why SRDF? 
SRDF is deployed in several key areas, delivering real benefits to their 
organizations allowing companies to maintain access to data, so that revenue 
producing or supporting applications continue to serve business functions. 
SRDF can be used in several key areas including, but not limited to: 
Business continuance: business applications continue running despite 
possible disk failures. 
Disaster recovery: data recovery at the disaster recovery site in minutes 
rather than days. 
Data centre migrations: application outage reduced to minutes instead of 
hours. 
Work load migrations: similar to the data centre migrations; especially useful 
for minimizing outages during preventative maintenance of hardware or 
software, or even data center powerdowns. 
Shortening or eliminating backup windows: eliminate the backup window 
by utilizing SRDF’s second data copy. 
How does SRDF work? 
SRDF works in 3 different modes; synchronous, semi-synchronous, and 
adaptive copy. 
- Synchronous. Data on the source (R1) and target (R2) volumes are 
always fully synchronized at the completion of an I/O sequence 
- Semi-synchronous. Data on remotely mirrored volumes are always 
synchronized between the source (R1) and the target (R2) prior to 
initiating the next write operation to these volumes. 
- Adaptive copy. Adaptive Copy modes transfer data from the source 
(R1) volume to the target (R2) volume and do not wait for receipt 
acknowledgment and synchronization to occur. 
SRDF writes are from cache to cache, hence when data is written from local 
Symmetrix cache to remote Symmetrix cache over the SRDF link, the 
Page 6 of 21
production Symmetrix waits for an acknowledgement from the remote 
Symmetrix before data is written to local disk. 
SRDF Storage Protocol used by Deutsche Bank 
SRDF at Deutsche Bank uses a storage protocol based upon either the 
ESCON™ or Fibre Channel FC-4 specifications to remotely mirror data 
between Symmetrix units. The host attachment, I/O protocol, and disk data 
structures required by each host are independent to the SRDF operation 
between Symmetrix units. All existing production implementations at 
Deutsche Bank use ESCON, though all future implementations, including the 
new datacentre at Hayes, will use Fibre Channel. 
The benefits of SRDF over Fibre Channel Point-to-Point include increased 
SRDF throughput for all host types and increased connectivity options for 
Open Systems. In addition, Fibre Channel maintains a peer-to-peer 
relationship as opposed to the ESCON channel and control unit relationship 
used at the ESCON RA director level. This increases the flexibility of SRDF in 
cases where it is desired to have primary and secondary volumes located at 
each side of the SRDF link. 
Where SRDF is used at Deutsche Bank 
SRDF is deployed between all the major MERs in the London campus, in 
point-to-point configurations. 
Page 7 of 21
DWDM Technology 
Overview 
Dense Wavelength Division Multiplexing (DWDM) is a process in which 
multiple different or multiple individual channels of data are carried at different 
wavelengths over one pair of fiber links. This contrasts to conventional fiber 
optic systems in which just one channel is carried over a fiber pair. 
For EMC customers this means that multiple SRDF channels and server 
channels can be transferred over one pair of fiber links along with traditional 
network traffic! This is especially important in locations where fiber links are at 
a premium. For example, a customer may be leasing fiber, so the more traffic 
they can run over a single link, the more cost effective the solution. With 
today’s technology, the capacity of a single pair of fiber strands is virtually 
unlimited. The limitation comes from the DWDM itself. Optical to electrical 
transfers for switching and channel protection are required and limit the input 
traffic per channel. 
SRDF over Fibre Channel does not currently support direct connections 
between RF directors using WDM or DWDM unit port connections, due to 
performance limitations and the relatively variable latencies of such links over 
long distances. 
DWDM units, however, are supported for SRDF traffic via ISL connections 
using Fibre Channel switches such as the Connectrix family of Fibre Channel 
switches. 
Nortel Networks OPTera Metro 
High capacity is inherent in Nortel Networks OPTera Metro DWDM (Dense 
Wave Division Multiplex) solution. Each wavelength can support up to .5Gb/s, 
while 32 or more such wavelengths can be multiplexed onto a single fiber. 
The resulting aggregate supports capacities of 80Gb/s to provide high 
capacity trunks between network elements. 
Page 8 of 21
Nortel Networks OPTera Metro provides the ability to route wavelengths, and 
therefore has the same survivability capabilities as current TDM rings when 
deployed in a ring topology. OPTera Metro provides a reliable DWDM platform 
for enterprises with large-scale connectivity requirements. OPTera’s 
transparent capabilities enable these enterprises to control the cost and 
DWDM 
Acts as an “optical funnel” 
– 
Fiber 
8 to 64 wavelengths 
management requirements of connectivity, ensure network integrity, Increase 
network robustness, and easily accommodate emerging communications 
protocols. 
Features and Benefits 
•Support of SONET/SDH and non-SONET/SDH interfaces 
•Protocol and bit-rate independence 
•32 protected wavelengths,64 unprotected wavelengths 
•P r-wavelength flexible protection switching 
•Scalable from 16 Mbps to 2.5 Gbps per wavelength 
•Point-to-point and survivable ring up to 120km 
•In-band, per wavelength Optical Service Channel 
•Point and click GUI management system 
•Open systems management platform 
•NEBS and ETSI compliant 
Page 9 of 21 
• Dense wavelength division multiplexing 
– 
Multiple protocol independent streams on a single fiber-optic 
cable pair 
– 
Each wavelength represents a unique stream of data 
which may have a different data rate
Latency 
SRDF induced delays 
Synchronous or even semi-synchronous mirroring of data can cause impacts 
to customer workloads. The impact to any given workload will vary according 
to: 
- The blocksize of the data being remote mirrored 
- The distance over which the remote mirroring is being done 
- The remote mirroring mode used (e.g.. Synchronous, semi-synchronous, 
adaptive copy) 
- The type of connection between the source and target Symmetrix units 
- The arrival rate of the write IOs at the source Symmetrix 
The degree to which a customer workload is impacted by delays induced by 
SRDF mirroring will not only vary according to the amount of the delay, but 
also due to the nature of the workload. Some workloads will not be impacted 
by extended response times on workload components that are critical for 
recovery. Other workloads could be severely impacted if the affected 
component is on the critical path for end user transaction response time. (e.g.. 
An increase in response time to the online Redo logs in an Oracle 
environment will invariably cause end user transaction response time to 
degrade.) 
In order to approximate the amount of delay likely to be introduced by 
SRDF’ing the data for any given workload, one should: 
- Determine the type of SRDF implementation that is likely to be installed 
- Calculate the propagation delay induced by the link (calculated by 
multiplying the round trip link distance in kilometres by 0.005 msec/km, 
and then by 3 if campus ESCON is to be used, or by 1 if a telco link 
(e.g. T3, ATM, etc) is to be used, or by 2 for SRDF over Fibre Channel. 
To this it will be necessary to add an allowance for protocol time within 
the both the source and target Symmetrix, as well as allowances for 
delays induced by protocol converters, network equipment, etc.) 
- Add the approximated SRDF link delay times to the current or 
anticipated non SRDF’ed IO response times. 
- Determine the likely impact on the customer workload, remembering 
that the impact will inevitably follow Little’s Law1. 
1 Little’s Law is the basis upon which a lot of queuing theory is built. In general terms, Little’s 
Law relates the average queue length (Q) to the arrival rate of transactions (a) and the 
average response time (R). Specifically, Little’s Law states: 
Q = a * R. 
Consequently, it can be seen that any increase in IO response time may well cause a 
significant blowout in the queue length within the application, which may or may not be 
supportable from a customer business perspective. 
Page 10 of 21
Example 
This document is concentrating on SRDF over Fibre Channel. Write IO is 
transmitted using SCSI over Fibre Channel, and so according to the SCSI 
protocol every IO to be transmitted actually requires 2 round trips; the first is 
the SCSI command word (for SRDF this will be WRITE), the remote 
Symmetrix then returns the acknowledgement. The second trip is for the 
actual data, followed by the acknowledgement from the remote Symmetrix 
that the data has been written to cache and confirmed. This leads to the X2 
propagation delay described above. 
The picture above illustrates the host response time without SRDF (Baseline), 
and the overhead of running SRDF over zero distance (Campus) for 4K and 
27K blocksize. 
Working through a 4K blocksize example, we have a 2.0MS host response 
time for zero distance. Add to this a 100KM distance – the approximate 
distance from London to Milton Keynes - ((100KM + 100KM + 100KM + 
100KM)*0.005)=2.0 – a total of 4MS response time per write IO. 
Heavy write activity on 1 volume may mean that IOs are queued waiting for 
the previous IO to be acknowledged from the remote Symmetrix, and so you 
may get IO elongation, with IOs waiting on IOs on IOs (see Little’s Law 
above). 
Note: There is no significant Latency through Switches or DWDMs 
Page 11 of 21 
2.1MS 3.9MS
Recommendations for Handling High Activity Data 
As a general rule of thumb, and depending on the nature of the application 
being supported, the distance over which the data is to remote mirrored, etc, 
in order to ensure acceptable overall IO response times it is desirable that no 
single logical volume involved in a remote mirroring relationship be required to 
handle more than 100 write IOs/sec at 200KMs. This figure is derived from the 
maximum number of IOs that a logical volume can sustain at that distance (4K 
blocksize – max 175 write IOs per second, 27K blocksize – max 125 write IOs 
per second). It must be remembered that only 1 IO for a volume can be in the 
SRDF ‘pipe’ at a time, though multiple IOs can be in the ‘pipe’ at the same 
time. 
In order to reduce the IO rate to any given logical volume to this sort of level, it 
may be necessary to implement some of the following. 
- Wherever possible high activity data should be spread over as many 
logical volumes as possible, so as to reduce the overall IO rate per 
volume, ie host level striping. 
- If possible, increase host level buffering and blocksizes so as to reduce 
the number of IOs done by the application. 
- When dealing with high activity IO caused by large, single address 
space tasks (e.g. database control regions, etc), it may be necessary to 
break the tasks into multiple smaller tasks, so as to reduce the amount 
of data generated on a per region basis to more manageable levels. 
This is a non-trivial task, as it may have significant impact on the 
customer’s application architecture, and will require significant 
involvement from customer personnel such as Data Base Analysts, etc. 
- If necessary, re-design the application so as to achieve the desired IO 
rate on a per volume basis. 
Page 12 of 21
Types of applications that may not be suited to 
synchronous replication. 
1) Database applications which exhibit very high transaction throughput and 
therefore a high number of log writes. 
2) Database Applications that have a high transaction rate and perform excessive 
number of Consistency Points operations (perhaps as a result of frequent log 
switch operations) 
3) Applications which exhibit high volumes of I/O writes. 
4) Applications that are highly sensitive to synchronous write I/O performance (non-buffered 
synchronous writes) 
5) Any highly time-bound write intensive application process where any elongation 
of write I/O would impact application performance 
Page 13 of 21
SRDF best practices in use at Deutsche Bank 
Various best practices can reduce the impact of IO Queuing and IO 
elongation. 
The simplest is to make sure that all filesystems are built on host level striped 
volumes. The reason for this is that the SRDF 'pipe' or queue can only have 1 
IO for a Symmetrix volume going across it at any time. The pipe can contain 
more than 1 IO, but not for the same Symmetrix volume. By creating a striped 
volume set at the host level you get 2 immediate effects when the host writes 
an IO. If we were to write IOs to a striped filesystem spread over 4 Symmetrix 
volumes then the 2 benefits would be: 
1) the host knows it is writing to a striped set and issues more IOs to 
the disk subsystem, as it knows it is actually writing to 4 volumes 
2) more IOs can go across the SRDF 'pipe' to the remote Symmetrix as 
the IOs are to 4 Symmetrix volumes rather just 1. This reduces queuing for 
pipe. 
Host level LVM striping is being used as a best practice by nearly all projects 
based on EMC Symmetrix. 
Page 14 of 21
Alternative strategies 
The latency overhead can also be masked from the user if an alternative 
replication strategy is adopted namely, Semi Synchronous or Multi-Hop 
replication. 
Another strategy would be combining the benefits of SRDF with an Oracle 
automated standby database. This solution requires only that the online redo 
logs be synchronously replicated, thus drastically reducing communication 
needs. 
The following strategies could help alleviate latency overhead with SRDF 
deployed over extended distances. 
1) SRDF Semi-Synchronous mode 
This is used primarily in extended distance environments. In this mode of 
operation, data on the remotely mirrored volumes are always synchronized 
between the source (R1) volume and the target (R2) volume prior to initiating 
the next write operation to these volumes. 
The sequence of operations is: 
1.An I/O write is received from the host/server into the cache of the source. 
2. An ending status is presented to the host/server. 
3.The I/O is transmitted to the cache of the target. 
4. A receipt acknowledgment is provided by the target back to the cache of 
the source. 
Semi-Synchronous mode masks the impact of distance in the general case, 
because it allows read operations while write operations are in transit. 
SRDF uses a first-in, first-out queue. 
Page 15 of 21 
SRDF SEMI-SYNCHRONOUS MODE 
4 
SRDF links 
3 
Source Target 
2 
1 
Target behind at most one write operation per source logical volume
2) SRDF Adaptive Copy mode 
SRDF Adaptive Copy mode is used primarily for data migrations and data 
centre moves. This operational mode is not recommended for use when 
mirroring for disaster recovery. 
SRDF Adaptive Copy mode allows the source (R1) volumes and target (R2) 
volumes to be a few or many I/Os out of synchronization. The number of 
tracks out of synchronization (skew) is user selectable. 
There are two types of adaptive copy: Write Pending mode and Disk mode. 
The sequence of operations is: 
1. An I/O write is received from the host/server into the cache of the 
source Symmetrix 
2. The I/O is acknowledged as completed to the host/server 
3. The I/O is placed in the SRDF queue 
4. The I/O is de-staged from cache to the source (R1) volume, and an 
issue request is sent to the SRDF link 
5. The I/O is transmitted to the cache of the target 
6. A receipt acknowledgment is provided by the target back to the cache 
of the source. 
Adaptive Copy Write Pending mode allows the transmission to take place 
before the data is de-staged from cache to the R1 disk volumes. 
Adaptive Copy Disk mode de-stages the data from the cache to the R1 
volume and then keeps track-level information as to what data is owed to the 
remote side so that information can be subsequently sent a track at a time. 
SRDF Adaptive Copy mode is used primarily for data migrations, data center 
moves, and in conjunction with SRDF over Internet Protocol (IP) links. This 
mode of operation also can be used in an SRDF Multi-Hop configuration to 
mirror TimeFinder Business Continuance Volumes (BCVs)/R1 changed tracks 
between the intermediate target site and the final (Multi-Hop) target site. 
N.b Thresholds for how far out of synch the volumes are allowed to be is 
selectable by the user with the “skew” command. 
Page 16 of 21
3) SRDF Multi-Hop mode 
TimeFinder software works by configuring multiple, independently 
addressable online Business Continuance Volumes (BCVs) for information 
storage. The BCV is a Symmetrix device with special attributes created when 
the Symmetrix is configured. It can function either as an additional mirror to a 
Symmetrix logical volume or as an independent, host-addressable volume. 
Establishing BCV devices as mirror images of active production volumes 
allows you to run multiple simultaneous business continuance tasks in 
parallel. The principal device, known as the standard device, remains on line 
for regular Symmetrix operation from the original production server. Each 
BCV 
contains a unique host address, making it accessible to a separate 
backup/recovery server. When you establish a BCV as a mirror of a standard 
device, that relationship is known as a BCV pair. The BCV is temporarily 
inaccessible to its host until you split the BCV pair. 
The multi-hop restart solution is applicable when you want zero data loss in 
the event of a disaster at the local site. Zero data loss means that the state of 
the data at the Hop 2 restart site (after being propagated from the Hop 1 
bunker site) is the same as it is at the local source site at the the beginning of 
a rolling disaster. 
Automated replication with the BCVs at Hop 2 is applicable if you want a zero 
data loss solution but cannot risk the loss of both the local source site and 
Hop 1 bunker site at the same time. With this configuration, there are two 
possible disaster restart possibilities: 
- If only the local source site is lost, the result is zero data loss at the 
Hop 2 restart site. 
- If both the local source site and the Hop 1 bunker site are lost, the 
result is a DBMS restartable copy at the Hop 2 restart site with 
controlled data loss. The amount of data loss will be a function of the 
replicate copy cycle time between the Hop 1 bunker site and the Hop 2 
restart site. 
Page 17 of 21
Hop1 Hop2 
2 EMC 
EMC 2 
EMC 
S YMMETRIX 
S YMMETRIX 
2 
S YMMETRIX 
Local 
R1 
R2 
R1 
BCV 
BCV 
2 3 4 
R2 
1 
1 
is 
another approach to the issues introduced by distance-based latency. 
Here,TimeFinder is used to create a point-in-time BCV of the production 
volume. SRDF Multi-Hop would then treat the BCV as an R1 or source device. 
Its R2 target would be at the other end of the link. 
In Multi-Hop scenarios, the links between the first location and the 
intermediate location are run synchronously. Then the TimeFinder software 
performs the splits described above. The links between the intermediate site 
and the distant site are usually Adaptive Copy mode due to the issues of 
latency. 
Multi-Hop is the best of both worlds: fully synchronous for performance 
between sites A and B but Adaptive Copy to keep line costs down between B 
and C, the disaster recovery site. 
Page 18 of 21
4) Oracle8I Automated Standby Database. 
The automated standby database is one of the prime solutions to ensure 
business continuity after a disaster. It achieves this with reduced amounts of 
inter site traffic by only shipping Archived redo logs. In the event of a disaster, 
a standby database can take over the processing and data serving 
responsibility from the primary database, providing near continuous database 
availability. The Oracle 8I automated Standby database and SRDF provide 
the means to create and automatically maintain, one or more copies of a 
Production database against disasters. 
A standby database is initially created by copying, or cloning the Production 
database at a remote site. Archived Redo Logs are copied by SRDF to the 
remote site. The Standby database is able to begin managed recovery when 
the next archived log generated by the Primary database is applied in 
managed recovery mode. 
Page 19 of 21 
Primary DB 
Failover DB 
On-Line 
Redo 
Logs 
Archived 
Redo 
Logs 
Logs Applied 
Archived 
Redo 
Logs 
Logs Copied over SRDF Link
Conclusion 
EMC Engineering has validated the Nortel Optera DWDM for use with EMC SRDF 
up to 200KM in a point-to-point configuration. 
For Deutsche Bank to replicate data in a Synchronous copy mode between sites, 
careful consideration must be given as to whether the nature and characteristics of the 
application are suited to a Synchronous copy mode configuration, or whether the 
application user response times will be adversely effected by the latency issues 
described in this document. 
If an application or its components exhibit high I/O writes, or high transaction rates, 
then alternative SRDF replication modes should be considered to avoid these latency 
issues. 
Page 21 of 21

More Related Content

Similar to Srdf overview latency_v.52

Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facilityramparasa
 
Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facilityramparasa
 
Radar Simulators From Digilogic Systems
Radar Simulators  From Digilogic SystemsRadar Simulators  From Digilogic Systems
Radar Simulators From Digilogic SystemsDigilogic Systems
 
“z/OS Multi-Site Business Continuity” September, 2012
“z/OS Multi-Site Business Continuity” September, 2012“z/OS Multi-Site Business Continuity” September, 2012
“z/OS Multi-Site Business Continuity” September, 2012IBM India Smarter Computing
 
Active / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data GuardActive / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data GuardAris Prassinos
 
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...OpenNebula Project
 
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...NETWAYS
 
Sansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewSansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewPatrick Tang
 
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...IRJET Journal
 
Demystify Edge Computing Vs. Cloud Computing
Demystify Edge Computing Vs. Cloud ComputingDemystify Edge Computing Vs. Cloud Computing
Demystify Edge Computing Vs. Cloud ComputingCygnet Infotech
 
5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory CouncilDESMOND YUEN
 
SungardASRaaS_WhitePaper_Final
SungardASRaaS_WhitePaper_FinalSungardASRaaS_WhitePaper_Final
SungardASRaaS_WhitePaper_FinalEric Brahney
 
White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices   White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices EMC
 
White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices   White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices EMC
 
ComputerNetworksAssignment
ComputerNetworksAssignmentComputerNetworksAssignment
ComputerNetworksAssignmentRebecca Patient
 
IEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsIEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsVijay Karan
 
IEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsIEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsVijay Karan
 

Similar to Srdf overview latency_v.52 (20)

1523 emc-srdf
1523 emc-srdf1523 emc-srdf
1523 emc-srdf
 
Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facility
 
Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facility
 
Radar Simulators From Digilogic Systems
Radar Simulators  From Digilogic SystemsRadar Simulators  From Digilogic Systems
Radar Simulators From Digilogic Systems
 
“z/OS Multi-Site Business Continuity” September, 2012
“z/OS Multi-Site Business Continuity” September, 2012“z/OS Multi-Site Business Continuity” September, 2012
“z/OS Multi-Site Business Continuity” September, 2012
 
Active / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data GuardActive / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data Guard
 
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...
OpenNebulaConf 2014 - OpenNebula and MooseFS for disaster recovery_real cloud...
 
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
 
Sansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewSansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overview
 
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...
IRJET-An Efficient Real-Time Controller for Retrieving Multimedia Data from S...
 
Demystify Edge Computing Vs. Cloud Computing
Demystify Edge Computing Vs. Cloud ComputingDemystify Edge Computing Vs. Cloud Computing
Demystify Edge Computing Vs. Cloud Computing
 
5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council
 
High Res CIO Review Article
High Res CIO Review ArticleHigh Res CIO Review Article
High Res CIO Review Article
 
Time finder
Time finderTime finder
Time finder
 
SungardASRaaS_WhitePaper_Final
SungardASRaaS_WhitePaper_FinalSungardASRaaS_WhitePaper_Final
SungardASRaaS_WhitePaper_Final
 
White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices   White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices
 
White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices   White Paper: DB2 and FAST VP Testing and Best Practices
White Paper: DB2 and FAST VP Testing and Best Practices
 
ComputerNetworksAssignment
ComputerNetworksAssignmentComputerNetworksAssignment
ComputerNetworksAssignment
 
IEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsIEEE 2014 NS2 Projects
IEEE 2014 NS2 Projects
 
IEEE 2014 NS2 Projects
IEEE 2014 NS2 ProjectsIEEE 2014 NS2 Projects
IEEE 2014 NS2 Projects
 

Recently uploaded

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 

Recently uploaded (20)

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 

Srdf overview latency_v.52

  • 1. SRDF Topology Discussion Document for Deutsche Bank James Ridley /Jan Jedynak EMC Corporation VERSION 1.2 Page 1 of 21
  • 2. Introduction..............................................................................................................................................3 Executive Overview..................................................................................................................................4 SRDF Overview........................................................................................................................................6 What is SRDF?.......................................................................................................................................6 Why SRDF?...........................................................................................................................................6 How does SRDF work?..........................................................................................................................6 SRDF Storage Protocol used by Deutsche Bank...................................................................................7 Where SRDF is used at Deutsche Bank.................................................................................................7 DWDM Technology.................................................................................................................................8 Overview................................................................................................................................................8 Nortel Networks OPTera Metro.............................................................................................................8 Latency....................................................................................................................................................10 SRDF induced delays...........................................................................................................................10 Example................................................................................................................................................11 Recommendations for Handling High Activity Data...........................................................................12 Types of applications that may not be suited to synchronous replication........................................13 SRDF best practices in use at Deutsche Bank.....................................................................................14 Alternative strategies.............................................................................................................................15 1) SRDF Semi-Synchronous mode .....................................................................................................15 2) SRDF Adaptive Copy mode ...........................................................................................................16 3) SRDF Multi-Hop mode...................................................................................................................17 4) Oracle8I Automated Standby Database...........................................................................................19 Conclusion............................................................................................................................................21 Page 2 of 21
  • 3. Introduction Recent engineering work by EMC has validated the Nortel Optera DWDM (Dense Wave Division Multiplexer) for use with EMC SRDF to a distance of 200KM. This document explains which applications can benefit from this extended distance mirroring, and which cannot. It also offers alternative solutions to support the protection of data if the application cannot support extended distance mirroring. Page 3 of 21
  • 4. Executive Overview EMC Engineering has recently validated the Nortel Optera DWDM for use with EMC SRDF up to 200KM. DWDM technology allows for the ‘packaging up’ of multiple SRDF links into a smaller number of physical telecomms fibre cable thus reducing the number of telecomms fibre cables required without reducing the bandwidth or efficency. This leads to greatly reduced costs. The Nortel Optera DWDM is not the DWDM currently selected by Deutsche Bank ‘New World’. It is envisaged that this 200KM distance will be increased very shortly after further validation by EMC engineering. EMC SRDF works by forwarding write IO from a host to a Symmetrix onto a second remote Symmetrix. This is done transparently to the host, which only ‘sees’ a slightly slower write IO. During normal non-BCP operation, only write IO is sent to the remote Symmetrix. A general rule of thumb is an application does 90-95% reads and only 10-5% writes. In order to calculate the additional latency we have to add the fixed ‘overhead’ of writing to 2 Symmetrix units, as opposed to 1, plus a variable value according to distance to be replicated. The variable value is proportionate to the speed of light, and it is not envisaged that EMC engineering will be able to improve on this in the near future. The DWDM units and Connectrix switch units required have negligible overhead. This overhead is per write IO. Applications with a heavy write IO, or during batch runs, may experience ‘IO Queuing’ as write IOs queue to be sent across the SRDF link to the second Symmetrix. Various host best practises can greatly reduce the potential for IO Queuing, and these are in use by Deutsche Bank. SRDF can operate in a number of modes, and these modes can be interactively switched on a very granular basis. These different modes can again greatly reduce or even eliminate IO Queuing. Thirdly, SRDF in conjunction with EMC TimeFinder can be architected into various ‘multi-hop’ and ‘time sync enabled’ solutions. In summary, SRDF can be extended across greater distances than previously possible using DWDM technology, but this extra distance is not without cost to the efficiency of the applications using it. Detailed examination of the service levels required and the IO profile of the application must be examined to see if Page 4 of 21
  • 5. it is practical to use SRDF over extended distances in synchronous mode. If the application is not suitable for synchronous mode SRDF over the distance then there are other architected solutions available which may provide the required level of protection. Page 5 of 21
  • 6. SRDF Overview What is SRDF? SRDF generates a mirror image of the data at the logical volume level in one or more remote Symmetrix systems. These remote volumes can be made addressable to remote hosts via software commands. SRDF Synchronous mode (which is the default mode of operation at Deutsche Bank) was first developed for Disaster Recovery within the customer’s campus. SRDF Adaptive Copy modes were later developed to support long distance bulk data transfers for data center relocations and content replication. Technology has evolved to support Wide Area Networking (WAN) and multiple transports, thus increasing distance and throughput for a wider variety of applications of SRDF. Additional customer uses for SRDF include remote data warehousing, remote test beds, remote report generation, remote backup and workload sharing between hosts at the same or geographically remote sites. Why SRDF? SRDF is deployed in several key areas, delivering real benefits to their organizations allowing companies to maintain access to data, so that revenue producing or supporting applications continue to serve business functions. SRDF can be used in several key areas including, but not limited to: Business continuance: business applications continue running despite possible disk failures. Disaster recovery: data recovery at the disaster recovery site in minutes rather than days. Data centre migrations: application outage reduced to minutes instead of hours. Work load migrations: similar to the data centre migrations; especially useful for minimizing outages during preventative maintenance of hardware or software, or even data center powerdowns. Shortening or eliminating backup windows: eliminate the backup window by utilizing SRDF’s second data copy. How does SRDF work? SRDF works in 3 different modes; synchronous, semi-synchronous, and adaptive copy. - Synchronous. Data on the source (R1) and target (R2) volumes are always fully synchronized at the completion of an I/O sequence - Semi-synchronous. Data on remotely mirrored volumes are always synchronized between the source (R1) and the target (R2) prior to initiating the next write operation to these volumes. - Adaptive copy. Adaptive Copy modes transfer data from the source (R1) volume to the target (R2) volume and do not wait for receipt acknowledgment and synchronization to occur. SRDF writes are from cache to cache, hence when data is written from local Symmetrix cache to remote Symmetrix cache over the SRDF link, the Page 6 of 21
  • 7. production Symmetrix waits for an acknowledgement from the remote Symmetrix before data is written to local disk. SRDF Storage Protocol used by Deutsche Bank SRDF at Deutsche Bank uses a storage protocol based upon either the ESCON™ or Fibre Channel FC-4 specifications to remotely mirror data between Symmetrix units. The host attachment, I/O protocol, and disk data structures required by each host are independent to the SRDF operation between Symmetrix units. All existing production implementations at Deutsche Bank use ESCON, though all future implementations, including the new datacentre at Hayes, will use Fibre Channel. The benefits of SRDF over Fibre Channel Point-to-Point include increased SRDF throughput for all host types and increased connectivity options for Open Systems. In addition, Fibre Channel maintains a peer-to-peer relationship as opposed to the ESCON channel and control unit relationship used at the ESCON RA director level. This increases the flexibility of SRDF in cases where it is desired to have primary and secondary volumes located at each side of the SRDF link. Where SRDF is used at Deutsche Bank SRDF is deployed between all the major MERs in the London campus, in point-to-point configurations. Page 7 of 21
  • 8. DWDM Technology Overview Dense Wavelength Division Multiplexing (DWDM) is a process in which multiple different or multiple individual channels of data are carried at different wavelengths over one pair of fiber links. This contrasts to conventional fiber optic systems in which just one channel is carried over a fiber pair. For EMC customers this means that multiple SRDF channels and server channels can be transferred over one pair of fiber links along with traditional network traffic! This is especially important in locations where fiber links are at a premium. For example, a customer may be leasing fiber, so the more traffic they can run over a single link, the more cost effective the solution. With today’s technology, the capacity of a single pair of fiber strands is virtually unlimited. The limitation comes from the DWDM itself. Optical to electrical transfers for switching and channel protection are required and limit the input traffic per channel. SRDF over Fibre Channel does not currently support direct connections between RF directors using WDM or DWDM unit port connections, due to performance limitations and the relatively variable latencies of such links over long distances. DWDM units, however, are supported for SRDF traffic via ISL connections using Fibre Channel switches such as the Connectrix family of Fibre Channel switches. Nortel Networks OPTera Metro High capacity is inherent in Nortel Networks OPTera Metro DWDM (Dense Wave Division Multiplex) solution. Each wavelength can support up to .5Gb/s, while 32 or more such wavelengths can be multiplexed onto a single fiber. The resulting aggregate supports capacities of 80Gb/s to provide high capacity trunks between network elements. Page 8 of 21
  • 9. Nortel Networks OPTera Metro provides the ability to route wavelengths, and therefore has the same survivability capabilities as current TDM rings when deployed in a ring topology. OPTera Metro provides a reliable DWDM platform for enterprises with large-scale connectivity requirements. OPTera’s transparent capabilities enable these enterprises to control the cost and DWDM Acts as an “optical funnel” – Fiber 8 to 64 wavelengths management requirements of connectivity, ensure network integrity, Increase network robustness, and easily accommodate emerging communications protocols. Features and Benefits •Support of SONET/SDH and non-SONET/SDH interfaces •Protocol and bit-rate independence •32 protected wavelengths,64 unprotected wavelengths •P r-wavelength flexible protection switching •Scalable from 16 Mbps to 2.5 Gbps per wavelength •Point-to-point and survivable ring up to 120km •In-band, per wavelength Optical Service Channel •Point and click GUI management system •Open systems management platform •NEBS and ETSI compliant Page 9 of 21 • Dense wavelength division multiplexing – Multiple protocol independent streams on a single fiber-optic cable pair – Each wavelength represents a unique stream of data which may have a different data rate
  • 10. Latency SRDF induced delays Synchronous or even semi-synchronous mirroring of data can cause impacts to customer workloads. The impact to any given workload will vary according to: - The blocksize of the data being remote mirrored - The distance over which the remote mirroring is being done - The remote mirroring mode used (e.g.. Synchronous, semi-synchronous, adaptive copy) - The type of connection between the source and target Symmetrix units - The arrival rate of the write IOs at the source Symmetrix The degree to which a customer workload is impacted by delays induced by SRDF mirroring will not only vary according to the amount of the delay, but also due to the nature of the workload. Some workloads will not be impacted by extended response times on workload components that are critical for recovery. Other workloads could be severely impacted if the affected component is on the critical path for end user transaction response time. (e.g.. An increase in response time to the online Redo logs in an Oracle environment will invariably cause end user transaction response time to degrade.) In order to approximate the amount of delay likely to be introduced by SRDF’ing the data for any given workload, one should: - Determine the type of SRDF implementation that is likely to be installed - Calculate the propagation delay induced by the link (calculated by multiplying the round trip link distance in kilometres by 0.005 msec/km, and then by 3 if campus ESCON is to be used, or by 1 if a telco link (e.g. T3, ATM, etc) is to be used, or by 2 for SRDF over Fibre Channel. To this it will be necessary to add an allowance for protocol time within the both the source and target Symmetrix, as well as allowances for delays induced by protocol converters, network equipment, etc.) - Add the approximated SRDF link delay times to the current or anticipated non SRDF’ed IO response times. - Determine the likely impact on the customer workload, remembering that the impact will inevitably follow Little’s Law1. 1 Little’s Law is the basis upon which a lot of queuing theory is built. In general terms, Little’s Law relates the average queue length (Q) to the arrival rate of transactions (a) and the average response time (R). Specifically, Little’s Law states: Q = a * R. Consequently, it can be seen that any increase in IO response time may well cause a significant blowout in the queue length within the application, which may or may not be supportable from a customer business perspective. Page 10 of 21
  • 11. Example This document is concentrating on SRDF over Fibre Channel. Write IO is transmitted using SCSI over Fibre Channel, and so according to the SCSI protocol every IO to be transmitted actually requires 2 round trips; the first is the SCSI command word (for SRDF this will be WRITE), the remote Symmetrix then returns the acknowledgement. The second trip is for the actual data, followed by the acknowledgement from the remote Symmetrix that the data has been written to cache and confirmed. This leads to the X2 propagation delay described above. The picture above illustrates the host response time without SRDF (Baseline), and the overhead of running SRDF over zero distance (Campus) for 4K and 27K blocksize. Working through a 4K blocksize example, we have a 2.0MS host response time for zero distance. Add to this a 100KM distance – the approximate distance from London to Milton Keynes - ((100KM + 100KM + 100KM + 100KM)*0.005)=2.0 – a total of 4MS response time per write IO. Heavy write activity on 1 volume may mean that IOs are queued waiting for the previous IO to be acknowledged from the remote Symmetrix, and so you may get IO elongation, with IOs waiting on IOs on IOs (see Little’s Law above). Note: There is no significant Latency through Switches or DWDMs Page 11 of 21 2.1MS 3.9MS
  • 12. Recommendations for Handling High Activity Data As a general rule of thumb, and depending on the nature of the application being supported, the distance over which the data is to remote mirrored, etc, in order to ensure acceptable overall IO response times it is desirable that no single logical volume involved in a remote mirroring relationship be required to handle more than 100 write IOs/sec at 200KMs. This figure is derived from the maximum number of IOs that a logical volume can sustain at that distance (4K blocksize – max 175 write IOs per second, 27K blocksize – max 125 write IOs per second). It must be remembered that only 1 IO for a volume can be in the SRDF ‘pipe’ at a time, though multiple IOs can be in the ‘pipe’ at the same time. In order to reduce the IO rate to any given logical volume to this sort of level, it may be necessary to implement some of the following. - Wherever possible high activity data should be spread over as many logical volumes as possible, so as to reduce the overall IO rate per volume, ie host level striping. - If possible, increase host level buffering and blocksizes so as to reduce the number of IOs done by the application. - When dealing with high activity IO caused by large, single address space tasks (e.g. database control regions, etc), it may be necessary to break the tasks into multiple smaller tasks, so as to reduce the amount of data generated on a per region basis to more manageable levels. This is a non-trivial task, as it may have significant impact on the customer’s application architecture, and will require significant involvement from customer personnel such as Data Base Analysts, etc. - If necessary, re-design the application so as to achieve the desired IO rate on a per volume basis. Page 12 of 21
  • 13. Types of applications that may not be suited to synchronous replication. 1) Database applications which exhibit very high transaction throughput and therefore a high number of log writes. 2) Database Applications that have a high transaction rate and perform excessive number of Consistency Points operations (perhaps as a result of frequent log switch operations) 3) Applications which exhibit high volumes of I/O writes. 4) Applications that are highly sensitive to synchronous write I/O performance (non-buffered synchronous writes) 5) Any highly time-bound write intensive application process where any elongation of write I/O would impact application performance Page 13 of 21
  • 14. SRDF best practices in use at Deutsche Bank Various best practices can reduce the impact of IO Queuing and IO elongation. The simplest is to make sure that all filesystems are built on host level striped volumes. The reason for this is that the SRDF 'pipe' or queue can only have 1 IO for a Symmetrix volume going across it at any time. The pipe can contain more than 1 IO, but not for the same Symmetrix volume. By creating a striped volume set at the host level you get 2 immediate effects when the host writes an IO. If we were to write IOs to a striped filesystem spread over 4 Symmetrix volumes then the 2 benefits would be: 1) the host knows it is writing to a striped set and issues more IOs to the disk subsystem, as it knows it is actually writing to 4 volumes 2) more IOs can go across the SRDF 'pipe' to the remote Symmetrix as the IOs are to 4 Symmetrix volumes rather just 1. This reduces queuing for pipe. Host level LVM striping is being used as a best practice by nearly all projects based on EMC Symmetrix. Page 14 of 21
  • 15. Alternative strategies The latency overhead can also be masked from the user if an alternative replication strategy is adopted namely, Semi Synchronous or Multi-Hop replication. Another strategy would be combining the benefits of SRDF with an Oracle automated standby database. This solution requires only that the online redo logs be synchronously replicated, thus drastically reducing communication needs. The following strategies could help alleviate latency overhead with SRDF deployed over extended distances. 1) SRDF Semi-Synchronous mode This is used primarily in extended distance environments. In this mode of operation, data on the remotely mirrored volumes are always synchronized between the source (R1) volume and the target (R2) volume prior to initiating the next write operation to these volumes. The sequence of operations is: 1.An I/O write is received from the host/server into the cache of the source. 2. An ending status is presented to the host/server. 3.The I/O is transmitted to the cache of the target. 4. A receipt acknowledgment is provided by the target back to the cache of the source. Semi-Synchronous mode masks the impact of distance in the general case, because it allows read operations while write operations are in transit. SRDF uses a first-in, first-out queue. Page 15 of 21 SRDF SEMI-SYNCHRONOUS MODE 4 SRDF links 3 Source Target 2 1 Target behind at most one write operation per source logical volume
  • 16. 2) SRDF Adaptive Copy mode SRDF Adaptive Copy mode is used primarily for data migrations and data centre moves. This operational mode is not recommended for use when mirroring for disaster recovery. SRDF Adaptive Copy mode allows the source (R1) volumes and target (R2) volumes to be a few or many I/Os out of synchronization. The number of tracks out of synchronization (skew) is user selectable. There are two types of adaptive copy: Write Pending mode and Disk mode. The sequence of operations is: 1. An I/O write is received from the host/server into the cache of the source Symmetrix 2. The I/O is acknowledged as completed to the host/server 3. The I/O is placed in the SRDF queue 4. The I/O is de-staged from cache to the source (R1) volume, and an issue request is sent to the SRDF link 5. The I/O is transmitted to the cache of the target 6. A receipt acknowledgment is provided by the target back to the cache of the source. Adaptive Copy Write Pending mode allows the transmission to take place before the data is de-staged from cache to the R1 disk volumes. Adaptive Copy Disk mode de-stages the data from the cache to the R1 volume and then keeps track-level information as to what data is owed to the remote side so that information can be subsequently sent a track at a time. SRDF Adaptive Copy mode is used primarily for data migrations, data center moves, and in conjunction with SRDF over Internet Protocol (IP) links. This mode of operation also can be used in an SRDF Multi-Hop configuration to mirror TimeFinder Business Continuance Volumes (BCVs)/R1 changed tracks between the intermediate target site and the final (Multi-Hop) target site. N.b Thresholds for how far out of synch the volumes are allowed to be is selectable by the user with the “skew” command. Page 16 of 21
  • 17. 3) SRDF Multi-Hop mode TimeFinder software works by configuring multiple, independently addressable online Business Continuance Volumes (BCVs) for information storage. The BCV is a Symmetrix device with special attributes created when the Symmetrix is configured. It can function either as an additional mirror to a Symmetrix logical volume or as an independent, host-addressable volume. Establishing BCV devices as mirror images of active production volumes allows you to run multiple simultaneous business continuance tasks in parallel. The principal device, known as the standard device, remains on line for regular Symmetrix operation from the original production server. Each BCV contains a unique host address, making it accessible to a separate backup/recovery server. When you establish a BCV as a mirror of a standard device, that relationship is known as a BCV pair. The BCV is temporarily inaccessible to its host until you split the BCV pair. The multi-hop restart solution is applicable when you want zero data loss in the event of a disaster at the local site. Zero data loss means that the state of the data at the Hop 2 restart site (after being propagated from the Hop 1 bunker site) is the same as it is at the local source site at the the beginning of a rolling disaster. Automated replication with the BCVs at Hop 2 is applicable if you want a zero data loss solution but cannot risk the loss of both the local source site and Hop 1 bunker site at the same time. With this configuration, there are two possible disaster restart possibilities: - If only the local source site is lost, the result is zero data loss at the Hop 2 restart site. - If both the local source site and the Hop 1 bunker site are lost, the result is a DBMS restartable copy at the Hop 2 restart site with controlled data loss. The amount of data loss will be a function of the replicate copy cycle time between the Hop 1 bunker site and the Hop 2 restart site. Page 17 of 21
  • 18. Hop1 Hop2 2 EMC EMC 2 EMC S YMMETRIX S YMMETRIX 2 S YMMETRIX Local R1 R2 R1 BCV BCV 2 3 4 R2 1 1 is another approach to the issues introduced by distance-based latency. Here,TimeFinder is used to create a point-in-time BCV of the production volume. SRDF Multi-Hop would then treat the BCV as an R1 or source device. Its R2 target would be at the other end of the link. In Multi-Hop scenarios, the links between the first location and the intermediate location are run synchronously. Then the TimeFinder software performs the splits described above. The links between the intermediate site and the distant site are usually Adaptive Copy mode due to the issues of latency. Multi-Hop is the best of both worlds: fully synchronous for performance between sites A and B but Adaptive Copy to keep line costs down between B and C, the disaster recovery site. Page 18 of 21
  • 19. 4) Oracle8I Automated Standby Database. The automated standby database is one of the prime solutions to ensure business continuity after a disaster. It achieves this with reduced amounts of inter site traffic by only shipping Archived redo logs. In the event of a disaster, a standby database can take over the processing and data serving responsibility from the primary database, providing near continuous database availability. The Oracle 8I automated Standby database and SRDF provide the means to create and automatically maintain, one or more copies of a Production database against disasters. A standby database is initially created by copying, or cloning the Production database at a remote site. Archived Redo Logs are copied by SRDF to the remote site. The Standby database is able to begin managed recovery when the next archived log generated by the Primary database is applied in managed recovery mode. Page 19 of 21 Primary DB Failover DB On-Line Redo Logs Archived Redo Logs Logs Applied Archived Redo Logs Logs Copied over SRDF Link
  • 20. Conclusion EMC Engineering has validated the Nortel Optera DWDM for use with EMC SRDF up to 200KM in a point-to-point configuration. For Deutsche Bank to replicate data in a Synchronous copy mode between sites, careful consideration must be given as to whether the nature and characteristics of the application are suited to a Synchronous copy mode configuration, or whether the application user response times will be adversely effected by the latency issues described in this document. If an application or its components exhibit high I/O writes, or high transaction rates, then alternative SRDF replication modes should be considered to avoid these latency issues. Page 21 of 21