Network infrastructures are speeding up and your business needs to keep pace. You will not be able to test your network's high-speed performance if you keep hitting traffic jams. Access this white paper to learn how to reduce the limits on your infrastructure to increase overall performance and improve the quality of your storage infrastructure.
2. 1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
2
Keep Your Business
Running in the Fast Lane
Business today depends upon its network infrastructure. Network availability
and performance can have a meaningful impact on the business’ success,
and businesses know it. Underscoring the importance of information
technology (IT) to business, industry analyst firm, Gartner, expects
worldwide IT spending to reach over $3.5 trillion in 2017, up 2.7 percent from
2016. With that kind of investment, optimizing resources is critical.
In today’s economy, one of the most important aspects of business
operations is management of the IT infrastructure. Ensuring applications
run with optimal performance is a crucial task, but fast, dependable
performance doesn’t necessarily come easy. Even with dedicated host
servers, all-flash storage, lots of dynamic random-access memory (DRAM)
and storage host bus adapter (HBAs) ports, the fastest available switches,
and essentially infinite funding, there can be other reasons for a lack of
performance, such as access contention. Imagine you’re driving a premium
sports car on a 10-lane super highway with no speed limits. You aren’t going
to be able to test its high-speed performance if you keep hitting traffic jams.
When production storage must be migrated while the production
workload is still active (what is generally called “online storage migration”),
traffic jams on the storage networking highway will happen, inevitably
impacting application performance. These input/output (I/O) traffic
slowdowns occur because the client hosts’ application processes
are writing to and reading from the disks of the production storage
simultaneously as the migration process performs heavy reads from the
same disks. This adds a significant amount of input/output operations
per second (IOPS) to the workload of the storage controller, consumes/
disrupts the limited amount of cache, and increases the randomness of
access. Simply put, an online storage migration process can consume a
large amount of available storage bandwidth, very similar to rush hour
traffic that can paralyze a 10-lane highway. Unfortunately, host-based
migration tools such as built-in logical volume manager (LVM) mirroring,
3rd party disaster recovery (DR) tools, and VMware’s Storage vMotion can
all have a significant, negative impact to application storage performance.
This is especially true when these tools are used to perform large scale
migrations of 100 terabytes (TB) or more.
Imagine you’re driving a
premium sports car on
a 10-lane super highway
with no speed limits. You
aren’t going to be able
to test its high-speed
performance if you keep
hitting traffic jams.
3. 3
1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
Keep Your Business
Running in the Fast Lane
The performance of the storage, as experienced by the host application, is
called the Quality of Service (QoS) of the storage. The actual measurement
is represented in units of either IOPS for small blocks of data such as
database transactions, and/or Megabytes Per Second (MB/s) for large
blocks, such as streaming video. Obviously, on a well-tuned system where
IOPS and MB/s are optimized for a particular application, adding a lot of
reads on the production storage during migration can push the storage
beyond its limit, resulting in a severe drop in the amount of IOPS and/
or MB/s available to the host application. This leads to traffic jams on the
storage path, and the end result is a frustrated user experience.
It can seem that guaranteed application storage QoS and the fast, effective
migration of online storage are mutually exclusive goals. Artificially
throttling down the migration process can help with the storage QoS,
but also reduces the migration performance and slows the overall data
migration. With these competing priorities in mind, Cirrus Data Solutions’
Data Migration Server (DMS) was designed to deliver QoS during a 24x7
online data migration. This intelligent QoS (iQoS) mechanism provides the
best of both worlds: guaranteed application QoS for storage, as well as the
ability to maximize storage bandwidth utilization by the migration process.
In an attempt to preserve the QoS, businesses sometimes introduce an
arbitrary limit on the amount of migration traffic. This is NOT iQoS. While
setting a limit on the maximum amount of migration MB/s (or IOPS) can
alleviate impact to storage QoS, it also creates new problems. Referring
back to our car analogy, it is similar to the highway entrance ramp traffic
light that is set to allow only one car to enter the highway every 10 seconds.
There are multiple problems with this rate limiting approach. How do you
determine the appropriate rate limit? What if you want to change the rate
during the migration? To really protect the client host application QoS,
the limit must be set extremely low – the equivalent to the traffic light only
allowing one car a minute onto the highway. This very conservative rate
limit creates another problem – the never-ending data migration.
Cirrus Data Solutions’
Data Migration Server
(DMS) was designed
to deliver QoS during
a 24x7 online data
migration. This intelligent
QoS (iQoS) mechanism
provides the best of
both worlds: guaranteed
application QoS for
storage, as well as the
ability to maximize
storage bandwidth
utilization by the
migration process.
4. 1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
4
Keep Your Business
Running in the Fast Lane
On the other hand, if the limit is set high (enabling a faster migration),
the application could be severely impacted, which will result in angry calls
from the application manager. In fact, this is one of the biggest headaches
of storage migration professionals – once you get a call to pause the
migration due to unacceptable impact to production, you may not be able
to track down the application owner again to get permission to resume the
migration, and now your project is in a state of limbo with no ending date
in sight. Additionally, many of the host based migration products require
you to restart a migration session from the beginning if it is paused. CDS
DMS allows you to pause a migration session for any reason and resume
the session from the point where it was paused. No need to restart the
migration session from the beginning thereby eliminating false starts and a
significant amount of wasted time.
With these considerations in mind, those responsible for migration tend
to set very low limits on data migration throughput to avoid the negative
impact on the applications’ operating performance. It explains why most
migrations are conducted at only a fraction of the maximum possible
rate and usually during off peak hours such as nights and weekends.
Unfortunately, this also means even when there are blocks of time where
production applications are hardly accessing the disks, the migration
process is still running slowly due to throttling, resulting in unnecessarily
prolonged migration projects and additional overtime labor costs.
Cirrus Data Solutions’ iQoS is designed to eliminate the above dilemma.
Rather than using a rate-based limit, iQoS from CDS provides the
equivalent of “automatic pause and resume” capability, based on the actual
I/O conditions of each disk being migrated. DMS monitors the read or
write commands that are queued up on each disk, pending execution by
the storage controller. The number of outstanding commands provides an
accurate calculation of the specific disk’s activity level or “busy-ness.” Based
on this calculation, an “intelligent” limit is set on the activity level, above
which the disk is considered “busy.”
CDS DMS allows you
to pause a migration
session for any reason
and resume the session
from the point where
it was paused. No
need to restart the
migration session from
the beginning thereby
eliminating false starts
and a significant amount
of wasted time.
5. 5
1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
Keep Your Business
Running in the Fast Lane
CDS’s iQoS algorithm also determines, within a measurement window,
what percentage of time the disk is busy. With this data, it is now possible
for the migration process to define how much impact is acceptable to the
application storage traffic, enabling the business to maintain true “Quality
of Service” within the production environment. When the migration
process uses low impact settings, the iQoS algorithm will yield to the
application storage traffic even if the “busy” percentage is low (by assigning
5 percent as the impact setting value). On the other hand, if there is a
pressing need to complete the migration quickly and the application owner
agrees ahead of time, the impact setting can be set to 95 percent. In this
scenario, the migration will continue as long as the percentage of time
that the disk is “busy” stays below 95 percent. When DMS is set in this
aggressive mode, it will migrate between 8TBs to 12TBs per hour.
What’s the difference between iQoS and rate-based QoS? Let’s look at one
straightforward example.
• Rate-based QoS methodology:
o Maximum Migration Rate = 200MB/s
o Potential Migration Rate = 1000MB/s
o Total Usage = 10 percent
• iQoS methodology:
o Maximum Migration Rate = 1000MB/s
o Activity Threshold = 5 percent
In the rate-based QoS model, regardless of the business activity level
the data migration will never exceed 200MB/s. Only 20 percent of the
available storage bandwidth will ever be used, including those periods of
time that the production disks (the source) are totally idle with zero I/O
from the application.
Harnessing the CDS iQoS functionality, when the product disks are idle,
data migration will increase to 1000MB/s. If activity increases, the data
migration will reduce to a very low impact level of 5 percent.
CDS’s iQoS algorithm
also determines, within
a measurement window,
what percentage of time
the disk is busy. With this
data, it is now possible
for the migration process
to define how much
impact is acceptable to
the application storage
traffic, enabling the
business to maintain
true “Quality of Service”
within the production
environment.
6. 1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
6
Keep Your Business
Running in the Fast Lane
DMS iQoS actually monitors the number of commands outstanding on
each of the LUNs being migrated and uses this information to gauge
impact. For easy comparison, let’s assume that a “Minimum Impact”
setting on iQoS translates to approximately 200MB/s of threshold on the
average. The dramatically better use of available bandwidth for migration
is shown on the graphs below.
DMS iQoS actually
monitors the number of
commands outstanding
on each of the LUNs
being migrated and
uses this information to
gauge impact.
0
200
400
600
800
1000
1200
12:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
MB/s
Rate-based ThroƩling MigraƟon Rate
Host IO MigraƟon IO
0
200
400
600
800
1000
1200
12:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
MB/s
DMS iQoS MigraƟon Rate
Host IO DMS Yielding DMS IO
7. 7
1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
Keep Your Business
Running in the Fast Lane
With iQoS, the
production I/O is
protected. iQoS yields
to the production I/O
when it arrives, yet takes
full advantage of the
periods of time where
production I/O is low
for full speed migration.
Everybody is happy.
The improvements to the data migration with iQoS are unquestionable.
iQoS enables the business to better utilize the available storage bandwidth
for migration, while at the same time assuring the precise amount of QoS
for the application. Without iQoS, the steady rate limit results in a prolonged
migration time and still does not totally eliminate impact to production.
With iQoS, the production I/O is protected. iQoS yields to the production
I/O when it arrives, yet takes full advantage of the periods of time where
production I/O is low for full speed migration. Everybody is happy.
The iQoS feature of DMS takes intelligence to the next level. This feature
provides three adjustable migration impact settings of “low,” “moderate,”
and “aggressive” for each set of disks. In addition to these three adjustable
migration modes, iQoS also allows you to establish different impact
settings for different dates and time periods via an iQoS calendar. The
calendar is provided because in the real world, an application owner’s
tolerance for impact is different at different times of the day, as well as for
different days of the week, month, quarter end, and year end.
For example, a business with traditional 9am – 5pm, Monday – Friday hours
might configure a reasonable impact setting as follows:
• Monday-Thursday:
o Low Impact migration from 9:00am – 5:00pm;
o Aggressive migration from 5:00pm – 8:00am
• Saturday-Sunday and Holidays:
o Moderate migration: 12:00am – 12:00pm;
o Aggressive migration from 12:01pm – 11:59pm
• Fridays and day-before-holidays:
o Low Impact migration from 9:00am – 3:00pm;
o Aggressive migration from 3:01pm – 8:00am
In this example, there is heavy activity on the production storage during
business hours (9-5) on normal workdays (Monday to Thursday). At these
times, migration should proceed in the Low Impact mode where the iQoS
is set by default to yield whenever the storage is more than 5 percent busy.
For weekends or holidays, migration is set to moderate during the 1st half
of the day and set to aggressive during the 2nd half of the day. These
recommended settings will accommodate backup jobs running during
the slower business activity level and then maximize the time when the
network is quiet. The business is even able to define its own holiday or
special days in a custom calendar.
8. 1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
8
Keep Your Business
Running in the Fast Lane
For days before a holiday – like New Year’s Eve or Fridays before a holiday
weekend – businesses often have employees leave work early, which results
in a much quieter network activity level. The migration mode can then be
set to aggressive starting at 3PM (1500H) instead of 5PM. The three modes
of migration, combined with a customizable calendar, makes it possible to
negotiate with each of the application owners ahead of time to define the
impact control based on the calendar and their level of business activity.
Once the settings are confirmed, the migration server will know exactly
how aggressively it can migrate data at different times and on different
days. With iQoS, you will never get an angry phone call because the
storage QoS is being brought to its knees by migration traffic.
Having an intelligent mechanism to guarantee QoS for applications is a
good thing. It’s like implementing a sensor on the highway entrance ramp
that enables better control of onramp traffic, thereby ensuring better
utilization of the highway bandwidth.
Additionally, if there could be a way to redirect all the rush-hour (i.e.
migration) traffic onto additional reserved lanes so that the extra traffic
volume is redirected to by-pass the highway system, that would even be
better. For some busy highway intersections and tunnels/bridges, one
or two lanes from the opposite traffic direction are often repurposed to
allow for extra traffic in the congested direction. In the IT world, the rough
equivalence of adding extra lanes is “offloaded copying.” A good example is
VMware’s Storage vMotion.
For a local Storage vMotion, where the source and destination storage
LUNs (logical disks) are both on the same physical storage frame (i.e.,
managed by the same storage controller), vMotion makes use of XCOPY so
that the storage controller will perform the block copy from source LUN to
destination LUN. When this is the case, the source LUN blocks are read into
the storage controller’s memory and then written to the destination LUN,
eliminating the need for a massive amount of block level data being moved
into the ESX server’s memory space and then being pushed out. This is
like having extra by-pass lanes on the highway for the rush hour traffic.
According to Virtual Geek’s lab test report the difference with or without
XCOPY can be significant.
Having an intelligent
mechanism to guarantee
QoS for applications is
a good thing. It’s like
implementing a sensor
on the highway entrance
ramp that enables
better control of onramp
traffic, thereby ensuring
better utilization of the
highway bandwidth.
9. 9
1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
Keep Your Business
Running in the Fast Lane
Unfortunately, VMware
has only implemented
XCOPY for moving data
between LUNs that
are under the same
storage controller (which
means within a single
storage system space).
This is NOT ideal for a
real storage migration
scenario, since VMware
simply cannot support
XCOPY if the source LUN
and destination LUN
are on different physical
storage systems (even
for the same vendor and
same model).
Storage vMotion performance is more than 5 times faster with XCOPY.
Unfortunately, VMware has only implemented XCOPY for moving data
between LUNs that are under the same storage controller (which means
within a single storage system space). This is NOT ideal for a real storage
migration scenario, since VMware simply cannot support XCOPY if the
source LUN and destination LUN are on different physical storage systems
(even for the same vendor and same model). This is understandable due
to the fact that such a migration project would require a complex setup
on the FC fabric to allow the source and destination storage to “see” each
other, and would require a significant amount of compatibility testing
across storage controllers from various vendors, a Herculean task at best.
When using Cirrus Data’s DMS, in addition to providing iQoS to intelligently
control migration aggressiveness, the migration traffic is 100 percent
offloaded onto the DMS appliance, away from VMware’s ESX hosts. This is
like a having universal XCOPY capability implemented across all storage,
regardless of where the source and destination LUNs reside. With DMS
capabilities, why would you ever want to use VMware to migrate data?
10. 1.516.427.5210 | info@cdsi.us.com | www.cdsi.us.com
10
Keep Your Business
Running in the Fast Lane
Cirrus Data’s DMS is the only data migration tool that can guarantee the
Quality of Service for applications while utilizing all available migration slots
to compress the length of time to complete the migration. DMS iQoS
uses a unique approach that ensures the lowest possible impact to active
applications while accelerating the time line to completion for larger scale,
online storage migrations.
Compared to other rate-based throttling methods, iQoS and the offloaded
copying features of Cirrus Data’s DMS provide a much more effective
method for moving the large amounts of data across a network. Cirrus
iQoS relieves the negative performance impact to the application resulting
from conflicting migration traffic on the disks being migrated. DMS delivers
guaranteed application storage performance with the highest QoS while
allowing migration projects to be completed at a much faster rate by more
intelligently using all available bandwidth. Cirrus Data’s DMS provides the
functionality to ensure that all data migration projects are completed on
time, every time, without negative impact to the application owners.
DMS delivers guaranteed
application storage
performance with the
highest QoS while
allowing migration
projects to be completed
at a much faster rate by
more intelligently using
all available bandwidth.