1. Site Audit & SAN ANALYSIS
MTI Technology Corporation
DOUBLE CLICK SITE AUDIT & SAN ANALYSIS
Report
SO#: PS310018
1
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
2. Site Audit & SAN ANALYSIS
Laurence E. Wilson
12 February 2003
REV-2
2
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
3. Site Audit & SAN ANALYSIS
PREFACE
1. The purpose of these SAN analyses is to document hardware,
software, control systems and common practices.
2. Customers have stated in the past that what they seek in a SAN
Analysis is vendor neutrality in as much as possible.
3. In respect to 1& 2 above the goal is to recommend improvements in
hardware / software / procedures or infrastructure. Where possible
several vendor solutions are listed where they are found to be
compatible / functional.
Much of the information in this document is simply documentation of
the customer data center. It may seem that you already know much of this
information but it is collected for 2 primary reasons.
1. So that if the customer decided to use another vendor (other than
or in addition to MTI), he could provide copies of this Analysis to
those vendors. Since you already paid for a SAN analysis, why pay
for a second if you use more than one vendor?
3
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
4. Site Audit & SAN ANALYSIS
2. So that MTI Sales Engineers (and the author) can gain a
sufficiently high level understanding of your site, that they can
implement whatever solutions you choose.
Each chapter has a summary reviewing that section and
recommendations from that chapter.
If you wish, please read Chapter 9 (Summary Road Map) and Chapter
10 (Recommendations / Summary) first. The rest of the chapters and
appendixes provide supporting documentation for the suggestions /
conclusions in 9 & 10.
4
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
5. Site Audit & SAN ANALYSIS
Table of Contents -
PAGE
PREFACE - 02
1.0 Overview - 07
1.1 Customer Summary - 07
1.2 Current Environment Summary - 08
1.3 NYC Data Center - 09
1.4 Primary Data Center - 10
2.0 Projections & Growth - 11
2.1 Current / Future Application - 11
2.2 Current / Future Users - 12
2.3 Future Hardware Growth -12
2.3.1 TAPE - 13
2.3.2 DISK - 13
2.4 Issues -14
2.5 Summary - 14
3.0 Proposed SAN - 15
3.1 SAN Topology - 15
3.2 Hardware Base – Directors - 15
5
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
6. Site Audit & SAN ANALYSIS
3.3 Multiple Operating Systems - 16
3.4 FCP/SCSI 3 SAN (Core San #1) - 16
3.5 IP/FIBRE Backup SAN (Core San #2) - 16
3.5.1 Combined FCP/IP SAN (Core San #3) - 17
3.6 Standardization - 18
3.6.2 Switches/Directors - 18
3.6.3 HBA - 18
3.6.4 Server Hardware - 18
3.6.5 Backup Hardware - 18
3.6.6 Backup Software - 19
3.7 NAS - 19
3.8 Summary - 20
4.0 Backup - 21
4.1 Current - 21
4.2 Proposed - 21
4.2.1 LAN Free - 23
4.2.2 Server Free - 24
6
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
7. Site Audit & SAN ANALYSIS
4.2.3 Veritas - 26
4.2.4 ATL / StorageTek – DLT (Tape Libraries) - 28
4.3 Near Line Data - 28
4.4 Summary - 29
5.0 Performance & Applications - 30
5.1 Current Applications - Performance Impact - 30
5.2 Proposed SAN - Impact on Applications Perf - 30
5.3 Proposed SAN - Future Growth Impact - 31
5.4 Recommended Enhancements - 31
5.4.1 V-Cache - 31
5.5 Tuning Oracle Database with V-Cache - 32
5.5.1 Using V-Cache with Oracle - 33
5.5.2 Determining The Need For V-Cache - 37
5.5.3 V-Cache Oracle Summary - 39
5.6 Summary - 40
6.0 Disaster Recovery - 41
6.1 Current - 41
6.2 Proposed - 41
7
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
8. Site Audit & SAN ANALYSIS
6.2.1 Off Site Disaster Discovery - 41
6.2.2 Create A DoubleClick owned datacenter -41
6.2.3 Off Site Storage - 42
6.3 Summary - 43
7.0 Control / Virtualization / Data Migration - 44
7.1 MSM / Fusion - 44
7.2 Veritas NBU - 45
7.3 Veritas SANPOINT / Disk Management - 46
7.4 Fujitsu (Softek) SANVIEW - 47
7.4.1 Fujitsu (Softek) Storage Manager - 47
7.4.2 Fujitsu (Softek) TDMF - 48
7.5 MTI DataSentry2 - 49
7.5.1 QuickSnap (Point In Time Images) - 50
7.5.2 QuickDR (Data Replication & Migration) -50
7.5.3 QuickCopy (Full Point In Time Image Vol) - 52
7.5.4 QuickPath (Server free & Zero impact Bup) - 52
8
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
10. Site Audit & SAN ANALYSIS
Appendix
Appendix 1 - Contact Information
Appendix 2 - MTI Hardware Primary Data Center
Appendix 3 - MTI Hardware Secondary Data Center
Appendix 4 - Hardware Environment WINTEL
Appendix 5 - Hardware UNIX / SUN
Appendix 6 - SAN Implementation Best Practices
Appendix 7 - HBA Count (MTI)
Appendix 8 - JBOD Inventory (Primary Data Center)
Appendix 9 – Wintel Storage Utilization (SAN)
10
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
11. Site Audit & SAN ANALYSIS
1.0 Overview
1.1 CUSTOMER SUMMARY
DoubleClick contracted with MTI Professional Services group to
perform an enterprise storage and SAN implementation analysis. The
purpose of this analysis was to suggest to DoubleClick a method for
implementing a centralized SAN storage architecture, while maximizing
their current investment.
An analysis of DoubleClick’s data center in New York City and its
equipment is reviewed herein. This document attempts to outline current
practices, hardware at DoubleClick, as well as recommended best
practices, and a plan to migrate to a SAN topology.
DoubleClick develops the tools that advertisers, direct marketers and
web publishers use to plan, execute and analyze marketing programs.
These powerful technologies have become leading tools for campaign
management, online advertising and email delivery, offline database
marketing and marketing analytics. Designed to improve the performance
and simplify the complexities of marketing, DoubleClick tools are a
11
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
12. Site Audit & SAN ANALYSIS
complete suite of integrated applications.
Double click and MTI have been partners since December of 1998. In
that time Double click has gone through a complete business cycle. Rapid
runaway growth at the end of the 90’s, caused the data center to expand
quickly without much consideration for best practices.
Since the economy has started to contract DoubleClick has gone from
not enough storage to a temporary glut. A lot of this storage is now 3 and
4 years old, and is a mixture of systems that in some cases are
approaching block obsolescence. DoubleClick has taken some steps to
correct this. Upgrading older V20’s to S240 series, and moving some
storage to other sites in the company where it can be better utilized.
Currently DoubleClick is focusing on consolidating their data center
operations in Colorado. They are currently in the process of moving their
New York City Data center equipment. They plan to complete this move
on or about 1 June 2003.
12
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
13. Site Audit & SAN ANALYSIS
This document will focus on helping DoubleClick on identifying
methods to consolidate their storage into a central SAN. Areas where
DoubleClick can focus, such as standardization of hardware, and
upgrading older arrays instead of moving them will be addressed.
1.2 Current Environment Summary
The current environment at DoubleClick consists of a mixture of direct
attached storage and SAN attached storage. These environments are
distributed and are both SCSI attached and fibre attached.
The hardware in the SAN consists mostly of Gladiator 3600 series /
V20/30 undergoing conversion to S240 series. Also included are older
Ancor switches, Gadzooks hubs, and Hitachi 9000 series SAN. These
devices far from being one complete SAN are actually “POOLS” of SAN
storage connected to 2 or 3 hosts each (on average). Thus the economies
and redundancies of a Core Switch based SAN are not seen here.
Local attach storage consists of MTI 2500 series JBODS, Compaq
Direct Attach Storage, and internal server drives.
13
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
14. Site Audit & SAN ANALYSIS
A matrix of MTI & Customer systems and storage can be found in
Appendix section of this document.
A topical overview of the New York City Data Center follows.
1.3 New York City Data Center
The New York City Data center resides on the 12th
floor of the
customer facility. It has OC3 capability to the customers Colorado Data
Center (in Thornton, Colorado) and the customer is looking into a second
OC3 pipe to aid in migrating data for the upcoming relocation from NYC to
Colorado.
The center may be broken into 3 facilities for reference purposes.
Those facilities would be:
14
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
15. Site Audit & SAN ANALYSIS
1. The Primary Data Center (PDC)
2. The Secondary Data Center (SDC)
3. The Cage (a small secure area within the PDC)
The customer subdivides operations on these systems into 3 primary
groups (or operations) they are:
1. Production
2. QA
3. DEV
Interviews with the customer on site have indicated that the Primary
Data Center (less the cage) & the systems dedicated to production is the
primary focus of the Data Migration and SAN Analysis. The systems that
provide QA and DEV functions will remain in NYC when the production
systems are moved to Colorado.
15
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
16. Site Audit & SAN ANALYSIS
1.4 Primary Data Center
The primary data center holds most of the production equipment and is
comprised of the following.
a. 8 SUN 6500
b. 2 EMC Symetrix (Offline)
c. 1 Hitachi 9960 SAN Array (15 TB)
d. 52 19 x 70” Cabinets of MTI Storage (112 TB ON –FLOOR)
a. Mainly 3600 with some S240 & V20/30 Series
b. 112 TB ON –FLOOR / RAW (Includes Offline Cabinets)
c. 51TB ONLINE RAW (26 Cabinets 3600)
d. 34TB ONLINE RAW (9 Cabs V20/30/S24X)
e. 3 ADIC Scalar 1000 DLT tape Libraries (Offline)
f. 2 StorageTek L700 robot tape libraries
16
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
17. Site Audit & SAN ANALYSIS
g. 15 Sun 4500
h. 43 SUN Sparc / Ultrasparc
i. 285 Compaq Proliant (Models mainly 1850/550/6400R + some
newer)
j. 28 HP Netserver 600R
k. 5 NET APP F760 (7.5TB)
l. 4 SUN 220R
m. Compaq JBODS (various 8.1 TB approx)
Note: These are systems that appear alive. Due to the size some may
actually look alive but be “retired”. All attempts where made not to count
“dead” or “retired” systems. Also due to day-to-day operations the count
may fluctuate. The above represents a good snapshot of the types and
numbers of equipment “in production”.
17
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
18. Site Audit & SAN ANALYSIS
2.0 Projections & Growth
The MTI statement of work (SOW) for this project stated a goal of
planned 30 – 40% annual growth in storage requirements. It will be the
baseline for all assumptions made in this document unless otherwise
noted. It will be the basis for a scalable SAN storage proposal.
The customer is currently involved in a move to consolidate their
facilities in Colorado. At the New York Data Center, the customer has a
Primary Data center & a Secondary Data center on the 12th
Floor of their
building.
Customer has stated that the focus is on the systems identified in the
Primary Data center as moving to Colorado. These are the
parameters/assumptions for the recommendations and notes found
within.
2.1 Current / Future Applications
Current applications we are considering for the purposes of SAN
design and implementation are:
18
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
19. Site Audit & SAN ANALYSIS
1. ORACLE – Database Application
2. ABINITIO – Database Application / Works with Oracle
3. Microsoft Exchange
4. DART - (Customer Applications).
5. STANDARD – (Standard Data Center process’s i.e. File
Serving, Print Serving, Web Serving etc.)
a. FS Archive
b. FSSIM Archive
c. Custom/Report DB
d. Corporate Systems
e. NetBackup
f. NXP
g. Auto PT
h. Arch Sure
i. DF Paycheck
j. EPS
k. UBR
l. OLTP
m. Rep Svr
n. FSADS
o. SPOD DB
p. DART W
q. DT
Future applications include but are not limited to:
1. All standard SUN & WINTEL based applications that will
execute on current systems using current Direct
Attached or SAN storage.
19
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
20. Site Audit & SAN ANALYSIS
These applications and their performance impact will be used to
asses the required performance parameters of the SAN.
2.2 Current / Future Users
20
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
21. Site Audit & SAN ANALYSIS
Statement Of Work (SOW) has stated that the solution must be able to
handle the storage needs represented by a 30% - 40% growth rate.
DoubleClick is now suffering from the same contraction in customer base
as other computer information based service companies. While 30 to
40% growth will be supported by the solution, the customer does not
foresee growth at that rate for the near term.
Customer has stated a concern about the block obsolescence of his
current storage, and the inherent costs to maintain this type of older
architecture. Interviews on site have also revealed the customer and his
Data Center employees understand that current storage (pools of San
and Direct Attach) lack flexibility and are rapidly approaching EOL (end of
Life). While the storage solution will be inherently capable of supporting
growth up to and beyond the stated 30 – 40%, modernization and
economies of scale will also be used to maximize the cost benefit to
DoubleClick.
2.3 Future Hardware Growth
As above plan is for 30-40% growth as per the SOW unless stated
otherwise. Proposed SAN must be capable of expanding at this rate into
the near future.
21
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
22. Site Audit & SAN ANALYSIS
2.3.1 TAPE
Currently the customer has 2 StorageTek L700 robot tape libraries
with DLT7000 drives. Backups are controlled via Veritas Netbackup
software.
All SUN and WINTEL systems backup to these central vaults over a
dedicated backup SAN with smaller systems going across the 100T
network to central media servers. This is a good setup, and adequate for
their needs. Suggest their next tape vault be fibre channel SAN capable,
and upgrade to newer (DLT8000 / SDLT) tape technology as they
outgrow their current system.
2.3.2 DISK (Note: Secondary Data Center Has 18.3TB of 3600 online)
Current disk capacity (Primary Data Center) is broken down as
follows:
ONSITE MTI ONLINE 3600 = 51 TB
ONSITE MTI ONLINE Vxx/S2xx (-V30 2/1/3) = 30 TB
22
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
23. Site Audit & SAN ANALYSIS
HITACHI 9960 ONLINE ONSITE = 15 TB
DIRECT ATT JBOD (COMPAQ /ETC.) = 8.1 TB
NETAPPS F760 (5 ARRAYS) = 7.5 TB
TOTAL = 111.6TB PDC
NOTE 1. Compaq JBODS are mainly older 9GIG and 18GIG drive
technology.
NOTE 2. Customer has plans to install HITACHI 9980 with 50TB in the
near future.
NOTE 3. This does not include Proliants that have from 2 to 4 local
drives attached (mostly 9 & 18G drives) and operate as either
independent systems, or use the drives for the operating system and a
mirror. Operating systems are recommended to remain on local disks
with a local mirror copy. Putting Operating systems on the SAN is never
a good idea as it eats up valuable throughput and creates expensive
unnecessary bottlenecks.
2.4 ISSUES – DOUBLECLICK
23
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
24. Site Audit & SAN ANALYSIS
1. Lots of pools of storage SAN & Direct attach.
2. Need to support 1 & 2 GIG Speeds in same SAN
3. Need to migrate backups off 100BaseT and onto SAN where
possible/necessary.
4. A considerable amount of storage Compaq JBOD / MTI 3600 /
NetAPPS F760 is obsolete and near end of life. Cost to move across
country may not be justifiable or prudent.
5. Symetrix systems being phased out (currently offline).
2.5 SUMMARY
Applications are stable with no major surprises planned. Customer has
a large investment in StorageTek backup hardware (L700) and Veritas
software. Since these are both adequate for the time being.
The SOW said to plan on a growth rate of 30 to 40%. Customer has
stated (and I agree) that while we may not be seeing that rate during this
analysis (or in the near future), that the SAN needs to be able to grow
quickly while avoiding all the problems DoubleClick had during their last
period of explosive growth.
24
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
25. Site Audit & SAN ANALYSIS
3.0 Proposed SAN
3.1 SAN Topology
DoubleClick’s proposed SAN backbone is to be based on 2 x Brocade
12000-core switches running in Fabric mode.
3.2 Hardware Base - Directors
The proposed SAN for DoubleClick will be based on Brocade 12000
director class switches. They can provide a high-speed backbone for both
FCP/SCSI (DISK & TAPE) and IP traffic.
25
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
26. Site Audit & SAN ANALYSIS
Brocade Silkworm 12000
Silkworm 12000 Highlights:
• Simplifies enterprise SAN deployment by combining high port density with
exceptional scalability, performance, reliability, and availability.
• Delivers industry-leading port density with up to 128 ports in a single 14U
enclosure and up to 384 ports in a single rack, facilitating SAN fabrics
composed of thousands of ports.
• Meets high-availability requirements with redundant, hot-pluggable
components, no single points of failure within the switch, and non-disruptive
software upgrades.
26
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
27. Site Audit & SAN ANALYSIS
• Supports emerging storage networking technologies with a unique
Multiprotocol architecture.
• Provides 1 Gbit/sec and 2 Gbit/sec operation today with seamless extension to
10 Gbit/sec in the future.
• Employs Brocade Inter-Switch Link (ISL) Trunking™ to provide a high-speed
data path between switches.
• Leverages Brocade Secure Fabric OS™ to ensure a comprehensive security
platform for the entire SAN fabric.
3.3 Multiple Operating Systems
Current: DoubleClick has multiple operating systems and platforms.
The systems on site are Currently comprised of SUN and Windows 2000
(Wintel) type platforms.
27
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
28. Site Audit & SAN ANALYSIS
Proposed: The proposed SAN needs to have the capability to support
Solaris, Windows, and in the future Linux.
3.4 FCP/SCSI 3 SAN – Core SAN #1
We are proposing 3 core SAN solutions. The first would be to split the
SAN in 2 (Core San #1 DISK & Core SAN #2 Tape). Have one SAN built
around 1 Silkworm 12000 runs only the FCP/SCSI 3 protocol. This SAN
would handle all disk operations.
3.5 IP/Fibre – Backup SAN – Core SAN #2
A second SAN built around 1 Silkworm 12000-class director would
handle FCP/IP traffic. It would in effect handle all IO that pertains to direct
attach tape libraries and IP traffic pertaining to backups. This would
provide what is known as LAN free backup. Offloading some of the
current Ethernet traffic and relocating it to the fibre network.
3.5.1 – Combined FCP/IP SAN – Core SAN #3 (Alternate)
If a cost performance trade off is desired, the customer can go with a
combined SAN. 2 x 128-port Brocade 12000 director class switches can
be used to handle both disk and tape IO and IP fibre traffic, utilizing
28
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
29. Site Audit & SAN ANALYSIS
Emulex Multiprotocol drivers.
These drivers will allow an Emulex Lightpulse 8000 / 9000 / 92XX
series Host Bus Adapter to handle both protocols simultaneously in the
same HBA. This would have a small impact on performance, but would
save money in acquisition costs of Switches and HBA’s, while still being
fully functional. (Still provides LANFREE backups) It could also be
expanded or modified at a later date into the split SAN proposed in 3.4 &
3.5 (above).
This alternative could also use an Ethernet dedicated IP network to
route the index files and housekeeping files in Veritas. The fibre network
would still run the bulk of the data I/O transfer traffic.
3.5.2 – Combined SAN – (Preferred recommendation)
1. Use the 2 Brocade 12000 core switches to build a mesh fabric.
2. Install 2 GIG HBA in high I/O servers (i.e. SUN 4500/6500) and
connect to core switches.
3. Install V400 2 GIG arrays and dedicate to HI I/O servers.
4. Decommission Gladiator 3600 & NetApps boxes.
5. Leave 1 GIG HBA’s used in current back up SAN and Silkworm
2800 switches connected as is (add ISL links to core switches)
6. Use Brocade Zoning; build “1 GIG ZONE”, “2 GIG ZONE”, “Backup
Zone” etc.
7. Use S240/S270 (V20/V30 conversions) to replace older SCSI
29
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
30. Site Audit & SAN ANALYSIS
JBODS on COMPAQ / WINTEL systems (Or Leave in NYC to
support QA and DEV systems that are not moving.
8. Implement a central san control/viewing solution such as Veritas or
Fujitsu (SANVIEW type) software.
9. Install Vcache SAN ready solid-state disk. Connect Vcache to core
switches and use to accelerate data base hot files, and indexes.
(Do this last, you may not require Vcache once you see the results
of the V400s performance on your SAN)
3.6 Standardization
Given DoubleClick’s current Environment all proposals will standardize
on the following.
3.6.2 Switches Directors (Standardization)
Switches will be based on Brocades family of Fabric enabled switches.
They can include Silkworm models 2040/50 (8 port), 2400 (8 port), 2800
(16 port), 6400 (64 port) and the 12000 (128 port).
Customer needs and budget constraints will dictate the ultimate
selection. DoubleClick already has a number of 2800’s on site in their
current backup solution, and they have purchased 2 Brocade 12000s
located in Colorado.
3.6.3 HBA (Standardization)
Most customer systems can be handled by the Emulex Lightpulse
30
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
31. Site Audit & SAN ANALYSIS
8000 / 9000 / 92XX (PCI/Sbus) family of dual protocol adapters.
3.6.4 Server Hardware (Standardization)
We strongly suggest that DoublClick consider standardization of
hardware/operating systems. While we can support all the hardware in
their current environment, a mixed topology has its own overhead in
inefficiencies of management and personnel required.
It seems from a review of the New York City facility that Windows
2000 and Solaris (2.8 or later on SUN and Compaq platforms) would be
a logical choice for standardization, given the preponderance of those
systems already on site. We realize there may be orphan systems
remaining for business and other reasons, and our proposal will allow
for their support, and the support of the current systems mix.
3.6.5Backup Hardware (Standardization)
Current Backup hardware is centered around 2 StorageTek L700
robot tape libraries running on a combination dedicated backup SAN
(using Brocade 2800 switches) and IP backup (Ethernet) for non-SAN
systems and Veritas housekeeping traffic.
3.6.6 Backup Software (Standardization)
31
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
32. Site Audit & SAN ANALYSIS
DoubleClick has already standardized on Veritas Netbackup. The
current system runs well and we would only suggest some additional
software packages and a future plan to transition to 2 GIG fibre.
A faster 2GIG backup SAN with IP and SCSI/FCP over fibre, or with
SCSI/FCP over fibre and a dedicated Ethernet backup network for index
and Veritas IP housekeeping traffic might also be an alternative.
3.7 NAS
Currently NAS / File Sharing is handled by 5 NetApps F760 NAS
boxes providing approximately 7.5 TB of storage using a total of 210 36G
drives.
Recommend that the NetApps be decommissioned. Obviously this
site is primarily a SAN environment, and the NetApps boxes are slow
and old and a maintenance problem in terms of $$$ and reliability.
32
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
33. Site Audit & SAN ANALYSIS
3.8 Summary
The current site already has a large investment in Brocade switches 1
GIG Host Bus Adapters. We can leverage this current investment and
achieve the benefits of standardization by moving all the slower less
intensive applications to older 1 GIG HBA’s and technology (S240), and
migrating the high speed high I/O applications to new 2 GIG HBA’s and
storage (i.e. V400).
There is no reason for the customer to totally scrap his current
investment and operation. Customer onsite discipline is good, and we
suggest that the customer commence moving to Hardware, Software
Standardization (using the best practices noted in Appendix 6).
We feel the current move to Colorado is the perfect opportunity for
the customer to upgrade his older equipment during the data migration.
Savings in maintenance contracts on older equipment (NetApps &
3600s) as well as dramatic fibre performance improvements can more
than justify upgrading selected parts of the infrastructure now.
33
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
34. Site Audit & SAN ANALYSIS
4.0 Backup
4.1 Current
The current environment (NYC) is comprised of 2 StorageTek L700
Tape libraries running DLT 7000 tape drives. The Libraries are connected
to primary SUN Servers via a dedicated 1GIG Fabric using Brocade
Silkworm 2800 switches (See Figure 4-1). This backs up their core
ORACLE, ABINITIO (EPS1-4), and DART systems, which forms the core
of their online systems.
They are currently using Veritas Net Backup to control these backups.
They are backing up the Oracle database using the offline method. Also
all NT/WIN2K systems are backed up over the 100BaseT internal IP
network. All backup operations in NYC are under Veritas control.
4.2 Proposed
The proposal is to maintain the Veritas / StorageTek system on site. It
34
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
35. Site Audit & SAN ANALYSIS
is an excellent solution in design and implementation. The customer could
save some money here by using the system as is.
Connect the Brocade 2800 silkworms to the Brocade 12000s as edge
switches, and designate a back up zone. Then as time and money
permits, add 2 GIG HBAs to replace the 1 GIG Backup HBAs on a one for
one basis. Also upgrade to later DLT technology in the libraries, DLT 8000
and SDLT as time and money permit.
The following proposals are modest improvements to fine tune and
improve the current customer back up infrastructure, and to provide
information on methods and techniques that may not have been
previously considered. It purposely avoids any major changes.
35
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
36. Site Audit & SAN ANALYSIS
36
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
Figure 4-1
37. Site Audit & SAN ANALYSIS
4.2.1 LAN Free (To speed up backups and reduce IP network
congestion)
This is somewhat a misnomer. When you set up IP over Fibre, you
actually are building a Fibre Channel LAN. All backup traffic is now
moving over fibre IP vice the regular 10/100 Base-T LAN.
VERITAS has stated that when you use a FIBRE Channel attached
tape library (I.e. a L700) that you still need a LAN to pass the backup
index files over. A fibre channel tape library does show up as a locally
attached fibre device to a host using FCP/SCSI3. However the indexes
can be passed over the Fibre channel SAN using IP/Fibre.
There are basically 2 ways to do a so-called LAN free backup.
The first method is to use a Fibre channel attached Tape Library
connected to a SAN via a Switch. Backup index files (Veritas) are then
37
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
38. Site Audit & SAN ANALYSIS
passed over the IP/Fibre Channel network. This requires all systems in
the backup network to also have a fibre channel HBA and be running
IP/Fibre. In this implementation the backup files are sent over fibre via
the SCSI3/FCP protocol. The backup indexes and Veritas/Legatto
control traffic also go over the same fibre but using IP protocol. Much
the same can be accomplished by using a Fibre Channel Tape Library,
and building a separate backend 10/100 to GIG-E backend network that
handles nothing but backup traffic.
The second method is to have a SCSI attached tape library hanging
off of a backup server. The backup server and clients are all members of
a FIBRE channel SAN running IP over Fibre. In this setup all backup
and index files are transferred over the fibre channel LAN using IP
protocols. (This method is nowhere in evidence at the NYC data center
except in a few pieces of offline backup hardware)
4.2.2 Server Free (This is only applicable if you keep your NetApp
Boxes)
38
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
39. Site Audit & SAN ANALYSIS
NetBackup accomplishes backup and recovery of Network Appliance
filers using NDMP (Network Data Management Protocol), an open
protocol co-developed by Network Appliance. The NDMP protocol
allows NDMP client software, such as VERITAS NetBackup, to send
messages over the network to an NDMP server application to initiate
and control backup and restore operations. The NDMP server
application resides on an NDMP host, such as a NetApp F760. The
NetBackup for NDMP Server is where the NetBackup for NDMP
software is installed; it may be a master server, a media server or both.
If it is only a media server, however, the assumption is that there is a
separate NetBackup master server to control backup and restore
operations.
Benefits:
• High-speed local backup
Perform backups from NetApp filers directly to locally attached tape devices for
performance without increasing network traffic (refer to red arrow on figure below).
• Alternate Client Restore
Data backed up on one NetApp filer may be restored to a different NetApp filer. This
helps simplify both recovery simulation and general restoration of data, especially when
the NetApp filer that performed the initial backup is unavailable (refer to blue arrows on
39
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
40. Site Audit & SAN ANALYSIS
figure below).
40
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
41. Site Audit & SAN ANALYSIS
System Requirements: NetBackup for NDMP supports Sun Solaris and
Windows NT/Windows 2000 platforms, both of which DOBLECLICK
maintains in their environment. One condition required by the latest
release of NetBackup is that the Qualified Network Appliance Data
ONTAP™ Release(s) be either 5.3.6R1 or 5.3.4. DOBLECLICK currently
maintains two versions of ONTAP™: 5.3.4R3 and 5.3.7R3. It is possible
that their current versions may be supported. [Detailed and/or updated
information would have to be verified through MTI's Veritas
representative.] The other condition is in regards to supported tape
libraries and tape drives. As of writing, tape devices currently supported
include any general DLT tape library or drive, DLT tape stacker, 8 mm
tape library or drive, and the Storage Tek ACSLS-controlled tape drive or
library. [Detailed and/or updated information would have to be verified
through MTI's Veritas representative.]
41
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
42. Site Audit & SAN ANALYSIS
4.2.3 Veritas
MTI has inventoried DoubleClick’s environment, It is well thought out
and efficient. I suggest DoubleClick look at adding the following Veritas
products at some point in the future.
Features & Benefits of VERITAS NetBackup™ DataCenter for
DoubleClick.
VERITAS Global Data Manager™
VERITAS Global Data Manager™ is a graphical management layer for centrally
managing a single VERITAS NetBackup™ storage domain or multiple storage anywhere
in the world. In addition, Global Data Manager offers a monitoring view into Backup
Exec™ for Windows NT and Backup Exec for NetWare environments, providing
customers with a single view into mixed Backup Exec and VERITAS NetBackup
environments.
VERITAS NetBackup Advanced Reporter™
VERITAS NetBackup Advanced Reporter™ delivers a comprehensive suite of easy-
to-follow graphical reports that help simplify monitoring, verifying, and troubleshooting
42
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
43. Site Audit & SAN ANALYSIS
VERITAS NetBackup™ environments. Advanced Reporter provides intuitive charts,
graphs, and reports that allow administrators to easily uncover problem areas or patterns
in backup and restore operations. Reports can be accessed through a web browser from
virtually anywhere in the enterprise.
• Reports are delivered via Web browser to virtually anywhere in the enterprise (even
over dial-up phone lines)
• Many reports embed hyperlinks that allow users to search for more granular
information without having to open or move to an additional screen
• Over 30 new graphical reports available
VERITAS NetBackup™ for Microsoft Exchange Server
VERITAS NetBackup™ for Microsoft Exchange Server simplifies database backup
and recovery without taking the Exchange server offline or disrupting local or remote
systems. A multi-level backup and recovery approach ensures continued availability of
Exchange services and data during backups. Central administration, automation options,
and support for all popular storage devices create the flexibility administrators need to
maximize performance.
43
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
44. Site Audit & SAN ANALYSIS
VERITAS NetBackup™ for NDMP option (If you
decommission your NetApps Boxes ignore
this)
Using the open Network Data Management Protocol, VERITAS NetBackup™ NDMP
delivers high-performance backup and recovery of NAS systems and Internet messaging
servers, such as Network Appliance, EMC, Auspex, ProComm, and Mirapoint. VERITAS
NetBackup sends NDMP messages to these systems to initiate and control backup and
recovery operations.
VERITAS NetBackup™ for Oracle
VERITAS NetBackup™ for Oracle simplifies the backup and recovery of all parts of
the Oracle database, including archived re-do logs and control files. Users can perform
hot, non-disruptive backups—full or incremental—of Oracle databases. VERITAS
NetBackup's job scheduler and tape robotic support allows for fully automated
database backups. In addition, VERITAS NetBackup for Oracle utilizes a high
performance parallel process engine that provides the fastest mechanism possible for
the backup and recovery of data, resulting in increased database availability.
VERITAS NetBackup™ for Oracle Advanced BLI Agent
44
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
45. Site Audit & SAN ANALYSIS
VERITAS NetBackup™ for Oracle Advanced BLI Agent delivers high performance
protection for Oracle databases by greatly reducing the amount of data during backup
and recovery. Block Level Incremental (BLI) backup technology updates changed
blocks (not the entire file), dramatically reducing CPU and network usage and enabling
more frequent backups. The Advanced BLI Agent is fully integrated with the Oracle 8i
Recovery Manager (RMAN) interface and augments the increased manageability and
simplified recovery that RMAN provides.
• Leveraging your existing hardware results in lower cost implementation compared to
split-mirror solutions
• Online, non-disruptive, full, differential and cumulative block level incremental backup
options provide flexible implementation and free up bandwidth
• Backup and restore at the database, tablespace and data file levels, including control
files
VERITAS NetBackup™ Shared Storage Option™
VERITAS NetBackup™ Shared Storage Option™ (SSO) is the first heterogeneous
45
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
46. Site Audit & SAN ANALYSIS
SAN-ready storage solution working on both UNIX and Windows. It enables dynamic
sharing of individual tape drives—standalone or in an automated tape library—by
"virtualizing" tape resources. VERITAS NetBackup™ SSO reduces overall costs by
providing better hardware utilization and optimization of resources during backup and
recovery operations.
4.2.4 ATL / STORAGTEK- DLT (TAPE LIBRARIES)
Existing tape libraries do not need any upgrades as of this writing. In
the future when considering tape libraries from the above manufactures
consider the following.
- Backward read compatibility with DLT7000
- DLT Format (DLT 8000 or SDLT) it is really to late at this point (and
to expensive) to change media formats.
- 2GIG Fibre speeds. (Anything slower will just mire you in the past)
4.3 Near Line Data
The customer currently has no Near Line Data (CD-RW / DVD-RW
46
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
47. Site Audit & SAN ANALYSIS
Libraries) on this site, or under consideration.
47
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
48. Site Audit & SAN ANALYSIS
4.4 Summary
1. Add the NetBackup Oracle Agent to your current backup suite.
2. My primary suggestion on NetApps is to get rid of them (due to age
and performance) but if you retain them consider Veritas NDMP for
server less backup of these systems.
3. Leave your backups on a dedicated fibre fabric of Silkworm 2800’s
(as it is now). Use ISL links to connect to your core switches.
Migrate to 2GIG technology as the L700’s reach end of life in an
orderly transition. (You can also upgrade to newer denser DLT
drive technology at the same time)
4. Build a backend IP network (IP over Fibre) or Ethernet, dedicate it
to passing indexes files (on fibre optic attached clients) and to
passing index files and backup traffic on your IP only clients
(mostly the Wintel and SPARC equipment)
48
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
49. Site Audit & SAN ANALYSIS
5.0 Performance & Applications
5.1 Current Applications – Performance Impact - Current
DoubleClick currently has a somewhat standard mix of applications.
The primary ones are Oracle, AB Initio, Microsoft Exchange, DART (in
house software), and web server applications. The systems also
provide for the traditional mix of file services, print services and backups
(Veritas NBU.
The current direct attached storage is marginally adequate from a
performance standpoint. The systems are also slowly becoming
obsolescent, and in some cases reaching capacity. Currently the
customer’s problems are mostly volumes going over 90% capacity, and
block obsolescence of storage hardware.
Moving to a core switch SAN based on new 2 GIG bandwidth
technology will improve the performance of the customer’s current
applications, and ameliorate some of his overall performance issues.
However it is our opinion that a controlled migration to a core switch
SAN is prudent. The current customer configuration is not only based on
pools of obsolete storage but small san (pools) that are also fast
reaching the end of their service lives. One major exception to this
would be the Hitachi 9000 series SAN that is installed in the Primary
Data Center.
49
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
50. Site Audit & SAN ANALYSIS
5.2 Proposed SAN – Impact on Applications Performance
The proposed SAN will have a marked impact on current applications
performance. DoubleClick has stated that they have two Brocade
12000-core switches available for the Colorado Data Center.
As part of this review I recommend DoubleClick look at removing all
1 GB technology (LP8000’s, Ancor Switches, etc. and eliminate edge
switches where practical (except for current backup operations). The 2
Brocade 12000’s could be made into a redundant Fabric (for path
failover capability) and all current San equipment (Hitachi, V20/30/S240)
as well as all new equipment (i.e. V400) be connected via the core
switches.
50
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
51. Site Audit & SAN ANALYSIS
5.3 Proposed SAN – Future Growth Impact
The CORE SWITCH SAN will allow for a more controlled growth
pattern, and flexibility in storage assignment. Currently the admins have to
throw disks (DAS/NAS/JBOD) at many storage problems on a case-by-
case basis.
Basing a SAN on director class switches will allow us to start with a
Fabric SAN backbone, and expand from there. Hosts and Storage may be
added to the backbone and allocated as needed. Basing the SAN on
Directors and High Density switches allows for almost limitless growth and
flexibility.
5.4 Recommended Enhancements
Recommended enhancements to hardware and proposed topologies
are covered in Chapter 9.
Proposed improvements and considerations for such things as
Database performance are covered in Chapter 8.0.
The only hardware performance enhancements (other than the SAN
itself, and LAN-FREE backup) would be to implement V-Cache.
51
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
52. Site Audit & SAN ANALYSIS
5.4.1 V-Cache - Hardware
MTI – V-Cache is a SAN enabled solid-state disk. It (like Direct
Attached Solid State Disks) is an I/O device that can be used to store and
safeguard data. Solid State Disk technologies like V-Cache use random
access memory (RAM) to sore data and thus eliminating the access
latency in mechanical disk drives. The near instantaneous response and
the ability to place a ram disk within a SAN allow for quantum
improvements in database applications by moving those small but I/O
intensive segments of database off of traditional mechanical disks.
DOUBLECLCIK could realize significant performance gains by moving
small but highly I/O intensive files to V-Cache. V-Cache would fit into the
SAN proposal since it is made to be plugged directly into a fibre switch or
director. Think of it as a RAM DISK that is SAN Enabled. Its other
advantage is that it can be cut into RAM LUNS, and served to specific
hosts (lun mapping to WWNN) just like the MTI S200 storage does with its
disk/luns.
VCACHE - (General Characteristics)
52
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
53. Site Audit & SAN ANALYSIS
- Physical: 5U high in a dual unit enclosure
- Base Unit goes to 25.8 GIGS so 2 units in 5U enc = 51.6G
- VCache SAN is for unit that will not be attached to one of our
products (it comes with its own switches in a mini SAN)
- VCache Expansion is a box to integrate into one of our
SANS (S2xx or Vxx)
- VCache Unit can hold up to 3 memory modules, of mixed
types 4.3 or 8.6GB
- SYSTEMS: Solaris 2.7/2.8 and Windows NT4/W2K.
- LUNS: Vcache supports 16 LUNS per controller (8 per RCU)
You can add as many controllers to Vcache as Customer
wants.
The nice thing about V-Cache (on the SAN) is that it can be shared
among systems on the SAN, or dedicated completely to any one system.
5.5Tuning Oracle Database with V-Cache Accelerator (Overview)
Disk access times are generally a key issue in most commercial data
processing environments. It is disk access, which is commonly the longest
part of the data acquisition process. Information processing technology
has continually improved by leaps and bounds in the case of CPUs,
memories, and other components. The hard disk arena has improved
primarily in the direction of creating high real densities (capacity) and a
lower cost per megabyte. Progress has not nearly been as great in
improving the speed of data access from the disk drive. Speed of access
in a drive is mainly limited by the rotating speed of the disk or the linear
speed of the access arm. A modem day CPU flies along at 1.5GHz or
faster, but it must wait for data from a disk drive which is chugging along
at 15,000rpm or less. The ever-widening gap in speed differential is
continuing to increase very rapidly and is causing I/O bottlenecks to
become more troublesome than ever.
53
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
54. Site Audit & SAN ANALYSIS
V-Cache application accelerators use high-speed memory organized
as a drive. To the server, it appears as a conventional disk drive when in
reality the server is dealing with a mass of external memory. In a V-Cache
accelerator, there are no performance-robbing mechanical limitations due
to moving heads or spinning platters. Data can be written to V-Cache at
memory speeds. There is no seeking of tracks and data can be found and
transferred to the server in a few microseconds, which is 200-350 times
faster than the 7-10 milliseconds a regular hard drive takes to seek to a
track and transfer data. Data files on a Unix file system can be created on
the V-Cache accelerator and the operating system will not know the
difference.
However, data residing in memory is volatile and must be
protected against loss of power to the memory. To prevent potential
data loss due to power disruptions, V-Cache incorporates both battery
backup and an internal hard disk drive that automatically detect power
abnormalities. In the event of power fluctuation, circuits inside the V-
Cache accelerator sense the loss and automatically switch over to
power from the internal batteries. If the power disruption persists longer
than a user-prescribed period of time (i.e. 10 seconds, 30 seconds, 1
minute, etc.), all data from the V-Cache memory is then saved to the
internal disk drive. Once data is safely saved to the internal disk, the
54
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
55. Site Audit & SAN ANALYSIS
device will switch off and wait for the power to return as the data is now
protected. When stable power is re-established, data is copied from the
internal disk drive back to the V-Cache memory area and the files may
again be accessed by the applications. There is virtually no risk of ever
losing data with these safe guards in place. A file created on a device
like this is equally if not more safe than a file created on a regular disk
drive.
5.5.1 Using V-Cache With Oracle
In a flat file based application, it is easy to identify the HOT (heavily
used) files and place them on a V-Cache accelerator to derive maximum
and immediate performance benefit. When a database application is
involved, it is a more complex scenario. A thorough understanding of the
database and the read and write pattern of the application is required
before anything should be attempted. Many times the application itself
may not provide room for improvement due to internal locking or waits for
other processing to complete.
For an application to show benefit it must be performing disk access
at a very fast pace and must be writing or reading fresh data. Oracle
performs its own smart optimization by buffering all writes and reads in
55
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
56. Site Audit & SAN ANALYSIS
the System Global Area. Data is written to the disks only on a “need to”
basis. Over and above this database functionality, the operating system
performs its own buffering. This means there are two sets of buffers to be
scanned before a physical disk access is required. This double buffering
strategy works well in many applications. What it does is add a second
level of caching to the HOPE algorithm. Oracle always does a
synchronized write, meaning the operating system buffers are bypassed
for all writes to disk. These buffers are used for the reads only if they are
not marked as old or dirty. This theory is largely true except in some
implementations of Oracle where the operating system buffers are
bypassed completely. For example, parallel server passes operating
system buffers to ensure read consistency in various instances.
Under Oracle, the greatest I/O activity exists with the rollback
segments, redo logs and temporary segments. These files are excellent
candidates for a V-Cache implementation depending on the application.
Oracle data files can be placed on a V-Cache accelerator when there are
a lot of random reads and writes to specific tables.
Ensuing performance benefits can be extremely varied and generally
depend on how much V-Cache memory is configured for the application.
The following paragraphs explain how to use V-Cache accelerators with
various Oracle files and the types of applications that can enjoy improved
performance benefits.
56
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
57. Site Audit & SAN ANALYSIS
Redo Logs
When an application commits data to the database, Oracle writes the
commit information to the redo log files. An Oracle process called the log
writer performs the task of writing the data to disk from the SGA (System
Global Area). The user data could still be in Oracles SGA but the
transaction is deemed complete. The above scheme allows Oracle to
support high transaction rates by saving only bare minimum data to the
disk. The user data is saved to disk later on by the DBWR process, which
wakes up at predefined intervals.
The following performance improvements can be expected from placing
Redo Logs on a V-Cache accelerator:
• High Transaction Rates -- An application with high transaction
rates will write large volumes of data to the redo logs in a very short time.
Redo logs, when placed on a V-Cache accelerator, can provide dramatic
performance benefits in such a scenario. Performance gains of up to
2,0000/0 have been seen, though it is much more common to see
improvements ranging up to 80%.
57
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
58. Site Audit & SAN ANALYSIS
• Transaction Processing -- In a transaction-processing
environment, data is written to both redo logs and rollback segments.
Performance benefits will be greatly enhanced when both redo logs and
rollback segments are placed on V-Cache, particularly for applications
with a high volume of inserts, updates and deletes
• Decision Support — In a decision support environment, which
typically only reads data, V-Cache accelerators will generally not show a
performance benefit.
Rollback Segments
Oracle stores previous images of data in the rollback segments. When
an application makes changes to data, it is stored in the rollback
segments until the user commits the transaction. All processes read the
previous image of the data from the rollback segments until the
transaction is committed. Oracle does its own buffering of the rollback
segments in the SGA.
However, when large transactions occur, they are written to the disk,
and if other processes need this previous image of the data, it will be read
from the rollback segments. Rollback segments on a V-Cache accelerator
58
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
59. Site Audit & SAN ANALYSIS
will speed up both the reads and the writes.
Generally, a transaction-processing environment will substantially
benefit from putting rollback segments on a V-Cache accelerator. On the
other hand, a decision support system, which primarily analyzes data, will
typically not show performance benefits.
Temporary Segments
Oracle uses temporary segments to store intermediate files. These
files could be the result of sub-queries or temporary sort files. Data is both
written to and read from the temporary segments by the various Oracle
processes. Small sorts are entirely performed in the SGA, while large
sorts are performed by using disk space in the temporary segments.
The following performance improvements can be expected from
placing Temporary Segments on a V-Cache accelerator:
• Large Sorts / Parallel Queries -- An application performing large
sorts or making heavy use of the parallel query option can show
immediate Results with a V-Cache accelerator.
59
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
60. Site Audit & SAN ANALYSIS
• Transaction Processing -- Transaction processing environments
with complex queries could also benefit from using a V-Cache
accelerator. However, transaction-processing environments with low
volume sorts will generally not show similar benefits.
• Decision Support -- Decision support systems, which retrieve or
sort large volumes of data can show immediate results with a V-Cache
accelerator.
Data Files
Oracle stores data from tables and indexes in table spaces residing on
Oracles data files. All data is buffered and stored in the SGA and, as
indicated earlier, will go through the operating systems buffers as well.
Putting data files on a V-Cache accelerator will provide benefit only in
unique situations. The writes to the data files are done by the DBWR at
predefined intervals and are therefore not an immediate priority.
Excessive reads can be avoided by increasing the number of buffers and
creating the heavily used tables using the CACHE option. This will
improve the possibility of the table data blocks being available in the SGA
on most occasions. This technique can be used for smaller tables by
increasing the size of main memory and the SGA.
60
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
61. Site Audit & SAN ANALYSIS
The following performance improvements can be expected from
placing Data Files on a V-Cache accelerator:
• Large Tables -- Large tables that are randomly read or written at
high frequencies should be placed on a V-Cache accelerator. Such tables
can be created on separate table spaces setup on the V-Cache
accelerator.
5.5.2 Determining The Need For V-Cache
Oracle application that have consistently high I/O rates that saturate
both the SGA and the operating system buffers will benefit the most from
a V-Cache accelerator. The saturation of the SGA is indicated by the Hit
Ratio, and can be determined by using a monitoring tool like the Oracle
Monitor or looking directly at the V$ tables. High transfer rates to a
particular data file or rollback segment could indicate possible
performance benefits.
The operating system should be monitored for I/O activity, and utilities
61
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
62. Site Audit & SAN ANALYSIS
such as sar and iostat can be used to analyze the operating system buffer
activity. If the operating system buffers are large enough, there will be
more logical than physical accesses. If the database (Oracle’s monitor)
shows a high number of physical reads but the operating system does
not, then the chances of obtaining any meaningful benefit from a V-Cache
accelerator is generally low. On the other hand, an application, which
causes physical reads both at the database level and at the operating
system level can substantially benefit from V-Cache’s performance
attributes. The exception to this is the Parallel Server, which always
bypasses the operating system buffers.
An application with intermittently high I/O rates for short periods of time
will not generally show much performance improvement with V-Cache,
because the database could work on flushing the buffers during the lean
periods. I/O transfer rate, rather than how much I/O is performed,
determines the need for a V-Cache accelerator. The reads and writes
must also slip through the buffering schemes before any benefits can be
seen.
Raw Devices
V-Cache accelerator can be configured as a raw device under Unix.
62
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
63. Site Audit & SAN ANALYSIS
Data written or read from any disk configured as a raw device always
bypasses the operating system buffers. This will eliminate the overhead of
buffer management. Raw devices are good for bulk volume writes. A
process shows good I/O bandwidth when large volumes of data are
written at one time, because it is now bypassing the buffering scheme.
This is particularly true with processes that write data at random in
small bits and pieces. These processes typically perform poorly because
they have lost the benefit of using the buffers and have to wait for the disk
to respond. For these environments, using a V-Cache accelerator as a
raw device under Oracle will increase the throughput, because the data
will move directly from Oracle’s SGA to V-Cache.
RAM Disks
RAM disks have been available with every operating system for a long
time. A RAM disk is a part of main server memory that can be designated
as a drive, and the system will use that area to store files or any other
user data. The RAM disk will cost the same as main memory and can be
good for storing small volumes of data or doing some high-speed sorts in
a flat file environment. However the main disadvantage of using RAM disk
is that the CPU now performs the I/O rather than dele gating work to the
I/O controller. A process performing I/O with RAM disks will run very fast
63
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
64. Site Audit & SAN ANALYSIS
but it will start loading the CPU with read and write requests, which can
significantly slow down any other processes. The whole system could
appear sluggish, especially when large data volumes are written to the
RAM disk, because the workload on the CPU has increased.
Placing Oracle’s redo logs, rollbacks or temporary segments on RAM
disks could potentially slow the entire system down in a heavy
transaction-processing environment. Also, the effect of RAM disks on
SMP (Symmetric Multi Processing) machines with multiple CPUs must be
evaluated. An SMP environment with multiple CPU’s and RAM disks
could potentially saturate the system bus and cause a rapid slow down of
the whole system in some hardware architectures.
Herein lies another advantage of the V-Cache accelerator — the ability to
share contents across multiple servers in clustered scenarios that is not
possible with internal server memories.
Also recall that anything written to RAM disks is volatile, and servers
do not usually come with the automatic battery backup and disk backup
capabilities of the V-Cache accelerator. If power is lost or the machine is
rebooted, data will be lost unless there are other planned alternatives.
64
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
65. Site Audit & SAN ANALYSIS
From a hardware point of view, a RAM disk will use up more slots in
the backplane of the computer, because more memory boards will have to
be installed, thereby limiting the number of slots available for other
purposes.
5.5.3 V-Cache Oracle Summary
When intelligently applied, V-Cache accelerators can provide
outstanding performance improvement in Oracle environments. DBA’s
can obtain much faster data access times by placing an application’s
“HOT” files on a V-Cache accelerator as opposed to traditional disk
drives. V-Cache does not replace disk drives or cached arrays but is used
in key areas to resolve I / 0 bottlenecks.
65
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
66. Site Audit & SAN ANALYSIS
66
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
67. Site Audit & SAN ANALYSIS
5.6 Summary
From an MTI hardware perspective, I see the New York City Data
Center as having potential major applications bottlenecks in moving
forward if the current pools of storage hardware method is retained after
the move to Colorado. We can’t speak to other vendors, but I doubt that
there are any applications that could not be moved to or not benefit from
moving to a core switch based SAN. Moving to a core switch 2 GIG SAN
will increase overall performance in and of itself.
We do suggest that DoubleClick start the transition over to newer SAN
hardware in Colorado at their earliest convenience. This would permit
them to move to new more reliable, and faster technology as they transfer
their data and applications to Colorado.
DoubleClick should also consider installing new WINTEL platforms in
Colorado to replace some of the obsolete COMPAQ PROLIANT
systems that they would otherwise move. I believe the benefits in speed,
flexibility, storage capacity; reliability and lower maintenance costs
would more than justify these actions.
67
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
68. Site Audit & SAN ANALYSIS
6.0 Disaster Recovery
6.1 Current
The customer currently backs up to DLT7000 tape in the 2 tape
libraries. In case of disaster in the New York Data Center, their plan is to
recover NYC operations at the Colorado Data center.
Data is currently offsite by Iron Mountain Incorporated.
http://www.ironmountain.com/Index.asp
Once the move to Colorado is complete DoubleClick may want to look
at a new disaster recovery plan.
6.2 Proposed
The current move to Colorado leaves 3 possible alternatives.
1. Off Site Disaster Recovery firm (i.e. Comdisco or IBM
Business recovery services)
2. Create another DoubleClick owned data center, and use it in
the event the Colorado center fails and do replication using
software (Fujitsu TMDF) or a SAN appliance/software
68
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
69. Site Audit & SAN ANALYSIS
solution (DataSentry 2).
3. Or simply use an offsite storage facility for tapes, and buy
computer time from one of the many disaster recovery
centers run by various vendors.
6.2.1 Off Site Disaster Recovery
This is my first recommendation. To purchase services from any
reputable Business Recovery Service is an expensive proposition,
however building a DoubleClick owned duplicate data center is even more
expensive. Doing without some kind of disaster recovery is a non-starter.
If DoubleClick losses its data center, a significant portion of its revenue
will be lost, making recovery even more difficult.
This solution becomes feasible if DoubleClick does not wish to
invest in hardware to duplicate the Colorado Data Center.
6.2.2 Create A DoubleClick owned disaster recovery data center.
The first step in this would be to decide if a fully duplicate center is
necessary or if certain key applications capabilities would suffice.
If DoubleClick goes this route, they can customize their solution, and
perhaps come up with a more viable and cheaper solution than an off site
Business Recovery Firm. There are also some other advantages.
DoubleClick could have empty tape libraries and a duplicate SAN in their
69
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
70. Site Audit & SAN ANALYSIS
recovery location. The benefit is if the disk drives in the primary Colorado
data center location survive a disaster, these drives could be plugged into
empty MTI SAN Arrays at the Disaster Recovery Data Center. This way
the data could be recovered without a tape restore.
Depending on the scale of the duplicate data center, full time data
replication over the WAN could be implemented. That of course carries
its own cost in WAN infrastructure costs.
If DoubleClick decides they can live with a little extra time involved in
getting back online, than a scaled down (critical systems only) solution,
using tape restore might be the best solution.
6.2.3 Offsite Storage Facility
Many reputable companies sell fire and disaster tolerant (notice no
one ever says proof anymore) facilities that are environmentally
controlled for tape backup storage. This is not technically disaster
recovery, but falls under data preservation.
This is probably the least expensive solution, and is the minimum
that DoubleClick should employ. The offsite storage would provide some
immediate protection from unhappy employees, computer virus’s, and
less devastating forms of data loss.
70
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
71. Site Audit & SAN ANALYSIS
In the event of a complete data loss DoubleClick would still be left
with finding a data center and systems to restore to. But this solution
when combined with a contract with a disaster recovery computer
systems provider has been found to be cost effective.
DoubleClick has stated an intention to at least retain this level of data
protection. They are happy with Iron Mountain’s services and may seek to
continue with them after the expansion/relocation Colorado.
6.3 SUMMARY
It is our recommendation to do a best of both worlds approach. Buy
enough hardware to replicate all the critical functions of the Colorado Data
Center (once the New York Center is closed). Break these critical
functions into 2 categories. First would be mission critical, and second
would be mission essential.
Mission Critical. Have those must have functions (revenue
generating, business continuance) and databases up and replicated
from Colorado data center to Disaster Recovery Data Center. This
would allow an immediate continuance of corp. business functions in the
71
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
72. Site Audit & SAN ANALYSIS
case of a disaster.
Mission Essential. Things like Microsoft Outlook. E-mail is essential to
business nowadays, but losing use of your current inbox, and files can
be worked around and is not a showstopper. Decide these non-critical
applications and allow them to remain offsite on tape. Plan to either
acquire hardware to run them when the need arises (natural disaster) or
to re-allocate the hardware (on the disaster recovery site) to
accommodate the must have functions.
In this manner DoubleClick could stay in business, with critical
services intact, and restore those non-critical capabilities as
time/resources permit.
72
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
73. Site Audit & SAN ANALYSIS
7.0 Control / Virtualization / Data Migration
Currently there exists no perfect product to handle all the management
of SAN. In the future software vendors promise packages that can
perform hardware configuration, backups, hardware monitoring,
performance monitoring, and system administration.
For now there are a lot of Vendors trying to pull this off. Some vendors
claim to have this functionality, but when you read between the lines, or
ask hard questions, it becomes obvious that they do not.
The following recommendations reflect our best opinion on software
that currently exists, that will allow reliable control, monitoring, system
administration and data migration.
7.1 MSM / FUSION
MSM & FUSION are MTI products for controlling MTIs family of
hardware SANs. Every Vendor has their own version of hardware control
software.
73
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
74. Site Audit & SAN ANALYSIS
1. MSM – MTI SAN Manager (MSM) is a Web-GUI specifically for
administering the MTI S200 & 400 series of Arrays. It allows for LUN
Masking, LUN Initialization, array performance monitoring, and other
housekeeping/administrative functions.
2. FUSION – Fusion is MTI’s dedicated graphical user interface for the
Vivant Vxx series Enterprise RAID systems. It performs all the same
functions, as DAM but is able to do such things as array initialization and
array re-configuration in the background. Some operations (LUN
Creation) in DAM require a controller reset, which would affect other hosts
using that array. FUSION & the Vivant Vxx have no such limitation. It can
be re-configured on the fly with no impact to any other hosts using the
array. (As Double Click upgrades their V20/V30 series to S2xx series,
fusion will no longer apply)
7.2 Veritas NBU
Currently DOUBLECLICK is using Veritas NetBackup to manage their
backups and tape libraries. There is currently no SAN wide control tool
that addresses backups of the size and complexity of those run at
74
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
75. Site Audit & SAN ANALYSIS
DoubleClick. Veritas is one of the best Software Backup Automation and
control suites available. Suggest no change here and continue using
Veritas to control Backups.
75
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
Single Console: Single system view of all NetBackup servers
Backup schedules & classes
Volume & tape management
Remote device management
Consolidated backup catalog
76. Site Audit & SAN ANALYSIS
NETBACKUP CONSOLE OVERVIEW
7.3 Veritas SANPOINT / Disk Management
Veritas SANPOINT / Foundation suite allows amore flexible control of
the customers storage and allows for retention of some of the legacy
storage.
It will allow storage to be grown, shrunk, and modified dynamically.
Any storage on site may be used, JBOD, RAID, internal disks, regardless
of vendor.
76
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
77. Site Audit & SAN ANALYSIS
77
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
SAN Switch
SAN Switch
SAN Switch
SAN Switch
SAN Switch
SAN Switch
SAN Switch
SAN Switch
SAN SwitchSAN Switch
SAN SwitchSAN Switch
SAN SwitchSAN Switch
SAN SwitchSAN Switch
•Disk group ownership
•DMP provides path failover
& load balancing
•Remote Mirror & SW RAID
•Performance optimization
•Dynamic grow/shrink of
logical volumes
Server
Application
Server
Application
VxVM
NT (Beta) Solaris HP/UX
78. Site Audit & SAN ANALYSIS
SAN Virtualization with VxVM:SAN Virtualization with VxVM:
Non-disruptive on-line storage managementNon-disruptive on-line storage management
7.4 Fujitsu (Softek) SAN View
Simplify the management of multi-vendor SAN devices
Softek SANView is a proactive Storage Area Network (SAN) fabric
manager that simplifies SAN troubleshooting. It monitors the health and
performance of your SAN-connected devices so you can diagnose and
avoid outages. Through automatic discovery, the software travels into the
SAN, identifies each of the various devices and then draws the topology
of your SAN.
See your SAN connections
Equipped with this graphical representation, you can quickly view
interoperability and connectivity of the different SAN devices. Softek
SANView then monitors all vital functions and reports on the health of
78
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
79. Site Audit & SAN ANALYSIS
your SAN, proactively alerting you to issues before they impact
application availability.
Without the clear picture of SAN connectivity that Softek SANView
provides, system administrators are operating in the dark. Problems
can't be anticipated and when problems do occur, they have no idea
where to begin troubleshooting.
Virtualize Your SAN Infrastructure
Unique in the industry, Softek SANView's synergy with Softek
Virtualization enables customers to virtualize and visualize their SAN
infrastructure.
7.4.1 Fujitsu (Softek) Storage Manager
Fujitsu SANVIEW is a SAN monitoring, and viewing tool. To add
control functionality to it you must add the Storage Manager option.
• Centralize storage management for mainframe and open systems
—Centrally monitors, reports and manages storage across the
79
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
80. Site Audit & SAN ANALYSIS
enterprise from a single product in a single console. Simplifies and
manages more storage with fewer resources.
• Improve productivity through automation—Monitors and automates
actions (archive, delete, provision) to allow management by exception.
Lets you set storage policies across servers or groups of servers.
Schedules actions to run based on pre-defined criteria. Eliminates the risk
of human error or application downtime.
• Manage multi-vendor, heterogeneous storage with one solution—
Hardware and software independence eliminates vendor lock-in or
proprietary solutions. Supports all storage types from all vendors including
DAS, SAN and NAS.
• Define business-process views of storage—Views data from an
application perspective so decisions can be based on business
requirements, not technology alone. Group servers by your logical
business model. Manages your business, not your servers.
80
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
81. Site Audit & SAN ANALYSIS
• Automate storage provisioning—Maximizes uptime by automatically
provisioning storage to applications that need more space. Users never
experience application downtime due to “out-of-space” conditions.
Know what you have, so it can be effectively managed
Increase utilization rates through threshold monitoring
Analyze storage across all media and easily maintain your desired
storage state
Increase productivity through automation
7.4.2 Fujitsu (Softek) TDMF
Softek TDMF, Mainframe Edition simplifies storage migration for
mainframe data. It ensures that an organization's revenue-generating
applications suffer zero interruption due to the addition of new storage.
Without it, the addition of new storage might require a five- to six-hour
interruption before being able to access mission-critical business
applications. DoubleClick was evaluating this product during the week of
13 – 17 January 2003. Initial results moving data from NYC to Colorado
over the OC3 pipe where promising.
81
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
82. Site Audit & SAN ANALYSIS
Vendor independent
Is completely transparent to end-users, operating on any mainframe-
compatible equipment regardless of manufacturer. Such vendor
independence, allows you to reduce your maintenance costs by replacing
expensive hardware/software migration and/or replication solutions.
Ensure business continuance with Point-In-Time Copies
Softek TDMF, Mainframe Edition features a base component with add-on
modules that extend product functionality. The first module, Offline
Volume Access (OVA), enables the processing of Point-In-Time copies
using standard in-house utilities such as DFSMSdss or FDR to maximize
the availability of critical production data.
Immediate Return on Investment
Maintain continuous data availability during data movement and
backup
Replace scheduled outages with business continuance
Satisfy service-level agreements
7.5 MTI Data Sentry 2
MTI’s DataSentry2 product is an appliance based Data Management
82
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
83. Site Audit & SAN ANALYSIS
solution. Typical applications include remote disaster recovery, creation of
off line volumes for development, business operations and data center
maintenance usage and re-mote ServerFree and zero impact backup.
Data Sentry2 solutions include:
- QuickSnap: Point In Time Images
- QuickDR: Short & Long Haul Replication
- QuickCopy: Full Point In Time Image Volumes
- QuickPath: Server Free Backup
DataSentry2 Console:
The DataSentry2 console provides single point control of Quick functions
for DataSentry2 enabled systems throughout the enterprise. From a
central SAN console the administrator can use the drag and drop GUI to:
create point in time images, create a mirror, extend volumes, define
replication relationships, kick off QuickSnap, quiese databases, start
Quick-Path Backup, and restore data after disasters.
83
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
84. Site Audit & SAN ANALYSIS
7.5.1 QuickSnap (Point In Time Images)
Quick snap is used to create a DAV (Data Availability Volume). This
virtual volume represents the source volume at a specific point in time.
Once the QuickSnap is taken the resulting DAV can be mounted to a
host and subsequent writes to the source volume will cause a physical
copy of the older data from the source to be kept aside for use during
access of the QuickSnap DAV. This scheme efficiently sees storage
space since only the writes since the last QuickSnap have to be kept
rather than an entire physical copy of the volume. The facility is often
used for periodic recording of an image for the later roll back/restore in
case of a user induced write or accidental delete. Since creating a
QuickSnap Dav happens instantaneously, the technology eliminates the
need for an incremental resynchronization since it will always be faster
to take a new QuickSnap than to perform a traditional incremental
resynchronization operation.
7.5.2 QuickDR (Data Replication & Migration)
MTI’s DataSentry2 supports creation of remote site hot-standby
84
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
85. Site Audit & SAN ANALYSIS
volumes for use during maintenance cycles and local disasters or to
migrate data over short or long distances.
Using DatSentry2, a replication set is established within the local
storage system. Next a replica is created inside the remote storage
system using either the internal or external (in this case OC3) network.
This one time copy may take awhile, as it is a complete physical copy.
During replication set up, you choose a time of day or watermark
(number of new writes) to establish a trigger for automatic initiation of
update. When the trigger occurs a QuickSnap is automatically taken and
the updates are sent over the long distance communication channel to
update the replica with the last set of changes. In this way the replica
(remote) volume is kept synchronized.
85
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
86. Site Audit & SAN ANALYSIS
When replication is complete the customer can choose when to break
the volume and use the replicated volume in Colorado to take over
operations. QuickDR is used in this fashion to either migrate data or long
distances, or for keeping a replica copy of chital data, either locally or at a
remote data center for disaster recovery.
86
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
87. Site Audit & SAN ANALYSIS
7.5.3 QuickCopy (Full Point In Time Image Volumes)
When a quick copy is issued to create a DAV, access to the DAV is
available to a host only after the physical copy process completes. (In this
way it is different from a QuickSnap where data is available immediately).
Multiple QuickCopy can be taken from the same source volume and can
be mounted to any host in the data center SAN or replicated to another
site for use. The advantage here is that due to it being a complete copy,
access time is the same as for the original volume. This is useful for
making copies of Databases or applications to test changes without
corrupting or changing live data.
7.5.4 QuickPath (Server Free & Zero Impact Backup)
For high-end storage architectures, one can create off-site rotating
storage (on-line) disaster images, keeping them constantly refreshed in
case of a need to quickly switch to the standby site. If the remote backup
image is kept on-line than it can be updated frequently and the time to
87
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
88. Site Audit & SAN ANALYSIS
bring up the remote site can be minimized since lengthy tape restorations
can be avoided. This technology uses QuickSnap features to allow the
customer to complete zero impact backups.
7.6.5 DataSentry2 Migration Considerations
88
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
89. Site Audit & SAN ANALYSIS
Existing storage systems can be field upgraded to support the
DataSentry2. Systems that have been previously configured by the host
operating systems can be imported into the DataSentry2. This allows for a
seamless migration to DataSentry2. The imported volume s act just like a
newly created virtual volume. All “Quick” functions can be ran against the
imported volumes.
89
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
90. Site Audit & SAN ANALYSIS
7.6 Software Review
Currently DoubleClick has the typical problems that you expect to see
from a data center that has faced their challenges. The solutions listed
below represent the best of what is available vice “vaporware”.
1. Hardware Control: (Of Hitachi / MT or other SAN) MSM (MTI)
or native hardware GUI if they go with another Vendor SAN
(Hitachi).
2. BACKUP – Stick with Veritas NBU (Net Back Up).
3. Storage Virtualization – For systems hooked up to an MTI Vxx
class, DatSentry2 or Fujitsu Storage manager provides much
of this Functionality. Which currently does not exist in the data
center.
4. Fujitsu SANVIEW – when combined with Fujitsu Storage
Manager allows for not only control but also viewing of your san
down to the individual components and firmware that make it
up.
5. VERITAS SANPOINT/ VxVM (Virtualization & Viewing) – to
allow integration of legacy storage and allow dynamic re-
90
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
91. Site Audit & SAN ANALYSIS
allocation of storage. It is broadly analogues in functionality to
Fujitsu SANVIEW and Fujitsu Storage manager combined.
6. BROCADE WEBTOOLS (SAN ADMINISTRATION) – The
brocade web-browser java/GUI is part comes with Brocade
switches and is Brocades native switch control / monitoring /
configuration software. It is a Java web based GUI and is quite
flexible and robust. DoubleClick already has this.
This above software all allows for running in a consolidated mode or a
distributed node. Or in some cases both. In consolidated mode all the
above software could run within windows on 1 or 2 NOC consoles to allow
for system monitoring and control. In distributed mode the software could
be set up to be accessed remotely or from personnel desktops located
around the Colorado Data Center.
91
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
92. Site Audit & SAN ANALYSIS
7.7 SUMMARY - ENTERPRISE STORAGE
My recommendation for Data migration is either to go with Fujitsu
TDMF (pending the outcome of onsite testing) or DataSentry2 to migrate
their data to Colorado. For now you could use TDMF or DS2 to migrate
your data, and proceed with SANVIEW and Control Software
improvements after the move is complete and you have your systems
running in Colorado. If you implement V400’s and core Brocade 12000
core switches in Colorado you can take a phased approach to improving
your SAN control and consolidation software.
Once the Data is Migrated to Colorado the customer may want to
look at a centralized control and monitoring system for all their storage.
Fujitsu SANVIEW with STORAGE MANAGER is one option and may be
the best. If DoubleClick goes with TDMF to migrate their data to
Colorado, than standardizing on Fujitsu for the SAN software makes
sense.
The alternative would be SANVIEW to view and monitor the SAN and
DataSentry2 to allow control and Virtualization of “ALL” the onsite
92
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
93. Site Audit & SAN ANALYSIS
storage. That could include any direct attach (JBOD) storage that
remains.
The nice thing about DataSentry2 is that it not only virtualizes all the
storage you wish to put under its control, but also brings SnapCopy,
Zero Impact Backup and Disaster Recovery (via replication) to the table.
93
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
94. Site Audit & SAN ANALYSIS
8.0 Special Applications Considerations
8.1 Database - Considerations
Customer Database considerations for a SAN implementation are fairly
straightforward. Oracle, Sybase, and SAP all have common areas that
can benefit from performance tuning, which include the underlying
database system’s transaction, journal files and temp space. To optimize
the performance of transaction logs, redo logs, journal files and temp
space or rollback segments, placement of these files or objects on fast
SAN-RAID arrays configured for RAID 0+1 will yield the best
performance. Also important is the need to isolate database storage from
other activities, placing it in arrays and controllers not being used for
storing other active data tables.
SAN-RAID (i.e. MTI V400 / S240 etc.) is designed for storing data in a
cost effective centralized manner, with performance being a by-product.
Since redo logs, transaction logs, temporary tables, journal files, and
rollback segments tend to be I/O bound, any steps that can be taken to
improve I/O performance will be of great benefit to DoubleClick.
MTI – V-Cache is a SAN enabled solid-state disk. It (like Direct
Attached Solid State Disks) is an I/O device that can be used to store and
safeguard data. Solid State Disk technologies like V-Cache use random
94
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
95. Site Audit & SAN ANALYSIS
access memory (RAM) to sore data and thus eliminating the access
latency in mechanical disk drives. The near instantaneous response and
the ability to place a ram disk within a SAN allow for quantum
improvements in database applications by moving those small but I/O
intensive segments of database off of traditional mechanical disks.
Traditional storage is measured in dollars-per megabyte while Solid
State Disk and V-Cache should be measured in terms of dollar per I/O. In
other words you could throw a lot of disks at an I/O problem and still not
fix the underlying problem. On the other hand you target those I/O bound
files with a V-Cache or Solid State Disk and gain a substantial
performance increase.
Solid State Disks/V-Cache and SAN-RAID compliment each other,
particularly in database and SAP environments. SAN-RAID is used for
tables, indices, and some journal files with the appropriate RAID Level,
while Solid State Disk is used for redo logs and key tables requiring very
low latency and high performance. Performance involving temporary and
system table segments can also benefit from being placed on an SSD or
SAN-RAID array configured as RAID 0+1.
DOUBLECLICK’s current ORACLE database could benefit from some
of these tuning considerations (i.e. RAID 0+1) when a centralized SAN is
implemented. However there may be a cost benefit consideration in
doing hardware performance improvements like a Solid State Disk/V-
Cache)
95
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
96. Site Audit & SAN ANALYSIS
8.2 Microsoft Exchange 2000 - Considerations
There are no special considerations for employing Exchange 2000 on
a LAN. However DOUBLECLICK may want to consider some kind of
software snapshot Technology for capturing point in time backups of
Exchange.
Software snapshot technology is implemented through a copy-on-
write technique that takes a picture of a file system or data set. The
process of taking a snapshot is almost instantaneous, and data is
moved to a backup medium at the block level. Any block that is
changing while this snapshot image is being backed up is moved to
cache. This important process allows the file system to remain in
production while also enabling the snapshot to preserve its exact
moment in time by flushing the cache to the backup medium. In the
event of a restore , data can be restored more efficiently at the file level
instead of the volume level. This is made possible by the data mapping
that is preserved along with the backup image.
Veritas Netbackup supports this functionality in Microsoft Exchange
(API level integration). Also MTI’s Data Sentry 2 SAN appliance can
provide this capability in a robust hardware appliance.
96
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
97. Site Audit & SAN ANALYSIS
97
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
98. Site Audit & SAN ANALYSIS
9.0 Summary - Roadmap
We are submitting two basic topologies as possible candidates for
DoubleClick’s transition to an integrated SAN.
9.1 Proposal Overview
Proposal is based on the following assumptions / recommendations.
- Decommission all 3600 Array’s
- Leave S240 (Vivant Conversions) in NYC to support QA & DEV
and replace 3600’s.
- Transition all high throughput Disk I/O to 2 GIG operations.
- Leave backups on 1 GIG SAN and migrate as time and $$ permit
to 2 GIG architecture as necessary after the relocation
- All Brocade switches in Colorado to be installed in core switch
fabric.
- Recycle Ancor switches into meshed fabric to support remaining
systems in NYC.
98
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
99. Site Audit & SAN ANALYSIS
- Upgrade Vivant V20/30 and upgraded to 73G drive technology
(as necessary) to gain space to eliminate older JBODS and or
NetAPPS.
99
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
100. Site Audit & SAN ANALYSIS
100
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.docChap 9 - Figure #1
Preferred SAN Config
12 x V400 3x70” cabinets cross connected to
Brocade 12000s (56TB Total Raw 47TB R5)
3 Mgt Proc & 4 V400 (146x32) per cabinet
High Speed Ethernet Trunk
2 x Brocade 12000
WINTEL
SYSTEMS
2 x L700 Library
SD
E NTER PR I SESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTE RP RI S ESun 5 0 0 0
Ω
DRI VE NULT RA S PA RC
SD
E N TERP RI S ESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTER PR I SESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTE RPR I SESun 5 0 0 0
Ω
DRIV ENULTRAS P ARC
S D
E NTE RP RI S ESun 5 0 0 0
Ω
DRI V EN
ULT RAS PA RC
S D
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
U LTR ASP AR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
ULT RAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DR IVEN
UL TRAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
D RIVEN
U LTRA SPA RC
S D
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
ULTR ASP AR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DR IVEN
UL TRAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
D RIV EN
U LTRAS PA RC
3500 / 4500 Farm
/E450/etc. Suns
6500 Primary Suns
NOTE:All 1 GIG adapters for backup
retained and built into Backup ZONE on
12000s. New 2 GIG HBA’s added and
dedicated to Disk Traffic only.
Wintel Systems Could use 1 or 2 GIG
HBA as I/O dictates. Allows for largest
re-use of 1 GIG HBAs on hand.
Add V-Cache later if
more database
performance is
required. Place Hot
Files and Index Files
here.
101. Site Audit & SAN ANALYSIS
Topology #1
Topology #1 starts with the installation of 2 Director class Brocade
switches (Brocade 12000s). Unix systems (Software Clustered or Stand-
Alone) and Windows (Clustered or Stand Alone Systems) are connected
directly to the Brocade 12000s.
Tape libraries are connected to the core switches and are built into a
Backup zone on the switches along with the 1 GIG HBA’s that support
back up operations. In this topology migration to 2-gig topology for backup
is easier to accomplish in phases, but mixes disk and tape I/O on the core
switches.
3 70” cabinets of V400’s (12 V400’s with 32 drives of 146G per V400)
provide 56 TB of raw data in the new 2 GIG technology (this effectively
replaces 26 cabinets of 3600s). Each V400 Cabinet would have 1
management processor for phone home and equipment monitoring.
Money can be saved by not buying the Brocade switches that normally
come with the V400’s (2 switches each V400 typical) and instead direct
connecting them to the Brocade core switches. Add 2 GIG HBA’s to
systems requiring storage and build a 2 GIG zone in Brocade.
Install the V400’s in Colorado and migrate data with DataSentry2 or
Fujitsu TDMF.
101
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
102. Site Audit & SAN ANALYSIS
Add V-C-Cache solid-state database accelerator only if you require
more speed than the V400 upgrade will produce.
A management console in the NOC can monitor the SAN via MSM
(MTI supplied) or another product (i.e. Fujitsu SANVIEW) . Backups are
via a direct fibre attached tape library. If desired multi protocol (IP over
Fibre) protocols can be run to move some of the backup traffic for non
san capable systems off the network. Virtualization tools (beyond what
comes with the V400) Like MTI DataSenry2 or Fujitsu SANVIEW with
Storage Control option, can be added after the move as money permits.
102
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
103. Site Audit & SAN ANALYSIS
103
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.docChap 9 - Figure #2
Alternate SAN Config
12 x V400 3x70” cabinets cross connected to
Brocade 12000s (56TB Total Raw 47TB R5)
3 Mgt Proc & 4 V400 (146x32) per cabinet
High Speed Ethernet Trunk
2 x Brocade 12000
WINTEL
SYSTEMS
2 x L700 Library and current
backup infrastructure maintained.
2800 Silkworms as edge switches.
SD
E NTER PR I SESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTE RP RI S ESun 5 0 0 0
Ω
DRI VE NULT RA S PA RC
SD
E N TERP RI S ESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTER PR I SESun 5 0 0 0
Ω
DRI V EN
UL TRAS PA RC
SD
E NTE RPR I SESun 5 0 0 0
Ω
DRIV ENULTRAS P ARC
S D
E NTE RP RI S ESun 5 0 0 0
Ω
DRI V EN
ULT RAS PA RC
S D
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
U LTR ASP AR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
ULT RAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DR IVEN
UL TRAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
D RIVEN
U LTRA SPA RC
S D
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DRIVEN
ULTR ASP AR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
DR IVEN
UL TRAS PAR C
SD
Sun E N T E R P R I S E
Ω
Ω
Ω
4 0 0 0
D RIV EN
U LTRAS PA RC
3500 / 4500 Farm
/E450/etc. Suns
6500 Primary Suns
NOTE:All 1 GIG adapters for backup
retained and connected as in current
backup topology. Silkworms connected to
12000’s to allow a phased transition of
the backups to 2 GIG speeds.
Wintel Systems Could use 1 or 2 GIG
HBA as I/O dictates. Segregates Backup
& Disk I/O for better disk performance.
Add V-Cache later if
more database
performance is
required. Place Hot
Files and Index Files
here.
104. Site Audit & SAN ANALYSIS
Topology #2
Topology # 2 is basically the same as topology #1. The main
difference is that edge switches (Brocade 2800’s) are used for the
backups. In effect the backup topology remains almost unchanged,
except that the Brocade 2800s are connected the Brocade 12000 core
switches via inter switch links (ISLs).
This has the advantage of totally segregating Disk and Tape I/O
operations, while allowing for the staged movement of tape operations
over to 2 GIG in a controlled fashion, as time and money permit.
All other advantages and functionality are as per the description for
Topology #1. I prefer this one, as it keeps the tape system I/O unchanged.
There will be enough changes made during the move, this will allow the
tape backup architecture to be modified later after the dust settles from
the Colorado move.
104
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
105. Site Audit & SAN ANALYSIS
9.2 Roadmap
105
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
106. Site Audit & SAN ANALYSIS
• Start with a Director Switch (SANS are a lot like Networks so you
need a core to start from)
• Add 2 GIG HBA’s only to systems where 2 GIG throughput is
necessary (i.e. high I/O servers SUN 4500 / 6500 etc.) Use 2 GIG
HBA’s for Disk I/O.
• Bring new enterprise storage into Director SAN (This would allow
for a more centralized storage plan going forward vice the DAS and
small SANs currently installed)
• Integrate tape/backup fibre environment into director (via ISL Inter
Switch Links)
• Install Multiprotocol drivers for ip/scsi (this can be done after
relocation) This can allow you to move IP backup traffic off the
Ethernet network and onto the fibre backbone. (it can run
concurrently with FCP/SCSI)
106
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
107. Site Audit & SAN ANALYSIS
• Determine Host qualification for SAN and its requirements i.e.
backup replication etc. (See 9.4 Transition Points for guidance)
• Connect new hosts (SAN Candidates) to director switch (12000) or
NYC Ancor fabric as necessary.
• Document its location (host) , label fibre connections, raid type ,
application , WWN, volume-mapping table and replicated volume.
(See Appendix – 6 , SAN Best Practices)
• Use volume mapping /hard zoning to ensure data security
• If the host is due to be replaced replication can be used to
maintain uptime by replicating data to the new host and its storage.
(See 9.3 Data Migration)
Storage and hosts can be added on the fly in a non-disruptive
manner in this environment.
107
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
108. Site Audit & SAN ANALYSIS
• You can scale to more storage by adding another core switch or
edge switches and continue growing your storage using tools like
Veritas San Point foundation suite, Fujitsu SANVIEW, or MTI
DataSentry2.
108
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
109. Site Audit & SAN ANALYSIS
109
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
110. Site Audit & SAN ANALYSIS
9.3 Data Migration (DAS-SAN)
In moving data from one storage system to another (as you eliminate
JBODS or migrate to Colorado), there are going to several issues. The
time window available, the impact to online resources during the move,
and the reliability of the data after the move. For example some data may
need to remain online 24x7, during the move, while other data can be
offline during the migration process.
The migration process is fairly straightforward for systems that do not
need 24x7 availability. Each of these systems can be connected to the
SAN via a host bus adapter. Storage can be assigned to that host while
maintaining its direct attached storage. The process at that point is a
simple disk-to-disk copy/backup or mirror operation. The storage LUN
allocated from the SAN merely looks like another local disk. The
applications can be stopped, the data copied to the san, the application
re-started and tested. The advantage to this is your data is still on your
old DAS should something go wrong with the transition. There have
been instances during migrations where time constraints, or unforeseen
problems have caused customer sites to fail back to their DAS, and this
is a nice option to have. Of course backing up the data prior to the
migration operation is a nice added bit of insurance.
110
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
111. Site Audit & SAN ANALYSIS
Where data needs to be 24x7, the procedure is similar. There would
still be a need for a small maintenance window to install host bus
adapters and drivers into the host being moved to the SAN. After that
the host could be brought back online. After the host is online a
mirroring application (i.e. Veritas Replication) or FUJITSU TDMF or MTI
DataSentry2 could be used to replicate the data from the DAS storage
to the SAN storage while allowing the application/host to remain online.
DoubleClick will have to weigh the benefits and impact of the downtime
required for migration using a direct copy vs. replication, and decide
which is appropriate on a case-by-case basis.
9.4 Transition Determination Points (DAS-SAN)
A basic rule set should be utilized by the storage administrator /system
administrator to determine a servers /application qualification for SAN
placement in your environment. These should include but are not limited
111
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc
112. Site Audit & SAN ANALYSIS
to these basic rules.
1. Does the application require high availability yes/no
2. Does the application have high growth expectations like dynamic
storage needs yes/no
3. The existing server / application has reached its end of life and is
scheduled to be replaced by newer equipment
4. The Application has high performance requirements ( video
streaming, imaging)
5. The server / application has a storage need of (100GB) or more.
6. The server is able to benefit from the Fibre Channel SAN
Architecture (no workstations Sparc 5,10,20 , ultra 1 etc.)
112
9ba6b2b1-6e35-418a-91c8-d3f885d8e6b9-170117023025.doc