Your SlideShare is downloading. ×
Storage Architectures And Options
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Storage Architectures And Options

1,729

Published on

Published in: Technology, Business
2 Comments
1 Like
Statistics
Notes
  • This presentation is concerned with providing an overview of mainstream storage options for large-volume data storage.

    Optical storage technology has been around for two or more decades and has never delivered on its promise of high-capacity, low cost storage. There have been lots of hybrid technologies such as optical tape that never became mainstream.

    Current 12-speed BluRay drives write at 54 MB/sec (432 Mbps) but not for BD-RE which in the format used for standard data storage. In this case the speed is about 9 MB/sec. This is 40 times slower that tape technologies and much slower again than disk.

    UDO had the sponsorship of large vendors such as HP and IBM but died when Plasmon failed.

    BluRay disks have a capacity of 25 GB or 50 GB. Even UDO went to 60 GB now and promised higher.

    LTO5 tape offers 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (again assuming a 2:1 compression).

    It is a matter of defining your storage requirements and constraints: capacity, write window, growth plans, data type, file sizes, application software. Then look for a solution to meet these requirements.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • you only mention Plasmon UDO for optical though this is obsolete and dead technology. I see no mention of Blu Ray or Holographic here.....though both have much broader support by the industry and major manufacturers like GE, Sony, Panasonic. Toshiba, etc.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,729
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
134
Comments
2
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Storage Architectures and Options Alan McSweeney
  • 2. Objectives
    • To provide high-level information on storage options and architectures for storing and managing digital camera data
    • To provide indicative sample solutions
    • To initiate discussions on storage configurations and options
  • 3. Agenda
    • Confirmation of Storage Requirements
    • Data Flows and Processes
    • Storage Management Architectures and Options
    • Storage Management Operation, Management and Use
    • Sample Solutions
  • 4. Understanding of Requirements
    • Storage solution to manage raw and processed map image data
    • Store raw and processed data
      • No requirement to store intermediate pre-processed data
    • Keep 6 month’s raw and processed data on primary storage
    • Keep online copy of additional data
    • Keep all raw and processed data indefinitely
    • Size for at least 5 years
    • Deliverables
      • Draft data management/storage policy
      • SLA options on data retrieval from non-primary storage
      • Set of practical options
      • Storage management policy document
  • 5. Objectives of Storage Management
    • Data availability to meet service level commitments even during failures, disasters, or other forms of primary data loss
    • Data protection against loss and to prevent unauthorised access
    • Data retention that is compliant with regulations and standards in an unalterable state, fully audited for long periods of time
    • Cost-effective storage management infrastructure
  • 6. Backup and Data Archival
    • Backup
      • Ensure efficient recoverability of data
      • Does not make backup data directly available
      • Optimised to bring large amounts of data back online quickly for system recovery
      • Retention management at the volume level
      • Not oriented to long-term management beyond life of current environment and media
    • Archiving
      • Copy from online environment to separately managed (secure) storage to reduce cost of storage and enforce retention
      • Provides easy (ideally transparent) access for retrieval
      • Optimised to write and retrieve data at file granularity
      • File-level retention management
      • Designed to manage data over long-term, through media migration and with access auditing and controls
      • Designed to manage multiple copies of data on different media types
  • 7. High Level Storage Management Architectures
    • Multi-tier data storage architectures
      • Primary/Secondary
      • Primary/Secondary/Tertiary
      • Primary/Secondary and Tertiary in parallel
      • Secondary disk storage layer is purely for convenience to allow recall of data
    • Advantages and disadvantages in terms of cost and service
  • 8. Hierarchical Storage Management (HSM)
    • HSM is a key requirement of effective (and cost-effective) storage management
    • Data is migrated (moved / copied) from one storage layer to another, usually less expensive, form of storage
    • A stub is created for and replaces each migrated file
      • On the local system, a stub file looks and act like a regular file
    • When user action restores a file but the user does not change the file, that file is ″re-stubbed″ during the next migration process
  • 9. Primary/Secondary Primary Storage Secondary Storage High speed fibre-channel disk Data is directly accessible Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined Interval
  • 10. Primary/Secondary Primary Storage Secondary Storage Migrate After Defined Interval Retrieve from Secondary to Primary
  • 11. Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage High speed fibre-channel disk Data is directly accessible High capacity ATA (SATA/FATA) disk Data is directly accessible Data resides Offline/nearline storage Retain data indefinitely Tape/optical media Migrate After Defined Interval Migrate After Defined Interval
  • 12. Primary/Secondary/Tertiary Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Migrate After Defined Interval Retrieve from Secondary/Tertiary to Primary
  • 13. Primary/Secondary and Tertiary in Parallel Primary Storage Secondary Storage Tertiary Storage Migrate After Defined Interval Take Copy Immediately
  • 14. Hardware Options
    • Disk Storage
    • Tape Storage – Manual or Automated
    • Optical Storage – Manual or Automated
    • Hybrid devices
      • VTL (Virtual Tape Library)
      • EMC Centera
      • IBM DR550
      • Storage gateways
  • 15. Hardware Options - Disk
    • Disk – Advantages
    • Speed - FC and SATA disk technologies allow the data to be housed on the appropriate disks
    • SATA Drive technology has mature and can lead to decreased acquisition costs
    • FC and SATA can be used within the same storage system for primary and secondary data
    • Storage Virtualisation
      • Virtualise disk arrays within a storage system
      • Virtualise storage systems within a fabric
      • Thin provisioning allows over commitment of disk – reducing acquisition costs
      • Single Instance Storage (Deduplication) can be used but its effectiveness depends in the nature of the data
  • 16. Hardware Options - Disk
    • Disk – Disadvantages
    • Acquisition cost
    • Disk systems do not interoperate well
    • Management - multiple skill sets may be required even if all storage systems are from the same vendor
    • Most hardware vendors focus on ensuring hardware resilience, data resilience is not their concern
    • Operating costs – power, air conditioning, maintenance
  • 17. Hardware Options – Removable Media
    • Advantages
      • Control of costs
      • Keep fixed number of media within automated library unit (could keep none)
    • Disadvantages
      • External media needs media management and control
        • Media management is greater for smaller capacity optical disks
      • Manual costs of media management
  • 18. Hardware Options – Optical Storage
    • Optical Storage
    • UDO (Ultra Density Optical)
      • 60 GB media capacity
    • UDO media have a 50+ year life
    • UDO technology roadmap -120GB and 240GB media capacities
    • Main vendor – Plasmon
    • Resold by other vendors: HP and IBM
    • WORM media option
  • 19. Optical Library and Drive Performance
    • Poor performance relative to tape
    • Direct access medium
    • Use depends on data read (retrieval) and write volumes
  • 20. Single Drive/Path Tape and Optical Read and Write Performance
  • 21. Hardware Options – Optical Storage
    • Optical – Advantages
    • Reduced cost over disk
    • Larger capacity media planned for the future
    • Can have embedded encryption
    • Long media shelf life before refresh is required
    • Very reliable medium
    • True WORM option
  • 22. Hardware Options – Optical Storage
    • Optical – Disadvantages
    • Low capacity
    • Media must be managed offline unless multiple libraries are bought
    • Low data access speed – not suited to large data volume restores
  • 23. Hardware Options – Optical Storage
    • Optical Storage Issues
    • Low medium capacity
      • UDO – 60 GB currently, 120 GB and 240 GB planned
    • Tape
      • LTO-4 Ultrium 1840 – 800 GB uncompressed
      • LTO-3 Ultrium 960 – 400 GB uncompressed
  • 24. Tape and Optical Media Capacities
    • Optical media capacity cumulative annual increase of c. 31%
    • Tape media capacity cumulative annual increase of c. 64%
  • 25. Hardware Options – Tape
    • Tape – Advantages
    • Cost
    • Very well defined road map for LTO
      • LTO4 (Dec 2006) - 1.6TB (2:1 compression) and data transfer rates of up to 240 MB/second (2:1 compression)
      • LTO5 (Planned) - 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (assuming a 2:1 compression)
      • LTO6 (Planned) - 6.4 TB (2:1 compression) and data transfer rates of up to 540 MB/second (assuming a 2:1 compression)
    • High capacity media
    • Designed for large data volume restore
    • Multiple media can be streamed to aggregate capacity and speed
    • Can have embedded encryption
  • 26. Hardware Options – Tape
    • Tape – Disadvantages
    • Media shelf life – medium
    • Media long-term reliability
    • Cumbersome single file restores
    • Sequential access medium
  • 27. Hardware Options – Tape Library
    • Widely available from large number of vendors: Dell, HP, IBM, Quantum
      • IBM System Storage TS3500 Tape Library
      • One base frame, and up to 15 expansion frames
      • Up to 12 drives per frame (up to 192 per library)
      • Up to 5.5 PB with LTO 4 cartridges
      • LTO Fibre Channel interface for server attachment
    • Very high capacity automated data management
    • Long-term data storage
  • 28. VTL (Virtual Tape Library)
    • Hybrid units that emulate tape libraries
    • Use low cost disk (and possibly tape)
    • Works with existing tape backup software
    • Improved backup speeds
    • No removable medium backup
    • Sample products
      • IBM
        • IBM Virtualization Engine TS7510
        • IBM Virtualization Engine TS7520
      • HP
        • StorageWorks Virtual Library System (VLS)
        • VLS1000i
        • VLS6000
  • 29. IBM Virtualization Engine TS75x0
    • TS7510
    • 96 TB Capacity at 2:1 Compression
    • Maximum number of virtual libraries – 128
    • Maximum number of virtual drives – 1,024
    • Maximum number of virtual cartridges – 8,192
    • Maximum number of concurrent backups – 32
    • TS7520
    • 2.6 PB Capacity at 2:1 Compression
    • Maximum number of virtual libraries – 512
    • Maximum number of virtual drives – 4,096
    • Maximum number of virtual cartridges – 64,000
    • Maximum number of concurrent backups – 32
  • 30. HP StorageWorks Virtual Library System (VLS)
    • VLS1000i
    • 3 TB Capacity at 2:1 Compression
    • Maximum number of virtual libraries – 6
    • Maximum number of virtual drives – 12
    • VLS6000
    • 105 TB Capacity at 2:1 Compression
    • Maximum number of virtual libraries – 16
    • Maximum number of virtual drives – 128
  • 31. IBM DR550
    • Uses multiple storage tiers (disk, tape, optical) within an archive
    • Software - System Storage Archive Manager
    • Two models
      • DR1 - 36.88 TB raw
      • DR2 - 168 TB raw
    • Attached devices – support for PB capacities
      • Tape systems
      • Optical systems
    • Awards
      • Data Protection Summit—Information Lifecycle Management (ILM)—Best of Show, 2007
      • AIIM (The Enterprise Content Management Association)—Best in Show, 2005, 2006
  • 32. Software Options
    • HSM
    • HSM is a principle most products offer the same basic functionality
      • Automatic migration and management of data from one medium to another
      • Stubs or pointer are left in place of migrated files
      • Speed of retrieval depends upon speed of hardware upon which the files have been migrated to, this gives online, near-line and off-line options
  • 33. Software Options
    • Bridgehead Software
    • Small company, employee owned
      • Can they offer the level of service and support required when really needed
      • Are they possible acquisition targets
    • Ideal for mid – large customers
      • Can it handle the levels of data over time
    • Caminosoft
    • Major corporation – publicly listed and managed by SEC rules and regulations
    • Primary focus is on managing file server type data
    • Repackaged by vendors such as CA
  • 34. Software Options
    • Symantec
    • Major corporation
    • Two products:
      • NetBackup
      • Enterprise Vault
    • NetBackup
      • HSM does not support Windows
    • Enterprise Vault
      • KVS staff still provide support, separate entity within Symantec
      • Focus is largely on email and compliance
      • Some integration with NetBackup
      • Files to be migrated are collected into CAB files
      • Entire CAB file recalled
      • Poor support for tape as archival medium
        • Recommended that you only use tape for data that is seldom or never accessed
  • 35. Software Options
    • IBM – Tivoli
    • Major corporation
    • Vast knowledge within the company
    • Extensive R&D budgets
    • Agents and options from most major software and hardware vendors
  • 36. Software Options
    • HP – File Archiver
    • Major corporation
    • Vast knowledge within the company
    • Extensive R&D budgets
    • “ Simple Lightweight Solution” according to HP
  • 37. Software Options
    • HSM Product
    • What is Required from chosen vendor / application?
    • Stable and functionally bullet proof solution
    • Easy to use
    • Capable of handling files
    • Capable of handling data volumes
    • Must integrate with backup application (so as NetBackup does not initiate a restore when backing up or restoring stubs)
    • Expert support knowledge
    • Expert integration knowledge
      • These products are dependant on hardware vendors solutions
  • 38. Data Deduplication
    • Store only one copy of data
    • The deduplication process should be granular
      • The smaller the data block examined, the more likely it is duplicate data will be found.
    • The deduplication process should be designed with minimal overhead when deduplicating (storing) and un-deduplicating (retrieving) data
      • Hardware better than software
    • The deduplication process should provide resiliency to insure that all data can be reliably stored and retrieved, even in the event of system failure
  • 39. Data Deduplication
    • Available for range of storage – hardware and software
      • Symantec Enterprise Vault creates a MD5 fingerprint for every file that is archived
        • If multiple files have the same hash code, only one copy of the file is physically stored
      • IBM N Series has Advanced Single Instance Storage (ASIS)
        • Hardware and block-based deduplication
  • 40. Deduplication in Action Client.ppt Identical file - 20 blocks Sales ed.ppt 20 x 4K blocks White paper.doc Different file - 10 blocks Sales ed v2.ppt Edited file - 24 blocks = Identical blocks With ASIS - 38 total blocks Without ASIS – 74 total blocks
  • 41. Potential Deduplication Savings – Dependent in Data Types
  • 42. Software and Solution Design Constraints and Issues
    • Bottom Line
    • Produce a realistic design before implementation and validate design
    • Solutions must be fully tested to ensure it works as expected
    • Decisions can then easily be made on the basis of the tests
    • NetBackup integration must be thoroughly tested with any solution
    • Primary to secondary to tertiary migration and retrievals must be tested and documented
    • Misconfiguration or lack of understanding can lead to data loss or primary production system failure
    • Need to look at the total cost of ownership – maintenance, power, manual effort – put a cost on all elements and activities to ensure fair comparison
    • Reduced complexity – fewer components, vendors – means long-term ease of operation and use and has a genuine value
  • 43. Sample Storage Capacity Planning
    • Sizing issues and assumptions
      • Annual growth rate
      • Overhead for determination of actual disk storage requirements (RAID overhead, etc.)
      • Archival storage medium utilisation overhead (allowance for unfilled tapes, optical platters, RAID for VTL, etc.)
      • Storage lifecycle
      • Number of storage layers – 2 or 3
    • Sample storage capacity planning scenarios
      • Annual growth rates – 0%, 10%, 20%, 30%
      • Translated into monthly growth rates for calculations - 20% annual growth = 1.531% monthly
      • Three tiers
      • Migrate from Tier 1 to Tier 2 after 6 months
      • Migrate from Tier 2 to Tier 3 after further 6 months
  • 44. Disk Space Calculations
    • Storage estimates expressed as raw capacities required to accommodate data
    • Includes overhead for effective usability, RAID, snapshots, online spare, less than 100% utilisation, etc.
    • Primary storage after 5 years with 10% annual growth = 25,580 GB
    • Equates to at least 34,533 GB of raw disk capacity
  • 45. Sample Storage Capacity Planning – 0% Annual Growth Rate
  • 46. Capacities - Annual Growth Rate – 0%
  • 47. Storage Capacities - 0% Annual Growth Rate
  • 48. Media Requirements - 0% Annual Growth Rate
  • 49. Sample Storage Capacity Planning – 10% Annual Growth Rate
  • 50. Capacities - Annual Growth Rate – 10%
  • 51. Storage Capacities - 10% Annual Growth Rate
  • 52. Media Requirements - 10% Annual Growth Rate
  • 53. Sample Storage Capacity Planning – 20% Annual Growth Rate
  • 54. Capacities - Annual Growth Rate – 20%
  • 55. Storage Capacities - 20% Annual Growth Rate
  • 56. Media Requirements - 20% Annual Growth Rate
  • 57. Sample Storage Capacity Planning – 30% Annual Growth Rate
  • 58. Capacities - Annual Growth Rate – 30%
  • 59. Storage Capacities - 30% Annual Growth Rate
  • 60. Media Requirements - 30% Annual Growth Rate
  • 61. 10 Year Data Storage Capacities – Different Growth Rates
  • 62. Single Drive/Path Tertiary Layer Data Write Times – Tape and Optical
  • 63. Implementation Options
    • Factors:
      • 2 or 3 tiers
      • Optical, tape or VTL as the last tier
      • Use of existing storage (HP/Dell) or new storage
      • DR or no DR
        • Offsite manual copy or replication
      • Software HSM – use existing NetBackup or other: HT FileStore, CaminoSoft, IBM Tivoli
  • 64. Spectrum of Options All disk DR option with replicated data Primary disk Secondary - tape Mixed disk/tape/optical/VTL/manual/automated
  • 65. Data Retrieval Operation
    • Secondary disk
      • Data is retrieved to primary immediately – available within seconds/minutes
    • Secondary/tertiary VTL
      • Data is retrieved to primary immediately – available within minutes
    • Secondary/tertiary tape library
      • Data is retrieved to primary immediately – available within minutes
    • Secondary/tertiary optical library
      • Data is retrieved to primary immediately – available within hours
    • Manual media retrieval
      • Retrieval times depends on media location and staff allocated to media handling
  • 66. Sample Options
    • Three tiers – optical or tape library as third tier
    • All disk
    • Reuse/expand existing hardware
    • Low cost ATA disks for secondary storage
    • Not all available options – presented for review and feedback
  • 67. Physical Option 1 – Three Tiers – Optical or Tape
  • 68. Physical Option 1 – Three Tiers – Optical or Tape
  • 69. Physical Option 1 - Components
    • Primary storage – SAN with fibre disk
    • Second storage – SAN with ATA disk
    • Tertiary storage – optical library
    • Software
      • HT Filestore
      • Caminosoft
      • NetBackup Storage Migrator
      • Tivoli Storage Manager
  • 70. Resilience
    • Primary storage mirrored for resilience
  • 71. Operation and Service Level Agreement
  • 72. Physical Option 2 – All Disk Configuration
    • All disk storage option
    • Two mirrored sites with realtime replication
    • Multiple replicated components for resilience
    • Sample configuration
      • Primary Storage
        • Clustered SAN Controllers with 594 x 300 GB Fibre Channel Drives = 151 TB Raw Storage
      • Secondary Storage
        • Clustered SAN Controllers with 336 x 750 GB SATA Drives = 252 TB Raw Storage
      • Total 403 TB of Raw Storage capacity (doubled for DR)
  • 73. All Disk Configuration
  • 74. Resilience – Multiple Points of Redundancy
  • 75. Resilience
    • SAN switches
    • SAN controllers
    • Two disks per shelf
    • Entire site
  • 76. All Disk Configuration
    • Indicative hardware and software (replication, snapshot) cost
      • € 1.8 million
      • € 4,460 per TB (doubled for DR)
    • 5 standard racks in each location
    • Does not include
      • HSM software
      • Installation and commissioning
    • Represents high water mark in terms of costs and functionality
  • 77. All Disk Configuration
    • Advantages
    • High performance
    • Low manual intervention
    • Highly resilient
    • Disadvantages
    • High cost of acquisition and operation
    • Growth in data volumes means additional expense
    • No upper limit on cost
  • 78. Physical Option 3 – Existing Hardware
    • Raw, pre-processed and processed data resides on HP EVA
    • Replicated continuously to second EVA
    • Dell CX disk array used as secondary location
    • Existing ADIC LTO drives used for tertiary and long term offsite storage
  • 79.  
  • 80. Existing Hardware
    • Advantages
    • Cost
    • Some skill sets already in organisation
    • Disadvantages
    • Investment in old technology
    • Software based HSM product skills required
  • 81. Introduction of Tertiary Device
    • Existing HP and Dell storage still employed
    • UDO or LTO device used as final destination before removal to offsite archive
  • 82.  
  • 83. Introduction of Tertiary Device
    • Advantages
    • Cost – use of existing hardware
    • Some skill sets already in organisation
    • Media life is increased with UDO
    • Disadvantages
    • Cost – UDO or new tape library
    • Management of archived media – especially UDO as they are low capacity
    • Investment in old technology
    • Software based HSM product skills required
    • UDO retrieval speeds
  • 84. Virtual Tape Library
    • VTL device will act as a tape library
    • VTL will be secondary location
    • HSM product skills may not be required
    • NetBackup could manage this process
    • VTL data will ultimately be archived to tape via ADIC tape library
  • 85.  
  • 86. Virtual Tape Library
    • Advantages
    • Some skill sets already in organisation
    • No new third party migration tool absolutely necessary
    • Extension of NetBackup system using NetBackup Storage Migrator
    • Disadvantages
    • Cost – VTL with required capacity can be expensive
    • Cannot take VTL backups offsite – tertiary solution still required
    • Lack of vendor implementation experience
  • 87. Physical Option 4 – Disk Based Secondary Information Store
    • Single storage device with multiple PB of data scalability
    • Data can be retained on information store for 15+ years and beyond
    • 1 TB disk make this possible
    • Data can be moved to storage attached tape
    • Internal backup features of information store can aid NetBackup routine (SnapShots, Vaulting)
  • 88.  
  • 89. Disk Based Information Store
    • Advantages
    • Speed of retrieval
    • No new third party migration tool absolutely necessary
    • Simplicity
    • Integration with NetBackup – no effect on daily backup routines
    • Information store can be split across multiple information stores to give multiple PB capacity is required
    • Disadvantages
    • Cost – may be expensive initially but storage can be added over time as needed
  • 90. Central Management – Storage Virtualisation
    • Controller site above storage systems
    • Handle day to day management of storage across all platforms
    • Advantages
    • Skill set consolidation
    • Costs
    • Disadvantages
    • Vendor based skill are still ultimately required
  • 91.  
  • 92. Key Questions
    • Number of storage tiers and preferred configuration
    • Use of tape/optical/VTL
    • Software HSM option
    • Disaster recovery/business continuity requirements and options
    • Capacity planning constraints and assumptions
    • New hardware or reuse of existing hardware
    • Level of automation required for archival level
    • Financial constraints and budget available
    • Implementation schedule
  • 93. More Information
        • Alan McSweeney
        • [email_address]

×