How Data Instant Replay and Data Progression Work Together

13,973 views

Published on

C-Drive 2009 presentation by Scott DesBles about how Compellent's Data Instant Replay and Data Progression work together to create an efficient data storage system.

Published in: Technology
1 Comment
6 Likes
Statistics
Notes
No Downloads
Views
Total views
13,973
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
0
Comments
1
Likes
6
Embeds 0
No embeds

No notes for slide
  • 74% less disk costs assumptions: compares all Tier1 vs. 20% Tier1 (active data) and 80% Tier2 (inactive data) Tier 1 = 15K rpm FC drives RAID 10 Tier 2 = 750GB SATA RAID 5-9
  • Time 2 cannot be expired even though it is a replay because Volume 2 relies on it
  • How Data Instant Replay and Data Progression Work Together

    1. 1. How Data Instant Replay and Data Progression Work Together Scott DesBles
    2. 2. Agenda <ul><li>Dynamic Block Architecture </li></ul><ul><li>Data Progression </li></ul><ul><li>Fast Track </li></ul><ul><li>Data Instant Replay </li></ul><ul><li>Volume Configuration Best Practices </li></ul><ul><li>Questions </li></ul>
    3. 3. The Compellent Advantage: Dynamic Block Architecture  The only storage solution that manages data inside the volume <ul><li>Dynamic Block Architecture  </li></ul><ul><li>Metadata </li></ul><ul><ul><li>Creation, Access, and Modification time </li></ul></ul><ul><ul><li>Frequency of access </li></ul></ul><ul><ul><li>Disk drive type/RAID level </li></ul></ul><ul><li>Sophisticated data movement engine </li></ul><ul><li>No performance impact </li></ul><ul><li>U.S. Patents issued - more pending </li></ul><ul><ul><li>Automated Tiered Storage </li></ul></ul><ul><ul><li>Thin replication and snapshots </li></ul></ul>Compellent 72°F 70°F 68°F 72°F Traditional SANs
    4. 4. How The Blocks Are Managed <ul><li>Dynamic Block Architecture – the Page Pool </li></ul><ul><ul><li>Collection of allocated and unallocated disk blocks </li></ul></ul><ul><ul><li>Maps pages to volumes </li></ul></ul><ul><ul><li>Maintains metadata </li></ul></ul><ul><ul><li>Default size is 2MB per page (4,096 blocks) </li></ul></ul><ul><li>Fast data movement on array </li></ul><ul><ul><li>Change RAID levels by re-pointing pages </li></ul></ul><ul><ul><ul><li>System manages data mapping across </li></ul></ul></ul><ul><ul><ul><li>multiple RAID levels </li></ul></ul></ul><ul><ul><ul><li>Individual files can span multiple drive </li></ul></ul></ul><ul><ul><ul><li>types and RAID levels </li></ul></ul></ul><ul><ul><li>Re-stripe data when adding drives </li></ul></ul><ul><li>Page Pool grows/shrinks as needed </li></ul><ul><ul><li>Self defragmentation and tuning </li></ul></ul>Volume Page pool
    5. 5. Pages <ul><li>Data Pages are identified into the following categories: </li></ul><ul><ul><li>Accessible Recently Accessed </li></ul></ul><ul><ul><ul><li>These are the active pages the volume is using the most </li></ul></ul></ul><ul><ul><li>Accessible Non-recently accessed </li></ul></ul><ul><ul><ul><li>Read-write pages that have not been recently used </li></ul></ul></ul><ul><ul><li>Historical Accessible </li></ul></ul><ul><ul><ul><li>Read-only pages that may be read by a volume </li></ul></ul></ul><ul><ul><ul><li>Applies to Snapshot Volumes only </li></ul></ul></ul><ul><ul><li>Historical Non-Accessible </li></ul></ul><ul><ul><ul><li>Read-only data pages that are not being currently accessed by a volume </li></ul></ul></ul><ul><ul><ul><li>Applies to Snapshot Volumes only </li></ul></ul></ul><ul><ul><ul><li>Snapshot maintains these pages for recovery purposes and they should be placed on the lowest cost storage possible </li></ul></ul></ul>
    6. 6. Types Of Pages <ul><li>Data Progression defines two types of pages, accessible and historical. </li></ul><ul><li>Accessible pages are pages that can be read or written by a server at the current time. </li></ul><ul><li>Historical pages are read-only pages (set by Data Instant Replay). </li></ul><ul><li>Data Progression uses the accessibility to determine the class of storage a page should use. </li></ul>
    7. 7. Data Progression Benefits <ul><li>Patented technology automatically moves inactive data to lower tiers </li></ul><ul><li>Eliminates RAID 5 write penalty </li></ul><ul><li>74% less disk costs </li></ul><ul><li>Reduced power, cooling and floorspace </li></ul><ul><li>Automatically move snapshots to lower tiers </li></ul><ul><li>No server side agents required </li></ul>The ONLY SAN to automate ILM inside the volume Data Progression Advantage Traditional Approach to Tiered Storage <ul><li>Entire Volume movement ONLY </li></ul><ul><li>Manual data classification </li></ul><ul><li>Manual data movement </li></ul><ul><li>Extra software on each server </li></ul>
    8. 8. Multiple Tiers Working Together <ul><li>Multiple Data Tiers </li></ul><ul><li>Maximize the usage of fast (expensive) disks for all volumes </li></ul><ul><li>Maximize the usage of all disks in the system </li></ul><ul><li>Spreading I/O to many disks is a primary advantage of virtualization </li></ul><ul><li>Minimize the overall system cost by needed less high speed disk </li></ul><ul><li>Data Progression moves infrequently accessed data to the slowest (cheapest) disks in the system </li></ul>
    9. 9. When Does Data Move? <ul><li>Data Progression runs once per day </li></ul><ul><ul><li>Default start time is 7pm </li></ul></ul><ul><li>Data is moved per page </li></ul><ul><li>Historical Replay pages eligible to move to lowest tier immediately </li></ul><ul><li>Default setting – 12 days down, 3 up </li></ul><ul><li>RAID Restripe </li></ul><ul><ul><li>When adding additional drives </li></ul></ul><ul><ul><li>When RAID extents score low </li></ul></ul>
    10. 10. Movement Summary Movement Summary Decision Descriptions Movement Cases Covered Notes Full Demotion Historical Progression The Page is not accessible and is kept for recovery purposes. It is placed on the lowest cost storage available. Demotion Progression Most Common Action Promotion Page returning to active use. A page that has not been used in a while is moved to a more expensive/faster storage. Configuration Change <ul><li>De-fragmentation </li></ul><ul><li>Volume/History Configuration Change </li></ul><ul><li>Removal of a Storage Class </li></ul>This may demote or promote the page. Catch all case for configuration changes. Decision Notes Is Historical Does the page belong to a Replay that is currently Read Only. Recently Accessed This checks the tracking information if the page has been recently accessed by the volumes. The criteria for recent access can change when page resources are getting low. This allows DP to more aggressively move pages. Is Accessible Check if the page is accessible by any volume. This requires examining the Replay Remap Table for each volume in a history. Is Lower Cost Storage Available A page has not been accessed recently and it a candidate to move to lower cost storage. Is such storage available? If no such storage is available the page has reached the least cost storage possible. Is Higher Cost Storage Available In cases where a page is not accessed for a long time and is now being accessed it needs to check if higher cost storage is available. If the storage is available then the page is promoted. Is Storage Class Available This validates if the current storage class may be used for this type of page. A change in configuration to the Volume/History or the system may cause a page to move. This also checks if the page is part of a device that is being removed and needs page allocations moved from the device. A change in configuration may lead to the demotion or promotion of a page.
    11. 11. Fast Track Fast Track Data Placement <ul><li>Operates inside the volume at the block-level </li></ul><ul><li>Reduce the need for costly, high-power drives </li></ul><ul><li>Improve disk utilization </li></ul><ul><li>Optimize performance on actual usage patterns </li></ul><ul><li>Significantly lower latency on read/write operations </li></ul><ul><li>Automatic function, no user intervention needed </li></ul>Fast Track Benefits Reduces the number of drives while maintaining performance Traditional Data Placement <ul><li>Volume-level placement </li></ul><ul><li>Active data distributed across the drives </li></ul><ul><li>Allocated but empty space on outer tracks </li></ul><ul><li>Wasted inner tracks on the drive </li></ul>Data on the Fast Track Minimize Seek Time
    12. 12. Fast Track – Understanding Disk Performance Rotational Latency Seek Time <ul><li>Disk drives are physical devices that are the slowest part of a storage system </li></ul><ul><li>Reducing physical delays (rotational + seek time) provides significant performance gains </li></ul>
    13. 13. Fast Track – Disk Geometry 20% Amount of drive capacity that provides the fastest data transfer rate Fastest tracks Slowest tracks
    14. 14. Traditional Disk Usage New Disk Drive All blocks are written from the outer edge in
    15. 15. Traditional Disk Usage Full Disk Drive Active blocks are scattered all over the drive creating reduced performance due to seek and rotational delays Active blocks
    16. 16. Fast Track – What It Does Full Disk Drive With Fast Track Active blocks are dynamically and automatically placed on outer edge
    17. 17. Data Progression and Fast Track Together <ul><li>Performance Optimization </li></ul><ul><ul><li>Dynamic Block Architecture provides metadata management </li></ul></ul><ul><ul><li>Data Progression analyzes metadata and provides sophisticated data movement </li></ul></ul><ul><ul><li>Fast Track reserves outer tracks for increased performance </li></ul></ul><ul><li>Benefits </li></ul><ul><ul><li>Shrink overall drive counts up without sacrificing performance </li></ul></ul><ul><ul><li>Reduce the need for costly, high-power drives </li></ul></ul><ul><ul><li>Unmatched value proposition </li></ul></ul>
    18. 18. Levels Of System Optimization <ul><li>Tiers </li></ul><ul><ul><li>Tier 1, Tier 2, Tier 3 </li></ul></ul><ul><li>Disk Zones </li></ul><ul><ul><li>Fast and Standard </li></ul></ul><ul><li>RAID Levels </li></ul><ul><ul><li>RAID 10, RAID 5-5 (4+1), RAID 5-9 (8+1) </li></ul></ul><ul><li>Page Sizes </li></ul><ul><ul><li>512KB, 2MB, 4MB </li></ul></ul>
    19. 19. Data Instant Replay Data Instant Replay Benefits Data Instant Replay Advantage Industry’s most advanced snapshots reduce downtime and save time Compellent Approach 8:00 8:15 8:30 8:45 <ul><li>Readable and writeable instantly </li></ul><ul><li>Space efficient – freeze pointers </li></ul><ul><li>No space pre-allocation </li></ul><ul><li>Integration with common applications </li></ul><ul><li>Reduce dependence on tape </li></ul><ul><li>Patented technology </li></ul>Replays Traditional Snapshots Mount by any server Yes No Recovery time Seconds Hours– Cloning required Writeable Yes No Boot volume Yes No Maximum allowed Unlimited Limited 8-12 Pre-allocation None 20-50% required Copy-on-write Drive type used No Any– user defines Yes Same as active volume
    20. 20. Data Instant Replay – How It Works <ul><li>The following outlines the page management that occurs with replays </li></ul><ul><li>Assumptions </li></ul><ul><ul><li>Storage system has two tiers of disk </li></ul></ul><ul><ul><li>Volumes are configured to use Recommended Storage Profile </li></ul></ul><ul><ul><li>Replays occur once per day at 6pm and are retained for 3 days </li></ul></ul><ul><ul><li>Volume was created today </li></ul></ul>A B C D E DATA
    21. 21. Data Instant Replay – How It Works A B C D E Time 0 C1 Time 1 READ A READ C Write C1 DATA DIR
    22. 22. Data Instant Replay – How It Works C2 E1 A B C D E Time 0 C1 READ A READ C1 Write C2 DATA DIR Δ Changes DIR Δ Changes READ E Write E1 Time 1 Time 2 <ul><li>Even with multiple Replays space is only the changes </li></ul>
    23. 23. Data Instant Replay – How It Works C2 E1 A B C D E Time 0 A B C1 D E READ A READ C2 Write C2 DATA DIR New oldest replay DIR Δ Changes READ E1 Write E1 Time 1 Time 2 Expire Time 0 Release space back to pool <ul><li>When a Replay expires the information is coalesced </li></ul><ul><li>Unneeded pages are released back into the common pool </li></ul>
    24. 24. Data Instant Replay – How It Wworks C2 E1 A B C1 D E READ A READ C2 Write C2 DATA DIR Δ Changes READ E1 Write E1 Time 1 Time 2 <ul><li>Expiration complete </li></ul>
    25. 25. Data Instant Replay – How It Works C3 E2 A B C1 D E Time 1 C2 E1 READ A READ C2 Write C3 DIR DIR Δ Changes READ E1 Write E2 Time 2 Time 3 DATA DATA Recovery of Volume 1 -> Volume 2 DATA Volume 1 <ul><li>Recover a Replay </li></ul><ul><ul><li>This becomes a new branch </li></ul></ul><ul><ul><li>The new branch shares read-only blocks </li></ul></ul>
    26. 26. Data Instant Replay – How It Works C3 E2 A B C1 D E Time 1 C2 E1 READ A READ C2 Write C3 DIR READ E1 Write E2 Time 2 Time 3 DATA DATA C3 E2 Volume 1 Volume 2 READ A READ C2 Write C3 READ E1 Write E2 Time 0 Δ Changes Recovery View
    27. 27. Data Instant Replay – How It Works C4 C3 E2 A B C1 D E Time 1 C2 E1 READ A READ C3 Write C4 DIR READ E2 Time 2 Time 4 DATA DATA B1 C3 E2 Volume 1 Volume 2 READ D READ B Write B1 READ E2 Time 0 DIR Δ Changes Δ Changes
    28. 28. Data Instant Replay – How It Works C4 C3 E2 A B C1 D E Time 1 A B C2 D E1 READ A READ C4 Write C4 DIR Coalesced Version READ E2 Time 2 Time 4 DATA DATA B1 C3 E2 Volume 1 Volume 2 READ A READ C3 Write C3 READ E2 Write E2 Time 0 DIR Δ Changes Release space back to pool
    29. 29. Data Instant Replay - Application Integration <ul><li>Integrate Replays with applications to provide consistency </li></ul><ul><ul><li>Microsoft applications – Replay Manager </li></ul></ul><ul><ul><ul><li>Windows Server 2003 and above </li></ul></ul></ul><ul><ul><ul><li>Exchange 2003 and above </li></ul></ul></ul><ul><ul><ul><li>SQL 2000 and above </li></ul></ul></ul><ul><ul><li>Compellent Command Utility </li></ul></ul><ul><ul><ul><li>Command line control of DIR </li></ul></ul></ul><ul><ul><ul><li>Free, downloadable from Knowledge Center </li></ul></ul></ul>
    30. 30. Volume Configuration Best Practices <ul><li>The Combination of Data Progression, Fast Track, and Data Instant Replay provide a powerful, automated storage system </li></ul><ul><li>Compellent recommends following our best practices to best advantage these features </li></ul><ul><li>Replay best practices </li></ul><ul><ul><li>All volumes should have at least one Replay scheduled per day with a retention of 48 hours or longer </li></ul></ul><ul><ul><li>Exception is log, swap, and pagefile volumes – no need to progress this data </li></ul></ul><ul><li>Volume best practices </li></ul><ul><ul><li>Configure all volumes for the Recommended Storage Profile </li></ul></ul><ul><ul><li>This profile uses R10 for writes and R5 across tiers for Replays </li></ul></ul><ul><ul><li>Exception is log, swap, and pagefile volumes – suggest the High Priority (Tier1) </li></ul></ul>
    31. 31. Storage Profile Details
    32. 32. Storage Profile Details
    33. 33. Custom Storage Profiles

    ×