Introduction to Storage Deduplication for the SQL Server DBA

2,466 views
2,168 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,466
On SlideShare
0
From Embeds
0
Number of Embeds
1,148
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to Storage Deduplication for the SQL Server DBA

  1. 1. Introduction to Storage Deduplication for the SQL Server DBA SQLDBApros SQL Server DBA Professionals Global Marketing
  2. 2. Introduction to deduplication SQL Server DBAs across the industry are increasingly facing requests to place database backups on deduplication storage — while also considering whether or not compressing backups while using deduplication storage is a good idea. 2 SQLDBApros
  3. 3. Let us explain… Deduplication is not a new term; it has been circulating widely the past few years as major companies began releasing deduplicating storage devices. Deduplication simply means not having to save the same data repeatedly. 3 SQLDBApros
  4. 4. Imagine this… Imagine the original data as separate files; the same files multiply as multiple users save them to their home directories, creating an excess of duplicate files that contain the same information. The object of deduplication is to store unique information once, rather than repeatedly. In this example, when a copy of a file is already saved, subsequent copies simply point to the original data, rather than saving the same information again and again. Download Your Free Trial of our Backup Compression Tool 4 SQLDBApros
  5. 5. Chunking Files enter the process intact, as they would in any storage. They are then deduplicated, and compressed. On many appliances, data is processed in real time. Unlike the file example above, deduplication appliances are more sophisticated. Most deduplication appliances offer an architecture wherein incoming data is deduplicated inline (before it hits the disk) and then the unique data is compressed and stored. Sophisticated algorithms break up files into smaller bits. This is called “chunking.” Most chunking algorithms offer “variable block” processing and “sliding windows” that allow for changes within files without much loss in deduplication. Download Your Free Trial of our Backup Compression Tool 5 SQLDBApros
  6. 6. Fingerprints For instance, if a one-line change is made in one of several nearly identical files, sliding windows and variable block sizes break up the larger file into smaller ones and effectively store the small pieces of changed information without thinking the one line of change means the entire file is new. Each chunk of data gets hashed, which you can think of as a fingerprint. If the system encounters a piece of data bearing a fingerprint it recognizes, it merely updates the file map and reference count without having to save that data. Unique data is saved and compressed. Download Your Free Trial of our Backup Compression Tool 6 SQLDBApros
  7. 7. Reduce storage needs Both target and source-side (at the server itself) deduplication help to reduce demand for storage by eliminating redundancies in data and reducing the amount of data being sent across the network. In source-side deduplication, vendor APIs can quickly query a storage device to see if the chunks of data already reside on the storage rather than sending all of the bits across the network for the storage device to process. Download Your Free Trial of our Backup Compression Tool 7 SQLDBApros
  8. 8. Replication Replication is another key feature. Replication features found in today’s deduplication appliances are a boon for DBAs. It allows them to ensure that backup data from one data center can be easily replicated to another by moving only the needed deduplicated and compressed chunks. Download Your Free Trial of our Backup Compression Tool 8 SQLDBApros
  9. 9. Rehydration Deduplication does mean that the file will eventually have to be “rehydrated” (put back together) if it’s needed. Bits are read, decompressed, and reassembled. Read speed may slow down compared to non-deduplicated storage because of needed decompression, reassembly, and transmission over the network. Additionally, fragmentation on the storage device can cause additional rehydration overhead. Download Your Free Trial of our Backup Compression Tool 9 SQLDBApros
  10. 10. Learn More Backup Compression Tool– Free Trial Download Compression Whitepaper Follow us on Twitter @SQLDBApros 10

×