Unlock the potential of Erasure Coding with this comprehensive exploration, designed to enhance your understanding of how this advanced data protection technique can significantly improve storage efficiency and reliability. This guide delves into the mechanics of Erasure Coding, comparing it to traditional redundancy methods like RAID, and showcases its benefits in terms of scalability, fault tolerance, and cost-effectiveness. To Know more: https://stonefly.com/white-papers/innovative-method-data-protection-disaster-recovery/
2. Erasure Coding
Summary................................................................................................1
What is Erasure Coding?.......................................................................2
Replication versus Erasure Coding........................................................3
Benefits of Erasure Coding....................................................................6
Uses of Erasure Coding.........................................................................7
Prospects of Erasure Coding.................................................................9
StoneFly NAS and Erasure Coding......................................................10
Table of Contents
3. Erasure Coding 1
The era of rapid technological advancements is helping the world develop more
and more insight into methods of data protection and disaster recovery. The
need to understand the importance of backup and recovery has gained utmost
priority as we move into the world of big data.
Business continuity surpasses every other element; hence IT experts have to
make smart choices with regards to their storage and backup needs. It is a dire
need for businesses to be using the most advanced methods of disaster recov-
ery to ensure they always reap maximum advantages from their business.
Compared to the old storage methods used, experts are diving into better ways
of storage. They are taking steps to ensure that a disaster is mitigated with ease
and in the fastest of ways to avoid any kinds of stoppages to the business. One
of the up and coming methods of data protection which is used in NAS is ‘Era-
sure Coding’. This method ensures the data is saved in such a way that the
storage is everlasting.
The only way to achieve such intricateness is by choosing the storage solutions
which include replication, snapshot and cloning. To conserve network bandwidth
and storage capacity, compression is also crucial when dealing with storage. The
right kind of network must be set up which ensures the perfect amount of granu-
larity to enhance efficiency and control disaster recovery and data protection
costs.
Summary
4. Erasure Coding 2
Primary storage must be able to provide you with the tools needed to monitor
and control the processes of disaster recovery and data protection. It should give
you the flexibility to choose from a variety of secondary storage targets. It must
also provide you with the ability to recover and replicate from a backup reposito-
ry.
The model explains what a good storage service provides the customer with, but
detailed knowledge of how it is done, benefits of it and future prospects are dis-
cussed in this guideline. This guideline goes on to explain what Erasure Coding
is and what its growth prospects are.
Erasure Coding is a method of data protection in which the data is broken down
into fragments, expanded and encoded with redundant data pieces. The data is
then stored across a set of different locations inside the cloud or storage media.
Erasure Coding has been proven to be a massive success as it is used for
common purpose items like DVDs and cell phones. Erasure codes, which are
also known as forward error correction (FEC) codes, were developed more than
50 years ago.
What is Erasure Coding?
5. Erasure Coding 3
The data is stored as various bricks or disks which make up an entire node; the
greater the information, the greater the probability of the nodes to go down. Era-
sure Coding however, mitigates the chances of failure as it is an enhanced data
protection technique. It uses extreme fractionalization where if a certain node
does go down, the data will be reconstructed using other redundant pieces of
information inside the cloud.
The possibilities of data recovery become unmatched as each brick or disk has
redundancy information which can readily be used to reconstruct the data in
case of failures.
n = k+m
The configuration of Erasure Coding can be broken down as follows:
n: number of bricks
m: bricks which can go down
k: minimum number of bricks required to reconstruct the data
Erasure Coding allows for the reconstruction of data using redundant pieces of
information in the cloud which are dispersed as per the configuration the user
wants.
The greater the amount of data the greater the chances that the data will go
down or fail. The failure cannot be dodged; no ways have yet been devised to
avoid it from ever happening. After a certain period of time, data is bound to be
damaged. However, there are various methods to resolve the problem after it
has occurred.
Replication versus Erasure Coding
6. Erasure Coding 4
Replication involves real time duplication of stored or archived data over a Stor-
age Area Network (SAN). The basic purpose of replication is disaster recovery. A
storage replication service manages the disaster recovery for the users. Replica-
tion typically occurs between a primary storage location and a secondary storage
location.
Apart from providing duplication over a Storage Area Network (SAN), replication
can take place over Local Area Network (LAN), Wide Area Network (WAN) and
the Cloud. Replication is a key technology for disaster recovery. It allows the
users to dive deep into the data which is lost overtime to retrieve it.
Replication
Problem
Replication indeed leads to business continuity and benefits the business, but
side by side imposes a lot of cost on the business. Replication involves physical
copies of the work done to be made at other places. This results in loads of stor-
age space being taken up due to replication. Though being the key technology
for disaster recovery, replication requires a lot of storage space.
7. Erasure Coding 5
Erasure Coding is another method to avoid the loss of data from disasters.
Disaster recovery from Erasure Coding is done using redundant pieces of infor-
mation which help in the reconstruction of data. Apart from just being a method
of data protection, Erasure Coding takes into account useful information from the
cloud to reconstitute the data to ensure business continuity.
Erasure Coding
The goal of erasure coding is to enable data that becomes corrupted at some
point in the disk storage process to be reconstructed. The data is reconstructed
using information about the data that is stored elsewhere in the array.
Erasure Coding surpasses the traditional data storage ways because of its ability
to reduce the time and overhead required to reconstruct the data. We now look
at an example of Erasure Coding and see how it helps in saving space in com-
parison to the traditional way of replication.
The Example
a = 2 a = 2
b = 3 b = 3
a = 2 b = 3
In replication, the original data remains as it is and other copies of the data are
made. Replication hence requires twice the storage size of the original data.
8. Erasure Coding 6
a = 2
a + b = 5
b = 3
In Erasure Coding, the original piece of data is used and a parity fragment is
added. The parity fragment has linear dependency on the original data.
a = 2
When the original data is lost, in replication, the user just has to go to the original
data and read the data back.
However in Erasure Coding, the value of the original data is not directly avail-
able, but the user has the equations:
a + b = 5
b = 3
The user can use these equations to read the remaining surviving fragments and
deduce the value of ‘a’ from them.
In both replication and Erasure Coding, the user tolerates the failure. Replication
however, requires twice the size of the original data and takes up more space.
Erasure Coding requires half times the original data and saves space.
Erasure coding creates a mathematical function to describe a set of numbers so
they can be checked for accuracy and recovered if one is lost. Referred to as
polynomial interpolation or oversampling, this is the key concept behind erasure
codes. In mathematical terms, the protection offered by erasure coding can be
represented in simple form.
Benefits of Erasure Coding
9. Erasure Coding 7
Erasure Coding is another method to avoid the loss of data from disasters.
Disaster recovery from Erasure Coding is done using redundant pieces of infor-
mation which help in the reconstruction of data. Apart from just being a method
of data protection, Erasure Coding takes into account useful information from the
cloud to reconstitute the data to ensure business continuity.
Erasure Coding
Storage Space
Increase in the storage space result-
ing in space saving because of half
times the original data required; with
the same level of redundancy 3
copies provide. Up to 50% more
space saved.
Storage Data
Each data block increasing the data
protection with the increase in data
integrity for it to be reconstructed.
Greater Reliability
Data pieces are fragmented into inde-
pendent fault dummies. This ensures
there are no dependent or correlated
failures.
Suitablity
Erasure Coding can be used for any
file size. Ranging from small block
sizes of Kilobytes to large block sizes
going up to Petabytes.
Erasure Coding is helping data storage in a new way which is effective, efficient
and space saving. The method of reconstruction of data is ensuring that it
becomes globally recognized and used.
Sequential Data
Sequential data is written in an order or sequence usually by one writer and only
once. The access to sequential data is also predetermined with significance as to
the structure of the data.
Uses of Erasure Coding
10. Erasure Coding 8
Erasure coding performs best in cases of sequential data writes. Erasure coding
engine has been developed to immediately write the original data to remote
disks as the data streams in. It computes the coding parts on the fly and writes
them along. Erasure Coding enhances performance when dealing with sequen-
tial data. The reading pattern for erasure coded data is the same as replicated
data because Erasure Coding too stores data in its original form. This negates
the possibility of data being tarnished.
Erasure Coding reads data over the network, so it makes sure that the redun-
dant pieces of information fit the data requirements when it is reconstructed. So,
Erasure Coding also provides the users with increased data safety and minimal
blow-up.
Archiving
As a user, you want access to all the data on your fingertips whenever needed.
Archiving such large sets is a need and loss of valuable information can harm
the business.
Erasure Coding works best with large datasets and large number of storage
elements. It maximizes the utilization of resources as it requires high CPU utiliza-
tion and also provides greater latency. This ensures that Erasure Coding is suit-
able for archiving applications with a long-term nature of the storage.
Object Storage
Erasure Coding helps in the reconstitution of data using pieces of information
dispersed in the cloud. The very large-volume cloud operators can use it for
maximum efficiency of their work. Erasure Coding can be used in the context of
object storage.
11. Whitepaper 9
Latency
Archiving does not have the issue of latency and works appropriately under all
circumstances. Erasure Coding becomes the most suitable option for such appli-
cations. Storage mediums cannot offer a guarantee as to the time of the storage
of data. Erasure Coding ensures that the life of the storage medium is extended.
The future holds rapid growth for Erasure Coding with latency getting faster. It
provides users with increased levels of resilience and reliability which are much
higher than the traditional storage methods.
Erasure Coding can be used for large-scale dispersed storage systems which
open up a whole new world for Erasure Coding where big data is becoming ever
important.
Health facilities, government organizations, oil companies and a mix of all the
companies who manage huge data sets can reap the advantages from Erasure
Coding.
The lack of education among storage managers and buyers does create some
hindrance for Erasure Coding. It might not be adopted instantly by such users,
but with innovations and increases in capacities of data, Erasure Coding has a
well-lit future.
Prospects of Erasure Coding
12. Erasure Coding 10
With the turn of the century, there have been great increases in datasets. A lot of
storage capacities are being created which increase the risk of failure of data
and impose high costs on the users. Ways have been appropriately designed to
control the storage of users.
StoneFly’s Scale-out NAS storage provides an intact cloud backup and disaster
recovery solution. It has been developed to reduce the storage space required
and provide an ideal platform for storage, protection and management of data.
NAS uses Erasure Coding which helps reduce storage space taken by up to
40%. The codes used to reconstruct data use half the times of original data for
reconstruction.
The Scale-out which NAS provides helps in achieving increased scalability, per-
formance and redundancy. Erasure Coding uses redundant pieces of information
dispersed in the cloud to reconstruct original data. It uses independent failure
dummies to increase levels of reliability ensuring long-term retention of data.
Alongside supporting smaller data, NAS supports petabytes of data. Erasure
Coding is very suitable when dealing with large data. Erasure Coding helps in
space saving when dealing with the large data as it expands, transforms, slices
and disperses the data across a network of storage nodes in the cloud.
The increased amounts of information in the cloud make the reconstruction of
data fast and easy.
StoneFly NAS and Erasure Coding
13. Erasure Coding 11
We hope the Erasure Coding essential guide got you thinking. Now is the time to
shift to StoneFly for your storage needs.
For more information about products built specifically for the modern enterprise.
Just visit: www.Stonefly.com/products
You can see how easy it is to guarantee performance, scaling-out, reliability and
durability in storage.
Organizations including; US Department of Defense, National Institute of Health,
United States Army Research Laboratory, Ecker Enterprises, Life Care Assur-
ance and others have said “No to LUNs & legacy Storage”. With StoneFly they
manage only virtual machines, in a fraction of the footprint and at far lower costs
than traditional storage.
For more information, visit www.Stonefly.com
Follow us on Twitter @StoneFlyInc
Thanks for Reading