The Case for RAID 4: Cloud-RAID Integration
with Local Storage
Christopher Hansen
Electrical and Computer Engineering
Brigham Young University
Provo, UT, USA
Email: cghansen@protonmail.com
James Archibald
Electrical and Computer Engineering
Brigham Young University
Provo, UT, USA
Email: james archibald@byu.edu
Abstract—The proliferation of the Internet of Things (IoT)
requires innovative solutions for all aspects of computing,
including storage. The small footprint of IoT devices limits
their capacity for local reliable storage. A solution is presented
which combines local and cloud storage in a RAID-like (Redun-
dant Array of Independent Disks) configuration, increasing the
amount of storage, access speed, and/or data reliability and avail-
ability for systems which implement the discussed configurations.
Previously, cloud-RAID, where data is distributed across multiple
cloud storage providers, has been proposed and implemented.
However, the current architectures place an emphasis on RAID
0, and other levels of RAID with their application to cloud storage
have not been thoroughly explored. A novel architecture for
local+cloud-RAID storage is presented, and benefits provided
by the architecture in the areas of availability, reliability, and
security are discussed. An effort to quantify the reliability
of various configurations of RAID, cloud-RAID, and hybrid
local+cloud-RAID levels will be made. While RAID 4 has been
widely regarded as obsolete and supplanted by RAID 5, we argue
that RAID 4 can be useful in a local+cloud-RAID configuration.
A new RAID level based on RAID 4, with the addition of a second
dedicated parity drive, is proposed, and is deemed RAID 4.5. We
conclude that cloud storage, from the perspectives of availability,
reliability, security, and performance, is beneficial to include in
various RAID configurations which include local drives.
Keywords—RAID, cloud, reliability, availability, security
I. INTRODUCTION AND RELATED WORK
Through the decades, the storage landscape has changed
significantly. The importance of secure and reliable storage
has increased, and simultaneously, bandwidth demands have
followed suit. Architects seek to optimize system efficiency,
security, reliability, and performance. Reliability is achieved
through redundancy, at the cost of space efficiency, using tech-
niques such as RAID, parity, ECC, or mirroring. Three metrics
govern the selection of a storage architecture: performance,
reliability, and feasibility. Feasibility includes space efficiency,
cost, and hardware requirements.
The need for online storage has steadily increased as devices
supplement local storage with cloud storage. While many
new systems rely on the cloud, such reliance can bring
disadvantages. Cloud access can be slow, negatively impacting
performance. Reliance on a single cloud provider can subject a
system to unavailability or compromise of critical information.
[1], [2], [3], [4], [5], [6], [7], [8], [9], [10].
Cloud storage increases the importance of data integrity,
privacy, and reliability. RAID has found use in cloud drives.
[1], [3], [4], [5], [6], [11], [12], [13], [14], [15]. Schnjakin et al.
describe a method of creating a RAID 0 array which includes
only cloud storage. [16]. Other approaches also increase data
reliability and availability in the cloud. [1], [7], [8], [9], [10],
[17]. Schemes which implement dynamic selection of cloud-
based providers have also been discussed. [5], [18].
In this paper, we extend the analysis of local and cloud
storage, and contribute the following:
1) We show how inclusion of cloud drives in a local RAID
array improves reliability and security.
2) We analyze RAID levels to identify appropriate
local+cloud configurations.
3) We show how RAID 4 has promising application in
local+cloud-RAID configurations.
4) We propose a new RAID level, 4.5, which utilizes
multiple cloud parity drives combined with local storage.
II. OVERVIEW OF BASIC RAID LEVELS
RAID 0: This configuration enhances read/write perfor-
mance at the cost of reliability, as any single drive failure
causes array failure. [19].
RAID 1: This configuration enhances reliability at the cost
of efficiency. One or more drives can fail as long a single
healthy drive remains. [19].
RAID 2-3: These configurations introduce striping and
parity, RAID 2 at the bit level, and RAID 3 at the byte level.
RAID 3 introduces a dedicated parity drive, while RAID 2 uses
distributed parity. Reliability is increased at the expense of
performance. These RAID levels tolerate a single drive failure.
[19].
RAID 4: This configuration also uses a dedicated parity
drive, but stripes on block-level data, which improves perfor-
mance over RAID 2 and 3. However, the dedicated parity drive
can bottleneck bandwidth. RAID 4 also tolerates a single drive
failure. [19].
RAID 5: This configuration distributes parity among all
array drives, and is much more commonly implemented than
RAID levels 2-4 due to performance improvements. [19].
RAID 6: This configuration improves upon RAID 5 by
adding an orthogonal layer of distributed parity, enabling
2016 11th International Conference on Availability, Reliability and Security
978-1-5090-0990-9/16 $31.00 © 2016 IEEE
DOI 10.1109/ARES.2016.100
235
1 2 N...
Fig. 1. A serial reliability model for RAID 0 with N elements
Start Fail
NλΔt
Fig. 2. A Markov failure model of RAID 0 with N striped drives and constant
failure rate λ
tolerance of two drive failures. [20].
RAID 10: This configuration consists of two mirrored arrays
which are then striped. [21].
RAID 01: This configuration consists of two striped arrays
which are then mirrored, and is less reliable than RAID 10.
[21].
Higher level RAID configurations exist, but their discussion
exceeds our scope.
III. RAID CONFIGURATION ANALYSIS
For the RAID levels discussed in Section II, an analysis of
the reliability, and applicable commentary on the performance,
feasibility, and security of each level follows for the following
scenarios:
1) RAID excluding cloud drives
2) RAID including both local and cloud drives
3) RAID including only cloud drives
Cloud storage providers publish availability data for their
services. An area of open research is determining the probabil-
ity of permanent data loss for cloud providers, rather than the
currently provided availability figures. Our models assume that
the probability of permanent cloud storage failure becomes
vanishingly small against the probability of local drive failure.
A. RAID Level 0
Figure 1 depicts a serial reliability model for RAID 0 for N
disks. Any state other than state 1 represents failure. A serial
model implies that no redundancy exists within the system;
parallel models include redundancy. All implementations of
RAID 0 can be modeled serially.
RAID 0 excluding cloud drives: This is the traditional
configuration for RAID 0. Figure 2 is a Markov model of the
reliability for RAID 0. In Figure 2, N is the number of distinct
drives in the RAID 0 array, and λ is the constant failure rate.
We assume equivalent and constant failure rates for all drives.
Thus, the overall failure rate for the array is Nλ. The reliability
of this configuration is proportional to the number of drives
over which RAID is configured.
Serial reliability behavior is described in the reliability
equation presented by Shooman[21]:
R(t) = P(x1)P(x2)...P(xN ) =
N
i=1
P(xi) (1)
Start Fail
2λΔt
2λΔt λΔt
λΔt
Fig. 3. A parallel reliability model for RAID 1 with two local drives
For components with identical constant failure rates, the
reliability equation is:
R(t) =
N
i=1
e−λit
= exp(−
N
i=1
−λit) = e−Nλt
(2)
RAID 0 including both local and cloud drives: This config-
uration is infeasible. Read speeds and writes speeds would be
bottlenecked by the cloud drive within the array, as the read
and write performance differential between local and cloud
storage is large.
RAID 0 including only cloud drives: Cloud-RAID 0 has
been analyzed previously as discussed in Section I. [7], [8],
[9], [10], [16], [17].
To calculate the availability of this system, multiply the
availabilities of the cloud storage media which are utilized:
A(t) =
N
i=1
Ai(t) (3)
B. RAID Level 1
RAID 1 excluding cloud drives: This configuration requires
a parallel reliability model, which is depicted in Figure 3. This
figure depicts a Markov model of a RAID 1 system with two
drives, one mirroring the other.
The reliability for this system follows the equation outlined
by Shooman[21]:
R(t) = P(x1 + x2 + ... + xN ) = 1 − P( ¯x1 ¯x2... ¯xN ) (4)
For constant failure rate components, the reliability equation
becomes:
R(t) = 1 −
N
i=1
(1 − e−λit
) (5)
RAID 1 including both local and cloud drives: This RAID
configuration is essentially a cloud-based backup of a local
drive. Examples include Google Drive, Dropbox, et cetera.
RAID 1 including only cloud drives: This configuration
consists of a cloud-based backup of another cloud drive, e.g.
using Google Drive to back up a Dropbox account. A model
236
Fig. 4. Local+cloud-RAID 4 with dedicated cloud parity
Start S1 Fail
λ”Δt
μ’Δt
λ’Δt
Fig. 5. Markov reliability models of traditional RAID 5 and local+cloud-
RAID 4. S1 represents degraded operation. This model assumes constant
failure rates.
for this is the same as a model for a RAID 1 array that
contains only local drives, with availability numbers used in
calculations rather than individual drive failure rates.
C. RAID Levels 2-3
Cloud storage has no meaningful impact on the feasibility
of these RAID levels due to the resolution at which Hamming
parity codes must be generated. No analysis will be performed.
D. RAID Level 4
RAID 4 is perhaps the most intriguing of all RAID levels
when it comes to the addition of cloud drives as part of the
array. RAID 4 is rarely used in practice, with RAID 5 and 6
preferred for local drives. We observe potential for RAID 4
as outlined below.
RAID 4 excluding cloud drives: Local-only RAID 4 config-
urations remain uninteresting, having been replaced by RAID
5 and 6.
RAID 4 including both local and cloud drives: This RAID
configuration shows great promise. Suppose a RAID 4 array
is created with two local drives and one cloud drive. This
configuration is depicted in Figure 4. The cloud drive is
selected to serve as the dedicated parity drive for the array.
Thus, the cloud drive absorbs all of the overhead incurred by
parity generation. Complete and quick reads can be performed
exclusively with the local drives, as the parity drive is read
only when the array runs in degraded mode or for data
verification. RAID 0 read performance can be achieved in this
configuration. Parity is dedicated to the cloud storage, so local
space is conserved. Components such as flash have limited
write cycles, and utilize error correcting codes and wear
distribution algorithms to ensure writes are evenly distributed
throughout the device fabric [22]. Traditional RAID levels 5
and 6, with interspersed parity, incur many extra write cycles
as data is updated. Our proposed configuration mitigates this
problem because all extra parity writes are performed to the
cloud. The writes do not cause wear to local physical drives,
nor do they degrade read performance of the system.
An additional benefit of this system is data privacy. With
only parity information stored on a cloud drive, attackers with
access to a cloud drive cannot access or compromise data.
One potential challenge in this system is the difficulty in
keeping up with writes to the array. Because parity information
is generated on the fly in real-time, this extra information
would need to be written at an equal pace to the cloud as
data is written to local drives. We do not want to outstrip the
ability of the system to keep up with repeated writes.
We propose an asynchronous parity scheme which generates
and stores parity data as a background task, relaxing the
requirements of standard RAID. Parity information could be
calculated and buffered in real-time, and uploaded as band-
width permits.
Reliability for this system can be calculated with the aid of
a Markov model. See Figure 5. For local+cloud-RAID 4, λ’
= 2λ and λ” = λ. The repair rate of the system is μ . This
model was solved by Shooman [21]. The probability, in the
Laplace domain, of being in each state is:
PStart(s) =
s + λ + μ
[s2 + (3λ + μ )s + 2λ2]
(6)
PS1
(s) =
2λ
[s2 + (3λ + μ )s + 2λ2]
(7)
PF ail(s) =
2λ2
s[s2 + (3λ + μ )s + 2λ2]
(8)
The MTTF of this system is the limit of the sum of P(Start)
and P(S1) as s approaches 0, which results in:
MTTF = lim
s→0
(PStart(s) + PS1
(s)) = (9)
lim
s→0
s + 3λ + μ
[s2 + (3λ + μ )s + 2λ2]
=
3λ + μ
2λ2
(10)
RAID 4 including only cloud drives: This configuration
provides a meaningful way to increase the reliability of cloud
storage and is similar to erasure-coding techniques for data in
the cloud, though existing systems utilize distributed parity,
which is akin to RAID 5 or RAID 6 [1], [7], [8], [9].
E. RAID Level 5
RAID 5 excluding cloud drives: This configuration can be
modeled the same way as local+cloud-RAID 4, with modified
failure rates. See Figure 5. With RAID 5, λ’ in this figure
becomes 3λ, and λ” becomes 2λ.
PStart(s) =
s + 2λ + μ
[s2 + (5λ + μ )s + 6λ2]
(11)
PS1
(s) =
3λ
[s2 + (5λ + μ )s + 6λ2]
(12)
PF ail(s) =
6λ2
s[s2 + (5λ + μ )s + 6λ2]
(13)
Likewise, the MTTF of this system is the limit of the sum
of P(Start) and P(S1) as s approaches 0, which results in:
237
0 500 1000 1500 2000 2500 3000
10
1
10
2
10
3
10
4
10
5
10
6
10
7
MTTF of Traditional RAID 5, Local+Cloud−RAID 4, and Simplex System vs. Repair Rate
Repairs per Year
MTTFinYears(With1/50failures/year)
RAID 5
RAID 4
Simplex
Fig. 6. MTTF of RAID 5 and local+cloud-RAID 4 versus Repair Rate
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
0
10
2
10
4
10
6
10
8
10
10
10
12
MTTF of Traditional RAID 5, Local+Cloud−RAID 4, and Simplex System vs. Failure Rate
Failure Rate per Year
MTTFinYears(RepairRate=100Repairs/Year)
RAID 5
RAID 4
Simplex
Fig. 7. MTTF of RAID 5 and local+cloud-RAID 4 versus Failure Rate
MTTF =
5λ + μ
6λ2
(14)
RAID 5 including both local and cloud drives: RAID 5 is
inferior to RAID 4 when RAID is performed across local and
cloud drives. RAID 5 requires all drives in the array to service
read requests. When one or more drives in the array exist in the
cloud, read performance would be throttled by cloud access.
RAID 5 also distributes the parity writes across all drives,
increasing write cycle wear on local devices.
RAID 5 including only cloud drives: This configuration
is similar to RAID 4 including only cloud drives, and as
previously discussed, has been implemented.
Fig. 8. Example implementation of proposed local+cloud-RAID 4.5 with
two dedicated cloud parity drives. Two local drives can fail and data is still
recoverable.
F. Traditional RAID 5 vs. local+cloud-RAID 4 Reliability
Figure 5 and the above discussion lead to an interesting
result: local+cloud-RAID 4 reliability is always higher than
traditional RAID 5 reliability for equivalent constant failure
rates between drives. The reliability improvement is (3λ /
2) / (5λ / 2). Assuming a failure rate of λ = 0.02 failures
/ year, and μ = 100 repairs / year, the reliability improvement
is approximately a factor of 3. Figure 6 plots the MTTF’s
of local+cloud-RAID 4, traditional RAID 5, and a simplex
system against increasing repair rates, holding the failure
rate constant at 1 failure per 50 years. Figure 7 plots these
MTTF’s against failure rates, holding repair rate constant at
100 repairs/year.
G. RAID Level 6
RAID 6 excluding cloud drives: This RAID configuration
will not be analyzed.
RAID 6 including both local and cloud drives: RAID
6 consists of RAID 5 with a second orthogonal layer of
distributed parity. In a local+cloud configuration, it suffers
from the same bottlenecks and write cycle increases as RAID
5, and will not be analyzed. We note that a RAID 4-like
configuration with a second dedicated parity drive in the
cloud offers reliability enhancement with a low penalty, if the
asynchronous parity mechanism is used for both parity layers.
We refer to this configuration as RAID 4.5. See Figure 8 for
an example implementation of RAID 4.5. This level of RAID
can offer read performance comparable to RAID 0, as well as
enhanced data privacy.
RAID 6 including only cloud drives: Like RAID levels 4 and
5, this configuration is meaningful to increase the reliability
of cloud data, as previously described.
H. RAID Levels 10 and 01
RAID 10 excluding cloud drives: Traditional RAID 10
merits comparison against implementations of RAID 10 and
RAID 01 that include both local and cloud drives. Traditional
RAID 01 is supplanted by traditional RAID 10, and will not
be modeled. The reliability of RAID 10 follows the Markov
model in Figure 9. Solving, the reliability equation for this
model is:
P(s) =
22λ2
+ 9λs + 7λμ + s2
+ 2μs + μ2
24λ3 + (26s + 4μ)λ2 + s(9s + 7μ)λ + s(s + μ)2
(15)
238
Start
S1
S2
Fail
μΔt
4λΔt
2λΔt μΔt
2λΔt
λΔt
Fig. 9. Traditional RAID 10 reliability model. S1 and S2 represent degraded
operation states. This model assumes constant failure rates and one repairman.
Fig. 10. Local+cloud-RAID 10 (stripe of mirrors) with cloud mirroring
To obtain the MTTF of this configuration, we take the limit
of this equation as s approaches 0, yielding:
MTTF =
22λ2
+ 7λμ + μ2
24λ3 + 4μλ2
(16)
RAID 10 including both local and cloud drives: Consider
the stripe of mirrors array of Figure 10 where mirrors exist
in the cloud. Compare this to a mirror of stripes (RAID
01) in Figure 11 where the mirrors also exist in the cloud.
These configurations are practically identical functionally as
well as from a reliability perspective when two cloud drives
and two local drives are used. The reliability model for these
configurations is the same as in Figure 5, with λ’ = 2λ and
λ” = λ.
Modifying the λ values of Figure 9 for local+cloud-RAID
10, and plotting the MTTF’s against varying drive failure rates
while holding repair rate constant, we observe in Figure 12
that local+cloud-RAID 10 always has a higher MTTF than
traditional RAID 10. Similarly, this is true if we hold failure
rate constant and vary the repair rate, as in Figure 13. This
configuration offers RAID 0-like performance.
RAID 10 including only cloud drives: This configuration
will not be analyzed.
Fig. 11. Local+cloud-RAID 01 (mirror of stripes) with cloud mirroring
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
MTTF of Traditional RAID 10, Local+Cloud−RAID 10, and Simplex System vs. Failure Rate
Failures per Year
MTTFinYears(With100repairs/year)
Cloud−RAID 10
RAID 10
Simplex
Fig. 12. MTTF of RAID 10 and local+cloud-RAID 10 versus Failure Rate
0 500 1000 1500 2000 2500 3000
10
1
10
2
10
3
10
4
10
5
10
6
10
7
MTTF of Traditional RAID 10, Local+Cloud−RAID 10, and Simplex System vs. Repair Rate
Repairs per Year
MTTFinYears(With1/50failures/year)
Cloud−RAID 10
RAID 10
Simplex
Fig. 13. MTTF of RAID 10 and local+cloud-RAID 10 versus Repair Rate
I. Traditional RAID 10 vs. local+cloud-RAID 10 Reliability
As Figure 12 shows, the MTTF for local+cloud-RAID 10
is always higher than for traditional RAID 10. To obtain the
reliability improvement, we divide equation 10 by equation 16.
Assuming a reasonable failure rate of 0.02 failures / year and
a repair rate of 100 possible repairs / year, the reliability
improvement is approximately 2.
IV. CONCLUSIONS
Cloud storage offers meaningful improvements in reliability
for many different RAID configurations. While cloud-RAID
has been a focus, local+cloud-RAID has not seen the same
attention. We conclude that local+cloud-RAID configurations
239
are viable, reducing connected hardware requirements, mini-
mizing the impact of parity-induced stress and overhead on
local drives, and increasing performance. RAID 4 is more
viable than RAID levels 5 and 6 when utilized in a combined
local+cloud configuration. RAID 4 can be used to increase the
MTTF of RAID 5 when a cloud drive is used as a dedicated
parity drive. We also propose RAID 4.5, which surpasses
RAID 6 in combined local+cloud configurations. RAID 0 read
performance is achieved with local+cloud-RAID levels 4, 4.5,
10, and 01, and substantial reliability gains are realized over
RAID 0 with the implementation of these levels. The cloud
can and ought to be extensively used as a resource to increase
the reliability of local storage, including within large arrays,
and also as a resource to increase the performance and security
of systems which include only cloud-based media.
V. FUTURE WORK
A Cost-Aware Risk Management Framework for Users of
Cloud Storage: Simplifying the decision-making process for
storage architecture selection would enable system archi-
tects to better compromise between cost, reliability, security,
and performance. New levels and configurations of RAID
can complicate system design by presenting options that
are currently unknown to architects. Quantifying the risk
of maintaining data in the cloud could enable architects to
better mitigate risk through proper storage architecture and
leveraging the reliability of the cloud.
Improving the Write Performance of local+cloud-RAID 4
and 4.5: Schemes to improve upon the write performance of
newly proposed RAID configurations will fully enable RAID
4 and RAID 4.5 to be adopted in storage systems which incur
a large number of writes.
Implementation and analysis of new RAID configurations:
Arguments have been made in favor of new RAID configura-
tions. However, performance of these systems in the field has
yet to be measured. Implementation of these configurations
will fuel further research.
VI. ACKNOWLEDGMENT
We thank Dr. Michael Wirthlin of BYU for providing
guidance on the techniques used to analyze RAID reliability.
REFERENCES
[1] C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring data storage
security in cloud computing,” in Quality of Service, 2009. IWQoS. 17th
International Workshop on, July 2009, pp. 1–9.
[2] H. Abu-Libdeh, L. Princehouse, and H. Weatherspoon, “RACS:
A case for cloud storage diversity,” in Proceedings of the 1st
ACM Symposium on Cloud Computing, ser. SoCC ’10. New
York, NY, USA: ACM, 2010, pp. 229–240. [Online]. Available:
http://doi.acm.org/10.1145/1807128.1807165
[3] M. Schnjakin, D. Korsch, M. Schoenberg, and C. Meinel, “Implemen-
tation of a secure and reliable storage above the untrusted clouds,” in
Computer Science Education (ICCSE), 2013 8th International Confer-
ence on, April 2013, pp. 347–353.
[4] H. Graupner, K. Torkura, P. Berger, C. Meinel, and M. Schnjakin,
“Secure access control for multi-cloud resources,” in Local Computer
Networks Conference Workshops (LCN Workshops), 2015 IEEE 40th,
Oct 2015, pp. 722–729.
[5] Y. Chi, W. Cai, Z. Hong, H. C. B. Chan, and V. C. M. Leung, “A privacy
and price-aware inter-cloud system,” in 2015 IEEE 7th International
Conference on Cloud Computing Technology and Science (CloudCom),
Nov 2015, pp. 298–305.
[6] C. W. Ling and A. Datta, “InterCloud RAIDer: A do-it-yourself
multi-cloud private data backup system,” in Proceedings of the 15th
International Conference on Distributed Computing and Networking
- Volume 8314, ser. ICDCN 2014. New York, NY, USA: Springer-
Verlag New York, Inc., 2014, pp. 453–468. [Online]. Available:
http://dx.doi.org/10.1007/978-3-642-45249-9 30
[7] K. D. Bowers, A. Juels, and A. Oprea, “HAIL: A high-availability
and integrity layer for cloud storage,” in Proceedings of the 16th ACM
Conference on Computer and Communications Security, ser. CCS ’09.
New York, NY, USA: ACM, 2009, pp. 187–198. [Online]. Available:
http://doi.acm.org/10.1145/1653662.1653686
[8] G. Chockler, R. Guerraoui, I. Keidar, and M. Vukolic, “Reliable
distributed storage,” Computer, vol. 42, no. 4, pp. 60–67, April 2009.
[9] B. Mao, S. Wu, and H. Jiang, “Improving storage availability in cloud-
of-clouds with hybrid redundant data distribution,” in Parallel and
Distributed Processing Symposium (IPDPS), 2015 IEEE International,
May 2015, pp. 633–642.
[10] Q. Zhang, S. Li, Z. Li, Y. Xing, Z. Yang, and Y. Dai, “CHARM: A
cost-efficient multi-cloud data hosting scheme with high availability,”
IEEE Transactions on Cloud Computing, vol. 3, no. 3, pp. 372–386,
July 2015.
[11] A. Bessani, M. Correia, B. Quaresma, F. Andr´e, and P. Sousa, “DepSky:
Dependable and secure storage in a cloud-of-clouds,” in Proceedings
of the Sixth Conference on Computer Systems, ser. EuroSys ’11.
New York, NY, USA: ACM, 2011, pp. 31–46. [Online]. Available:
http://doi.acm.org/10.1145/1966445.1966449
[12] M. A. Alzain, B. Soh, and E. Pardede, “MCDB: Using multi-clouds
to ensure security in cloud computing,” in Dependable, Autonomic and
Secure Computing (DASC), 2011 IEEE Ninth International Conference
on, Dec 2011, pp. 784–791.
[13] M. Schnjakin, T. Metzke, and C. Meinel, “Applying erasure codes
for fault tolerance in cloud-RAID,” in Computational Science and
Engineering (CSE), 2013 IEEE 16th International Conference on, Dec
2013, pp. 66–75.
[14] M. Vrable, S. Savage, and G. M. Voelker, “BlueSky: A cloud-backed
file system for the enterprise,” in Proceedings of the 10th USENIX
Conference on File and Storage Technologies, ser. FAST’12. Berkeley,
CA, USA: USENIX Association, 2012, pp. 19–19. [Online]. Available:
http://dl.acm.org/citation.cfm?id=2208461.2208480
[15] C. Selvakumar, G. J. Rathanam, and M. R. Sumalatha, “PDDS - improv-
ing cloud data storage security using data partitioning technique,” in
Advance Computing Conference (IACC), 2013 IEEE 3rd International,
Feb 2013, pp. 7–11.
[16] M. Schnjakin and C. Meinel, “Evaluation of cloud-RAID: A secure and
reliable storage above the clouds,” in Computer Communications and
Networks (ICCCN), 2013 22nd International Conference on, July 2013,
pp. 1–9.
[17] B. Mao, S. Wu, and H. Jiang, “Exploiting workload characteristics and
service diversity to improve the availability of cloud storage systems,”
IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99,
pp. 1–1, 2015.
[18] C. W. Chang, P. Liu, and J. J. Wu, “Probability-based cloud storage
providers selection algorithms with maximum availability,” in Parallel
Processing (ICPP), 2012 41st International Conference on, Sept 2012,
pp. 199–208.
[19] D. A. Patterson, G. Gibson, and R. H. Katz, A case for redundant arrays
of inexpensive disks (RAID). ACM, 1988, vol. 17, no. 3.
[20] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson,
“Raid: High-performance, reliable secondary storage,” ACM Computing
Surveys (CSUR), vol. 26, no. 2, pp. 145–185, 1994.
[21] M. L. Shooman, Reliability of Computer Systems and Networks: Fault
Tolerance, Analysis and Design. New York, USA: Wiley-Interscience,
2002, pp. 112–114,438–441.
[22] S. Gregori, A. Cabrini, O. Khouri, and G. Torelli, “On-chip error
correcting techniques for new-generation flash memories,” Proceedings
of the IEEE, vol. 91, no. 4, pp. 602–616, April 2003.
240

07784576

  • 1.
    The Case forRAID 4: Cloud-RAID Integration with Local Storage Christopher Hansen Electrical and Computer Engineering Brigham Young University Provo, UT, USA Email: cghansen@protonmail.com James Archibald Electrical and Computer Engineering Brigham Young University Provo, UT, USA Email: james archibald@byu.edu Abstract—The proliferation of the Internet of Things (IoT) requires innovative solutions for all aspects of computing, including storage. The small footprint of IoT devices limits their capacity for local reliable storage. A solution is presented which combines local and cloud storage in a RAID-like (Redun- dant Array of Independent Disks) configuration, increasing the amount of storage, access speed, and/or data reliability and avail- ability for systems which implement the discussed configurations. Previously, cloud-RAID, where data is distributed across multiple cloud storage providers, has been proposed and implemented. However, the current architectures place an emphasis on RAID 0, and other levels of RAID with their application to cloud storage have not been thoroughly explored. A novel architecture for local+cloud-RAID storage is presented, and benefits provided by the architecture in the areas of availability, reliability, and security are discussed. An effort to quantify the reliability of various configurations of RAID, cloud-RAID, and hybrid local+cloud-RAID levels will be made. While RAID 4 has been widely regarded as obsolete and supplanted by RAID 5, we argue that RAID 4 can be useful in a local+cloud-RAID configuration. A new RAID level based on RAID 4, with the addition of a second dedicated parity drive, is proposed, and is deemed RAID 4.5. We conclude that cloud storage, from the perspectives of availability, reliability, security, and performance, is beneficial to include in various RAID configurations which include local drives. Keywords—RAID, cloud, reliability, availability, security I. INTRODUCTION AND RELATED WORK Through the decades, the storage landscape has changed significantly. The importance of secure and reliable storage has increased, and simultaneously, bandwidth demands have followed suit. Architects seek to optimize system efficiency, security, reliability, and performance. Reliability is achieved through redundancy, at the cost of space efficiency, using tech- niques such as RAID, parity, ECC, or mirroring. Three metrics govern the selection of a storage architecture: performance, reliability, and feasibility. Feasibility includes space efficiency, cost, and hardware requirements. The need for online storage has steadily increased as devices supplement local storage with cloud storage. While many new systems rely on the cloud, such reliance can bring disadvantages. Cloud access can be slow, negatively impacting performance. Reliance on a single cloud provider can subject a system to unavailability or compromise of critical information. [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. Cloud storage increases the importance of data integrity, privacy, and reliability. RAID has found use in cloud drives. [1], [3], [4], [5], [6], [11], [12], [13], [14], [15]. Schnjakin et al. describe a method of creating a RAID 0 array which includes only cloud storage. [16]. Other approaches also increase data reliability and availability in the cloud. [1], [7], [8], [9], [10], [17]. Schemes which implement dynamic selection of cloud- based providers have also been discussed. [5], [18]. In this paper, we extend the analysis of local and cloud storage, and contribute the following: 1) We show how inclusion of cloud drives in a local RAID array improves reliability and security. 2) We analyze RAID levels to identify appropriate local+cloud configurations. 3) We show how RAID 4 has promising application in local+cloud-RAID configurations. 4) We propose a new RAID level, 4.5, which utilizes multiple cloud parity drives combined with local storage. II. OVERVIEW OF BASIC RAID LEVELS RAID 0: This configuration enhances read/write perfor- mance at the cost of reliability, as any single drive failure causes array failure. [19]. RAID 1: This configuration enhances reliability at the cost of efficiency. One or more drives can fail as long a single healthy drive remains. [19]. RAID 2-3: These configurations introduce striping and parity, RAID 2 at the bit level, and RAID 3 at the byte level. RAID 3 introduces a dedicated parity drive, while RAID 2 uses distributed parity. Reliability is increased at the expense of performance. These RAID levels tolerate a single drive failure. [19]. RAID 4: This configuration also uses a dedicated parity drive, but stripes on block-level data, which improves perfor- mance over RAID 2 and 3. However, the dedicated parity drive can bottleneck bandwidth. RAID 4 also tolerates a single drive failure. [19]. RAID 5: This configuration distributes parity among all array drives, and is much more commonly implemented than RAID levels 2-4 due to performance improvements. [19]. RAID 6: This configuration improves upon RAID 5 by adding an orthogonal layer of distributed parity, enabling 2016 11th International Conference on Availability, Reliability and Security 978-1-5090-0990-9/16 $31.00 © 2016 IEEE DOI 10.1109/ARES.2016.100 235
  • 2.
    1 2 N... Fig.1. A serial reliability model for RAID 0 with N elements Start Fail NλΔt Fig. 2. A Markov failure model of RAID 0 with N striped drives and constant failure rate λ tolerance of two drive failures. [20]. RAID 10: This configuration consists of two mirrored arrays which are then striped. [21]. RAID 01: This configuration consists of two striped arrays which are then mirrored, and is less reliable than RAID 10. [21]. Higher level RAID configurations exist, but their discussion exceeds our scope. III. RAID CONFIGURATION ANALYSIS For the RAID levels discussed in Section II, an analysis of the reliability, and applicable commentary on the performance, feasibility, and security of each level follows for the following scenarios: 1) RAID excluding cloud drives 2) RAID including both local and cloud drives 3) RAID including only cloud drives Cloud storage providers publish availability data for their services. An area of open research is determining the probabil- ity of permanent data loss for cloud providers, rather than the currently provided availability figures. Our models assume that the probability of permanent cloud storage failure becomes vanishingly small against the probability of local drive failure. A. RAID Level 0 Figure 1 depicts a serial reliability model for RAID 0 for N disks. Any state other than state 1 represents failure. A serial model implies that no redundancy exists within the system; parallel models include redundancy. All implementations of RAID 0 can be modeled serially. RAID 0 excluding cloud drives: This is the traditional configuration for RAID 0. Figure 2 is a Markov model of the reliability for RAID 0. In Figure 2, N is the number of distinct drives in the RAID 0 array, and λ is the constant failure rate. We assume equivalent and constant failure rates for all drives. Thus, the overall failure rate for the array is Nλ. The reliability of this configuration is proportional to the number of drives over which RAID is configured. Serial reliability behavior is described in the reliability equation presented by Shooman[21]: R(t) = P(x1)P(x2)...P(xN ) = N i=1 P(xi) (1) Start Fail 2λΔt 2λΔt λΔt λΔt Fig. 3. A parallel reliability model for RAID 1 with two local drives For components with identical constant failure rates, the reliability equation is: R(t) = N i=1 e−λit = exp(− N i=1 −λit) = e−Nλt (2) RAID 0 including both local and cloud drives: This config- uration is infeasible. Read speeds and writes speeds would be bottlenecked by the cloud drive within the array, as the read and write performance differential between local and cloud storage is large. RAID 0 including only cloud drives: Cloud-RAID 0 has been analyzed previously as discussed in Section I. [7], [8], [9], [10], [16], [17]. To calculate the availability of this system, multiply the availabilities of the cloud storage media which are utilized: A(t) = N i=1 Ai(t) (3) B. RAID Level 1 RAID 1 excluding cloud drives: This configuration requires a parallel reliability model, which is depicted in Figure 3. This figure depicts a Markov model of a RAID 1 system with two drives, one mirroring the other. The reliability for this system follows the equation outlined by Shooman[21]: R(t) = P(x1 + x2 + ... + xN ) = 1 − P( ¯x1 ¯x2... ¯xN ) (4) For constant failure rate components, the reliability equation becomes: R(t) = 1 − N i=1 (1 − e−λit ) (5) RAID 1 including both local and cloud drives: This RAID configuration is essentially a cloud-based backup of a local drive. Examples include Google Drive, Dropbox, et cetera. RAID 1 including only cloud drives: This configuration consists of a cloud-based backup of another cloud drive, e.g. using Google Drive to back up a Dropbox account. A model 236
  • 3.
    Fig. 4. Local+cloud-RAID4 with dedicated cloud parity Start S1 Fail λ”Δt μ’Δt λ’Δt Fig. 5. Markov reliability models of traditional RAID 5 and local+cloud- RAID 4. S1 represents degraded operation. This model assumes constant failure rates. for this is the same as a model for a RAID 1 array that contains only local drives, with availability numbers used in calculations rather than individual drive failure rates. C. RAID Levels 2-3 Cloud storage has no meaningful impact on the feasibility of these RAID levels due to the resolution at which Hamming parity codes must be generated. No analysis will be performed. D. RAID Level 4 RAID 4 is perhaps the most intriguing of all RAID levels when it comes to the addition of cloud drives as part of the array. RAID 4 is rarely used in practice, with RAID 5 and 6 preferred for local drives. We observe potential for RAID 4 as outlined below. RAID 4 excluding cloud drives: Local-only RAID 4 config- urations remain uninteresting, having been replaced by RAID 5 and 6. RAID 4 including both local and cloud drives: This RAID configuration shows great promise. Suppose a RAID 4 array is created with two local drives and one cloud drive. This configuration is depicted in Figure 4. The cloud drive is selected to serve as the dedicated parity drive for the array. Thus, the cloud drive absorbs all of the overhead incurred by parity generation. Complete and quick reads can be performed exclusively with the local drives, as the parity drive is read only when the array runs in degraded mode or for data verification. RAID 0 read performance can be achieved in this configuration. Parity is dedicated to the cloud storage, so local space is conserved. Components such as flash have limited write cycles, and utilize error correcting codes and wear distribution algorithms to ensure writes are evenly distributed throughout the device fabric [22]. Traditional RAID levels 5 and 6, with interspersed parity, incur many extra write cycles as data is updated. Our proposed configuration mitigates this problem because all extra parity writes are performed to the cloud. The writes do not cause wear to local physical drives, nor do they degrade read performance of the system. An additional benefit of this system is data privacy. With only parity information stored on a cloud drive, attackers with access to a cloud drive cannot access or compromise data. One potential challenge in this system is the difficulty in keeping up with writes to the array. Because parity information is generated on the fly in real-time, this extra information would need to be written at an equal pace to the cloud as data is written to local drives. We do not want to outstrip the ability of the system to keep up with repeated writes. We propose an asynchronous parity scheme which generates and stores parity data as a background task, relaxing the requirements of standard RAID. Parity information could be calculated and buffered in real-time, and uploaded as band- width permits. Reliability for this system can be calculated with the aid of a Markov model. See Figure 5. For local+cloud-RAID 4, λ’ = 2λ and λ” = λ. The repair rate of the system is μ . This model was solved by Shooman [21]. The probability, in the Laplace domain, of being in each state is: PStart(s) = s + λ + μ [s2 + (3λ + μ )s + 2λ2] (6) PS1 (s) = 2λ [s2 + (3λ + μ )s + 2λ2] (7) PF ail(s) = 2λ2 s[s2 + (3λ + μ )s + 2λ2] (8) The MTTF of this system is the limit of the sum of P(Start) and P(S1) as s approaches 0, which results in: MTTF = lim s→0 (PStart(s) + PS1 (s)) = (9) lim s→0 s + 3λ + μ [s2 + (3λ + μ )s + 2λ2] = 3λ + μ 2λ2 (10) RAID 4 including only cloud drives: This configuration provides a meaningful way to increase the reliability of cloud storage and is similar to erasure-coding techniques for data in the cloud, though existing systems utilize distributed parity, which is akin to RAID 5 or RAID 6 [1], [7], [8], [9]. E. RAID Level 5 RAID 5 excluding cloud drives: This configuration can be modeled the same way as local+cloud-RAID 4, with modified failure rates. See Figure 5. With RAID 5, λ’ in this figure becomes 3λ, and λ” becomes 2λ. PStart(s) = s + 2λ + μ [s2 + (5λ + μ )s + 6λ2] (11) PS1 (s) = 3λ [s2 + (5λ + μ )s + 6λ2] (12) PF ail(s) = 6λ2 s[s2 + (5λ + μ )s + 6λ2] (13) Likewise, the MTTF of this system is the limit of the sum of P(Start) and P(S1) as s approaches 0, which results in: 237
  • 4.
    0 500 10001500 2000 2500 3000 10 1 10 2 10 3 10 4 10 5 10 6 10 7 MTTF of Traditional RAID 5, Local+Cloud−RAID 4, and Simplex System vs. Repair Rate Repairs per Year MTTFinYears(With1/50failures/year) RAID 5 RAID 4 Simplex Fig. 6. MTTF of RAID 5 and local+cloud-RAID 4 versus Repair Rate 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 0 10 2 10 4 10 6 10 8 10 10 10 12 MTTF of Traditional RAID 5, Local+Cloud−RAID 4, and Simplex System vs. Failure Rate Failure Rate per Year MTTFinYears(RepairRate=100Repairs/Year) RAID 5 RAID 4 Simplex Fig. 7. MTTF of RAID 5 and local+cloud-RAID 4 versus Failure Rate MTTF = 5λ + μ 6λ2 (14) RAID 5 including both local and cloud drives: RAID 5 is inferior to RAID 4 when RAID is performed across local and cloud drives. RAID 5 requires all drives in the array to service read requests. When one or more drives in the array exist in the cloud, read performance would be throttled by cloud access. RAID 5 also distributes the parity writes across all drives, increasing write cycle wear on local devices. RAID 5 including only cloud drives: This configuration is similar to RAID 4 including only cloud drives, and as previously discussed, has been implemented. Fig. 8. Example implementation of proposed local+cloud-RAID 4.5 with two dedicated cloud parity drives. Two local drives can fail and data is still recoverable. F. Traditional RAID 5 vs. local+cloud-RAID 4 Reliability Figure 5 and the above discussion lead to an interesting result: local+cloud-RAID 4 reliability is always higher than traditional RAID 5 reliability for equivalent constant failure rates between drives. The reliability improvement is (3λ / 2) / (5λ / 2). Assuming a failure rate of λ = 0.02 failures / year, and μ = 100 repairs / year, the reliability improvement is approximately a factor of 3. Figure 6 plots the MTTF’s of local+cloud-RAID 4, traditional RAID 5, and a simplex system against increasing repair rates, holding the failure rate constant at 1 failure per 50 years. Figure 7 plots these MTTF’s against failure rates, holding repair rate constant at 100 repairs/year. G. RAID Level 6 RAID 6 excluding cloud drives: This RAID configuration will not be analyzed. RAID 6 including both local and cloud drives: RAID 6 consists of RAID 5 with a second orthogonal layer of distributed parity. In a local+cloud configuration, it suffers from the same bottlenecks and write cycle increases as RAID 5, and will not be analyzed. We note that a RAID 4-like configuration with a second dedicated parity drive in the cloud offers reliability enhancement with a low penalty, if the asynchronous parity mechanism is used for both parity layers. We refer to this configuration as RAID 4.5. See Figure 8 for an example implementation of RAID 4.5. This level of RAID can offer read performance comparable to RAID 0, as well as enhanced data privacy. RAID 6 including only cloud drives: Like RAID levels 4 and 5, this configuration is meaningful to increase the reliability of cloud data, as previously described. H. RAID Levels 10 and 01 RAID 10 excluding cloud drives: Traditional RAID 10 merits comparison against implementations of RAID 10 and RAID 01 that include both local and cloud drives. Traditional RAID 01 is supplanted by traditional RAID 10, and will not be modeled. The reliability of RAID 10 follows the Markov model in Figure 9. Solving, the reliability equation for this model is: P(s) = 22λ2 + 9λs + 7λμ + s2 + 2μs + μ2 24λ3 + (26s + 4μ)λ2 + s(9s + 7μ)λ + s(s + μ)2 (15) 238
  • 5.
    Start S1 S2 Fail μΔt 4λΔt 2λΔt μΔt 2λΔt λΔt Fig. 9.Traditional RAID 10 reliability model. S1 and S2 represent degraded operation states. This model assumes constant failure rates and one repairman. Fig. 10. Local+cloud-RAID 10 (stripe of mirrors) with cloud mirroring To obtain the MTTF of this configuration, we take the limit of this equation as s approaches 0, yielding: MTTF = 22λ2 + 7λμ + μ2 24λ3 + 4μλ2 (16) RAID 10 including both local and cloud drives: Consider the stripe of mirrors array of Figure 10 where mirrors exist in the cloud. Compare this to a mirror of stripes (RAID 01) in Figure 11 where the mirrors also exist in the cloud. These configurations are practically identical functionally as well as from a reliability perspective when two cloud drives and two local drives are used. The reliability model for these configurations is the same as in Figure 5, with λ’ = 2λ and λ” = λ. Modifying the λ values of Figure 9 for local+cloud-RAID 10, and plotting the MTTF’s against varying drive failure rates while holding repair rate constant, we observe in Figure 12 that local+cloud-RAID 10 always has a higher MTTF than traditional RAID 10. Similarly, this is true if we hold failure rate constant and vary the repair rate, as in Figure 13. This configuration offers RAID 0-like performance. RAID 10 including only cloud drives: This configuration will not be analyzed. Fig. 11. Local+cloud-RAID 01 (mirror of stripes) with cloud mirroring 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 MTTF of Traditional RAID 10, Local+Cloud−RAID 10, and Simplex System vs. Failure Rate Failures per Year MTTFinYears(With100repairs/year) Cloud−RAID 10 RAID 10 Simplex Fig. 12. MTTF of RAID 10 and local+cloud-RAID 10 versus Failure Rate 0 500 1000 1500 2000 2500 3000 10 1 10 2 10 3 10 4 10 5 10 6 10 7 MTTF of Traditional RAID 10, Local+Cloud−RAID 10, and Simplex System vs. Repair Rate Repairs per Year MTTFinYears(With1/50failures/year) Cloud−RAID 10 RAID 10 Simplex Fig. 13. MTTF of RAID 10 and local+cloud-RAID 10 versus Repair Rate I. Traditional RAID 10 vs. local+cloud-RAID 10 Reliability As Figure 12 shows, the MTTF for local+cloud-RAID 10 is always higher than for traditional RAID 10. To obtain the reliability improvement, we divide equation 10 by equation 16. Assuming a reasonable failure rate of 0.02 failures / year and a repair rate of 100 possible repairs / year, the reliability improvement is approximately 2. IV. CONCLUSIONS Cloud storage offers meaningful improvements in reliability for many different RAID configurations. While cloud-RAID has been a focus, local+cloud-RAID has not seen the same attention. We conclude that local+cloud-RAID configurations 239
  • 6.
    are viable, reducingconnected hardware requirements, mini- mizing the impact of parity-induced stress and overhead on local drives, and increasing performance. RAID 4 is more viable than RAID levels 5 and 6 when utilized in a combined local+cloud configuration. RAID 4 can be used to increase the MTTF of RAID 5 when a cloud drive is used as a dedicated parity drive. We also propose RAID 4.5, which surpasses RAID 6 in combined local+cloud configurations. RAID 0 read performance is achieved with local+cloud-RAID levels 4, 4.5, 10, and 01, and substantial reliability gains are realized over RAID 0 with the implementation of these levels. The cloud can and ought to be extensively used as a resource to increase the reliability of local storage, including within large arrays, and also as a resource to increase the performance and security of systems which include only cloud-based media. V. FUTURE WORK A Cost-Aware Risk Management Framework for Users of Cloud Storage: Simplifying the decision-making process for storage architecture selection would enable system archi- tects to better compromise between cost, reliability, security, and performance. New levels and configurations of RAID can complicate system design by presenting options that are currently unknown to architects. Quantifying the risk of maintaining data in the cloud could enable architects to better mitigate risk through proper storage architecture and leveraging the reliability of the cloud. Improving the Write Performance of local+cloud-RAID 4 and 4.5: Schemes to improve upon the write performance of newly proposed RAID configurations will fully enable RAID 4 and RAID 4.5 to be adopted in storage systems which incur a large number of writes. Implementation and analysis of new RAID configurations: Arguments have been made in favor of new RAID configura- tions. However, performance of these systems in the field has yet to be measured. Implementation of these configurations will fuel further research. VI. ACKNOWLEDGMENT We thank Dr. Michael Wirthlin of BYU for providing guidance on the techniques used to analyze RAID reliability. REFERENCES [1] C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring data storage security in cloud computing,” in Quality of Service, 2009. IWQoS. 17th International Workshop on, July 2009, pp. 1–9. [2] H. Abu-Libdeh, L. Princehouse, and H. Weatherspoon, “RACS: A case for cloud storage diversity,” in Proceedings of the 1st ACM Symposium on Cloud Computing, ser. SoCC ’10. New York, NY, USA: ACM, 2010, pp. 229–240. [Online]. Available: http://doi.acm.org/10.1145/1807128.1807165 [3] M. Schnjakin, D. Korsch, M. Schoenberg, and C. Meinel, “Implemen- tation of a secure and reliable storage above the untrusted clouds,” in Computer Science Education (ICCSE), 2013 8th International Confer- ence on, April 2013, pp. 347–353. [4] H. Graupner, K. Torkura, P. Berger, C. Meinel, and M. Schnjakin, “Secure access control for multi-cloud resources,” in Local Computer Networks Conference Workshops (LCN Workshops), 2015 IEEE 40th, Oct 2015, pp. 722–729. [5] Y. Chi, W. Cai, Z. Hong, H. C. B. Chan, and V. C. M. Leung, “A privacy and price-aware inter-cloud system,” in 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), Nov 2015, pp. 298–305. [6] C. W. Ling and A. Datta, “InterCloud RAIDer: A do-it-yourself multi-cloud private data backup system,” in Proceedings of the 15th International Conference on Distributed Computing and Networking - Volume 8314, ser. ICDCN 2014. New York, NY, USA: Springer- Verlag New York, Inc., 2014, pp. 453–468. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-45249-9 30 [7] K. D. Bowers, A. Juels, and A. Oprea, “HAIL: A high-availability and integrity layer for cloud storage,” in Proceedings of the 16th ACM Conference on Computer and Communications Security, ser. CCS ’09. New York, NY, USA: ACM, 2009, pp. 187–198. [Online]. Available: http://doi.acm.org/10.1145/1653662.1653686 [8] G. Chockler, R. Guerraoui, I. Keidar, and M. Vukolic, “Reliable distributed storage,” Computer, vol. 42, no. 4, pp. 60–67, April 2009. [9] B. Mao, S. Wu, and H. Jiang, “Improving storage availability in cloud- of-clouds with hybrid redundant data distribution,” in Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International, May 2015, pp. 633–642. [10] Q. Zhang, S. Li, Z. Li, Y. Xing, Z. Yang, and Y. Dai, “CHARM: A cost-efficient multi-cloud data hosting scheme with high availability,” IEEE Transactions on Cloud Computing, vol. 3, no. 3, pp. 372–386, July 2015. [11] A. Bessani, M. Correia, B. Quaresma, F. Andr´e, and P. Sousa, “DepSky: Dependable and secure storage in a cloud-of-clouds,” in Proceedings of the Sixth Conference on Computer Systems, ser. EuroSys ’11. New York, NY, USA: ACM, 2011, pp. 31–46. [Online]. Available: http://doi.acm.org/10.1145/1966445.1966449 [12] M. A. Alzain, B. Soh, and E. Pardede, “MCDB: Using multi-clouds to ensure security in cloud computing,” in Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on, Dec 2011, pp. 784–791. [13] M. Schnjakin, T. Metzke, and C. Meinel, “Applying erasure codes for fault tolerance in cloud-RAID,” in Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on, Dec 2013, pp. 66–75. [14] M. Vrable, S. Savage, and G. M. Voelker, “BlueSky: A cloud-backed file system for the enterprise,” in Proceedings of the 10th USENIX Conference on File and Storage Technologies, ser. FAST’12. Berkeley, CA, USA: USENIX Association, 2012, pp. 19–19. [Online]. Available: http://dl.acm.org/citation.cfm?id=2208461.2208480 [15] C. Selvakumar, G. J. Rathanam, and M. R. Sumalatha, “PDDS - improv- ing cloud data storage security using data partitioning technique,” in Advance Computing Conference (IACC), 2013 IEEE 3rd International, Feb 2013, pp. 7–11. [16] M. Schnjakin and C. Meinel, “Evaluation of cloud-RAID: A secure and reliable storage above the clouds,” in Computer Communications and Networks (ICCCN), 2013 22nd International Conference on, July 2013, pp. 1–9. [17] B. Mao, S. Wu, and H. Jiang, “Exploiting workload characteristics and service diversity to improve the availability of cloud storage systems,” IEEE Transactions on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1, 2015. [18] C. W. Chang, P. Liu, and J. J. Wu, “Probability-based cloud storage providers selection algorithms with maximum availability,” in Parallel Processing (ICPP), 2012 41st International Conference on, Sept 2012, pp. 199–208. [19] D. A. Patterson, G. Gibson, and R. H. Katz, A case for redundant arrays of inexpensive disks (RAID). ACM, 1988, vol. 17, no. 3. [20] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson, “Raid: High-performance, reliable secondary storage,” ACM Computing Surveys (CSUR), vol. 26, no. 2, pp. 145–185, 1994. [21] M. L. Shooman, Reliability of Computer Systems and Networks: Fault Tolerance, Analysis and Design. New York, USA: Wiley-Interscience, 2002, pp. 112–114,438–441. [22] S. Gregori, A. Cabrini, O. Khouri, and G. Torelli, “On-chip error correcting techniques for new-generation flash memories,” Proceedings of the IEEE, vol. 91, no. 4, pp. 602–616, April 2003. 240