QOS-AWARE DATA REPLICATION FOR 
DATA-INTENSIVE APPLICATIONS IN 
CLOUD COMPUTING SYSTEMS 
Presented by: 
LansA Informatics Pvt Ltd
ABSTRACT 
• Cloud computing provides scalable computing and storage resources. More and more data-intensive 
applications are developed in this computing environment. Different applications 
have different quality-of-service (QoS) requirements. To continuously support the QoS 
requirement of an application after data corruption, we propose two QoS-aware data 
replication (QADR) algorithms in cloud computing systems. 
• The first algorithm adopts the intuitive idea of high-QoS first-replication (HQFR) to perform 
data replication. However, this greedy algorithm cannot minimize the data replication cost 
and the number of QoS-violated data replicas. To achieve these two minimum objectives, the 
second algorithm transforms the QADR problem into the well-known minimum-cost 
maximum-flow (MCMF) problem. 
• By applying the existing MCMF algorithm to solve the QADR problem, the second algorithm 
can produce the optimal solution to the QADR problem in polynomial time, but it takes more 
computational time than the first algorithm. Moreover, it is known that a cloud computing 
system usually has a large number of nodes. We also propose node combination techniques 
to reduce the possibly large data replication time. Finally, simulation experiments are 
performed to demonstrate the effectiveness of the proposed algorithms in the data 
replication and recovery.
Existing System 
• Due to a large number of nodes in the cloud computing 
system, the probability of hardware failures is nontrivial based 
on the statistical analysis of hardware failures. Some hardware 
failures will damage the disk data of nodes. As a result, the 
running data-intensive applications may not read data from 
disks successfully. 
• To tolerate the data corruption, the data replication technique 
is extensively adopted in the cloud computing system to 
provide high data availability. For example, the Amazon EC2 is 
a realistic heterogeneous cloud platform, which provides 
various infrastructure resource types to meet different user 
needs in the computing and storage resources. 
• The cloud computing system has heterogeneous 
characteristics in nodes. Note that the QoS requirement of an 
application is defined from the aspect of the request 
information. For example, in, the response time of a data 
object access is defined as the QoS requirement of an 
application in the content distribution system.
DISADVANTAGES 
• The QoS requirement of an application is not taken 
into account in the data replication. When data 
corruption occurs, the QoS requirement of the 
application cannot be supported continuously. 
• The data of a high-QoS application may be 
replicated in a low-performance node (the node 
with slow communication and disk access latencies). 
Later, if data corruption occurs in the node running 
the high-QoS application, the data of the application 
will be retrieved from the low-performance node. 
• Since the low-performance node has slow 
communication and disk access latencies, the QoS 
requirement of the high-QoS application may be 
violated.
PROPOSED SYSTEM 
• We Propose QoS-aware data replication (QADR) problem for data-intensive applications 
in cloud computing systems. The QADR problem concerns how to efficiently consider the 
QoS requirements of applications in the data replication. This can significantly reduce the 
probability that the data corruption occurs before completing data replication. Due to 
limited replication space of a storage node, the data replicas of some applications may be 
stored in lower-performance nodes. This will result in some data replicas that cannot 
meet the QoS requirements of their corresponding applications. These data replicas are 
called the QoS-violated data replicas. The number of QoS-violated data replicas is 
expected to be as small as possible. 
• To solve the QADR problem, we first propose a greedy algorithm, called the high-QoS 
first-replication (HQFR) algorithm. In this algorithm, if application i has a higher QoS 
requirement, it will take precedence over other applications to perform data replication. 
However, the HQFR algorithm cannot achieve the above minimum objective. Basically, 
the optimal solution of the QADR problem can be obtained by formulating the problem as 
an integer linear programming (ILP) formulation. However, the ILP formulation 
involves complicated computation. To find the optimal solution of the QADR problem in 
an efficient manner, we propose a new algorithm to solve the QADR problem. In this 
algorithm, the QADR problem is transformed to the minimum-cost maximum-flow 
(MCMF) problem. 
• We propose a new algorithm to solve the QADR problem. In this algorithm, the 
QADR problem is transformed to the minimum-cost maximum-flow (MCMF) problem. 
Then, an existing MCMF algorithm is utilized to optimally solve the QADR problem in 
polynomial time. Compared to the HQFR algorithm, the optimal algorithm takes more 
computational time.
ADVANTAGES 
• While minimizing the data replication cost, 
the data replication can be completed 
quickly. 
• We use node combination techniques to 
suppress the computational time of the 
QADR problem without linear growth as 
increasing the number of nodes
SYSTEM ARCHITECTURE
SYSTEM CONFIGURATION 
HARDWARE REQUIREMENTS:- 
• Processor - Pentium –IV 
• Speed - 1.1 Ghz 
• RAM - 512 MB(min) 
• Hard Disk - 40 GB 
• Key Board - Standard Windows Keyboard 
• Mouse - Two or Three Button Mouse 
• Monitor - LCD/LED 
SOFTWARE REQUIREMENTS: 
• Operating system : Windows XP. 
• Coding Language : C# .Net 
• Data Base : SQL Server 2005 
• Tool : VISUAL STUDIO 2008.
REFERENCE 
• Jenn-Wei Lin, Chien-Hung Chen, and J. Morris 
Chang, “QOS-AWARE DATA REPLICATION FOR DATA-INTENSIVE 
APPLICATIONS IN CLOUD COMPUTING 
SYSTEMS” IEEE TRANSACTIONS ON CLOUD 
COMPUTING, VOL. 1, NO. 1, JANUARY-JUNE 2013
OFFICE ADDRESS: 
LansA Informatics Pvt ltd 
No 165, 5th Street, 
Crosscut Road, Gandhipuram, 
Coimbatore - 641 015 
JOIN US! 
OTHER MODE OF 
CONTACT: 
Landline: 0422 – 4204373 
Mobile : +91 90 953 953 33 
+91 91 591 159 69 
Email ID: 
studentscdc@lansainformatics.com 
web: www.lansainformatics.com 
Blog: 
www.lansastudentscdc.blogspot.com 
Facebook: 
www.facebook.com/lansainformatics

QoS-Aware Data Replication for Data-Intensive Applications in Cloud Computing Systems

  • 1.
    QOS-AWARE DATA REPLICATIONFOR DATA-INTENSIVE APPLICATIONS IN CLOUD COMPUTING SYSTEMS Presented by: LansA Informatics Pvt Ltd
  • 2.
    ABSTRACT • Cloudcomputing provides scalable computing and storage resources. More and more data-intensive applications are developed in this computing environment. Different applications have different quality-of-service (QoS) requirements. To continuously support the QoS requirement of an application after data corruption, we propose two QoS-aware data replication (QADR) algorithms in cloud computing systems. • The first algorithm adopts the intuitive idea of high-QoS first-replication (HQFR) to perform data replication. However, this greedy algorithm cannot minimize the data replication cost and the number of QoS-violated data replicas. To achieve these two minimum objectives, the second algorithm transforms the QADR problem into the well-known minimum-cost maximum-flow (MCMF) problem. • By applying the existing MCMF algorithm to solve the QADR problem, the second algorithm can produce the optimal solution to the QADR problem in polynomial time, but it takes more computational time than the first algorithm. Moreover, it is known that a cloud computing system usually has a large number of nodes. We also propose node combination techniques to reduce the possibly large data replication time. Finally, simulation experiments are performed to demonstrate the effectiveness of the proposed algorithms in the data replication and recovery.
  • 3.
    Existing System •Due to a large number of nodes in the cloud computing system, the probability of hardware failures is nontrivial based on the statistical analysis of hardware failures. Some hardware failures will damage the disk data of nodes. As a result, the running data-intensive applications may not read data from disks successfully. • To tolerate the data corruption, the data replication technique is extensively adopted in the cloud computing system to provide high data availability. For example, the Amazon EC2 is a realistic heterogeneous cloud platform, which provides various infrastructure resource types to meet different user needs in the computing and storage resources. • The cloud computing system has heterogeneous characteristics in nodes. Note that the QoS requirement of an application is defined from the aspect of the request information. For example, in, the response time of a data object access is defined as the QoS requirement of an application in the content distribution system.
  • 4.
    DISADVANTAGES • TheQoS requirement of an application is not taken into account in the data replication. When data corruption occurs, the QoS requirement of the application cannot be supported continuously. • The data of a high-QoS application may be replicated in a low-performance node (the node with slow communication and disk access latencies). Later, if data corruption occurs in the node running the high-QoS application, the data of the application will be retrieved from the low-performance node. • Since the low-performance node has slow communication and disk access latencies, the QoS requirement of the high-QoS application may be violated.
  • 5.
    PROPOSED SYSTEM •We Propose QoS-aware data replication (QADR) problem for data-intensive applications in cloud computing systems. The QADR problem concerns how to efficiently consider the QoS requirements of applications in the data replication. This can significantly reduce the probability that the data corruption occurs before completing data replication. Due to limited replication space of a storage node, the data replicas of some applications may be stored in lower-performance nodes. This will result in some data replicas that cannot meet the QoS requirements of their corresponding applications. These data replicas are called the QoS-violated data replicas. The number of QoS-violated data replicas is expected to be as small as possible. • To solve the QADR problem, we first propose a greedy algorithm, called the high-QoS first-replication (HQFR) algorithm. In this algorithm, if application i has a higher QoS requirement, it will take precedence over other applications to perform data replication. However, the HQFR algorithm cannot achieve the above minimum objective. Basically, the optimal solution of the QADR problem can be obtained by formulating the problem as an integer linear programming (ILP) formulation. However, the ILP formulation involves complicated computation. To find the optimal solution of the QADR problem in an efficient manner, we propose a new algorithm to solve the QADR problem. In this algorithm, the QADR problem is transformed to the minimum-cost maximum-flow (MCMF) problem. • We propose a new algorithm to solve the QADR problem. In this algorithm, the QADR problem is transformed to the minimum-cost maximum-flow (MCMF) problem. Then, an existing MCMF algorithm is utilized to optimally solve the QADR problem in polynomial time. Compared to the HQFR algorithm, the optimal algorithm takes more computational time.
  • 6.
    ADVANTAGES • Whileminimizing the data replication cost, the data replication can be completed quickly. • We use node combination techniques to suppress the computational time of the QADR problem without linear growth as increasing the number of nodes
  • 7.
  • 8.
    SYSTEM CONFIGURATION HARDWAREREQUIREMENTS:- • Processor - Pentium –IV • Speed - 1.1 Ghz • RAM - 512 MB(min) • Hard Disk - 40 GB • Key Board - Standard Windows Keyboard • Mouse - Two or Three Button Mouse • Monitor - LCD/LED SOFTWARE REQUIREMENTS: • Operating system : Windows XP. • Coding Language : C# .Net • Data Base : SQL Server 2005 • Tool : VISUAL STUDIO 2008.
  • 9.
    REFERENCE • Jenn-WeiLin, Chien-Hung Chen, and J. Morris Chang, “QOS-AWARE DATA REPLICATION FOR DATA-INTENSIVE APPLICATIONS IN CLOUD COMPUTING SYSTEMS” IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 1, NO. 1, JANUARY-JUNE 2013
  • 10.
    OFFICE ADDRESS: LansAInformatics Pvt ltd No 165, 5th Street, Crosscut Road, Gandhipuram, Coimbatore - 641 015 JOIN US! OTHER MODE OF CONTACT: Landline: 0422 – 4204373 Mobile : +91 90 953 953 33 +91 91 591 159 69 Email ID: studentscdc@lansainformatics.com web: www.lansainformatics.com Blog: www.lansastudentscdc.blogspot.com Facebook: www.facebook.com/lansainformatics