SlideShare a Scribd company logo
Load Rebalancing for Distributed File Systems in Clouds
ABSTRACT:
Distributed file systems are key building blocks for cloud computing applications
based on the MapReduce programming paradigm. In such file systems, nodes
simultaneously serve computing and storage functions; a file is partitioned into a
number of chunks allocated in distinct nodes so that MapReduce tasks can be
performed in parallel over the nodes. However, in a cloud computing environment,
failure is the norm, and nodes may be upgraded, replaced, and added in the system.
Files can also be dynamically created, deleted, and appended. This results in load
imbalance in a distributed file system; that is, the file chunks are not distributed as
uniformly as possible among the nodes. Emerging distributed file systems in
production systems strongly depend on a central node for chunk reallocation. This
dependence is clearly inadequate in a large-scale, failure-prone environment
because the central load balancer is put under considerable workload that is
linearly scaled with the system size, and may thus become the performance
bottleneck and the single point of failure. In this paper, a fully distributed load
rebalancing algorithm is presented to cope with the load imbalance problem. Our
algorithm is compared against a centralized approach in a production system and a
competing distributed solution presented in the literature. The simulation results
indicate that our proposal is comparable with the existing centralized approach and
considerably outperforms the prior distributed algorithm in terms of load
imbalance factor, movement cost, and algorithmic overhead. The performance of
our proposal implemented in the Hadoop distributed file system is further
investigated in a cluster environment.
ARCHITECTURE:
EXISTING SYSTEM:
State-of-the-art distributed file systems (e.g., Google GFS and Hadoop HDFS) in
clouds rely on central nodes to manage the metadata information of the file
systems and to balance the loads of storage nodes based on that metadata. The
centralized approach simplifies the design and implementation of a distributed file
system. However, recent experience concludes that when the number of storage
nodes, the number of files and the number of accesses to files increase linearly, the
central nodes (e.g., the master in Google GFS) become a performance bottleneck,
as they are unable to accommodate a large number of file accesses due to clients
and MapReduce applications.
DISADVANTAGES OF EXISTING SYSTEM:
The most existing solutions are designed without considering both movement cost
and node heterogeneity and may introduce significant maintenance network traffic
to the DHTs.
PROPOSED SYSTEM:
In this paper, we are interested in studying the load rebalancing problem in
distributed file systems specialized for large-scale, dynamic and data-
intensive clouds. (The terms “rebalance” and “balance” are interchangeable
in this paper.) Such a large-scale cloud has hundreds or thousands of nodes
(and may reach tens of thousands in the future).
Our objective is to allocate the chunks of files as uniformly as possible
among the nodes such that no node manages an excessive number of chunks.
Additionally, we aim to reduce network traffic (or movement cost) caused
by rebalancing the loads of nodes as much as possible to maximize the
network bandwidth available to normal applications. Moreover, as failure is
the norm, nodes are newly added to sustain the overall system
performance,resulting in the heterogeneity of nodes. Exploiting capable
nodes to improve the system performance is, thus, demanded.
Our proposal not only takes advantage of physical network locality in the
reallocation of file chunks to reduce the movement cost but also exploits
capable nodes to improve the overall system performance.
ADVANTAGES OF PROPOSED SYSTEM:
 This eliminates the dependence on central nodes.
 Our proposed algorithm operates in a distributed manner in which nodes
perform their load-balancing tasks independently without synchronization or
global knowledge regarding the system.
 Algorithm reduces algorithmic overhead introduced to the DHTs as much as
possible.
ALGORITHM USED:
 Load Rebalancing Algorithm
SYSTEM CONFIGURATION:-
HARDWARE CONFIGURATION:-
 Processor - Pentium –IV
 Speed - 1.1 Ghz
 RAM - 256 MB(min)
 Hard Disk - 20 GB
 Key Board - Standard Windows Keyboard
 Mouse - Two or Three Button Mouse
 Monitor - SVGA
SOFTWARE CONFIGURATION:-
 Operating System : Windows XP
 Programming Language : JAVA
 Java Version : JDK 1.6 & above.
REFERENCE:
Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying
Shen, Member, IEEE, and Yu-Chang Chao, “Load Rebalancing for Distributed
File Systems in Clouds”, IEEE TRANSACTIONS ON PARALLEL AND
DISTRIBUTED SYSTEMS, VOL. 24, NO. 5, MAY 2013.

More Related Content

What's hot

Sector Vs Hadoop
Sector Vs HadoopSector Vs Hadoop
Sector Vs Hadoop
lilyco
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Cloudera, Inc.
 
Hadoop architecture-tutorial
Hadoop  architecture-tutorialHadoop  architecture-tutorial
Hadoop architecture-tutorial
vinayiqbusiness
 
Load Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud ComputingLoad Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud Computing
iosrjce
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoop
João Gabriel Lima
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Cloudera, Inc.
 
Cppt
CpptCppt
Hadoop, mapreduce and yarn networks
Hadoop, mapreduce and yarn networksHadoop, mapreduce and yarn networks
Hadoop, mapreduce and yarn networks
HariniA7
 
1 rh storage - architecture whitepaper
1 rh storage - architecture whitepaper1 rh storage - architecture whitepaper
1 rh storage - architecture whitepaperAccenture
 
Energy aware load balancing and application scaling for the cloud ecosystem
Energy aware load balancing and application scaling for the cloud ecosystemEnergy aware load balancing and application scaling for the cloud ecosystem
Energy aware load balancing and application scaling for the cloud ecosystem
Kamal Spring
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415
SANTOSH WAYAL
 
Efficient load rebalancing for distributed file system in Clouds
Efficient load rebalancing for distributed file system in CloudsEfficient load rebalancing for distributed file system in Clouds
Efficient load rebalancing for distributed file system in Clouds
IJERA Editor
 
Shuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop TerasortShuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop Terasortpramodbiligiri
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
Subhas Kumar Ghosh
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 
Distributed, concurrent, and independent access to encrypted cloud databases
Distributed, concurrent, and independent access to encrypted cloud databasesDistributed, concurrent, and independent access to encrypted cloud databases
Distributed, concurrent, and independent access to encrypted cloud databases
Papitha Velumani
 

What's hot (18)

Sector Vs Hadoop
Sector Vs HadoopSector Vs Hadoop
Sector Vs Hadoop
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
Hadoop architecture-tutorial
Hadoop  architecture-tutorialHadoop  architecture-tutorial
Hadoop architecture-tutorial
 
Load Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud ComputingLoad Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud Computing
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
4 026
4 0264 026
4 026
 
Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoop
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
Cppt
CpptCppt
Cppt
 
Hadoop, mapreduce and yarn networks
Hadoop, mapreduce and yarn networksHadoop, mapreduce and yarn networks
Hadoop, mapreduce and yarn networks
 
1 rh storage - architecture whitepaper
1 rh storage - architecture whitepaper1 rh storage - architecture whitepaper
1 rh storage - architecture whitepaper
 
Energy aware load balancing and application scaling for the cloud ecosystem
Energy aware load balancing and application scaling for the cloud ecosystemEnergy aware load balancing and application scaling for the cloud ecosystem
Energy aware load balancing and application scaling for the cloud ecosystem
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415
 
Efficient load rebalancing for distributed file system in Clouds
Efficient load rebalancing for distributed file system in CloudsEfficient load rebalancing for distributed file system in Clouds
Efficient load rebalancing for distributed file system in Clouds
 
Shuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop TerasortShuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop Terasort
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Distributed, concurrent, and independent access to encrypted cloud databases
Distributed, concurrent, and independent access to encrypted cloud databasesDistributed, concurrent, and independent access to encrypted cloud databases
Distributed, concurrent, and independent access to encrypted cloud databases
 

Viewers also liked

Optimal multicast capacity and delay tradeoffs in mane ts
Optimal multicast capacity and delay tradeoffs in mane tsOptimal multicast capacity and delay tradeoffs in mane ts
Optimal multicast capacity and delay tradeoffs in mane ts
JPINFOTECH JAYAPRAKASH
 
2013 14 ieee matlab titles
2013 14 ieee matlab titles2013 14 ieee matlab titles
2013 14 ieee matlab titles
JPINFOTECH JAYAPRAKASH
 
Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...
JPINFOTECH JAYAPRAKASH
 
Scalable and secure sharing of personal health records in cloud computing usi...
Scalable and secure sharing of personal health records in cloud computing usi...Scalable and secure sharing of personal health records in cloud computing usi...
Scalable and secure sharing of personal health records in cloud computing usi...
JPINFOTECH JAYAPRAKASH
 
Enhanced olsr for defense against dos attack in ad hoc networks
Enhanced olsr for defense against dos attack in ad hoc networksEnhanced olsr for defense against dos attack in ad hoc networks
Enhanced olsr for defense against dos attack in ad hoc networks
JPINFOTECH JAYAPRAKASH
 
M privacy for collaborative data publishing
M privacy for collaborative data publishingM privacy for collaborative data publishing
M privacy for collaborative data publishing
JPINFOTECH JAYAPRAKASH
 
Cam cloud assisted privacy preserving mobile health monitoring
Cam cloud assisted privacy preserving mobile health monitoringCam cloud assisted privacy preserving mobile health monitoring
Cam cloud assisted privacy preserving mobile health monitoring
JPINFOTECH JAYAPRAKASH
 
Intrusion detection technique by using k means, fuzzy neural network and svm ...
Intrusion detection technique by using k means, fuzzy neural network and svm ...Intrusion detection technique by using k means, fuzzy neural network and svm ...
Intrusion detection technique by using k means, fuzzy neural network and svm ...
JPINFOTECH JAYAPRAKASH
 
Delay based network utility maximization
Delay based network utility maximizationDelay based network utility maximization
Delay based network utility maximization
JPINFOTECH JAYAPRAKASH
 
A load balancing model based on cloud partitioning for the public cloud
A load balancing model based on cloud partitioning for the public cloudA load balancing model based on cloud partitioning for the public cloud
A load balancing model based on cloud partitioning for the public cloud
JPINFOTECH JAYAPRAKASH
 
An efficient and robust addressing protocol for node autoconfiguration in ad ...
An efficient and robust addressing protocol for node autoconfiguration in ad ...An efficient and robust addressing protocol for node autoconfiguration in ad ...
An efficient and robust addressing protocol for node autoconfiguration in ad ...
JPINFOTECH JAYAPRAKASH
 
A rank correlation based detection against distributed reflection do s attacks
A rank correlation based detection against distributed reflection do s attacksA rank correlation based detection against distributed reflection do s attacks
A rank correlation based detection against distributed reflection do s attacks
JPINFOTECH JAYAPRAKASH
 
Localization of wireless sensor networks in the wild pursuit of ranging quality
Localization of wireless sensor networks in the wild pursuit of ranging qualityLocalization of wireless sensor networks in the wild pursuit of ranging quality
Localization of wireless sensor networks in the wild pursuit of ranging quality
JPINFOTECH JAYAPRAKASH
 
2013 2014 ieee java project titles
2013 2014 ieee java project titles2013 2014 ieee java project titles
2013 2014 ieee java project titles
JPINFOTECH JAYAPRAKASH
 
Clustering sentence level text using a novel fuzzy relational clustering algo...
Clustering sentence level text using a novel fuzzy relational clustering algo...Clustering sentence level text using a novel fuzzy relational clustering algo...
Clustering sentence level text using a novel fuzzy relational clustering algo...
JPINFOTECH JAYAPRAKASH
 
Research in progress defending android smartphones from malware attacks
Research in progress  defending android smartphones from malware attacksResearch in progress  defending android smartphones from malware attacks
Research in progress defending android smartphones from malware attacks
JPINFOTECH JAYAPRAKASH
 
A probabilistic misbehavior detection scheme towards efficient trust establis...
A probabilistic misbehavior detection scheme towards efficient trust establis...A probabilistic misbehavior detection scheme towards efficient trust establis...
A probabilistic misbehavior detection scheme towards efficient trust establis...
JPINFOTECH JAYAPRAKASH
 
A generalized flow based method for analysis of implicit relationships on wik...
A generalized flow based method for analysis of implicit relationships on wik...A generalized flow based method for analysis of implicit relationships on wik...
A generalized flow based method for analysis of implicit relationships on wik...
JPINFOTECH JAYAPRAKASH
 
Toward accurate mobile sensor network localization in noisy environments
Toward accurate mobile sensor network localization in noisy environmentsToward accurate mobile sensor network localization in noisy environments
Toward accurate mobile sensor network localization in noisy environments
JPINFOTECH JAYAPRAKASH
 

Viewers also liked (19)

Optimal multicast capacity and delay tradeoffs in mane ts
Optimal multicast capacity and delay tradeoffs in mane tsOptimal multicast capacity and delay tradeoffs in mane ts
Optimal multicast capacity and delay tradeoffs in mane ts
 
2013 14 ieee matlab titles
2013 14 ieee matlab titles2013 14 ieee matlab titles
2013 14 ieee matlab titles
 
Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...Trusted db a trusted hardware based database with privacy and data confidenti...
Trusted db a trusted hardware based database with privacy and data confidenti...
 
Scalable and secure sharing of personal health records in cloud computing usi...
Scalable and secure sharing of personal health records in cloud computing usi...Scalable and secure sharing of personal health records in cloud computing usi...
Scalable and secure sharing of personal health records in cloud computing usi...
 
Enhanced olsr for defense against dos attack in ad hoc networks
Enhanced olsr for defense against dos attack in ad hoc networksEnhanced olsr for defense against dos attack in ad hoc networks
Enhanced olsr for defense against dos attack in ad hoc networks
 
M privacy for collaborative data publishing
M privacy for collaborative data publishingM privacy for collaborative data publishing
M privacy for collaborative data publishing
 
Cam cloud assisted privacy preserving mobile health monitoring
Cam cloud assisted privacy preserving mobile health monitoringCam cloud assisted privacy preserving mobile health monitoring
Cam cloud assisted privacy preserving mobile health monitoring
 
Intrusion detection technique by using k means, fuzzy neural network and svm ...
Intrusion detection technique by using k means, fuzzy neural network and svm ...Intrusion detection technique by using k means, fuzzy neural network and svm ...
Intrusion detection technique by using k means, fuzzy neural network and svm ...
 
Delay based network utility maximization
Delay based network utility maximizationDelay based network utility maximization
Delay based network utility maximization
 
A load balancing model based on cloud partitioning for the public cloud
A load balancing model based on cloud partitioning for the public cloudA load balancing model based on cloud partitioning for the public cloud
A load balancing model based on cloud partitioning for the public cloud
 
An efficient and robust addressing protocol for node autoconfiguration in ad ...
An efficient and robust addressing protocol for node autoconfiguration in ad ...An efficient and robust addressing protocol for node autoconfiguration in ad ...
An efficient and robust addressing protocol for node autoconfiguration in ad ...
 
A rank correlation based detection against distributed reflection do s attacks
A rank correlation based detection against distributed reflection do s attacksA rank correlation based detection against distributed reflection do s attacks
A rank correlation based detection against distributed reflection do s attacks
 
Localization of wireless sensor networks in the wild pursuit of ranging quality
Localization of wireless sensor networks in the wild pursuit of ranging qualityLocalization of wireless sensor networks in the wild pursuit of ranging quality
Localization of wireless sensor networks in the wild pursuit of ranging quality
 
2013 2014 ieee java project titles
2013 2014 ieee java project titles2013 2014 ieee java project titles
2013 2014 ieee java project titles
 
Clustering sentence level text using a novel fuzzy relational clustering algo...
Clustering sentence level text using a novel fuzzy relational clustering algo...Clustering sentence level text using a novel fuzzy relational clustering algo...
Clustering sentence level text using a novel fuzzy relational clustering algo...
 
Research in progress defending android smartphones from malware attacks
Research in progress  defending android smartphones from malware attacksResearch in progress  defending android smartphones from malware attacks
Research in progress defending android smartphones from malware attacks
 
A probabilistic misbehavior detection scheme towards efficient trust establis...
A probabilistic misbehavior detection scheme towards efficient trust establis...A probabilistic misbehavior detection scheme towards efficient trust establis...
A probabilistic misbehavior detection scheme towards efficient trust establis...
 
A generalized flow based method for analysis of implicit relationships on wik...
A generalized flow based method for analysis of implicit relationships on wik...A generalized flow based method for analysis of implicit relationships on wik...
A generalized flow based method for analysis of implicit relationships on wik...
 
Toward accurate mobile sensor network localization in noisy environments
Toward accurate mobile sensor network localization in noisy environmentsToward accurate mobile sensor network localization in noisy environments
Toward accurate mobile sensor network localization in noisy environments
 

Similar to Load rebalancing for distributed file systems in clouds

Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
huda2018
 
Simple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In CloudSimple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In Cloud
IOSR Journals
 
13
1313
13
1313
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
Varun Narang
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
Alireza Kamrani
 
Eg4301808811
Eg4301808811Eg4301808811
Eg4301808811
IJERA Editor
 
Document 22.pdf
Document 22.pdfDocument 22.pdf
Document 22.pdf
rahulsahu887608
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computingpurplesea
 
H017144148
H017144148H017144148
H017144148
IOSR Journals
 
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
IOSR Journals
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed Systems
Richa Singh
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
C017311316
C017311316C017311316
C017311316
IOSR Journals
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
IJAAS Team
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET Journal
 
Page a partition aware engine for parallel graph computation
Page a partition aware engine for parallel graph computationPage a partition aware engine for parallel graph computation
Page a partition aware engine for parallel graph computation
CloudTechnologies
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System
cscpconf
 
Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce
cscpconf
 

Similar to Load rebalancing for distributed file systems in clouds (20)

Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
 
Simple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In CloudSimple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In Cloud
 
13
1313
13
 
13
1313
13
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
 
Eg4301808811
Eg4301808811Eg4301808811
Eg4301808811
 
Document 22.pdf
Document 22.pdfDocument 22.pdf
Document 22.pdf
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computing
 
H017144148
H017144148H017144148
H017144148
 
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed Systems
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
C017311316
C017311316C017311316
C017311316
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
 
Page a partition aware engine for parallel graph computation
Page a partition aware engine for parallel graph computationPage a partition aware engine for parallel graph computation
Page a partition aware engine for parallel graph computation
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System
 
Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce
 

Recently uploaded

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Recently uploaded (20)

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 

Load rebalancing for distributed file systems in clouds

  • 1. Load Rebalancing for Distributed File Systems in Clouds ABSTRACT: Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. However, in a cloud computing environment, failure is the norm, and nodes may be upgraded, replaced, and added in the system. Files can also be dynamically created, deleted, and appended. This results in load imbalance in a distributed file system; that is, the file chunks are not distributed as uniformly as possible among the nodes. Emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation. This dependence is clearly inadequate in a large-scale, failure-prone environment because the central load balancer is put under considerable workload that is linearly scaled with the system size, and may thus become the performance bottleneck and the single point of failure. In this paper, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature. The simulation results
  • 2. indicate that our proposal is comparable with the existing centralized approach and considerably outperforms the prior distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead. The performance of our proposal implemented in the Hadoop distributed file system is further investigated in a cluster environment. ARCHITECTURE: EXISTING SYSTEM: State-of-the-art distributed file systems (e.g., Google GFS and Hadoop HDFS) in clouds rely on central nodes to manage the metadata information of the file systems and to balance the loads of storage nodes based on that metadata. The centralized approach simplifies the design and implementation of a distributed file
  • 3. system. However, recent experience concludes that when the number of storage nodes, the number of files and the number of accesses to files increase linearly, the central nodes (e.g., the master in Google GFS) become a performance bottleneck, as they are unable to accommodate a large number of file accesses due to clients and MapReduce applications. DISADVANTAGES OF EXISTING SYSTEM: The most existing solutions are designed without considering both movement cost and node heterogeneity and may introduce significant maintenance network traffic to the DHTs. PROPOSED SYSTEM: In this paper, we are interested in studying the load rebalancing problem in distributed file systems specialized for large-scale, dynamic and data- intensive clouds. (The terms “rebalance” and “balance” are interchangeable in this paper.) Such a large-scale cloud has hundreds or thousands of nodes (and may reach tens of thousands in the future). Our objective is to allocate the chunks of files as uniformly as possible among the nodes such that no node manages an excessive number of chunks. Additionally, we aim to reduce network traffic (or movement cost) caused by rebalancing the loads of nodes as much as possible to maximize the
  • 4. network bandwidth available to normal applications. Moreover, as failure is the norm, nodes are newly added to sustain the overall system performance,resulting in the heterogeneity of nodes. Exploiting capable nodes to improve the system performance is, thus, demanded. Our proposal not only takes advantage of physical network locality in the reallocation of file chunks to reduce the movement cost but also exploits capable nodes to improve the overall system performance. ADVANTAGES OF PROPOSED SYSTEM:  This eliminates the dependence on central nodes.  Our proposed algorithm operates in a distributed manner in which nodes perform their load-balancing tasks independently without synchronization or global knowledge regarding the system.  Algorithm reduces algorithmic overhead introduced to the DHTs as much as possible.
  • 5. ALGORITHM USED:  Load Rebalancing Algorithm SYSTEM CONFIGURATION:- HARDWARE CONFIGURATION:-  Processor - Pentium –IV  Speed - 1.1 Ghz  RAM - 256 MB(min)  Hard Disk - 20 GB  Key Board - Standard Windows Keyboard  Mouse - Two or Three Button Mouse  Monitor - SVGA SOFTWARE CONFIGURATION:-  Operating System : Windows XP  Programming Language : JAVA
  • 6.  Java Version : JDK 1.6 & above. REFERENCE: Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying Shen, Member, IEEE, and Yu-Chang Chao, “Load Rebalancing for Distributed File Systems in Clouds”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 5, MAY 2013.