SlideShare a Scribd company logo
1 of 4
Download to read offline
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763
Issue 01, Volume 3 (January 2016) www.ijirae.com
______________________________________________________________________________________________________
IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57
© 2014- 16, IJIRAE- All Rights Reserved Page -47
Dynamically Partitioning Big Data Using Virtual Machine
Mapping
Mrs.D.Saritha Dr.R.Veera Sudarasana Reddy G.Narasimha Reddy
Assistant .Professor/ MCA Principal Senior Manager
Narayana Engineering College, Narayana Engineering College,Gudur CTS Pvt,Ltd,Hydrabed
Andhra Pradesh Andhra Pradesh Andhra Pradesh
Abstract— Big data refers to data that is so large that it exceeds the processing capabilities of traditional systems. Big data
can be awkward to work and the storage, processing and analysis of big data can be problematic. MapReduce is a recent
programming model that can handle big data. MapReduce achieves this by distributing the storage and processing of data
amongst a large number of computers (nodes). However, this means the time required to process a MapReduce job is
dependent on whichever node is last to complete a task. This problem is bad situation by heterogeneous environments. In
this paper a methodologyis properly to improve MapReduce execution in heterogeneous environments. It is carried out
using dynamically partitioning data during the Map phase and by using virtual machine mapping in the Reduce phase in
order to maximize resource utilization.
Keywords—BigData; MapReduce; Hadoop; Virtual Machine; Heterogeneous environment;
INTRODUCTION
MapReduce is a programming model for creating distributed applications that can process big data using a large number of
commodity computers. Originally developed by Google[1,2], MapReduce enjoys wide use by both industry and academia[3] via
Hadoop[4]. The advantages of MapReduce framework is that it allows users to execute analytical tasks over big data without
worrying about the myriad of details inherent in distributed programming[3,5]. However, the efficacy of MapReduce can be
undermined by its implementation. For instance, Hadoop the most popular open source MapReduce framework[5] assumes all the
nodes in the network to be homogenous. Consequently, Hadoop’s performance is not optimal in a heterogeneous environment[6].
In this paper we focus on the Hadoop framework. We look in particular how MapReduce handles map input and reduce task
assignment in a heterogeneous environment. There are many reasons why MapReduce might execute in a heterogeneous
environment. For instance, advances in technology might mean new machines in the network are different to old ones. Alternatively,
MapReduce may be deployed on a hybrid cloud environment, where computing resources tend to be heterogeneous[7]. In summary,
this paper presents the following contributions
A method to improve virtual machine mapping for reducers 
A method to improve reducer selection on a heterogeneous systems 
The rest of this paper is organized as follows. In section 2, we present some background on MapReduce. In section 3, we present our
proposed dynamic data partitioning and virtual machine mapping methods. In section 4, we evaluate our work, present our
experimental results and discuss our findings. Finally, in section 5, we present our conclusion and prospects for future work.
MAPREDUCE
The purpose of MapReduce is to process large amounts of data on clusters of computers. At the heart of MapReduce resides two
distinct programming functions, a map function and a reduce function[8]. It is the responsibility of the programmer to provide these
functions. Two tasks known as the mapper and reducer handle the map and reduce functions respectively. In this paper, the terms
mapper and reducer are used interchangeably with the terms map task and reduce task.
Figure 1. MapReduce data flow
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763
Issue 01, Volume 3 (January 2016) www.ijirae.com
______________________________________________________________________________________________________
IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57
© 2014- 16, IJIRAE- All Rights Reserved Page -48
The purpose of the map and reduce functions is to handle sets of keys-value pairs. When a user runs a MapReduce program, data
from a file or set of files is split amongst the mappers provided and read as a series of key-value pairs. The mapper then applies the
map function on these key-value pairs. It is the duty of the map function to derive meaning from the input, to manipulate or filter the
data, and to compute a list of key-value pairs. The list of key-value pairs is then partitioned based on the key, typically via a hash
function. During this process, data is stored locally in temporary intermediate file, as shown in Fig 1.
as input a key and a list of values. Once the reduce function finishes computing the data an output file is produced. Each reducer
generates a separate output file. These files can be searched, merged or handled in whatever way the user wants once all reducers
have completed their workload.
PROPOSED TECHNIQUES AND IMPLEMENTATION
The research model for this study is presented in Fig. 2, which shows a network that consists of several physical machines. Each
physical machine (PM) has a limited number of virtual machines (VM). Without losing generality, virtual machines are used as a
basic unit with which to execute a task. The virtual machine may run a map task or a reduce task. Due to the heterogenerous nature of
the environment, the processing capabilities of any particular virtual machine may differ from other virtual machines in the
environment.
Figure 3. Dataflow of MapReduce model with the proposed dynamic data partitioner and a virtual machine mapper. Six Map
tasks and three Reduce tasks run on virtual machines with differing processing capabilities.
A. Dynamic Data Partitioning
In Hadoop, a MapReduce job begins by first reading a large input file. This file is usually stored on the Hadoop Distributed File
System (HDFS). Since Hadoop assumes the environment is homogenous, the data from this file is split into fixed sized pieces.
Hadoop then creates a mapper for each split. In a homogenous cluster each node has the same processing power and capabilities. In
this case, each mapper will finish processing its split at approximately the same time. In a heterogeneous network, nodes that process
faster than others will complete their work earlier.
Since data access rates between nodes on the HDFS are inconsistent due to issues of data locality, we propose a dynamic data
partitioner that partitions data on a node irrespective of other nodes on the network. An example of the dynamic data partitioner is
shown in Fig. 3. In this example, a 600GB file is used as input data. In this scenario, the data is divided up into six equal sized pieces,
and sent to six virtual machines. Each of these virtual machines then executes a map task. Each virtual machine is given a value n that
indicates the relative processing ability of that virtual machines VPU. This is based on the number of virtual processing units (VPU)
of that virtual machine, and the physical machine it is running on. For instance, the virtual machine VM1 has an n value of 10 and the
virtual machine VM2 has an n value of 2. This means that VM1 is able to process data 5 times faster than VM2. The processing speed
of each virtual machine is calculated prior to execution using a profiling tool.
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763
Issue 01, Volume 3 (January 2016) www.ijirae.com
______________________________________________________________________________________________________
IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57
© 2014- 16, IJIRAE- All Rights Reserved Page -49
Figure 3. Map/Reduce using worksets
As previously mentioned, the proportion of data to be reassigned amongst virtual machines is determined by the processing ability of
all the virtual machines running on the same physical machine. The following algorithm calculated. The machine the DDP
repartitions the data on each physical machine. On PM1 there is three virtual machines VM1, VM2 and VM3. VM1 has a VM
processing rate of 10, VM2 has a VM processing rate of 2 and VM3 has a processing rate of 8. Each virtual machine has an initial
split size of 100GB. Consequently, the total data size of the three virtual machines is 300GB, and the total speed of the three virtual
machines is 20 units. The input split is then divided into fragments. The size of fragment is calculated using the following equation:
Table I- MAP VS. MATCH VS. PREGEL BASED
TYPE FILE TIME [M]
ca.CondMat.txt Match/Reduce 0.23
ca.CondMat.txt Pregel 0.23
ca.CondMat.txt Map/Reduce 0.25
com.youtube.ungraph.txt Pregel 9.98
com.youtube.ungraph.txt Match/Reduce 16.78
com.youtube.ungraph.txt Map/Reduce 34.40
web.BerkStan.txt Pregel 14.12
web.BerkStan.txt Match/Reduce 24.70
web.BerkStan.txt Map/Reduce 298.35
A closer analysis of Figure 3 reveals that data sets with less than a hundred thousand nodes play in the hands of tratosphere. Its
lack of a wiring phase, as in Hama, where nodes have to be connected to their neighbors, and the lack of Hadoop’s serialization
makes it the fastest of the Java-based frameworks for smaller data sets
B. Virtual Machine Mapping
In Hadoop, a master node determines where mappers or reducers reside on a network. When assigning a mapper to a node, it is
important that it is located on or near the data it will access. This is because transferring data from one node to another takes time and
delays task execution. The problem of determining where to place a task on the network so that it is close to the data it uses is known
as data locality. Data locality is a key concept in MapReduce and has a major impact on its performance. Even though Hadoop uses
this concept when determining where to execute its mappers, it does not exploit this concept for its reducers and locate reducers near
these partitions. We therefore propose for this purpose a virtual machine mapper (VMM) which allocates the reducer to the
appropriate physical machine based on partition size and the availability of a virtual machine.
Figure 4. Virtual Machine Mapper
Fig. 4 shows a simple example of the VMM ascribing reduce tasks to physical machines. In this example, there are six mappers on
two physical machines, with three mappers per physical machine. Since this job requests three reduce tasks, each mapper creates
three partitions. The total amount of data to be received by each reducer is then deduced by summing up the respective partition at
each mapper. Based on the concept of data locality, reducers are assigned locations on physical machines based on where the data for
each reducer is stored. On a physical machine there are multiple virtual machines. Once a reducer is assigned to a physical machine
the reducer is assigned whichever virtual machine has the fastest processing speed. The algorithm for the virtual machine mapper is
presented as follows.
In this example, all of the reducers are allocated to a virtual machine on an appropriate physical machine. If there are no virtual
machines available, the reducer is allocated to another physical machine. This is first done using the best fit selection methodwhich
eases recovery, but hampers performance [5]. However, with the relative youth of its successors, it is cur-rently the only framework
that can actually recover from faults. Both distributed GraphLab and Stratosphere detect lost nodes quickly, however cannot recover
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763
Issue 01, Volume 3 (January 2016) www.ijirae.com
______________________________________________________________________________________________________
IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57
© 2014- 16, IJIRAE- All Rights Reserved Page -50
from errors. Both display an error code on the console.
Only Stratosphere has a built-in recovery mechanism, which is at the time of writing unable to handle iterations properly. In Fig. 3
there are two physical machines. Both physical machines have three virtual machines each running a mapper. Each mapper produces
three data partitions, which are assigned to three reducers. Using Algorithm 2, reducer 1 is assigned to physical machine 1, virtual
machine 1. This is because physical machine 1 stores the largest fraction of the data designated for that reducer. Similarly, reducer 2
and reducer 3 are assigned to a virtual machine on physical machine 2. On physical machine 1 there are three viable virtual machines
(VM1, VM2, VM3). Reducer 1 is assigned to VM1 as it has the fastest processing speed. On physical machine 2 there are also three
viable virtual machines (VM4, VM5, VM6) each with different processing speeds. Since reducer 3 has more data stored on physical
machine 2, it is assigned to the fastest virtual machine (VM6). Consequently, reducer 2 is assigned to the second fastest virtual
machine (VM4). In this example, all of the reducers are allocated to a virtual machine on an appropriate physical machine. If there are
no virtual machines available, the reducer is allocated to another physical machine. This is first done using the best fit selection
method. If a physical machine has more reducers requesting a virtual machine than there are available virtual machines, the VMM has
to locate the reducer to another physical machine. Instead of rejecting reducers based on a first come first served (FCFS), the VMM
assesses the size of the data contributed to the reducer by each physical machine. The VMM then gives priority to those reducers with
the greatest difference between their largest data contribution and their second largest data contribution.both scenarios using the
analyzed vertex centric frameworks. The results for the Map/Reduce frameworks did not match the speed of their counterparts and
have been left out. The first thing to notice is that GraphLab apparently comes from a multicore background and is still optimized for
that. In that scenario, Hama performs worse than GraphLab by at least 20% for every tested data set. The gain shows up especially
using large data sets in the multicore scenario, which is an important benefit.
In the distributed scenario, the gain is negligible for a huge data set like Wiki-talk, and even reversed for the Berkeley data. As
previously mentioned, dense data set seems to exploit a shortcoming in the implementation of distributed GraphLab. For a complete
analysis of GraphLab on a multicore machine see Since data access rates between nodes on the HDFS are inconsistent due to issues
of data locality, we propose a dynamic data partitioner that partitions data on a node irrespective of other nodes on the network. An
example of the dynamic data both scenarios using the analyzed vertex centric frameworks. The results for the Map/Reduce
frameworks did not match the speed of their counterparts and have been left out. The first thing to notice is that GraphLab apparently
comes from a multicore background and is still optimized for that. In that scenario, Hama performs worse than GraphLab by at least
20% for every tested data set. The gain shows up especially using large data sets in the multicore scenario, which is an important
benefit.
CONCLUSIONS
This paper is based on MapReduce and the Hadoop framework. Its purpose is to improve the performance of MapReduce distributed
application when executing in a heterogeneous environment. By dynamically partitioning input data being read by map tasks and by
judicious assignation of reduce tasks based on data locality using a Virtual Machine Mapper. Simulation and experimental results
show an improvement in MapReduce performance, improving map task completion time by up to 44% and reduce task completion
time by up to 29%.In future research, this work can be expanded todynamically determine the number of reducers deployed onthe
MapReduceenvironment. This is an important topic,which analyzes the cost-benefits of increasing the number of reducers, and
compares whether the impact on performance.
REFERENCES
[1]. J. Alvarez-Hamelin, L. Dall Asta, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the
k-core decom-position. Advances in neural information processing systems, 18:41, 2006.
[2]. G. D. Bader and C. W. Hogue. Analyzing yeast protein-protein interaction data obtained from different sources. Nature
biotechnology, 20(10):991–997, Oct. 2002.
[3]. K.-H. Lee, Y.-J. Lee, H. Choi, Y. D. Chung, and B. Moon, "Parallel data processing with MapReduce a survey," ACM
SIGMOD Record, vol. 40, 2012, pp.11-20.
[4]. J. Cohen. Graph twiddling in a mapreduce world. Computing in Science Engineering, 11(4):29 –41, July–Aug. 2009
[5]. J. Dittrich and J.-A. Quiané-Ruiz, "Efficient big data processing in Hadoop MapReduce," Proceedings of the VLDB
Endowment, vol. 5, 2012, pp. 2014-2015.
[6]. J. Xie, S. Yin, X. Ruan, Z. Ding, Y. Tian, J. Majors, A. Manzanares,and X. Qin, "Improving mapreduce performance through
dataplacement inheterogeneous adoop clusters," in Parallel &Distributed Processing, Workshops and Phd Forum (IPDPSW),
2010IEEE International Symposium on, 2010, pp. 1-9.

More Related Content

What's hot

Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudAM Publications
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET Journal
 
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...AM Publications
 
Effective Sparse Matrix Representation for the GPU Architectures
 Effective Sparse Matrix Representation for the GPU Architectures Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU ArchitecturesIJCSEA Journal
 
Survey on load balancing and data skew mitigation in mapreduce applications
Survey on load balancing and data skew mitigation in mapreduce applicationsSurvey on load balancing and data skew mitigation in mapreduce applications
Survey on load balancing and data skew mitigation in mapreduce applicationsIAEME Publication
 
Improving performance of apriori algorithm using hadoop
Improving performance of apriori algorithm using hadoopImproving performance of apriori algorithm using hadoop
Improving performance of apriori algorithm using hadoopeSAT Journals
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09Hiroshi Ono
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceeSAT Publishing House
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011Satya Ramachandran
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...IOSR Journals
 
Twitter Analysis of Road Traffic Congestion Severity Estimation
Twitter Analysis of Road Traffic Congestion Severity EstimationTwitter Analysis of Road Traffic Congestion Severity Estimation
Twitter Analysis of Road Traffic Congestion Severity EstimationGaurav Singh
 
Effect of countries in performance of hadoop.
Effect of countries in performance of hadoop.Effect of countries in performance of hadoop.
Effect of countries in performance of hadoop.Computer Science Journals
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...eSAT Publishing House
 
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKS
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKSMULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKS
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKSijcses
 
Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks IJECEIAES
 

What's hot (20)

Survey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in CloudSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud
 
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
IRJET - Evaluating and Comparing the Two Variation with Current Scheduling Al...
 
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
 
Effective Sparse Matrix Representation for the GPU Architectures
 Effective Sparse Matrix Representation for the GPU Architectures Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU Architectures
 
Survey on load balancing and data skew mitigation in mapreduce applications
Survey on load balancing and data skew mitigation in mapreduce applicationsSurvey on load balancing and data skew mitigation in mapreduce applications
Survey on load balancing and data skew mitigation in mapreduce applications
 
J41046368
J41046368J41046368
J41046368
 
Improving performance of apriori algorithm using hadoop
Improving performance of apriori algorithm using hadoopImproving performance of apriori algorithm using hadoop
Improving performance of apriori algorithm using hadoop
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduce
 
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous EnvironmentData Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
 
IJET-V3I2P24
IJET-V3I2P24IJET-V3I2P24
IJET-V3I2P24
 
IJET-V3I1P27
IJET-V3I1P27IJET-V3I1P27
IJET-V3I1P27
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
 
Twitter Analysis of Road Traffic Congestion Severity Estimation
Twitter Analysis of Road Traffic Congestion Severity EstimationTwitter Analysis of Road Traffic Congestion Severity Estimation
Twitter Analysis of Road Traffic Congestion Severity Estimation
 
Effect of countries in performance of hadoop.
Effect of countries in performance of hadoop.Effect of countries in performance of hadoop.
Effect of countries in performance of hadoop.
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
 
Hadoop Mapreduce
Hadoop MapreduceHadoop Mapreduce
Hadoop Mapreduce
 
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKS
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKSMULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKS
MULTIDIMENSIONAL ANALYSIS FOR QOS IN WIRELESS SENSOR NETWORKS
 
Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks Operating Task Redistribution in Hyperconverged Networks
Operating Task Redistribution in Hyperconverged Networks
 

Viewers also liked

Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...
Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...
Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...AM Publications
 
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...AM Publications
 
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...Simulation of Suction & Compression Process with Delayed Entry Technique Usin...
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...AM Publications
 
Collage_Multiple Pages_Campbell
Collage_Multiple Pages_CampbellCollage_Multiple Pages_Campbell
Collage_Multiple Pages_CampbellJanet Campbell
 
OFDM: Modulation Technique for Wireless Communication
OFDM: Modulation Technique for Wireless CommunicationOFDM: Modulation Technique for Wireless Communication
OFDM: Modulation Technique for Wireless CommunicationAM Publications
 

Viewers also liked (10)

Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...
Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...
Verification of Computer Aided Engineering (CAE) in Optimization of Chassis f...
 
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...
Performance Analysis of OFDM system using LS, MMSE and Less Complex MMSE in t...
 
Голосовое меню (IVR)
Голосовое меню (IVR) Голосовое меню (IVR)
Голосовое меню (IVR)
 
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...Simulation of Suction & Compression Process with Delayed Entry Technique Usin...
Simulation of Suction & Compression Process with Delayed Entry Technique Usin...
 
Primum project presentation
Primum project presentationPrimum project presentation
Primum project presentation
 
Collage_Multiple Pages_Campbell
Collage_Multiple Pages_CampbellCollage_Multiple Pages_Campbell
Collage_Multiple Pages_Campbell
 
Big data Paper
Big data PaperBig data Paper
Big data Paper
 
OFDM: Modulation Technique for Wireless Communication
OFDM: Modulation Technique for Wireless CommunicationOFDM: Modulation Technique for Wireless Communication
OFDM: Modulation Technique for Wireless Communication
 
IP-телефония: варианты организации связи внутри компании
IP-телефония: варианты организации связи внутри компанииIP-телефония: варианты организации связи внутри компании
IP-телефония: варианты организации связи внутри компании
 
Mustafa cv 2016
Mustafa cv 2016Mustafa cv 2016
Mustafa cv 2016
 

Similar to Dynamically Partitioning Big Data Using Virtual Machine Mapping

Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 
Managing Big data using Hadoop Map Reduce in Telecom Domain
Managing Big data using Hadoop Map Reduce in Telecom DomainManaging Big data using Hadoop Map Reduce in Telecom Domain
Managing Big data using Hadoop Map Reduce in Telecom DomainAM Publications
 
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...AM Publications
 
Distributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisDistributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisIRJET Journal
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacmlmphuong06
 
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...AM Publications
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTijwscjournal
 
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...ijdpsjournal
 
High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...IRJET Journal
 
Accelerated seam carving using cuda
Accelerated seam carving using cudaAccelerated seam carving using cuda
Accelerated seam carving using cudaeSAT Journals
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
 
IRJET- Load Balancing and Crash Management in IoT Environment
IRJET-  	  Load Balancing and Crash Management in IoT EnvironmentIRJET-  	  Load Balancing and Crash Management in IoT Environment
IRJET- Load Balancing and Crash Management in IoT EnvironmentIRJET Journal
 
IRJET- A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...
IRJET-  	  A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...IRJET-  	  A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...
IRJET- A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...IRJET Journal
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisIRJET Journal
 
Performance comparison of row per slave and rows set per slave method in pvm ...
Performance comparison of row per slave and rows set per slave method in pvm ...Performance comparison of row per slave and rows set per slave method in pvm ...
Performance comparison of row per slave and rows set per slave method in pvm ...eSAT Journals
 

Similar to Dynamically Partitioning Big Data Using Virtual Machine Mapping (20)

Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Managing Big data using Hadoop Map Reduce in Telecom Domain
Managing Big data using Hadoop Map Reduce in Telecom DomainManaging Big data using Hadoop Map Reduce in Telecom Domain
Managing Big data using Hadoop Map Reduce in Telecom Domain
 
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
 
Ie3514301434
Ie3514301434Ie3514301434
Ie3514301434
 
Distributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data AnalysisDistributed Feature Selection for Efficient Economic Big Data Analysis
Distributed Feature Selection for Efficient Economic Big Data Analysis
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
 
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
 
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
 
High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...
 
Accelerated seam carving using cuda
Accelerated seam carving using cudaAccelerated seam carving using cuda
Accelerated seam carving using cuda
 
Accelerated seam carving using cuda
Accelerated seam carving using cudaAccelerated seam carving using cuda
Accelerated seam carving using cuda
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
 
IRJET- Load Balancing and Crash Management in IoT Environment
IRJET-  	  Load Balancing and Crash Management in IoT EnvironmentIRJET-  	  Load Balancing and Crash Management in IoT Environment
IRJET- Load Balancing and Crash Management in IoT Environment
 
E031201032036
E031201032036E031201032036
E031201032036
 
IRJET- A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...
IRJET-  	  A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...IRJET-  	  A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...
IRJET- A Review on K-Means++ Clustering Algorithm and Cloud Computing wit...
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
 
Performance comparison of row per slave and rows set per slave method in pvm ...
Performance comparison of row per slave and rows set per slave method in pvm ...Performance comparison of row per slave and rows set per slave method in pvm ...
Performance comparison of row per slave and rows set per slave method in pvm ...
 

More from AM Publications

DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...AM Publications
 
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...AM Publications
 
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNTHE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNAM Publications
 
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...AM Publications
 
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...AM Publications
 
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESAM Publications
 
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS AM Publications
 
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...AM Publications
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONAM Publications
 
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...AM Publications
 
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...AM Publications
 
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...AM Publications
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...AM Publications
 
OPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNOPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNAM Publications
 
DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTAM Publications
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTAM Publications
 
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...AM Publications
 
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY AM Publications
 
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASET
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASETDATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASET
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASETAM Publications
 

More from AM Publications (20)

DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
 
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
 
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNTHE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
 
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
 
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
 
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
 
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
 
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
 
INTELLIGENT BLIND STICK
INTELLIGENT BLIND STICKINTELLIGENT BLIND STICK
INTELLIGENT BLIND STICK
 
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
 
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
 
OPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNOPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNN
 
DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECT
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
 
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
 
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
 
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASET
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASETDATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASET
DATA MINING WITH CLUSTERING ON BIG DATA FOR SHOPPING MALL’S DATASET
 

Recently uploaded

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 

Dynamically Partitioning Big Data Using Virtual Machine Mapping

  • 1. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763 Issue 01, Volume 3 (January 2016) www.ijirae.com ______________________________________________________________________________________________________ IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57 © 2014- 16, IJIRAE- All Rights Reserved Page -47 Dynamically Partitioning Big Data Using Virtual Machine Mapping Mrs.D.Saritha Dr.R.Veera Sudarasana Reddy G.Narasimha Reddy Assistant .Professor/ MCA Principal Senior Manager Narayana Engineering College, Narayana Engineering College,Gudur CTS Pvt,Ltd,Hydrabed Andhra Pradesh Andhra Pradesh Andhra Pradesh Abstract— Big data refers to data that is so large that it exceeds the processing capabilities of traditional systems. Big data can be awkward to work and the storage, processing and analysis of big data can be problematic. MapReduce is a recent programming model that can handle big data. MapReduce achieves this by distributing the storage and processing of data amongst a large number of computers (nodes). However, this means the time required to process a MapReduce job is dependent on whichever node is last to complete a task. This problem is bad situation by heterogeneous environments. In this paper a methodologyis properly to improve MapReduce execution in heterogeneous environments. It is carried out using dynamically partitioning data during the Map phase and by using virtual machine mapping in the Reduce phase in order to maximize resource utilization. Keywords—BigData; MapReduce; Hadoop; Virtual Machine; Heterogeneous environment; INTRODUCTION MapReduce is a programming model for creating distributed applications that can process big data using a large number of commodity computers. Originally developed by Google[1,2], MapReduce enjoys wide use by both industry and academia[3] via Hadoop[4]. The advantages of MapReduce framework is that it allows users to execute analytical tasks over big data without worrying about the myriad of details inherent in distributed programming[3,5]. However, the efficacy of MapReduce can be undermined by its implementation. For instance, Hadoop the most popular open source MapReduce framework[5] assumes all the nodes in the network to be homogenous. Consequently, Hadoop’s performance is not optimal in a heterogeneous environment[6]. In this paper we focus on the Hadoop framework. We look in particular how MapReduce handles map input and reduce task assignment in a heterogeneous environment. There are many reasons why MapReduce might execute in a heterogeneous environment. For instance, advances in technology might mean new machines in the network are different to old ones. Alternatively, MapReduce may be deployed on a hybrid cloud environment, where computing resources tend to be heterogeneous[7]. In summary, this paper presents the following contributions A method to improve virtual machine mapping for reducers  A method to improve reducer selection on a heterogeneous systems  The rest of this paper is organized as follows. In section 2, we present some background on MapReduce. In section 3, we present our proposed dynamic data partitioning and virtual machine mapping methods. In section 4, we evaluate our work, present our experimental results and discuss our findings. Finally, in section 5, we present our conclusion and prospects for future work. MAPREDUCE The purpose of MapReduce is to process large amounts of data on clusters of computers. At the heart of MapReduce resides two distinct programming functions, a map function and a reduce function[8]. It is the responsibility of the programmer to provide these functions. Two tasks known as the mapper and reducer handle the map and reduce functions respectively. In this paper, the terms mapper and reducer are used interchangeably with the terms map task and reduce task. Figure 1. MapReduce data flow
  • 2. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763 Issue 01, Volume 3 (January 2016) www.ijirae.com ______________________________________________________________________________________________________ IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57 © 2014- 16, IJIRAE- All Rights Reserved Page -48 The purpose of the map and reduce functions is to handle sets of keys-value pairs. When a user runs a MapReduce program, data from a file or set of files is split amongst the mappers provided and read as a series of key-value pairs. The mapper then applies the map function on these key-value pairs. It is the duty of the map function to derive meaning from the input, to manipulate or filter the data, and to compute a list of key-value pairs. The list of key-value pairs is then partitioned based on the key, typically via a hash function. During this process, data is stored locally in temporary intermediate file, as shown in Fig 1. as input a key and a list of values. Once the reduce function finishes computing the data an output file is produced. Each reducer generates a separate output file. These files can be searched, merged or handled in whatever way the user wants once all reducers have completed their workload. PROPOSED TECHNIQUES AND IMPLEMENTATION The research model for this study is presented in Fig. 2, which shows a network that consists of several physical machines. Each physical machine (PM) has a limited number of virtual machines (VM). Without losing generality, virtual machines are used as a basic unit with which to execute a task. The virtual machine may run a map task or a reduce task. Due to the heterogenerous nature of the environment, the processing capabilities of any particular virtual machine may differ from other virtual machines in the environment. Figure 3. Dataflow of MapReduce model with the proposed dynamic data partitioner and a virtual machine mapper. Six Map tasks and three Reduce tasks run on virtual machines with differing processing capabilities. A. Dynamic Data Partitioning In Hadoop, a MapReduce job begins by first reading a large input file. This file is usually stored on the Hadoop Distributed File System (HDFS). Since Hadoop assumes the environment is homogenous, the data from this file is split into fixed sized pieces. Hadoop then creates a mapper for each split. In a homogenous cluster each node has the same processing power and capabilities. In this case, each mapper will finish processing its split at approximately the same time. In a heterogeneous network, nodes that process faster than others will complete their work earlier. Since data access rates between nodes on the HDFS are inconsistent due to issues of data locality, we propose a dynamic data partitioner that partitions data on a node irrespective of other nodes on the network. An example of the dynamic data partitioner is shown in Fig. 3. In this example, a 600GB file is used as input data. In this scenario, the data is divided up into six equal sized pieces, and sent to six virtual machines. Each of these virtual machines then executes a map task. Each virtual machine is given a value n that indicates the relative processing ability of that virtual machines VPU. This is based on the number of virtual processing units (VPU) of that virtual machine, and the physical machine it is running on. For instance, the virtual machine VM1 has an n value of 10 and the virtual machine VM2 has an n value of 2. This means that VM1 is able to process data 5 times faster than VM2. The processing speed of each virtual machine is calculated prior to execution using a profiling tool.
  • 3. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763 Issue 01, Volume 3 (January 2016) www.ijirae.com ______________________________________________________________________________________________________ IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57 © 2014- 16, IJIRAE- All Rights Reserved Page -49 Figure 3. Map/Reduce using worksets As previously mentioned, the proportion of data to be reassigned amongst virtual machines is determined by the processing ability of all the virtual machines running on the same physical machine. The following algorithm calculated. The machine the DDP repartitions the data on each physical machine. On PM1 there is three virtual machines VM1, VM2 and VM3. VM1 has a VM processing rate of 10, VM2 has a VM processing rate of 2 and VM3 has a processing rate of 8. Each virtual machine has an initial split size of 100GB. Consequently, the total data size of the three virtual machines is 300GB, and the total speed of the three virtual machines is 20 units. The input split is then divided into fragments. The size of fragment is calculated using the following equation: Table I- MAP VS. MATCH VS. PREGEL BASED TYPE FILE TIME [M] ca.CondMat.txt Match/Reduce 0.23 ca.CondMat.txt Pregel 0.23 ca.CondMat.txt Map/Reduce 0.25 com.youtube.ungraph.txt Pregel 9.98 com.youtube.ungraph.txt Match/Reduce 16.78 com.youtube.ungraph.txt Map/Reduce 34.40 web.BerkStan.txt Pregel 14.12 web.BerkStan.txt Match/Reduce 24.70 web.BerkStan.txt Map/Reduce 298.35 A closer analysis of Figure 3 reveals that data sets with less than a hundred thousand nodes play in the hands of tratosphere. Its lack of a wiring phase, as in Hama, where nodes have to be connected to their neighbors, and the lack of Hadoop’s serialization makes it the fastest of the Java-based frameworks for smaller data sets B. Virtual Machine Mapping In Hadoop, a master node determines where mappers or reducers reside on a network. When assigning a mapper to a node, it is important that it is located on or near the data it will access. This is because transferring data from one node to another takes time and delays task execution. The problem of determining where to place a task on the network so that it is close to the data it uses is known as data locality. Data locality is a key concept in MapReduce and has a major impact on its performance. Even though Hadoop uses this concept when determining where to execute its mappers, it does not exploit this concept for its reducers and locate reducers near these partitions. We therefore propose for this purpose a virtual machine mapper (VMM) which allocates the reducer to the appropriate physical machine based on partition size and the availability of a virtual machine. Figure 4. Virtual Machine Mapper Fig. 4 shows a simple example of the VMM ascribing reduce tasks to physical machines. In this example, there are six mappers on two physical machines, with three mappers per physical machine. Since this job requests three reduce tasks, each mapper creates three partitions. The total amount of data to be received by each reducer is then deduced by summing up the respective partition at each mapper. Based on the concept of data locality, reducers are assigned locations on physical machines based on where the data for each reducer is stored. On a physical machine there are multiple virtual machines. Once a reducer is assigned to a physical machine the reducer is assigned whichever virtual machine has the fastest processing speed. The algorithm for the virtual machine mapper is presented as follows. In this example, all of the reducers are allocated to a virtual machine on an appropriate physical machine. If there are no virtual machines available, the reducer is allocated to another physical machine. This is first done using the best fit selection methodwhich eases recovery, but hampers performance [5]. However, with the relative youth of its successors, it is cur-rently the only framework that can actually recover from faults. Both distributed GraphLab and Stratosphere detect lost nodes quickly, however cannot recover
  • 4. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2763 Issue 01, Volume 3 (January 2016) www.ijirae.com ______________________________________________________________________________________________________ IJIRAE: Impact Factor Value - ISRAJIF: 1.857 | PIF: 2.469 | Jour Info: 4.085 | Index Copernicus 2014 = 6.57 © 2014- 16, IJIRAE- All Rights Reserved Page -50 from errors. Both display an error code on the console. Only Stratosphere has a built-in recovery mechanism, which is at the time of writing unable to handle iterations properly. In Fig. 3 there are two physical machines. Both physical machines have three virtual machines each running a mapper. Each mapper produces three data partitions, which are assigned to three reducers. Using Algorithm 2, reducer 1 is assigned to physical machine 1, virtual machine 1. This is because physical machine 1 stores the largest fraction of the data designated for that reducer. Similarly, reducer 2 and reducer 3 are assigned to a virtual machine on physical machine 2. On physical machine 1 there are three viable virtual machines (VM1, VM2, VM3). Reducer 1 is assigned to VM1 as it has the fastest processing speed. On physical machine 2 there are also three viable virtual machines (VM4, VM5, VM6) each with different processing speeds. Since reducer 3 has more data stored on physical machine 2, it is assigned to the fastest virtual machine (VM6). Consequently, reducer 2 is assigned to the second fastest virtual machine (VM4). In this example, all of the reducers are allocated to a virtual machine on an appropriate physical machine. If there are no virtual machines available, the reducer is allocated to another physical machine. This is first done using the best fit selection method. If a physical machine has more reducers requesting a virtual machine than there are available virtual machines, the VMM has to locate the reducer to another physical machine. Instead of rejecting reducers based on a first come first served (FCFS), the VMM assesses the size of the data contributed to the reducer by each physical machine. The VMM then gives priority to those reducers with the greatest difference between their largest data contribution and their second largest data contribution.both scenarios using the analyzed vertex centric frameworks. The results for the Map/Reduce frameworks did not match the speed of their counterparts and have been left out. The first thing to notice is that GraphLab apparently comes from a multicore background and is still optimized for that. In that scenario, Hama performs worse than GraphLab by at least 20% for every tested data set. The gain shows up especially using large data sets in the multicore scenario, which is an important benefit. In the distributed scenario, the gain is negligible for a huge data set like Wiki-talk, and even reversed for the Berkeley data. As previously mentioned, dense data set seems to exploit a shortcoming in the implementation of distributed GraphLab. For a complete analysis of GraphLab on a multicore machine see Since data access rates between nodes on the HDFS are inconsistent due to issues of data locality, we propose a dynamic data partitioner that partitions data on a node irrespective of other nodes on the network. An example of the dynamic data both scenarios using the analyzed vertex centric frameworks. The results for the Map/Reduce frameworks did not match the speed of their counterparts and have been left out. The first thing to notice is that GraphLab apparently comes from a multicore background and is still optimized for that. In that scenario, Hama performs worse than GraphLab by at least 20% for every tested data set. The gain shows up especially using large data sets in the multicore scenario, which is an important benefit. CONCLUSIONS This paper is based on MapReduce and the Hadoop framework. Its purpose is to improve the performance of MapReduce distributed application when executing in a heterogeneous environment. By dynamically partitioning input data being read by map tasks and by judicious assignation of reduce tasks based on data locality using a Virtual Machine Mapper. Simulation and experimental results show an improvement in MapReduce performance, improving map task completion time by up to 44% and reduce task completion time by up to 29%.In future research, this work can be expanded todynamically determine the number of reducers deployed onthe MapReduceenvironment. This is an important topic,which analyzes the cost-benefits of increasing the number of reducers, and compares whether the impact on performance. REFERENCES [1]. J. Alvarez-Hamelin, L. Dall Asta, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decom-position. Advances in neural information processing systems, 18:41, 2006. [2]. G. D. Bader and C. W. Hogue. Analyzing yeast protein-protein interaction data obtained from different sources. Nature biotechnology, 20(10):991–997, Oct. 2002. [3]. K.-H. Lee, Y.-J. Lee, H. Choi, Y. D. Chung, and B. Moon, "Parallel data processing with MapReduce a survey," ACM SIGMOD Record, vol. 40, 2012, pp.11-20. [4]. J. Cohen. Graph twiddling in a mapreduce world. Computing in Science Engineering, 11(4):29 –41, July–Aug. 2009 [5]. J. Dittrich and J.-A. Quiané-Ruiz, "Efficient big data processing in Hadoop MapReduce," Proceedings of the VLDB Endowment, vol. 5, 2012, pp. 2014-2015. [6]. J. Xie, S. Yin, X. Ruan, Z. Ding, Y. Tian, J. Majors, A. Manzanares,and X. Qin, "Improving mapreduce performance through dataplacement inheterogeneous adoop clusters," in Parallel &Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010IEEE International Symposium on, 2010, pp. 1-9.