SlideShare a Scribd company logo
1 of 4
Download to read offline
Do Your Projects With Domain Experts…
Copyright © 2015 LeMeniz Infotech. All rights reserved
Page number 1
LeMeniz Infotech
36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry-
605 005.
Call: 0413-4205444, +91 9566355386, 99625 88976.
Web : www.lemenizinfotech.com / www.ieeemaster.com
Mail : projects@lemenizinfotech.com
Hadoop Performance Modeling for Job Estimation and Resource Provisioning
ABSTRACT:
MapReduce has become a major computing model for data intensive applications.
Hadoop, an open source implementation of MapReduce, has been adopted by an
increasingly growing user community. Cloud computing service providers such as Amazon
EC2 Cloud offer the opportunities for Hadoop users to lease a certain amount of resources
and pay for their use. However, a key challenge is that cloud service providers do not have
a resource provisioning mechanism to satisfy user jobs with deadline requirements.
Currently, it is solely the user's responsibility to estimate the required amount of resources
for running a job in the cloud. This paper presents a Hadoop job performance model that
accurately estimates job completion time and further provisions the required amount of
resources for a job to be completed within a deadline. The proposed model builds on
historical job execution records and employs Locally Weighted Linear Regression (LWLR)
technique to estimate the execution time of a job. Furthermore, it employs Lagrange
Multipliers technique for resource provisioning to satisfy jobs with deadline requirements.
The proposed model is initially evaluated on an in-house Hadoop cluster and subsequently
evaluated in the Amazon EC2 Cloud. Experimental results show that the accuracy of the
proposed model in job execution estimation is in the range of 94.97% and 95.51%, and
jobs are completed within the required deadlines following on the resource provisioning
scheme of the proposed model.
INTRODUCTION
Many organizations are continuously collecting massive amounts of datasets from various
sources such as the World Wide Web, sensor networks and social networks. The ability to
perform scalable and timely analytics on these unstructured datasets is a high priority task
for many enterprises. It has become difficult for traditional network storage and database
systems to process these continuously growing datasets. MapReduce, originally
developed by Google, has become a major computing model in support of data intensive
applications. It is a highly scalable, fault-tolerant and data parallel model that automatically
distributes the data and parallelizes the computation across a cluster of computers. Among
its implementations such as Mars, Phoenix, Dryad and Hadoop , Hadoop has received a
Do Your Projects With Domain Experts…
Copyright © 2015 LeMeniz Infotech. All rights reserved
Page number 2
LeMeniz Infotech
36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry-
605 005.
Call: 0413-4205444, +91 9566355386, 99625 88976.
Web : www.lemenizinfotech.com / www.ieeemaster.com
Mail : projects@lemenizinfotech.com
wide uptake by the community due to its open source nature.
EXISTING SYSTEM
In Existing System Hadoop MapReduce is its support of public cloud computing that
enables the organizations to utilize cloud services in a pay-as-you-go manner. This facility
is beneficial to small and medium size organizations where the setup of a large scale and
complex private cloud is not feasible due to financial constraints. Hence, executing
Hadoop MapReduce applications in a cloud environment for big data analytics has
become a realistic option for both the industrial practitioners and academic researchers.
For example, Amazon has designed Elastic MapReduce (EMR) that enables users to run
Hadoop applications across its Elastic Cloud Computing (EC2) nodes.
DisADVANTAGE OF Existing SYSTEM
 This facility is beneficial to small and medium size organizations where the setup of
a large scale and complex private cloud is not feasible due to financial constraints.
PROPOSED SYSTEM
In Proposed System we present improved HP model for Hadoop job execution
estimation and resource provisioning. The improved HP work mathematically models all
the three core phases of a Hadoop job. In contrast, the HP work does not mathematically
model the non-overlapping shuffle phase in the first wave. The improved HP model
employs Locally Weighted Linear Regression (LWLR) technique to estimate the execution
time of a Hadoop job with a varied number of reduce tasks. In contrast, the HP model
employs a simple linear regress technique for job execution estimation which restricts to a
constant number of reduce tasks. Based on job execution estimation, the improved HP
model employs Langrage Multiplier technique to provision the amount of resources for a
Hadoop job to complete within a given deadline.
Do Your Projects With Domain Experts…
Copyright © 2015 LeMeniz Infotech. All rights reserved
Page number 3
LeMeniz Infotech
36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry-
605 005.
Call: 0413-4205444, +91 9566355386, 99625 88976.
Web : www.lemenizinfotech.com / www.ieeemaster.com
Mail : projects@lemenizinfotech.com
ADVANTAGE OF PROPOSED SYSTEM
 VMs are configured with a large number of both map slots and reduce slots.
ARCHITECTURE:
Do Your Projects With Domain Experts…
Copyright © 2015 LeMeniz Infotech. All rights reserved
Page number 4
LeMeniz Infotech
36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry-
605 005.
Call: 0413-4205444, +91 9566355386, 99625 88976.
Web : www.lemenizinfotech.com / www.ieeemaster.com
Mail : projects@lemenizinfotech.com
HARDWARE REQUIREMENTS:
 System : Pentium IV 2.4 GHz.
 Hard Disk : 40 GB.
 Floppy Drive : 44 Mb.
 Monitor : 15 VGA Colour.
SOFTWARE REQUIREMENTS:
 Operating system : Windows 7.
 Coding Language : Java 1.7 ,Hadoop 0.8.1
 Database : MySql 5
 IDE : Eclipse

More Related Content

What's hot

mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experian
Erni Susanti
 
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the EnterpriseData Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
DataWorks Summit
 

What's hot (20)

1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoop
 
BigData
BigDataBigData
BigData
 
Cascading User Group Meet
Cascading User Group MeetCascading User Group Meet
Cascading User Group Meet
 
Case Study - Komli Media Improves Utilization With Premium Big Data Platform ...
Case Study - Komli Media Improves Utilization With Premium Big Data Platform ...Case Study - Komli Media Improves Utilization With Premium Big Data Platform ...
Case Study - Komli Media Improves Utilization With Premium Big Data Platform ...
 
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsTrends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systems
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning Primer
 
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
VMUGIT UC 2013 - 08a VMware Hadoop
VMUGIT UC 2013 - 08a VMware HadoopVMUGIT UC 2013 - 08a VMware Hadoop
VMUGIT UC 2013 - 08a VMware Hadoop
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud Services
 
mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experian
 
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the EnterpriseData Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
Data Science: Driving Smarter Finance and Workforce Decsions for the Enterprise
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
MapR Unique features
MapR Unique featuresMapR Unique features
MapR Unique features
 
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
 
RDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs SparkRDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs Spark
 
Combining hadoop with big data analytics
Combining hadoop with big data analyticsCombining hadoop with big data analytics
Combining hadoop with big data analytics
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted Analytics
 

Viewers also liked

Camps gaëlle et lilou
Camps   gaëlle et lilouCamps   gaëlle et lilou
Camps gaëlle et lilou
mdrouet44
 
Naturaleza
NaturalezaNaturaleza
Naturaleza
diosyely
 
интеллект карта
интеллект картаинтеллект карта
интеллект карта
kmr17863
 
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
Yelena Shaulova
 
April 13 2014 slideshow
April 13 2014 slideshowApril 13 2014 slideshow
April 13 2014 slideshow
Earl Oswalt
 

Viewers also liked (18)

PySaprk
PySaprkPySaprk
PySaprk
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
Python and Bigdata - An Introduction to Spark (PySpark)
Python and Bigdata -  An Introduction to Spark (PySpark)Python and Bigdata -  An Introduction to Spark (PySpark)
Python and Bigdata - An Introduction to Spark (PySpark)
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
The smart quiz
The smart quizThe smart quiz
The smart quiz
 
Camps gaëlle et lilou
Camps   gaëlle et lilouCamps   gaëlle et lilou
Camps gaëlle et lilou
 
บทความประชุมวิชาการ มหาวิทยาลัยเกษตรศาสตร์ ครั้งที่ ๕๐
บทความประชุมวิชาการ มหาวิทยาลัยเกษตรศาสตร์ ครั้งที่ ๕๐บทความประชุมวิชาการ มหาวิทยาลัยเกษตรศาสตร์ ครั้งที่ ๕๐
บทความประชุมวิชาการ มหาวิทยาลัยเกษตรศาสตร์ ครั้งที่ ๕๐
 
Hiding in the mobile crowd location privacy through collaboration
Hiding in the mobile crowd location privacy through collaborationHiding in the mobile crowd location privacy through collaboration
Hiding in the mobile crowd location privacy through collaboration
 
Get Easy
Get EasyGet Easy
Get Easy
 
Cát Vạn Lợi - Nhà cung cấp vật tư thiết bị điện hàng đầu cho mọi công trình M...
Cát Vạn Lợi - Nhà cung cấp vật tư thiết bị điện hàng đầu cho mọi công trình M...Cát Vạn Lợi - Nhà cung cấp vật tư thiết bị điện hàng đầu cho mọi công trình M...
Cát Vạn Lợi - Nhà cung cấp vật tư thiết bị điện hàng đầu cho mọi công trình M...
 
Naturaleza
NaturalezaNaturaleza
Naturaleza
 
интеллект карта
интеллект картаинтеллект карта
интеллект карта
 
Bandwidth distributed denial of service attacks and defenses
Bandwidth distributed denial of service attacks and defensesBandwidth distributed denial of service attacks and defenses
Bandwidth distributed denial of service attacks and defenses
 
Ncc technology presentation 2014
Ncc technology presentation 2014Ncc technology presentation 2014
Ncc technology presentation 2014
 
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
НОВАЯ ВОЛНА КОМАНДНАЯ ЭФФЕКТИВНОСТЬ_Развитие Лидеров Коучинг(1)
 
Medium access with adaptive relay selection in cooperative wireless networks
Medium access with adaptive relay selection in cooperative wireless networksMedium access with adaptive relay selection in cooperative wireless networks
Medium access with adaptive relay selection in cooperative wireless networks
 
April 13 2014 slideshow
April 13 2014 slideshowApril 13 2014 slideshow
April 13 2014 slideshow
 

Similar to Hadoop performance modeling for job estimation and resource provisioning

Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
researchinventy
 
Hadoop performance modeling for job
Hadoop performance modeling for jobHadoop performance modeling for job
Hadoop performance modeling for job
ranjith kumar
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
NavNeet KuMar
 

Similar to Hadoop performance modeling for job estimation and resource provisioning (20)

Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...
Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...
Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...
 
Hfsp bringing size based scheduling to hadoop
Hfsp bringing size based scheduling to hadoopHfsp bringing size based scheduling to hadoop
Hfsp bringing size based scheduling to hadoop
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Hadoop performance modeling for job
Hadoop performance modeling for jobHadoop performance modeling for job
Hadoop performance modeling for job
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reduce
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
Hadoop performance modeling for job estimation and resource provisioning
Hadoop performance modeling for job estimation and resource provisioningHadoop performance modeling for job estimation and resource provisioning
Hadoop performance modeling for job estimation and resource provisioning
 
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
 
JIT Borawan Cloud computing part 2
JIT Borawan Cloud computing part 2JIT Borawan Cloud computing part 2
JIT Borawan Cloud computing part 2
 
E5 05 ijcite august 2014
E5 05 ijcite august 2014E5 05 ijcite august 2014
E5 05 ijcite august 2014
 
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
Performance evaluation of Map-reduce jar pig hive and spark with machine lear...
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
 
Big Data Platform and Architecture Recommendation
Big Data Platform and Architecture RecommendationBig Data Platform and Architecture Recommendation
Big Data Platform and Architecture Recommendation
 
Clloud computing provisioing and benifits altanai bisht 2nd year , part ii
Clloud computing provisioing and benifits   altanai bisht 2nd year , part iiClloud computing provisioing and benifits   altanai bisht 2nd year , part ii
Clloud computing provisioing and benifits altanai bisht 2nd year , part ii
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
Hadoop
HadoopHadoop
Hadoop
 
Design architecture based on web
Design architecture based on webDesign architecture based on web
Design architecture based on web
 

More from LeMeniz Infotech

Interleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approachInterleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approach
LeMeniz Infotech
 
Bumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuitsBumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuits
LeMeniz Infotech
 
A bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam controlA bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam control
LeMeniz Infotech
 
Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...
LeMeniz Infotech
 
Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...
LeMeniz Infotech
 
Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...
LeMeniz Infotech
 
Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...
LeMeniz Infotech
 

More from LeMeniz Infotech (20)

A fast acquisition all-digital delay-locked loop using a starting-bit predict...
A fast acquisition all-digital delay-locked loop using a starting-bit predict...A fast acquisition all-digital delay-locked loop using a starting-bit predict...
A fast acquisition all-digital delay-locked loop using a starting-bit predict...
 
A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...
 
A dynamically reconfigurable multi asip architecture for multistandard and mu...
A dynamically reconfigurable multi asip architecture for multistandard and mu...A dynamically reconfigurable multi asip architecture for multistandard and mu...
A dynamically reconfigurable multi asip architecture for multistandard and mu...
 
Interleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approachInterleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approach
 
Bumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuitsBumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuits
 
A bidirectional single stage three phase rectifier with high-frequency isolat...
A bidirectional single stage three phase rectifier with high-frequency isolat...A bidirectional single stage three phase rectifier with high-frequency isolat...
A bidirectional single stage three phase rectifier with high-frequency isolat...
 
A bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam controlA bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam control
 
Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...
 
Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...
 
Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...
 
Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...
 
Connection of converters to a low and medium power dc network using an induct...
Connection of converters to a low and medium power dc network using an induct...Connection of converters to a low and medium power dc network using an induct...
Connection of converters to a low and medium power dc network using an induct...
 
Stamp enabling privacy preserving location proofs for mobile users
Stamp enabling privacy preserving location proofs for mobile usersStamp enabling privacy preserving location proofs for mobile users
Stamp enabling privacy preserving location proofs for mobile users
 
Sbvlc secure barcode based visible light communication for smartphones
Sbvlc secure barcode based visible light communication for smartphonesSbvlc secure barcode based visible light communication for smartphones
Sbvlc secure barcode based visible light communication for smartphones
 
Read2 me a cloud based reading aid for the visually impaired
Read2 me a cloud based reading aid for the visually impairedRead2 me a cloud based reading aid for the visually impaired
Read2 me a cloud based reading aid for the visually impaired
 
Privacy preserving location sharing services for social networks
Privacy preserving location sharing services for social networksPrivacy preserving location sharing services for social networks
Privacy preserving location sharing services for social networks
 
Pass byo bring your own picture for securing graphical passwords
Pass byo bring your own picture for securing graphical passwordsPass byo bring your own picture for securing graphical passwords
Pass byo bring your own picture for securing graphical passwords
 
Eplq efficient privacy preserving location-based query over outsourced encryp...
Eplq efficient privacy preserving location-based query over outsourced encryp...Eplq efficient privacy preserving location-based query over outsourced encryp...
Eplq efficient privacy preserving location-based query over outsourced encryp...
 
Analyzing ad library updates in android apps
Analyzing ad library updates in android appsAnalyzing ad library updates in android apps
Analyzing ad library updates in android apps
 
An exploration of geographic authentication scheme
An exploration of geographic authentication schemeAn exploration of geographic authentication scheme
An exploration of geographic authentication scheme
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 

Recently uploaded (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Hadoop performance modeling for job estimation and resource provisioning

  • 1. Do Your Projects With Domain Experts… Copyright © 2015 LeMeniz Infotech. All rights reserved Page number 1 LeMeniz Infotech 36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry- 605 005. Call: 0413-4205444, +91 9566355386, 99625 88976. Web : www.lemenizinfotech.com / www.ieeemaster.com Mail : projects@lemenizinfotech.com Hadoop Performance Modeling for Job Estimation and Resource Provisioning ABSTRACT: MapReduce has become a major computing model for data intensive applications. Hadoop, an open source implementation of MapReduce, has been adopted by an increasingly growing user community. Cloud computing service providers such as Amazon EC2 Cloud offer the opportunities for Hadoop users to lease a certain amount of resources and pay for their use. However, a key challenge is that cloud service providers do not have a resource provisioning mechanism to satisfy user jobs with deadline requirements. Currently, it is solely the user's responsibility to estimate the required amount of resources for running a job in the cloud. This paper presents a Hadoop job performance model that accurately estimates job completion time and further provisions the required amount of resources for a job to be completed within a deadline. The proposed model builds on historical job execution records and employs Locally Weighted Linear Regression (LWLR) technique to estimate the execution time of a job. Furthermore, it employs Lagrange Multipliers technique for resource provisioning to satisfy jobs with deadline requirements. The proposed model is initially evaluated on an in-house Hadoop cluster and subsequently evaluated in the Amazon EC2 Cloud. Experimental results show that the accuracy of the proposed model in job execution estimation is in the range of 94.97% and 95.51%, and jobs are completed within the required deadlines following on the resource provisioning scheme of the proposed model. INTRODUCTION Many organizations are continuously collecting massive amounts of datasets from various sources such as the World Wide Web, sensor networks and social networks. The ability to perform scalable and timely analytics on these unstructured datasets is a high priority task for many enterprises. It has become difficult for traditional network storage and database systems to process these continuously growing datasets. MapReduce, originally developed by Google, has become a major computing model in support of data intensive applications. It is a highly scalable, fault-tolerant and data parallel model that automatically distributes the data and parallelizes the computation across a cluster of computers. Among its implementations such as Mars, Phoenix, Dryad and Hadoop , Hadoop has received a
  • 2. Do Your Projects With Domain Experts… Copyright © 2015 LeMeniz Infotech. All rights reserved Page number 2 LeMeniz Infotech 36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry- 605 005. Call: 0413-4205444, +91 9566355386, 99625 88976. Web : www.lemenizinfotech.com / www.ieeemaster.com Mail : projects@lemenizinfotech.com wide uptake by the community due to its open source nature. EXISTING SYSTEM In Existing System Hadoop MapReduce is its support of public cloud computing that enables the organizations to utilize cloud services in a pay-as-you-go manner. This facility is beneficial to small and medium size organizations where the setup of a large scale and complex private cloud is not feasible due to financial constraints. Hence, executing Hadoop MapReduce applications in a cloud environment for big data analytics has become a realistic option for both the industrial practitioners and academic researchers. For example, Amazon has designed Elastic MapReduce (EMR) that enables users to run Hadoop applications across its Elastic Cloud Computing (EC2) nodes. DisADVANTAGE OF Existing SYSTEM  This facility is beneficial to small and medium size organizations where the setup of a large scale and complex private cloud is not feasible due to financial constraints. PROPOSED SYSTEM In Proposed System we present improved HP model for Hadoop job execution estimation and resource provisioning. The improved HP work mathematically models all the three core phases of a Hadoop job. In contrast, the HP work does not mathematically model the non-overlapping shuffle phase in the first wave. The improved HP model employs Locally Weighted Linear Regression (LWLR) technique to estimate the execution time of a Hadoop job with a varied number of reduce tasks. In contrast, the HP model employs a simple linear regress technique for job execution estimation which restricts to a constant number of reduce tasks. Based on job execution estimation, the improved HP model employs Langrage Multiplier technique to provision the amount of resources for a Hadoop job to complete within a given deadline.
  • 3. Do Your Projects With Domain Experts… Copyright © 2015 LeMeniz Infotech. All rights reserved Page number 3 LeMeniz Infotech 36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry- 605 005. Call: 0413-4205444, +91 9566355386, 99625 88976. Web : www.lemenizinfotech.com / www.ieeemaster.com Mail : projects@lemenizinfotech.com ADVANTAGE OF PROPOSED SYSTEM  VMs are configured with a large number of both map slots and reduce slots. ARCHITECTURE:
  • 4. Do Your Projects With Domain Experts… Copyright © 2015 LeMeniz Infotech. All rights reserved Page number 4 LeMeniz Infotech 36, 100 Feet Road, Natesan Nagar, Near Indira Gandhi Statue, Pondicherry- 605 005. Call: 0413-4205444, +91 9566355386, 99625 88976. Web : www.lemenizinfotech.com / www.ieeemaster.com Mail : projects@lemenizinfotech.com HARDWARE REQUIREMENTS:  System : Pentium IV 2.4 GHz.  Hard Disk : 40 GB.  Floppy Drive : 44 Mb.  Monitor : 15 VGA Colour. SOFTWARE REQUIREMENTS:  Operating system : Windows 7.  Coding Language : Java 1.7 ,Hadoop 0.8.1  Database : MySql 5  IDE : Eclipse