SlideShare a Scribd company logo
Presented By Under the Guidance of
PRATHVIRAJ Mrs. SHANTHI
1CR11CS401 Assistant professor
Dept of CSE Dept of CSE
CMRIT CMRIT
1CMRIT 2013-14
 Abstract
 Introduction
 Existing system
 proposed system
 System Architecture
 Implementation
 Conclusion
 References
2CMRIT 2013-14
 Hadoop employs Java-based network
transport stack on top of the Java Virtual
Machine (JVM) for its data shuffling and
merging purposes.
 JVM-Bypass Shuffling (JBS) for Hadoop
 JBS helps Hadoop data shuffling by avoiding
Java based transport protocols, removing the
overhead and limitations of the JVM.
3CMRIT 2013-14
 MapReduce is a popular programming model
that provides a simple and scalable parallel
data processing framework.
 Hadoop as an open-source implementation
of Map Reduce adopted by Yahoo,Facebook
 data shuffling
 But data shuffling results in great volumes of
network traffic, reduces efficiency of data
analytics application
4CMRIT 2013-14
 JVM introduces significant overhead in
managing Java objects results in shrinkage in
memory available to hadoop
 High speed networks, such as InfiniBand
provide Remote Direct Memory Access(RDMA)
that is capable of up to 56Gbps bandwidth,
and low CPU utilization.
5CMRIT 2013-14
 Hadoop exposes two simple interfaces:
 map
 reduce
 Its runtime system consists of four major
components: JobTracker, TaskTracker,
MapTask, and ReduceTask.
 Dominant source of network traffic:5% of
large jobs can consume more than
98%network bandwidth
6CMRIT 2013-14
CMRIT 2013-14 7
 Execution time
 Cpu utilization
 Network traffic
 Overhead in managing java objects
CMRIT 2013-14 8
 No change to existing user interface (map
and reduce functions)
 Bypass the JVM from the critical path
of intermediate data shuffling
 Portable layer on top of any network
transport protocol
9CMRIT 2013-14
10CMRIT 2013-14
 Asynchronous network operations
 Increases locality of disk access
 Reduces average delay of requests
11CMRIT 2013-14
 Consolidates network fetching requests from
all Reduce Tasks on a single node
 Number of network connections, which is no
longer the total amount of MOF Copiers from
all Reduce Tasks
 Reduces the resource requirements
 creating and sustaining many network channels
 associated memory to buffer data
12CMRIT 2013-14
 Connection Establishment for RDMA and
RoCE
 TCP/IP-Based Communication
CMRIT 2013-14 13
CMRIT 2013-14 14
 Currently only Reliable Connection
(RC)service provided by RDMA-capable
interconnects is supported
 512 active connections at a time
 connections are torn down based on the LRU
when exceeded after threshold
15CMRIT 2013-14
 Event-driven model and multiple threads
 On the client side,
◦ one dedicated thread to prepare connection requests
◦ data threads request connection to remote servers
 On the server side
◦ one thread is listening for client connection requests
andaccepts them
 Both client and server
◦ Use epoll interface to monitor and detect events from
concurrent connections
◦ rely on their data threads to perform the network
communication for data transfer.
CMRIT 2013-14 16
 Simply switching to the high-performance
interconnects cannot effectively boost
Hadoop’s performance
 Overhead imposed by JVM on Hadoop
intermediate data shuffling identified
 JVM-Bypass Shuffling (JBS)
 avoids JVM in the critical path of Hadoop data shuffling
 portable library to leverage both conventional TCP/IP
protocol and high-performance RDMA protocol
 reduce the execution time of Hadoop jobs by up
to66.3% a
 lower the CPU utilization by 48.1%
17CMRIT 2013-14
 [1] Apache Hadoop Project.
http://hadoop.apache.org/.
 [2] Plugin for Generic Shuffle Service.
https://issues.apache.org/jira/browse/MAPREDUCE
-4049.
 [3] F. Ahmad, S. T. Chakradhar, A. Raghunathan,
and T. N. Vijaykumar. Tarazu: optimizing map
reduce on heterogeneous clusters. In Proceedings
of the seventeenth international conference on
Architectural Support for Programming Languages
and Operating Systems, ASPLOS’12, pages 61–74,
New York, NY, USA, 2012. ACM.
18CMRIT 2013-14
THANK YOU
19CMRIT 2013-14

More Related Content

What's hot

Load Balancing and Congestion Control in MANET
Load Balancing and Congestion Control in MANETLoad Balancing and Congestion Control in MANET
Load Balancing and Congestion Control in MANET
ijsrd.com
 
How to Ensure Next-Generation Services
How to Ensure Next-Generation ServicesHow to Ensure Next-Generation Services
How to Ensure Next-Generation Services
Fluke Networks
 
Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...
ieeeprojectschennai
 
A survey on routing algorithms and routing metrics for wireless mesh networks
A survey on routing algorithms and routing metrics for wireless mesh networksA survey on routing algorithms and routing metrics for wireless mesh networks
A survey on routing algorithms and routing metrics for wireless mesh networks
Mohammad Siraj
 
pack prediction-based cloud bandwidth and cost reduction system
pack prediction-based cloud bandwidth and cost reduction systempack prediction-based cloud bandwidth and cost reduction system
pack prediction-based cloud bandwidth and cost reduction system
swathi78
 
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
IOSR Journals
 
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORKBETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
cscpconf
 
Linkage aggregation
Linkage aggregationLinkage aggregation
Linkage aggregation
Long Triển Tôn
 
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless NetworksEfficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
CSCJournals
 
Performance Analysis of WiMAX Networks with Relay Station
Performance Analysis of WiMAX Networks with Relay StationPerformance Analysis of WiMAX Networks with Relay Station
Performance Analysis of WiMAX Networks with Relay Station
idescitation
 
Provable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systemsProvable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systems
Pvrtechnologies Nellore
 
11.a study of congestion aware adaptive routing protocols in manet
11.a study of congestion aware adaptive routing protocols in manet11.a study of congestion aware adaptive routing protocols in manet
11.a study of congestion aware adaptive routing protocols in manet
Alexander Decker
 
Mpls Failover Presentation
Mpls Failover PresentationMpls Failover Presentation
Mpls Failover Presentation
trilok_randhawa
 
Assignment2
Assignment2Assignment2
Assignment2
Thang Ta Hoang
 
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMSPROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
Nexgen Technology
 
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
AAKASH S
 
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing SystemsProvable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
1crore projects
 
Multipath TCP-Goals and Issues
Multipath TCP-Goals and IssuesMultipath TCP-Goals and Issues
Multipath TCP-Goals and Issues
RSIS International
 
In3514801485
In3514801485In3514801485
In3514801485
IJERA Editor
 

What's hot (19)

Load Balancing and Congestion Control in MANET
Load Balancing and Congestion Control in MANETLoad Balancing and Congestion Control in MANET
Load Balancing and Congestion Control in MANET
 
How to Ensure Next-Generation Services
How to Ensure Next-Generation ServicesHow to Ensure Next-Generation Services
How to Ensure Next-Generation Services
 
Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...
 
A survey on routing algorithms and routing metrics for wireless mesh networks
A survey on routing algorithms and routing metrics for wireless mesh networksA survey on routing algorithms and routing metrics for wireless mesh networks
A survey on routing algorithms and routing metrics for wireless mesh networks
 
pack prediction-based cloud bandwidth and cost reduction system
pack prediction-based cloud bandwidth and cost reduction systempack prediction-based cloud bandwidth and cost reduction system
pack prediction-based cloud bandwidth and cost reduction system
 
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
A Simulation Based Performance Comparison of Routing Protocols (Reactive and ...
 
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORKBETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
BETTER SCALABLE ROUTING PROTOCOL FOR HYBRID WIRELESS MESH NETWORK
 
Linkage aggregation
Linkage aggregationLinkage aggregation
Linkage aggregation
 
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless NetworksEfficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
Efficient and Fair Bandwidth Allocation AQM Scheme for Wireless Networks
 
Performance Analysis of WiMAX Networks with Relay Station
Performance Analysis of WiMAX Networks with Relay StationPerformance Analysis of WiMAX Networks with Relay Station
Performance Analysis of WiMAX Networks with Relay Station
 
Provable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systemsProvable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systems
 
11.a study of congestion aware adaptive routing protocols in manet
11.a study of congestion aware adaptive routing protocols in manet11.a study of congestion aware adaptive routing protocols in manet
11.a study of congestion aware adaptive routing protocols in manet
 
Mpls Failover Presentation
Mpls Failover PresentationMpls Failover Presentation
Mpls Failover Presentation
 
Assignment2
Assignment2Assignment2
Assignment2
 
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMSPROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
 
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
A QoS oriented distributed routing protocol for Hybrid Wireless Network :Firs...
 
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing SystemsProvable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
 
Multipath TCP-Goals and Issues
Multipath TCP-Goals and IssuesMultipath TCP-Goals and Issues
Multipath TCP-Goals and Issues
 
In3514801485
In3514801485In3514801485
In3514801485
 

Similar to jvm bypass for effcient hadoop shuffling

PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
PROIDEA
 
SDN and Photonics for Dynamic Cloud Connectivity
SDN and Photonics for Dynamic Cloud Connectivity SDN and Photonics for Dynamic Cloud Connectivity
SDN and Photonics for Dynamic Cloud Connectivity
ADVA
 
How to Re-evaluate Your MPLS Service Provider
How to Re-evaluate Your MPLS Service ProviderHow to Re-evaluate Your MPLS Service Provider
How to Re-evaluate Your MPLS Service Provider
Idan Hershkovich
 
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docxChapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
bartholomeocoombs
 
Cloud computing and Software defined networking
Cloud computing and Software defined networkingCloud computing and Software defined networking
Cloud computing and Software defined networking
saigandham1
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions
Blesson Babu
 
GMPLS, SDN, Optical Networking and Control Planes
GMPLS, SDN, Optical Networking and Control PlanesGMPLS, SDN, Optical Networking and Control Planes
GMPLS, SDN, Optical Networking and Control Planes
ADVA
 
Software Defined Networking – Virtualization of Traffic Engineering
Software Defined Networking – Virtualization of Traffic EngineeringSoftware Defined Networking – Virtualization of Traffic Engineering
Software Defined Networking – Virtualization of Traffic Engineering
QuEST Global (erstwhile NeST Software)
 
Project DRAC: Creating an applications-aware network
Project DRAC: Creating an applications-aware networkProject DRAC: Creating an applications-aware network
Project DRAC: Creating an applications-aware network
Tal Lavian Ph.D.
 
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
IBM India Smarter Computing
 
Software Defined Networking (SDN): A Revolution in Computer Network
Software Defined Networking (SDN): A Revolution in Computer NetworkSoftware Defined Networking (SDN): A Revolution in Computer Network
Software Defined Networking (SDN): A Revolution in Computer Network
IOSR Journals
 
High performance and flexible networking
High performance and flexible networkingHigh performance and flexible networking
High performance and flexible networking
John Berkmans
 
Networking project list for java and dotnet
Networking project list for java and dotnetNetworking project list for java and dotnet
Networking project list for java and dotnet
redpel dot com
 
Technology Brief: Flexible Blade Server IO
Technology Brief: Flexible Blade Server IOTechnology Brief: Flexible Blade Server IO
Technology Brief: Flexible Blade Server IO
IT Brand Pulse
 
Software Defined Network (SDN)
Software Defined Network (SDN)Software Defined Network (SDN)
Software Defined Network (SDN)
Ahmed Ayman
 
🏗️Improve database performance with connection pooling and load balancing tec...
🏗️Improve database performance with connection pooling and load balancing tec...🏗️Improve database performance with connection pooling and load balancing tec...
🏗️Improve database performance with connection pooling and load balancing tec...
Alireza Kamrani
 
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ? Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Swiss Data Forum Swiss Data Forum
 
DBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructureDBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructure
Emiliano Fusaglia
 
Data Core Riverved Dr 22 Sep08
Data Core Riverved Dr 22 Sep08Data Core Riverved Dr 22 Sep08
Data Core Riverved Dr 22 Sep08
michaelking
 
Data Center Interconnects: An Overview
Data Center Interconnects: An OverviewData Center Interconnects: An Overview
Data Center Interconnects: An Overview
XO Communications
 

Similar to jvm bypass for effcient hadoop shuffling (20)

PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
PLNOG 6: Emil Gągała - Introduction to BGP-MPLS. Ethernet VPN
 
SDN and Photonics for Dynamic Cloud Connectivity
SDN and Photonics for Dynamic Cloud Connectivity SDN and Photonics for Dynamic Cloud Connectivity
SDN and Photonics for Dynamic Cloud Connectivity
 
How to Re-evaluate Your MPLS Service Provider
How to Re-evaluate Your MPLS Service ProviderHow to Re-evaluate Your MPLS Service Provider
How to Re-evaluate Your MPLS Service Provider
 
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docxChapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
Chapter 11 Selecting Technologies and Devices for Enterprise Netwo.docx
 
Cloud computing and Software defined networking
Cloud computing and Software defined networkingCloud computing and Software defined networking
Cloud computing and Software defined networking
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions
 
GMPLS, SDN, Optical Networking and Control Planes
GMPLS, SDN, Optical Networking and Control PlanesGMPLS, SDN, Optical Networking and Control Planes
GMPLS, SDN, Optical Networking and Control Planes
 
Software Defined Networking – Virtualization of Traffic Engineering
Software Defined Networking – Virtualization of Traffic EngineeringSoftware Defined Networking – Virtualization of Traffic Engineering
Software Defined Networking – Virtualization of Traffic Engineering
 
Project DRAC: Creating an applications-aware network
Project DRAC: Creating an applications-aware networkProject DRAC: Creating an applications-aware network
Project DRAC: Creating an applications-aware network
 
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
Towards an Open Data Cente with an Interoperable Network (ODIN) Volume 5: WAN...
 
Software Defined Networking (SDN): A Revolution in Computer Network
Software Defined Networking (SDN): A Revolution in Computer NetworkSoftware Defined Networking (SDN): A Revolution in Computer Network
Software Defined Networking (SDN): A Revolution in Computer Network
 
High performance and flexible networking
High performance and flexible networkingHigh performance and flexible networking
High performance and flexible networking
 
Networking project list for java and dotnet
Networking project list for java and dotnetNetworking project list for java and dotnet
Networking project list for java and dotnet
 
Technology Brief: Flexible Blade Server IO
Technology Brief: Flexible Blade Server IOTechnology Brief: Flexible Blade Server IO
Technology Brief: Flexible Blade Server IO
 
Software Defined Network (SDN)
Software Defined Network (SDN)Software Defined Network (SDN)
Software Defined Network (SDN)
 
🏗️Improve database performance with connection pooling and load balancing tec...
🏗️Improve database performance with connection pooling and load balancing tec...🏗️Improve database performance with connection pooling and load balancing tec...
🏗️Improve database performance with connection pooling and load balancing tec...
 
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ? Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
Aujourd’hui la consolidation de bases de données Oracle c’est quoi ?
 
DBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructureDBaaS - The Next generation of database infrastructure
DBaaS - The Next generation of database infrastructure
 
Data Core Riverved Dr 22 Sep08
Data Core Riverved Dr 22 Sep08Data Core Riverved Dr 22 Sep08
Data Core Riverved Dr 22 Sep08
 
Data Center Interconnects: An Overview
Data Center Interconnects: An OverviewData Center Interconnects: An Overview
Data Center Interconnects: An Overview
 

Recently uploaded

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 

Recently uploaded (20)

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 

jvm bypass for effcient hadoop shuffling

  • 1. Presented By Under the Guidance of PRATHVIRAJ Mrs. SHANTHI 1CR11CS401 Assistant professor Dept of CSE Dept of CSE CMRIT CMRIT 1CMRIT 2013-14
  • 2.  Abstract  Introduction  Existing system  proposed system  System Architecture  Implementation  Conclusion  References 2CMRIT 2013-14
  • 3.  Hadoop employs Java-based network transport stack on top of the Java Virtual Machine (JVM) for its data shuffling and merging purposes.  JVM-Bypass Shuffling (JBS) for Hadoop  JBS helps Hadoop data shuffling by avoiding Java based transport protocols, removing the overhead and limitations of the JVM. 3CMRIT 2013-14
  • 4.  MapReduce is a popular programming model that provides a simple and scalable parallel data processing framework.  Hadoop as an open-source implementation of Map Reduce adopted by Yahoo,Facebook  data shuffling  But data shuffling results in great volumes of network traffic, reduces efficiency of data analytics application 4CMRIT 2013-14
  • 5.  JVM introduces significant overhead in managing Java objects results in shrinkage in memory available to hadoop  High speed networks, such as InfiniBand provide Remote Direct Memory Access(RDMA) that is capable of up to 56Gbps bandwidth, and low CPU utilization. 5CMRIT 2013-14
  • 6.  Hadoop exposes two simple interfaces:  map  reduce  Its runtime system consists of four major components: JobTracker, TaskTracker, MapTask, and ReduceTask.  Dominant source of network traffic:5% of large jobs can consume more than 98%network bandwidth 6CMRIT 2013-14
  • 8.  Execution time  Cpu utilization  Network traffic  Overhead in managing java objects CMRIT 2013-14 8
  • 9.  No change to existing user interface (map and reduce functions)  Bypass the JVM from the critical path of intermediate data shuffling  Portable layer on top of any network transport protocol 9CMRIT 2013-14
  • 11.  Asynchronous network operations  Increases locality of disk access  Reduces average delay of requests 11CMRIT 2013-14
  • 12.  Consolidates network fetching requests from all Reduce Tasks on a single node  Number of network connections, which is no longer the total amount of MOF Copiers from all Reduce Tasks  Reduces the resource requirements  creating and sustaining many network channels  associated memory to buffer data 12CMRIT 2013-14
  • 13.  Connection Establishment for RDMA and RoCE  TCP/IP-Based Communication CMRIT 2013-14 13
  • 15.  Currently only Reliable Connection (RC)service provided by RDMA-capable interconnects is supported  512 active connections at a time  connections are torn down based on the LRU when exceeded after threshold 15CMRIT 2013-14
  • 16.  Event-driven model and multiple threads  On the client side, ◦ one dedicated thread to prepare connection requests ◦ data threads request connection to remote servers  On the server side ◦ one thread is listening for client connection requests andaccepts them  Both client and server ◦ Use epoll interface to monitor and detect events from concurrent connections ◦ rely on their data threads to perform the network communication for data transfer. CMRIT 2013-14 16
  • 17.  Simply switching to the high-performance interconnects cannot effectively boost Hadoop’s performance  Overhead imposed by JVM on Hadoop intermediate data shuffling identified  JVM-Bypass Shuffling (JBS)  avoids JVM in the critical path of Hadoop data shuffling  portable library to leverage both conventional TCP/IP protocol and high-performance RDMA protocol  reduce the execution time of Hadoop jobs by up to66.3% a  lower the CPU utilization by 48.1% 17CMRIT 2013-14
  • 18.  [1] Apache Hadoop Project. http://hadoop.apache.org/.  [2] Plugin for Generic Shuffle Service. https://issues.apache.org/jira/browse/MAPREDUCE -4049.  [3] F. Ahmad, S. T. Chakradhar, A. Raghunathan, and T. N. Vijaykumar. Tarazu: optimizing map reduce on heterogeneous clusters. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS’12, pages 61–74, New York, NY, USA, 2012. ACM. 18CMRIT 2013-14