result management system report for college project
Gpu enabled hadoop cluster using cp scheduler
1. CP Scheduler In GPU Enabled Hadoop
Cluster
Anandu Jayan and Bhargavi R Upadhyay
Dept. of Computer Science and Engineering,
Amrita School of Engineering,Bengaluru,
Amrita Vishwa Vidyapeetham,
India
03/06/2017 Amrita School of Engineering Bangalore
2. INTRODUCTION
• HADOOP
• GPU increases the performance.
• Communication time reduced.
• More number of systems included to obtain large dataset.
HADOOP + GPU = Increased Performance
HADOOP + GPU + Cluster = Better Performance
03/06/2017 Amrita School of Engineering Bangalore
4. INTRODUCTION( cont..)
• Schedulers are used.
Fair scheduling :
Assigning resources to applications such that all apps get, on average, an
equal share of resources over time.
Single task
Single Task Two Task Four Task
03/06/2017 Amrita School of Engineering Bangalore
100 % CPU 50% CPU 50% CPU
25% 25% 25% 25%
Two tasks Four task
5. INTRODUCTION( cont..)
Capacity Scheduling :
Assigning resources to applications such that all apps are allocated with
resources in a timely manner under constraints of allocated capacities.
Priority Schedulers
03/06/2017 Amrita School of Engineering Bangalore
QUEUE 1
QUEUE 2
QUEUE 3
APPS
APPS
APPS
20 %
50%
30%
Guaranteed Resources
6. MOTIVATION
• Big data sets, expected to take less time, will be taking more time
• All data may not be in proper schema
• Large data has to be processed in less time
• As data increases, CPU will not be enough
• Scheduler can only be used for schema based dataset
• Defining users are limited in some schedulers
03/06/2017 Amrita School of Engineering Bangalore
7. LITERATURE SURVEY
PROJECT CONTRIBUTERS EXPLANATION
A hybrid scheduling approach for
scalable heterogeneous Hadoop
systems
A. Rasooli, D. G. Down Developed COSHH scheduler
increasing scalability
A Dynamic MapReduce Scheduler
for Heterogeneous Workloads
Chao Tian, Haojie Zhou ,Yongqiang
He , Li Zha
Improve h/w utilization. Triple
queues. MRPredict
ATLAS: An Adaptive Failureaware
Scheduler for Hadoop
Mbarka Soualhia , Foutse Khomh
and Sofi‘ene Tahar
Hadoop can change scheduling
policies considering failures
Chronos: Failure-Aware Scheduling
in Shared Hadoop Clusters
Orcun Yildiz, Shadi Ibrahim, Tran
Anh Phuong,Orcun Yildiz, Shadi
Ibrahim, Tran Anh Phuong,
Failure aware and recovery solution
Performance evaluation of fair and
capacity scheduling in Hadoop
YARN
Garima Sharma ,Dr Anita Ganpati, Scheduling policies in YARN
03/06/2017 Amrita School of Engineering Bangalore
9. REASON FOR NEW SCHEDULER
• If the higher priority job was bigger than others, then the small ones
will have to wait longer. In FIFO preemption wasn’t possible.
• Gives unequal shares of the clusters and random turn around time.
03/06/2017 Amrita School of Engineering Bangalore
14. CONCLUSION
• Experimental results shows that CP scheduler gives better
performance than any present Hadoop scheduling algorithm in Hadoop
enabled GPU cluster.
• For the future work, we are planning to make Spark cluster on GPU
nodes and analyze for proposed schedule
03/06/2017 Amrita School of Engineering Bangalore
15. REFERENCES
1.C I Johnpaul and Neetha Susan Thampi Distributed in-memory cluster computing approach in scala for solving graph data
applications,International Conference on Advances in Electronics Computers and Communications,2014.
2. Anandu Jayan, Akash Nair, Bhargavi R Upadhyay, Supriya M Performance Analysis of Modified RC4 Cryptographic
Algorithm using number of cores in Parallel Execution, IJCTA, 9(21), 2016, pp. 225-231.
3. A. Rasooli, D. G. Down, A hybrid scheduling approach for scalable heterogeneous Hadoop systems,5th IEEE Workshop on
Many-Task Computing on Grids and Supercomputers (MTAGS12), Salt Lake City, Utah, USA, 2012.
4. Chao Tian, Haojie Zhou ,Yongqiang He , Li Zha A Dynamic MapReduce Scheduler for Heterogeneous Workloads,Eighth
International Conference on Grid and Cooperative Computing 2009.
5. Mbarka Soualhia , Foutse Khomh and Sofi‘ene Tahar ATLAS: An Adaptive Failureaware Scheduler for Hadoop,IEEE 34th
International Performance Computing and Communications Conference (IPCCC),2015.
6. Andre Luckow, Ioannis Paraskevakos, George Chantzialexiou and Shantenu Jha,Hadoop on HPC: Integrating Hadoop and
Pilot-based Dynamic Resource Management,2016 IEEE International Parallel and Distributed Processing Symposium
Workshops 2016.
7. Orcun Yildiz, Shadi Ibrahim, Tran Anh Phuong,Orcun Yildiz, Shadi Ibrahim, Tran Anh Phuong, Chronos: Failure-Aware
Scheduling in Shared Hadoop Clusters , , 2015.
03/06/2017 Amrita School of Engineering Bangalore
16. REFERENCES
8. E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, and L. Ramakrishnan,ProcessingCassandra Datasets
with Hadoop-Streaming Based Approaches ,IEEE transactions on services computing, vol. 9, no. 1, January/February,
2016.
9. Garima Sharma ,Dr Anita Ganpati,Performance evaluation of fair and capacity scheduling in Hadoop YARN,978-1-
4673-7910-6/15/31.00 IEEE 2015.
10. A. C. Murthy, V.K. Vavilapalli, D. Eadline, J. Niemiec, J. Markham, Apache Hadoop YARN: Moving beyond
MapReduce and batch processing with Apache Hadoop 2,in Pearson Education 2016.
11. J. Dean, S. Ghemawat, and G. Inc, MapReduce: Simplied data processing on large clusters,6th Conf. Symp.
Operating Syst. Des. Implementation, 2004.
12. Krish K.R., Ali Anwar, Ali R. Butt, Sched: A Heterogeneity-Aware Hadoop Workow Schedule,IEEE 22nd
International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems 2014
03/06/2017 Amrita School of Engineering Bangalore