Your SlideShare is downloading. ×
  • Like
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications across hybrid clouds to meet soft deadlines
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications across hybrid clouds to meet soft deadlines

  • 410 views
Published

Scaling map reduce applications across hybrid clouds to meet soft deadlines - By Michael Mattess, Rodrigo N. Calheiros, and Rajkumar Buyya,  Proceedings of the 27th IEEE International Conference on …

Scaling map reduce applications across hybrid clouds to meet soft deadlines - By Michael Mattess, Rodrigo N. Calheiros, and Rajkumar Buyya,  Proceedings of the 27th IEEE International Conference on Advanced Information Networking and Applications (AINA 2013, IEEE CS Press, USA), Barcelona, Spain, March 25-28, 2013.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
410
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Scaling MapReduce Applications across Hybrid Clouds to Meet Soft Deadlines Dongseo University Division of Computer & Information Engineering Intelligent Smart Systems Research Lab Presented by: Ahmed Abdulhakim Al-Absi
  • 2. 2 About The Paper  2013 IEEE International Conference on Advanced Information Networking and Applications (AINA)  Cloud Computing and Distributed Systems (CLOUDS) Laboratory  University of Melbourne, Australia Michael Mattess, Rodrigo N. Calheiros, and Rajkumar Buyya, Proceedings of the 27th IEEE International Conference on Advanced Information Networking and Applications (AINA 2013, IEEE CS Press, USA), Barcelona, Spain, March 25-28, 2013.
  • 3. Outline  Motivations  Proposed Policy  Proposed Policy Scenario  Policy parameters  Policy Parameters Algorithm  Implementation  Performance Evaluation  Experimental Testbed and Sample Application  Performance Analysis  Experimental Testbed and Sample Application  Results – Map phase  Results – Reduce phase  Conclusion  My Opinion 3
  • 4.  As MapReduce is becoming the prevalent programming model for building data processing applications in Clouds, the need for timely execution of such applications becomes a necessity.  Existing approaches for execution of such deadline- constrained MapReduce applications focus in meeting deadlines via admission control. However, they cannot solve conflicts when applications with Motivation 4
  • 5. Proposed Policy 5  To tackle this limitation of current approaches for the problem, authors proposed a novel policy for dynamic provisioning of public Cloud resources to speed up execution of MapReduce applications.  Authors target the Map phase as the target of the soft deadline because this phase contains tasks that generally have uniform execution time.
  • 6. Policy System 1. The initial state of the system consists of  local worker nodes registered with the master node and ready to execute MapReduce tasks.  Datasets are available for local workers and are also stored in the Cloud provider’s storage service. 2. A MapReduce application is submitted to the master node and the scheduler begins assigning Map tasks to worker nodes; 3. When a predefined number or fraction of the Map tasks completes, the master uses the provisioning policy and, based on the application deadline, 6
  • 7. 4. Requested resources register with the master node as they become available and the scheduler assigns Map tasks to them; 5. When the Map phase completes, the scheduler begins assigning Reduce tasks to workers; 6. Each Reduce task obtains the intermediate data to be reduced from other nodes; 7. The output of the Reduce tasks may remain on the nodes in anticipation of another MapReduce application, which will further process the data. Alternatively, the output can be collected by the master node and sent back to the user that submitted the application. 7 Policy System
  • 8. Proposed Policy Scenario 8
  • 9. 9 Policy Parameters  MARGIN is a ‘safety margin’ that is removed from the remaining time to account for errors in the prediction of tasks execution time.  LOCAL FACTOR is a multiplier applied to the average run time of the first batch of Map tasks to generate an estimation of the expected Map task execution time on the Cluster considering variation in the worker performance.  REMOTE FACTOR is set to predict the expected Map task execution time on public Cloud resources. This allows accounting for the expected variation in the performance of local and public Cloud resources.  BOOT TIME is the expected amount of time between requesting a new resource from the public Cloud and
  • 10. Policy Parameters Algorithm 10 The decision of number of public Cloud resources to be allocated to a MapReduce application
  • 11. 11 Implementation  Implemented in the Aneka Cloud Platform with some changes:  Dynamic Provisioning: Extended the existing MapReduce Service of Aneka to enable its interaction with the Provisioning Service  Data Exchange: enable Aneka MapReduce to work across local and remote resources by HTTP and IIS to serve the intermediary data files.  Remote Storage: S3 as the source of local input data files when running on EC2.
  • 12. 12 Performance Evaluation Experimental Testbed and Sample Application  The environment is a hybrid Cloud composed of a local Cluster and a public Cloud:  Local Cluster: 4 IBM System X3200 M3 servers running Citrix XenServer. 2 Windows 2008 virtual machines= 8 worker nodes.  Public Cloud Resources: provisioned from Amazon EC2, USA East Coast data center. m1.small instances, 1.0-1.2 GHz CPU, 1.7 GB of memory.  Dataset: 4.5 GB copied in Local and S3
  • 13. Performance Evaluation Experimental Testbed and Sample Application13
  • 14. 14 Performance Analysis Results – Map Phase Authors sequentially executed several requests for the word count application. On each request, they modified the sleep time in each Map task, and kept the deadline for completing the Map phase constant and equal to 30 minutes for each application.
  • 15. Performance Analysis Results – Map Phase
  • 16. Performance Analysis Results – Reduce Phase 352 MB as Small Data 3422 MB as Big Data Similarly, a sleep was inserted in the Reduce tasks in order to increase their execution time and observe how different sizes of Reduce tasks affect execution time of applications. Two different values for sleep, 60 seconds and 5 seconds 16
  • 17.  This paper presented dynamic provisioning policy for MapReduce applications and a prototype in the Aneka Cloud Platform.  Results showed that the approach, even though its lower complexity, delivers good results.  The policy was able to meet deadlines of applications, which are defined in terms of completion time of the Map phase, for increasing execution times of Map Conclusion 17
  • 18. My Opinion 18
  • 19. Q & A Thank You!