IEEE CLOUD \'11

•

0 likes•976 views

David Ribeiro Alves

The presentation of the paper:Deadline Queries: Leveraging the Cloud to Produce On-Time results

Technology Business

Deadline Queries: Leveraging the Cloud to Produce On-Time Results Authors: David Ribeiro Alves, Pedro Bizarro, Paulo Marques

In a nutshell Cluster computing widely used to solve “BigData” problems Users use programming abstractions to express the computation, e.g., MapReduce, but are left with some difficult questions: how many nodes? how long will it take? Proposed solution:Users define a deadline; cluster expands/contracts to meet it. 2 CLOUD '11

Introducing Deadline Queries Cluster computing tasks that complete within a deadline… … while minimizingcost/resource consumption Independently of: 3 Processing Capacity per Machine Faults or Perturbations Initial Number of Nodes Data Size, Content or Skew Computation Complexity CLOUD '11

Approaches in current systems 4 … make the task fit the cluster. CLOUD '11

Our Approach 5 … make cluster fit the task. CLOUD '11

Architecture and Runtime 6 Ex: SELECT symbol, avg(value), avg(volume) FROM Stocks GROUP BY symbol FINISH IN 900 SEC Master Node Query IaaS Provider request nodes metrics Worker Node Part. 1 Worker Node mod. cluster Worker Node Part. 2 Worker Node Worker Worker Part. 3 Worker Part. n CLOUD '11

Stream Processing Continuous processing allows phases to start before previous phases complete Continuous processing allows to continuously gather progress metrics about the computation as a whole SP provides continuous load balancing, which allows to: take immediate advantage of arriving nodes deal with temporary or permanent asymmetries deal with data skew SP fault tolerance allow to quickly respond to faults CLOUD '11 7

MapReduce SELECT symbol, avg(value),avg(volume) FROM Stocks GROUP BY symbol FINISH IN 900 sec MapReduce Decomposition: 8 Fetch & Transform Map (Select/Project) Group Reduce (Aggregate) Store Results CLOUD '11

Streaming MapReduce - Scaling Stream Processing => load balancing and fault tolerance in a changing cluster MapReduce => Simple, parallel, scalable programming and execution model 9 CLOUD '11

Progress estimation Consumed vs. remaining data + linear regression to estimate finish time. React accordingly by either expanding or contracting the cluster. 10 CLOUD '11

Experimental Evaluation - Setup 11 Real world environment experiments On top of Amazon EC2 Running Query: SELECT symbol, avg(value), avg(volume)FROM StocksGROUP BY symbol FINISH IN 900 sec Used between 1 and 27 machines (m1.large) 2* Dual Core Xeon (2.66 Ghz) 7.5 GB of RAM Experiments show: Predicted remaining time Number of nodes CLOUD '11

Exp. 1 – Varying Initial Cluster Size 12 CLOUD '11

Exp. 3 – Introducing Perturbations 14 CLOUD '11

Conclusions Cloud Computing, e.g., IaaS, allow new approaches to cluster computing and new optimization goals. Deadline Queries may help in expressing computation prov. requirements beyond number of nodes. Deadline Queries is a viable alternative to implement hard time limits for query execution. Real implementation and evaluation show approach is feasible and works as expected. 15 CLOUD '11

What's hot

Hadoop mapreduce performance study on arm clusterairbots

FinalprojectpresentationSANTOSH WAYAL

CNIT 127 Ch 5: Introduction to heap overflowsSam Bowne

Parallelization Strategies for Implementing Nbody Codes on Multicore Architec...Filipo Mór

Scheduling in cloudDr.Manjunath Kotari

CUDA performance study on Hadoop MapReduce Clusterairbots

Llnl talkTed Dunning

The next generation of the Montage image mosaic engineG. Bruce Berriman

Ch 5: Introduction to heap overflowsSam Bowne

MapReduce and HadoopNicola Cadenelli

HP - Jerome Rolia - Hadoop World 2010Cloudera, Inc.

cnsm2011_slidererngvit yanggratoke

Mcs 041 assignment solution (2020-21)smumbahelp

Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16MLconf

HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...Xiao Qin

MapReduce: Simplified Data Processing On Large Clusterskazuma_sato

My mapreduce1 presentationNoha Elprince

Scaling metricsVladimir Varfolomeev

"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...Adrian Florea

REVIEW PAPER on Scheduling in Cloud ComputingJaya Gautam

What's hot (20)

Hadoop mapreduce performance study on arm cluster

Finalprojectpresentation

CNIT 127 Ch 5: Introduction to heap overflows

Parallelization Strategies for Implementing Nbody Codes on Multicore Architec...

Scheduling in cloud

CUDA performance study on Hadoop MapReduce Cluster

Llnl talk

The next generation of the Montage image mosaic engine

Ch 5: Introduction to heap overflows

MapReduce and Hadoop

HP - Jerome Rolia - Hadoop World 2010

cnsm2011_slide

Mcs 041 assignment solution (2020-21)

Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16

HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...

MapReduce: Simplified Data Processing On Large Clusters

My mapreduce1 presentation

Scaling metrics

"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...

REVIEW PAPER on Scheduling in Cloud Computing

Viewers also liked

James elastic searchLearningTech

Introduction to ElasticsearchSperasoft

Elastic Search Indexing InternalsGaurav Kukal

quick intro to elastic search medcl

Elastic search WalkthroughSuhel Meman

What I learnt: Elastic search & Kibana : introduction, installtion & configur...Rahul K Chauhan

Elastic search & patent information @ mtcArne Krueger

Linux commands and file structureSreenatha Reddy K R

Diario Resumen 20170315Diario Resumen

Expanding Elastic: Learn how anyone can leverage heterogeneous compute to ext...Ryft

Elastic SearchNavule Rao

Viewers also liked (11)

James elastic search

Introduction to Elasticsearch

Elastic Search Indexing Internals

quick intro to elastic search

Elastic search Walkthrough

What I learnt: Elastic search & Kibana : introduction, installtion & configur...

Elastic search & patent information @ mtc

Linux commands and file structure

Diario Resumen 20170315

Expanding Elastic: Learn how anyone can leverage heterogeneous compute to ext...

Elastic Search

Similar to IEEE CLOUD \'11

Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemShuai Yuan

Green schedulingVincenzo De Maio

Resisting skew accumulationMd. Hasibur Rashid

Genetic Algorithm for task scheduling in Cloud Computing EnvironmentSwapnil Shahade

Deep Dive on Amazon EC2 instancesAmazon Web Services

Efficient processing of Rank-aware queries in Map/ReduceSpiros Oikonomakis

Efficient processing of Rank-aware queries in Map/ReduceSpiros Economakis

Hadoop scheduler with deadline constraintijccsa

Intermachine ParallelismSri Prasanna

Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit

SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services

HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin

Download Itbutest

SOME WORKLOAD SCHEDULING ALTERNATIVES 11.07.2013James McGalliard

High Throughput Analytics with Cassandra & AzureDataStax Academy

Handout3oShahbaz Sidhu

Hadoop mapreduce and yarn frame work- unit5RojaT4

Enhancing Performance and Fault Tolerance of Hadoop ClusterIRJET Journal

Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com

Sawmill - Integrating R and Large Data CloudsRobert Grossman

Similar to IEEE CLOUD \'11 (20)

Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system

Green scheduling

Resisting skew accumulation

Genetic Algorithm for task scheduling in Cloud Computing Environment

Deep Dive on Amazon EC2 instances

Efficient processing of Rank-aware queries in Map/Reduce

Hadoop scheduler with deadline constraint

Intermachine Parallelism

Apache Beam: A unified model for batch and stream processing data

SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...

HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters

Download It

SOME WORKLOAD SCHEDULING ALTERNATIVES 11.07.2013

High Throughput Analytics with Cassandra & Azure

Handout3o

Hadoop mapreduce and yarn frame work- unit5

Enhancing Performance and Fault Tolerance of Hadoop Cluster

Applying Cloud Techniques to Address Complexity in HPC System Integrations

Sawmill - Integrating R and Large Data Clouds

Recently uploaded

[BuildWithAI] Introduction to Gemini.pdfSandro Moreira

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

ICT role in 21st century education and its challengesrafiqahmad00786416

Exploring Multimodal Embeddings with MilvusZilliz

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Corporate and higher education May webinar.pptxRustici Software

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

MINDCTI Revenue Release Quarter One 2024MIND CTI

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)

CNIC Information System with Pakdata Cf In Pakistan

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

ICT role in 21st century education and its challenges

Exploring Multimodal Embeddings with Milvus

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Artificial Intelligence Chap.5 : Uncertainty

Corporate and higher education May webinar.pptx

FWD Group - Insurer Innovation Award 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

How to Troubleshoot Apps for the Modern Connected Worker

Six Myths about Ontologies: The Basics of Formal Ontology

MINDCTI Revenue Release Quarter One 2024

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Vector Search -An Introduction in Oracle Database 23ai.pptx

IEEE CLOUD \'11

1. Deadline Queries: Leveraging the Cloud to Produce On-Time Results Authors: David Ribeiro Alves, Pedro Bizarro, Paulo Marques

2. In a nutshell Cluster computing widely used to solve “BigData” problems Users use programming abstractions to express the computation, e.g., MapReduce, but are left with some difficult questions: how many nodes? how long will it take? Proposed solution:Users define a deadline; cluster expands/contracts to meet it. 2 CLOUD '11

3. Introducing Deadline Queries Cluster computing tasks that complete within a deadline… … while minimizingcost/resource consumption Independently of: 3 Processing Capacity per Machine Faults or Perturbations Initial Number of Nodes Data Size, Content or Skew Computation Complexity CLOUD '11

4. Approaches in current systems 4 … make the task fit the cluster. CLOUD '11

5. Our Approach 5 … make cluster fit the task. CLOUD '11

6. Architecture and Runtime 6 Ex: SELECT symbol, avg(value), avg(volume) FROM Stocks GROUP BY symbol FINISH IN 900 SEC Master Node Query IaaS Provider request nodes metrics Worker Node Part. 1 Worker Node mod. cluster Worker Node Part. 2 Worker Node Worker Worker Part. 3 Worker Part. n CLOUD '11

7. Stream Processing Continuous processing allows phases to start before previous phases complete Continuous processing allows to continuously gather progress metrics about the computation as a whole SP provides continuous load balancing, which allows to: take immediate advantage of arriving nodes deal with temporary or permanent asymmetries deal with data skew SP fault tolerance allow to quickly respond to faults CLOUD '11 7

8. MapReduce SELECT symbol, avg(value),avg(volume) FROM Stocks GROUP BY symbol FINISH IN 900 sec MapReduce Decomposition: 8 Fetch & Transform Map (Select/Project) Group Reduce (Aggregate) Store Results CLOUD '11

9. Streaming MapReduce - Scaling Stream Processing => load balancing and fault tolerance in a changing cluster MapReduce => Simple, parallel, scalable programming and execution model 9 CLOUD '11

10. Progress estimation Consumed vs. remaining data + linear regression to estimate finish time. React accordingly by either expanding or contracting the cluster. 10 CLOUD '11

11. Experimental Evaluation - Setup 11 Real world environment experiments On top of Amazon EC2 Running Query: SELECT symbol, avg(value), avg(volume)FROM StocksGROUP BY symbol FINISH IN 900 sec Used between 1 and 27 machines (m1.large) 2* Dual Core Xeon (2.66 Ghz) 7.5 GB of RAM Experiments show: Predicted remaining time Number of nodes CLOUD '11

12. Exp. 1 – Varying Initial Cluster Size 12 CLOUD '11

13. Exp. 2 – Varying Deadline 13 CLOUD '11

14. Exp. 3 – Introducing Perturbations 14 CLOUD '11

15. Conclusions Cloud Computing, e.g., IaaS, allow new approaches to cluster computing and new optimization goals. Deadline Queries may help in expressing computation prov. requirements beyond number of nodes. Deadline Queries is a viable alternative to implement hard time limits for query execution. Real implementation and evaluation show approach is feasible and works as expected. 15 CLOUD '11

16. 16 Questions? CLOUD '11

17. Fault Tolerance 17 CLOUD ‘11

Editor's Notes

----- Meeting Notes (10/20/10 14:48) -----Notasgenericas:Mais "sharp"FocarnaaudienciaNuncadigocomovouavaliar o sistema.Gantt estamuitopequeno
In particular I’d like to refer to two practical cases:1st one is that of a portuguese bank that must complete processing 10M transaction and produce the respective reports in the morning, but has no idea how much machine power it requires to do so.2nd is that of a portuguese telecom company that is actually building the largest portuguese private cloud, but still has problems alocating nodes to tasks to guarantee they complete in time.
Create an animation in a slide or two that describes how the problem was previously deal with and our solution, introduce the running example hereStory of the slide is:start a processing documents, (start moving doc arrow to the cluster)when the system predicts the deadline will be missed (clock turns red)… it starts discard data or reducing accuracy (put documents in the trash)mencionaroralmenteque outros sistemasdicartam dados mas naoficarporaquimuito tempomencionarqueemmuitoscasosnao se podedeitar dados for a (exemplosprevios)
Story of the slide is:start a processing documents, (start moving doc arrow to the cluster)when we see the deadline will be missed (clock turns red)… start expanding used resources
Mencionarqueadoptamos streaming mapreduceparasermoscapazes de lidar com alteracoes no cluster
Transform task in dataflow and split data in partitionsRequest nodes and assign dataflow parts to themNodes fetch partitions from a queue and insert them in the dataflowNodes send report updates to the master, which decides if more nodes are needed anf if so…**CLICK**New nodes are added to the computationThe fact that we use stramingmapreduce allows us to:deal with data skew, by using streaming routing techniquesdeal with faults relatively quickly
Transform task in dataflow and split data in partitionsRequest nodes and assign dataflow parts to themNodes fetch partitions from a queue and insert them in the dataflowNodes send report updates to the master, which decides if more nodes are needed anf if so…**CLICK**New nodes are added to the computationUse load-balanced content insensitive routing where possibleUse load-balanced content sensitive routing where needed.----- Meeting Notes (6/27/11 19:18) -----Por o nome das maquinas----- Meeting Notes (6/29/11 14:33) -----eficiente palavra demasiado relativa
Experiment 1 – Varying Initial Cluster Size, Click 1 – The experiment starts with 1 nodeClick 2 – At first we
Series 1Lets see the experimentsWe begin by executing the query starting with one node**CLICK**The system starts execution with 1 nodeAt first there are no statistics on progress so nothing can be said about wether the deadline will be met**CLICK**As soon as the system detects the deadline will be missed----- Meeting Notes (6/29/11 14:33) -----threshold on dealine fault detectionthreshold on max number of machineshistoria interesasnte para contar como se portaria em diversos modelos de custospeculOS
Clarificaroralmentecomoforaminjectadas as perturbacoes (comandoslinux)
Normal OperationMaps process single partitions and tag results with part_idPartial reduces maitain per partition windowsTotal Reduces maitain a tentative set, where results are separated partition wise.Upon receiving a part_end punctuation When faults occurMaster notifies remaining nodes that the node has failed (so they know not to receive data from that node).Nodes discard all data from that partition (partial reduces discard the partitions window set and total reduces discard the partitions group in the tentative set)----- Meeting Notes (6/27/11 18:55) -----Transformar isto em dois slides

IEEE CLOUD \'11

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to IEEE CLOUD \'11

Similar to IEEE CLOUD \'11 (20)

Recently uploaded

Recently uploaded (20)

IEEE CLOUD \'11

Editor's Notes