SlideShare a Scribd company logo
A Unified Approach for Modeling and Optimization of 
Energy, Makespan and Reliability for Scientific Workflows 
on Large-Scale Computing Infrastructures 
Rafael&Ferreira&da&Silva1,&Thomas&Fahringer2,&Juan&J.&Durillo2,&Ewa&Deelman1& 
& 
Workshop&on&Modeling&&&SimulaBon&of&Systems&and&ApplicaBons& 
August&13H14,&2014,&University&of&Washington,&SeaLle,&Washington& 
1University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA 
2University of Innsbruck, Technikerstrasse 21a, Innsbruck, Austria
Introduction 
• Scientific workflows are often used to manage large-scale 
computations on HPC and HTC platforms 
• Several studies have been conducted to optimize workflow scheduling 
• However, most existing optimization techniques are limited to single or 
two objectives 
• Research in green computing often address cooling and 
energy usage reduction in large data-centers 
• There are few studies on how resources are used by applications 
• Green computing in scientific workflows 
• Studies are limited to the measurement of energy usage according to 
resource utilization 
• The energy consumption model is simplistic (e.g., homogeneous 
execution nodes) 
2
Research Goals 
• Development of an energy consumption model to address real 
large-scale infrastructure conditions 
• e.g., heterogeneity, resource availability, external loads 
• Validation of the model in a fully instrumented platform able to measure 
the actual temperature and energy consumed by computing, 
networking, and storage systems 
• Development of a multi-objective optimization approach to 
explore workflow execution tradeoffs 
3
Application Model: Scientific Workflows 
• Directed Acyclic Graph (DAG) 
• Nodes denote tasks 
• Edges denote task dependencies 
4 
• Tasks 
• Command-line programs that read one or 
more input files and produce one or more 
output files 
• Compute-intensive or data-intensive 
• Data dependencies 
• Result of output files from one program 
becoming input files for another program
System Model: Distributed Infrastructure 
• Infrastructure as a Service (IaaS) 
• Data and task computations are stored/performed in the infrastructure 
5 
90 
90 
6 
'LVWULEXWHG,QIUDVWUXFWXUH 
 
6XEPLW 
+RVW 
 
 
 
+RVW +RVW 
 
RPPXQLFDWLRQ1HWZRUN 
90 
90 
+ 
1: Application setup: provision of a set of 
parameters and input files uploading 
2: Workflow task scheduling 
3: Output data is stored on the storage 
server 
4: Output data required by the user is 
downloaded from the storage server
Runtime and Reliability Models 
• At Workflow Level (Our Expertise) 
• Collect and summarize performance metrics for workflow applications 
• e.g., process I/O, runtime, memory usage, CPU utilization 
• Profile data is used to build distributions of workflow applications 
• At Infrastructure Level (Looking for a Partner) 
• Collect temperature and energy consumption from execution nodes, 
storage servers, and network systems 
• Requires a fully instrumented platform 
6
Research Dimensions 
• Goal: Multi-objective optimization of energy consumption, 
makespan, and reliability for scientific workflows 
7 
Multi-objective optimization process 
• Monitoring 
• Workflow profile data has been 
collected as part of the DOE dV/dt 
project (ER26110) 
• Temperature and energy 
consumption monitoring requires 
access to a fully instrumented 
infrastructure
8 
Research Dimensions 
• Multi-Objective Optimization 
• The improvement of one optimization criteria may imply in the 
deterioration of another criteria 
• Development of heuristics to reduce the large-search space of workflow 
executions 
• Modeling (Dynamic Optimization) 
• Models will be constantly updated based on the profiling data collected 
during the workflow execution 
• Workflow Execution 
• Conducted with the Pegasus WMS (OCI SI2-SSI #1148515)
9 
Discussions 
• Major Contribution 
• Multi-objective optimization of energy consumption, makespan, and 
reliability for scientific workflows on large-scale computing infrastructures 
• Gaps in Current Research 
• There is no energy-aware profiling of scientific workflow applications 
• Research is focused on the optimization of a single or two objectives 
• Strong assumptions are made (e.g., homogeneous environments) 
• Synergistic Projects 
• dV/dT: Accelerating the Rate of Progress Towards Extreme Scale 
Collaborative Science (DOE ER26110) 
• Pegasus WMS (OCI SI2-SSI #1148515) 
• DOE Sustained Performance, Energy and Resilience (SUPER) project
A Unified Approach for Modeling and Optimization of 
Energy, Makespan and Reliability for Scientific Workflows 
on Large-Scale Computing Infrastructures 
Thankyou. 
rafsilva@isi.edu- 
- 
h/p://pegasus.isi.edu-

More Related Content

What's hot

Improving Query Processing Time of Olap Cube using Olap Operations
Improving Query Processing Time of Olap Cube using Olap OperationsImproving Query Processing Time of Olap Cube using Olap Operations
Improving Query Processing Time of Olap Cube using Olap Operations
IRJET Journal
 
Nathan's Resume - Sept 2016
Nathan's Resume - Sept 2016Nathan's Resume - Sept 2016
Nathan's Resume - Sept 2016
Nathan Sender
 
SC13 BoF: RDA and HPC
SC13 BoF: RDA and HPCSC13 BoF: RDA and HPC
SC13 BoF: RDA and HPC
John Cobb
 
Rethinking data intensive science using scalable analytics systems
 Rethinking data intensive science using scalable analytics systems Rethinking data intensive science using scalable analytics systems
Rethinking data intensive science using scalable analytics systems
newmooxx
 
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
WG_ Events
 
Fault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational gridFault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational grid
Ghulam Asfia
 
2015 presentation
2015 presentation2015 presentation
2015 presentation
Nikhil Ghosh
 
Using Excel in project management
Using Excel in project managementUsing Excel in project management
Using Excel in project management
Long Hoàng
 
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
Reza Farrahi Moghaddam, PhD, BEng
 
Renga: a collaborative data science platform
Renga: a collaborative data science platformRenga: a collaborative data science platform
Renga: a collaborative data science platform
rrrrrok
 

What's hot (10)

Improving Query Processing Time of Olap Cube using Olap Operations
Improving Query Processing Time of Olap Cube using Olap OperationsImproving Query Processing Time of Olap Cube using Olap Operations
Improving Query Processing Time of Olap Cube using Olap Operations
 
Nathan's Resume - Sept 2016
Nathan's Resume - Sept 2016Nathan's Resume - Sept 2016
Nathan's Resume - Sept 2016
 
SC13 BoF: RDA and HPC
SC13 BoF: RDA and HPCSC13 BoF: RDA and HPC
SC13 BoF: RDA and HPC
 
Rethinking data intensive science using scalable analytics systems
 Rethinking data intensive science using scalable analytics systems Rethinking data intensive science using scalable analytics systems
Rethinking data intensive science using scalable analytics systems
 
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
 
Fault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational gridFault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational grid
 
2015 presentation
2015 presentation2015 presentation
2015 presentation
 
Using Excel in project management
Using Excel in project managementUsing Excel in project management
Using Excel in project management
 
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
Reza Farrahi Moghaddam’s Progress Report within the Perspective of the GSTC P...
 
Renga: a collaborative data science platform
Renga: a collaborative data science platformRenga: a collaborative data science platform
Renga: a collaborative data science platform
 

Viewers also liked

Dynamic modeling
Dynamic modelingDynamic modeling
Dynamic modeling
Preeti Mishra
 
Ooad 3
Ooad 3Ooad 3
Seq uml
Seq umlSeq uml
Unit 3 object analysis-classification
Unit 3 object analysis-classificationUnit 3 object analysis-classification
Unit 3 object analysis-classification
gopal10scs185
 
Dynamic and Static Modeling
Dynamic and Static ModelingDynamic and Static Modeling
Dynamic and Static Modeling
Saurabh Kumar
 
Noun phrases
Noun phrasesNoun phrases
Noun phrases
Academic Supervisor
 
Presentation on Transmission Media
Presentation on Transmission MediaPresentation on Transmission Media
Presentation on Transmission Media
Syed Ahmed Zaki
 
Transmission Media
Transmission MediaTransmission Media
Transmission Media
Sourav Roy
 
Guided media and Unguided media
Guided media and Unguided mediaGuided media and Unguided media
Guided media and Unguided media
Sanaa Sial
 
Uml class-diagram
Uml class-diagramUml class-diagram
Uml class-diagram
ASHOK KUMAR PALAKI
 
Ooad unit – 1 introduction
Ooad unit – 1 introductionOoad unit – 1 introduction
Ooad unit – 1 introduction
Babeetha Muruganantham
 
Ppt for tranmission media
Ppt for tranmission mediaPpt for tranmission media
Ppt for tranmission media
Manish8976
 
Uml diagrams
Uml diagramsUml diagrams
Uml diagrams
barney92
 
UML tutorial
UML tutorialUML tutorial
UML tutorial
Eliza Wright
 
Unified Process
Unified ProcessUnified Process
Unified Process
guy_davis
 
Class diagram presentation
Class diagram presentationClass diagram presentation
Class diagram presentation
SayedFarhan110
 
Different approaches and methods
Different approaches and methodsDifferent approaches and methods
Different approaches and methods
switlu
 

Viewers also liked (17)

Dynamic modeling
Dynamic modelingDynamic modeling
Dynamic modeling
 
Ooad 3
Ooad 3Ooad 3
Ooad 3
 
Seq uml
Seq umlSeq uml
Seq uml
 
Unit 3 object analysis-classification
Unit 3 object analysis-classificationUnit 3 object analysis-classification
Unit 3 object analysis-classification
 
Dynamic and Static Modeling
Dynamic and Static ModelingDynamic and Static Modeling
Dynamic and Static Modeling
 
Noun phrases
Noun phrasesNoun phrases
Noun phrases
 
Presentation on Transmission Media
Presentation on Transmission MediaPresentation on Transmission Media
Presentation on Transmission Media
 
Transmission Media
Transmission MediaTransmission Media
Transmission Media
 
Guided media and Unguided media
Guided media and Unguided mediaGuided media and Unguided media
Guided media and Unguided media
 
Uml class-diagram
Uml class-diagramUml class-diagram
Uml class-diagram
 
Ooad unit – 1 introduction
Ooad unit – 1 introductionOoad unit – 1 introduction
Ooad unit – 1 introduction
 
Ppt for tranmission media
Ppt for tranmission mediaPpt for tranmission media
Ppt for tranmission media
 
Uml diagrams
Uml diagramsUml diagrams
Uml diagrams
 
UML tutorial
UML tutorialUML tutorial
UML tutorial
 
Unified Process
Unified ProcessUnified Process
Unified Process
 
Class diagram presentation
Class diagram presentationClass diagram presentation
Class diagram presentation
 
Different approaches and methods
Different approaches and methodsDifferent approaches and methods
Different approaches and methods
 

Similar to A Unified Approach for Modeling and Optimization of Energy, Makespan and Reliability for Scientific Workflows on Large-Scale Computing Infrastructures

High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
Geoffrey Fox
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study Migrations
Perficient, Inc.
 
Cloud Strategy
Cloud StrategyCloud Strategy
Cloud Strategy
Richard Harvey
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
Time management
Time managementTime management
Time management
Mostafa Elgamala
 
Efficient & effective data management for research projects : ILRI's Data Ma...
Efficient & effective  data management for research projects : ILRI's Data Ma...Efficient & effective  data management for research projects : ILRI's Data Ma...
Efficient & effective data management for research projects : ILRI's Data Ma...
CIARD Movement
 
3.01.16 HIMSS- Duke RFD
3.01.16 HIMSS- Duke RFD3.01.16 HIMSS- Duke RFD
3.01.16 HIMSS- Duke RFD
Amy Nordo RN, CPHQ, LNC
 
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
hydrologywebsite1
 
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
hydrologyproject001
 
GouriShankar_Informatica
GouriShankar_InformaticaGouriShankar_Informatica
GouriShankar_Informatica
Gouri Shankar M
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 
RDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the Data
Robin Rice
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
ariadnenetwork
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
David Wallom
 
commissioning-PEG Engineering profile.pdf
commissioning-PEG Engineering profile.pdfcommissioning-PEG Engineering profile.pdf
commissioning-PEG Engineering profile.pdf
Vincent Bui
 
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
Neo4j
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
confluent
 
TPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data IntegrationTPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data Integration
Tilmann Rabl
 
UCPath at UCOP
UCPath at UCOPUCPath at UCOP
UCPath at UCOP
Jeffery Wong
 
Alliance 2017 3891-University of California | Office of The President People...
Alliance 2017  3891-University of California | Office of The President People...Alliance 2017  3891-University of California | Office of The President People...
Alliance 2017 3891-University of California | Office of The President People...
Smart ERP Solutions, Inc.
 

Similar to A Unified Approach for Modeling and Optimization of Energy, Makespan and Reliability for Scientific Workflows on Large-Scale Computing Infrastructures (20)

High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study Migrations
 
Cloud Strategy
Cloud StrategyCloud Strategy
Cloud Strategy
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Time management
Time managementTime management
Time management
 
Efficient & effective data management for research projects : ILRI's Data Ma...
Efficient & effective  data management for research projects : ILRI's Data Ma...Efficient & effective  data management for research projects : ILRI's Data Ma...
Efficient & effective data management for research projects : ILRI's Data Ma...
 
3.01.16 HIMSS- Duke RFD
3.01.16 HIMSS- Duke RFD3.01.16 HIMSS- Duke RFD
3.01.16 HIMSS- Duke RFD
 
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
Download-manuals-surface water-manual-45howtoreviewmonitoringnetworks
 
GouriShankar_Informatica
GouriShankar_InformaticaGouriShankar_Informatica
GouriShankar_Informatica
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
RDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the Data
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
commissioning-PEG Engineering profile.pdf
commissioning-PEG Engineering profile.pdfcommissioning-PEG Engineering profile.pdf
commissioning-PEG Engineering profile.pdf
 
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
The Protein Regulatory Networks of COVID-19 - A Knowledge Graph Created by El...
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
 
TPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data IntegrationTPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data Integration
 
UCPath at UCOP
UCPath at UCOPUCPath at UCOP
UCPath at UCOP
 
Alliance 2017 3891-University of California | Office of The President People...
Alliance 2017  3891-University of California | Office of The President People...Alliance 2017  3891-University of California | Office of The President People...
Alliance 2017 3891-University of California | Office of The President People...
 

More from Rafael Ferreira da Silva

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Rafael Ferreira da Silva
 
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Rafael Ferreira da Silva
 
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Rafael Ferreira da Silva
 
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
Rafael Ferreira da Silva
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
Rafael Ferreira da Silva
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
Rafael Ferreira da Silva
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
Rafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
Rafael Ferreira da Silva
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
Rafael Ferreira da Silva
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
Rafael Ferreira da Silva
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Rafael Ferreira da Silva
 

More from Rafael Ferreira da Silva (20)

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
 
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
 
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
 
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
 

Recently uploaded

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

A Unified Approach for Modeling and Optimization of Energy, Makespan and Reliability for Scientific Workflows on Large-Scale Computing Infrastructures

  • 1. A Unified Approach for Modeling and Optimization of Energy, Makespan and Reliability for Scientific Workflows on Large-Scale Computing Infrastructures Rafael&Ferreira&da&Silva1,&Thomas&Fahringer2,&Juan&J.&Durillo2,&Ewa&Deelman1& & Workshop&on&Modeling&&&SimulaBon&of&Systems&and&ApplicaBons& August&13H14,&2014,&University&of&Washington,&SeaLle,&Washington& 1University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA 2University of Innsbruck, Technikerstrasse 21a, Innsbruck, Austria
  • 2. Introduction • Scientific workflows are often used to manage large-scale computations on HPC and HTC platforms • Several studies have been conducted to optimize workflow scheduling • However, most existing optimization techniques are limited to single or two objectives • Research in green computing often address cooling and energy usage reduction in large data-centers • There are few studies on how resources are used by applications • Green computing in scientific workflows • Studies are limited to the measurement of energy usage according to resource utilization • The energy consumption model is simplistic (e.g., homogeneous execution nodes) 2
  • 3. Research Goals • Development of an energy consumption model to address real large-scale infrastructure conditions • e.g., heterogeneity, resource availability, external loads • Validation of the model in a fully instrumented platform able to measure the actual temperature and energy consumed by computing, networking, and storage systems • Development of a multi-objective optimization approach to explore workflow execution tradeoffs 3
  • 4. Application Model: Scientific Workflows • Directed Acyclic Graph (DAG) • Nodes denote tasks • Edges denote task dependencies 4 • Tasks • Command-line programs that read one or more input files and produce one or more output files • Compute-intensive or data-intensive • Data dependencies • Result of output files from one program becoming input files for another program
  • 5. System Model: Distributed Infrastructure • Infrastructure as a Service (IaaS) • Data and task computations are stored/performed in the infrastructure 5 90 90 6 'LVWULEXWHG,QIUDVWUXFWXUH 6XEPLW +RVW +RVW +RVW RPPXQLFDWLRQ1HWZRUN 90 90 + 1: Application setup: provision of a set of parameters and input files uploading 2: Workflow task scheduling 3: Output data is stored on the storage server 4: Output data required by the user is downloaded from the storage server
  • 6. Runtime and Reliability Models • At Workflow Level (Our Expertise) • Collect and summarize performance metrics for workflow applications • e.g., process I/O, runtime, memory usage, CPU utilization • Profile data is used to build distributions of workflow applications • At Infrastructure Level (Looking for a Partner) • Collect temperature and energy consumption from execution nodes, storage servers, and network systems • Requires a fully instrumented platform 6
  • 7. Research Dimensions • Goal: Multi-objective optimization of energy consumption, makespan, and reliability for scientific workflows 7 Multi-objective optimization process • Monitoring • Workflow profile data has been collected as part of the DOE dV/dt project (ER26110) • Temperature and energy consumption monitoring requires access to a fully instrumented infrastructure
  • 8. 8 Research Dimensions • Multi-Objective Optimization • The improvement of one optimization criteria may imply in the deterioration of another criteria • Development of heuristics to reduce the large-search space of workflow executions • Modeling (Dynamic Optimization) • Models will be constantly updated based on the profiling data collected during the workflow execution • Workflow Execution • Conducted with the Pegasus WMS (OCI SI2-SSI #1148515)
  • 9. 9 Discussions • Major Contribution • Multi-objective optimization of energy consumption, makespan, and reliability for scientific workflows on large-scale computing infrastructures • Gaps in Current Research • There is no energy-aware profiling of scientific workflow applications • Research is focused on the optimization of a single or two objectives • Strong assumptions are made (e.g., homogeneous environments) • Synergistic Projects • dV/dT: Accelerating the Rate of Progress Towards Extreme Scale Collaborative Science (DOE ER26110) • Pegasus WMS (OCI SI2-SSI #1148515) • DOE Sustained Performance, Energy and Resilience (SUPER) project
  • 10. A Unified Approach for Modeling and Optimization of Energy, Makespan and Reliability for Scientific Workflows on Large-Scale Computing Infrastructures Thankyou. rafsilva@isi.edu- - h/p://pegasus.isi.edu-