SlideShare a Scribd company logo
Characterizing a High Throughput Computing Workload:
The Compact Muon Solenoid (CMS) Experiment at LHC
Rafael	
  Ferreira	
  da	
  Silva1,	
  Mats	
  Rynge1,	
  Gideon	
  Juve1,	
  Igor	
  Sfiligoi2,	
  	
  
Ewa	
  Deelman1,	
  James	
  LeAs1,	
  Frank	
  Würthwein2,	
  Miron	
  Livny3	
  
	
  
InternaGonal	
  Conference	
  on	
  ComputaGonal	
  Science	
  
June	
  1-­‐3,	
  2015,	
  Reykjavik,	
  Iceland	
  
1 University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA
2 University of California at San Diego, Department of Physics, La Jolla, CA, USA
3 University of Wisconsin Madison, Madison, WI, USA
2
Introduction
•  Methods assume that accurate estimates are available
•  It is hard to compute accurate estimates in production systems
•  Compact Muon Solenoid (CMS) experiment at the Large
Hadron Collider (LHC)
•  Process millions of jobs submitted by hundreds of users
•  The efficiency of the workload execution and resource utilization
depends on how these jobs are scheduled and resources are
provisioned
2
Scheduling	
  and	
  
Resource	
  Provisioning	
  
Algorithms	
  
Task	
  CharacterisGcs:	
  
RunGme	
  
Disk	
  Space	
  
Memory	
  ConsumpGon	
  
3
Overview of the Resource Provisioning Loop
3
Workload
Characterization
Resource
Allocation
ExecutionMonitoring
Workload Archive
dV/dtExecution Traces
Workload
Estimation
4
What is covered in this work?
4
Workload
Characterization
Resource
Allocation
ExecutionMonitoring
Workload Archive
Execution Traces
Workload
Estimation
dV/dt
5
Workload Characteristics
5
Characteristic Data
General Workload
Total number of jobs 1,435,280
Total number of users 392
Total number of execution sites 75
Total number of execution nodes 15,484
Jobs statistics
Completed jobs 792,603
Preempted jobs 257,230
Exit code (!= 0) 385,447
Average job runtime (in seconds) 9,444.6
Standard deviation of job runtime (in seconds) 14,988.8
Average disk usage (in MB) 55.3
Standard deviation of disk usage (in MB) 219.1
Average memory usage (in MB) 217.1
Standard deviation of memory usage (in MB) 659.6
Characteristics of the CMS workload for a period of a month (Aug 2014)
6
Workload Execution Profiling
•  The workload shows similar behavior to the workload analysis
conducted in [Sfiligoi 2013]
•  The magnitude of the job runtimes varies among users and tasks
6
Job runtimes by user
sorted by per-user mean job runtime
Job runtimes by task
sorted by per-task mean job runtime
7
Workload Execution Profiling (2)
7
0
2000
4000
6000
8000
0 200 400 600 800
Time (hours)
NumberofJobs
Job start time rate
Colors represent different execution sites – job distribution is relatively balanced among sites
Job completion time rate
Colors represent different job status
8
site
host
factory
exitCode
jobStatus
queueTime
startTime
completionTime
duration
remoteWallClockTime
numJ obStarts
numRequestedCpus
remoteSysCpu
remoteUserCpu
diskUsage
diskRequested
diskProvisioned
inputsSize
memoryProvisioned
imageSize
memoryRequested
command
executableSize
blTaskID
user
site
host
factory
exitCode
jobStatus
queueTime
startTime
completionTime
duration
remoteWallClockTime
numJobStarts
numRequestedCpus
remoteSysCpu
remoteUserCpu
diskUsage
diskRequested
diskProvisioned
inputsSize
memoryProvisioned
imageSize
memoryRequested
command
executableSize
•  Correlation Statistics
•  Weak correlations suggest
that none of the properties
can be directly used to
predict future workload
behaviors
•  Two variables are
correlated if the ellipse is
too narrow as a line
Workload Characterization
8
Trivial correlations
9
0
1
2
1e−01 1e+01 1e+03 1e+05
Job Runtime (sec)
ProbabilityDensity
0.0
0.3
0.6
0.9
1e−02 1e+00 1e+02 1e+04
Disk Usage (MB)
ProbabilityDensity
0
20
40
1e−02 1e+00 1e+02 1e+04
Memory Usage (MB)
ProbabilityDensity
•  Correlation measures are
sensitive to the data distribution
•  Probability Density Functions
•  Do not fit any of the most common
families of density families (e.g.
Normal or Gamma)
•  Our approach
•  Statistical recursive partitioning
method to combine properties from
the workload to build Regression
Trees
Workload Characterization (2)
9
10
•  The recursive algorithm looks for
PDFs that fit a family of density
•  In this work, we consider the Normal
and Gamma distributions
•  Measured with the Kolmogorov-
Smirnov test (K-S test)
Regression Trees
10
The PDF for the tree node (in blue)
fits a Gamma distribution (in grey)
with the following parameters:
Shape parameter = 12
Rate parameter = 5x10-4
Mean = 27414.8
p-value = 0.17
executableSize
p < 0.001
1
≤ 27 > 27
executableSize
p = 0.004
2
≤ 25 > 25
Node 3 (n = 522)
0
20000
40000
60000
80000
Node 4 (n = 19)
●●●
0
20000
40000
60000
80000
inputsSize
p < 0.001
5
≤ 28 > 28
numJobStarts
p = 0.02
6
≤ 0 > 0
Node 7 (n = 2161)
●●●●
●
●
●●
●
●●●
●
●
●●●●●●●●
●
●●●●●●
●
●
●●
●●●
●●●
●●
●
●
●●●●●
●
●
●
●●
●●●●
●
●●
●●●●
●
●●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●●●
●
●●●●●●
●
●
●
●●●●●●●●●●
●
●●●
●
●●●
●
●●
●●●
●
●
●
●●
●
●●
●
●●
●●
0
20000
40000
60000
80000
Node 8 (n = 152)
●●●●
●
●●
0
20000
40000
60000
80000
numJobStarts
p = 0.002
9
≤ 1 > 1
Node 10 (n = 26411)
●
●●
●
●
●
●
●●●●●●●
●
●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●●●●●
●
●●●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●●●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●●
●
●●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●●●
●
●
●●●
●
●●
●
●
●
●●
●
●
●●
●
●●●●●
●
●
●
●
●
●●
●
●●
●
●
●
●●●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●
●
●●
●●
●
●●
●●●
●●●●
●
●●●●
●
●●●
●
●●
●
●●
●
●●●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●●●
●
●
●
●●●
●●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●●
●●
●
●
●●●
●
●●
●
●●●
●
●●●
●
●
●
●
●
●●
●●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●●●
●●●●
●
●
●●
●
●
●
●
●●●●●
●
●
●
●●●●●●
●
●●●●●●
●
●●
●●
●●
●
●
●●●
●●
●
●●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●●
●●
●
●
●
●●
●●●
0
20000
40000
60000
80000
Node 11 (n = 665)
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
0
20000
40000
60000
80000
0e+00
2e−05
4e−05
6e−05
20000 40000 60000
Runtime (sec)
ProbabilityDensity
11
Job Estimation Process
11
•  Based on the regression trees
•  We built a regression tree per user
•  Estimates are generated according to a distribution
(Normal, Gamma, or Uniform)
12
Experimental Results
12
Job Runtime
Disk UsageMemory Usage
Average accuracy of the workload dataset
The training set is defined as a portion of the entire workload dataset
The median accuracy increases
as more data is used for the
training set
13
Experimental Results (2)
13
accuracy above 60% Fits mostly Normal
distributions
•  Number of Rules per Distribution
•  Runtime: better fits Gamma distributions
•  Disk: better fits Normal distributions
•  Memory: better fits Normal distributions Specialization
14
Prediction of Future Workloads
14
•  Experiment Conditions
•  Used the workload from Aug 2014 to predict job requirements for
October 2014
•  Experiment Results
•  Median estimation accuracy
Runtime: 82% (50% 1st quartile, 94% 3rd quartile)
Disk and Memory consumption: over 98%
Characteristic Data
General Workload
Total number of jobs 1,638,803
Total number of users 408
Jobs statistics
Completed jobs 810,567
Characteristics of the CMS workload for a period of a month (October 2014)
15
•  Contributions
•  Workload characterization of 1,435,280 jobs
•  Use of a statistical recursive partitioning algorithm and conditional
inference trees to identify patterns
•  Estimation process to predict job characteristics
•  Experimental Results
•  Adequate estimates can be attained for job runtime
•  Nearly optimal estimates are obtained for disk and memory
consumption
•  Remarks
•  Data collection process should be refined to gather finer information
•  Applications should provide mechanisms to distinguish custom user
codes from the standard executable
Conclusion
15
Characterizing a High Throughput Computing Workload:
The Compact Muon Solenoid (CMS) Experiment at LHC
Thank	
  you.	
  
	
  
rafsilva@isi.edu	
  
	
  
h/p://pegasus.isi.edu	
  
This work was funded by DOE under the contract number ER26110, “dV/dt - Accelerating the Rate of Progress
Towards Extreme Scale Collaborative Science”

More Related Content

What's hot

Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
Zbigniew Jerzak
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
Naoki Shibata
 
Handout3o
Handout3oHandout3o
Handout3o
Shahbaz Sidhu
 
Auto-scaling Techniques for Elastic Data Stream Processing
Auto-scaling Techniques for Elastic Data Stream ProcessingAuto-scaling Techniques for Elastic Data Stream Processing
Auto-scaling Techniques for Elastic Data Stream Processing
Zbigniew Jerzak
 
Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
Ramandeep Kaur
 
Adaptive Replication for Elastic Data Stream Processing
Adaptive Replication for Elastic Data Stream ProcessingAdaptive Replication for Elastic Data Stream Processing
Adaptive Replication for Elastic Data Stream Processing
Zbigniew Jerzak
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
Joydeep Sen Sarma
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
IRJET Journal
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
IJSRD
 
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
Journal For Research
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
eSAT Publishing House
 
Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.
Ramandeep Kaur
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
Anubhav Jain
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution Service
Angelo Corsaro
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
butest
 
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
Naoki Shibata
 
4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
Islam Samir
 
Genetic Algorithm for Process Scheduling
Genetic Algorithm for Process SchedulingGenetic Algorithm for Process Scheduling
Genetic Algorithm for Process Scheduling
Login Technoligies
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
Anubhav Jain
 

What's hot (20)

Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
 
Handout3o
Handout3oHandout3o
Handout3o
 
Auto-scaling Techniques for Elastic Data Stream Processing
Auto-scaling Techniques for Elastic Data Stream ProcessingAuto-scaling Techniques for Elastic Data Stream Processing
Auto-scaling Techniques for Elastic Data Stream Processing
 
Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
 
Adaptive Replication for Elastic Data Stream Processing
Adaptive Replication for Elastic Data Stream ProcessingAdaptive Replication for Elastic Data Stream Processing
Adaptive Replication for Elastic Data Stream Processing
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Hadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspectiveHadoop Scheduling - a 7 year perspective
Hadoop Scheduling - a 7 year perspective
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
 
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
TASK SCHEDULING USING AMALGAMATION OF MET HEURISTICS SWARM OPTIMIZATION ALGOR...
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
 
Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution Service
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
 
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...
 
4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
 
Genetic Algorithm for Process Scheduling
Genetic Algorithm for Process SchedulingGenetic Algorithm for Process Scheduling
Genetic Algorithm for Process Scheduling
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 

Similar to Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC

PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
Feng Li
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
SAIL_QU
 
Cloud workload analysis and simulation
Cloud workload analysis and simulationCloud workload analysis and simulation
Cloud workload analysis and simulation
Prabhakar Ganesamurthy
 
Scientific
Scientific Scientific
Scientific
marpierc
 
Fault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational gridFault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational grid
Ghulam Asfia
 
A Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing CenterA Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing Center
James McGalliard
 
MSR 2009
MSR 2009MSR 2009
MSR 2009
swy351
 
Srushti_M.E_PPT.ppt
Srushti_M.E_PPT.pptSrushti_M.E_PPT.ppt
Srushti_M.E_PPT.ppt
khalid aberbach
 
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
Anis Nasir
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
Ian Foster
 
Seven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingSeven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch Benchmarking
Fan Robbin
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
Farley Lai
 
Hui 3.0
Hui 3.0Hui 3.0
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
IJEACS
 
Datacenter workload analysis & qualification of stroage servers fms 2017
Datacenter workload analysis & qualification of stroage servers fms 2017Datacenter workload analysis & qualification of stroage servers fms 2017
Datacenter workload analysis & qualification of stroage servers fms 2017
_fJ_
 
Desktop to Cloud Transformation Planning
Desktop to Cloud Transformation PlanningDesktop to Cloud Transformation Planning
Desktop to Cloud Transformation Planning
Phearin Sok
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
Frederic Desprez
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
Abhiram Kanigolla
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
Ian Foster
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Deepak Shankar
 

Similar to Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC (20)

PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Cloud workload analysis and simulation
Cloud workload analysis and simulationCloud workload analysis and simulation
Cloud workload analysis and simulation
 
Scientific
Scientific Scientific
Scientific
 
Fault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational gridFault Tollerant scheduling system for computational grid
Fault Tollerant scheduling system for computational grid
 
A Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing CenterA Queue Simulation Tool for a High Performance Scientific Computing Center
A Queue Simulation Tool for a High Performance Scientific Computing Center
 
MSR 2009
MSR 2009MSR 2009
MSR 2009
 
Srushti_M.E_PPT.ppt
Srushti_M.E_PPT.pptSrushti_M.E_PPT.ppt
Srushti_M.E_PPT.ppt
 
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Seven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingSeven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch Benchmarking
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
 
Hui 3.0
Hui 3.0Hui 3.0
Hui 3.0
 
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
Resource Availability Prediction in the Grid: Taxonomy and Review of State of...
 
Datacenter workload analysis & qualification of stroage servers fms 2017
Datacenter workload analysis & qualification of stroage servers fms 2017Datacenter workload analysis & qualification of stroage servers fms 2017
Datacenter workload analysis & qualification of stroage servers fms 2017
 
Desktop to Cloud Transformation Planning
Desktop to Cloud Transformation PlanningDesktop to Cloud Transformation Planning
Desktop to Cloud Transformation Planning
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
 

More from Rafael Ferreira da Silva

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Rafael Ferreira da Silva
 
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Rafael Ferreira da Silva
 
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Rafael Ferreira da Silva
 
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
Rafael Ferreira da Silva
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
Rafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
Rafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Rafael Ferreira da Silva
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Rafael Ferreira da Silva
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Rafael Ferreira da Silva
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
Rafael Ferreira da Silva
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
Rafael Ferreira da Silva
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
Rafael Ferreira da Silva
 

More from Rafael Ferreira da Silva (20)

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
 
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
 
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
Good Practices for Developing Scientific Software Frameworks: The WRENCH fram...
 
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...WorkflowHub: Community Framework for Enabling  Scientific Workflow Research a...
WorkflowHub: Community Framework for Enabling Scientific Workflow Research a...
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 

Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC

  • 1. Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC Rafael  Ferreira  da  Silva1,  Mats  Rynge1,  Gideon  Juve1,  Igor  Sfiligoi2,     Ewa  Deelman1,  James  LeAs1,  Frank  Würthwein2,  Miron  Livny3     InternaGonal  Conference  on  ComputaGonal  Science   June  1-­‐3,  2015,  Reykjavik,  Iceland   1 University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA 2 University of California at San Diego, Department of Physics, La Jolla, CA, USA 3 University of Wisconsin Madison, Madison, WI, USA
  • 2. 2 Introduction •  Methods assume that accurate estimates are available •  It is hard to compute accurate estimates in production systems •  Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) •  Process millions of jobs submitted by hundreds of users •  The efficiency of the workload execution and resource utilization depends on how these jobs are scheduled and resources are provisioned 2 Scheduling  and   Resource  Provisioning   Algorithms   Task  CharacterisGcs:   RunGme   Disk  Space   Memory  ConsumpGon  
  • 3. 3 Overview of the Resource Provisioning Loop 3 Workload Characterization Resource Allocation ExecutionMonitoring Workload Archive dV/dtExecution Traces Workload Estimation
  • 4. 4 What is covered in this work? 4 Workload Characterization Resource Allocation ExecutionMonitoring Workload Archive Execution Traces Workload Estimation dV/dt
  • 5. 5 Workload Characteristics 5 Characteristic Data General Workload Total number of jobs 1,435,280 Total number of users 392 Total number of execution sites 75 Total number of execution nodes 15,484 Jobs statistics Completed jobs 792,603 Preempted jobs 257,230 Exit code (!= 0) 385,447 Average job runtime (in seconds) 9,444.6 Standard deviation of job runtime (in seconds) 14,988.8 Average disk usage (in MB) 55.3 Standard deviation of disk usage (in MB) 219.1 Average memory usage (in MB) 217.1 Standard deviation of memory usage (in MB) 659.6 Characteristics of the CMS workload for a period of a month (Aug 2014)
  • 6. 6 Workload Execution Profiling •  The workload shows similar behavior to the workload analysis conducted in [Sfiligoi 2013] •  The magnitude of the job runtimes varies among users and tasks 6 Job runtimes by user sorted by per-user mean job runtime Job runtimes by task sorted by per-task mean job runtime
  • 7. 7 Workload Execution Profiling (2) 7 0 2000 4000 6000 8000 0 200 400 600 800 Time (hours) NumberofJobs Job start time rate Colors represent different execution sites – job distribution is relatively balanced among sites Job completion time rate Colors represent different job status
  • 8. 8 site host factory exitCode jobStatus queueTime startTime completionTime duration remoteWallClockTime numJ obStarts numRequestedCpus remoteSysCpu remoteUserCpu diskUsage diskRequested diskProvisioned inputsSize memoryProvisioned imageSize memoryRequested command executableSize blTaskID user site host factory exitCode jobStatus queueTime startTime completionTime duration remoteWallClockTime numJobStarts numRequestedCpus remoteSysCpu remoteUserCpu diskUsage diskRequested diskProvisioned inputsSize memoryProvisioned imageSize memoryRequested command executableSize •  Correlation Statistics •  Weak correlations suggest that none of the properties can be directly used to predict future workload behaviors •  Two variables are correlated if the ellipse is too narrow as a line Workload Characterization 8 Trivial correlations
  • 9. 9 0 1 2 1e−01 1e+01 1e+03 1e+05 Job Runtime (sec) ProbabilityDensity 0.0 0.3 0.6 0.9 1e−02 1e+00 1e+02 1e+04 Disk Usage (MB) ProbabilityDensity 0 20 40 1e−02 1e+00 1e+02 1e+04 Memory Usage (MB) ProbabilityDensity •  Correlation measures are sensitive to the data distribution •  Probability Density Functions •  Do not fit any of the most common families of density families (e.g. Normal or Gamma) •  Our approach •  Statistical recursive partitioning method to combine properties from the workload to build Regression Trees Workload Characterization (2) 9
  • 10. 10 •  The recursive algorithm looks for PDFs that fit a family of density •  In this work, we consider the Normal and Gamma distributions •  Measured with the Kolmogorov- Smirnov test (K-S test) Regression Trees 10 The PDF for the tree node (in blue) fits a Gamma distribution (in grey) with the following parameters: Shape parameter = 12 Rate parameter = 5x10-4 Mean = 27414.8 p-value = 0.17 executableSize p < 0.001 1 ≤ 27 > 27 executableSize p = 0.004 2 ≤ 25 > 25 Node 3 (n = 522) 0 20000 40000 60000 80000 Node 4 (n = 19) ●●● 0 20000 40000 60000 80000 inputsSize p < 0.001 5 ≤ 28 > 28 numJobStarts p = 0.02 6 ≤ 0 > 0 Node 7 (n = 2161) ●●●● ● ● ●● ● ●●● ● ● ●●●●●●●● ● ●●●●●● ● ● ●● ●●● ●●● ●● ● ● ●●●●● ● ● ● ●● ●●●● ● ●● ●●●● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ●●● ●●●●● ● ●●●●●● ● ● ● ●●●●●●●●●● ● ●●● ● ●●● ● ●● ●●● ● ● ● ●● ● ●● ● ●● ●● 0 20000 40000 60000 80000 Node 8 (n = 152) ●●●● ● ●● 0 20000 40000 60000 80000 numJobStarts p = 0.002 9 ≤ 1 > 1 Node 10 (n = 26411) ● ●● ● ● ● ● ●●●●●●● ● ● ●● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ●●●●● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ●● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ●●● ● ● ●●● ● ●● ● ● ● ●● ● ● ●● ● ●●●●● ● ● ● ● ● ●● ● ●● ● ● ● ●●●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ●●● ●● ●● ● ● ● ● ●● ●● ● ●● ●●● ●●●● ● ●●●● ● ●●● ● ●● ● ●● ● ●●● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ●●● ●● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ●● ● ● ●●● ● ●● ● ●●● ● ●●● ● ● ● ● ● ●● ●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●●●● ●●●● ● ● ●● ● ● ● ● ●●●●● ● ● ● ●●●●●● ● ●●●●●● ● ●● ●● ●● ● ● ●●● ●● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ●●● 0 20000 40000 60000 80000 Node 11 (n = 665) ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● 0 20000 40000 60000 80000 0e+00 2e−05 4e−05 6e−05 20000 40000 60000 Runtime (sec) ProbabilityDensity
  • 11. 11 Job Estimation Process 11 •  Based on the regression trees •  We built a regression tree per user •  Estimates are generated according to a distribution (Normal, Gamma, or Uniform)
  • 12. 12 Experimental Results 12 Job Runtime Disk UsageMemory Usage Average accuracy of the workload dataset The training set is defined as a portion of the entire workload dataset The median accuracy increases as more data is used for the training set
  • 13. 13 Experimental Results (2) 13 accuracy above 60% Fits mostly Normal distributions •  Number of Rules per Distribution •  Runtime: better fits Gamma distributions •  Disk: better fits Normal distributions •  Memory: better fits Normal distributions Specialization
  • 14. 14 Prediction of Future Workloads 14 •  Experiment Conditions •  Used the workload from Aug 2014 to predict job requirements for October 2014 •  Experiment Results •  Median estimation accuracy Runtime: 82% (50% 1st quartile, 94% 3rd quartile) Disk and Memory consumption: over 98% Characteristic Data General Workload Total number of jobs 1,638,803 Total number of users 408 Jobs statistics Completed jobs 810,567 Characteristics of the CMS workload for a period of a month (October 2014)
  • 15. 15 •  Contributions •  Workload characterization of 1,435,280 jobs •  Use of a statistical recursive partitioning algorithm and conditional inference trees to identify patterns •  Estimation process to predict job characteristics •  Experimental Results •  Adequate estimates can be attained for job runtime •  Nearly optimal estimates are obtained for disk and memory consumption •  Remarks •  Data collection process should be refined to gather finer information •  Applications should provide mechanisms to distinguish custom user codes from the standard executable Conclusion 15
  • 16. Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC Thank  you.     rafsilva@isi.edu     h/p://pegasus.isi.edu   This work was funded by DOE under the contract number ER26110, “dV/dt - Accelerating the Rate of Progress Towards Extreme Scale Collaborative Science”