SlideShare a Scribd company logo
Use of a Levy Distribution for Modeling 
Best Case Execution Time Variation 
Jonathan Beard, Roger Chamberlain 
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Work also supported by: 
1
Outline 
• Motivation! 
• Stream Processing! 
• Optimization Goals! 
• Methodology! 
• Distributions! 
• Results 
2
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
3
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
Kernel 
3
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Computing 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
Stream 
Stream 
Stream 
Stream 
4
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Streaming Languages 
StreamIt, Auto-Pipe, Brook, Cg, S-Net, 
Scala-Pipe, Streams-C and 
many others 
5
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Slow 
Fast Kernel 
Super Fast 
Medium 
6
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
7
1 2 
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
7
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
Kernel 1 
Kernel 2 
Kernel 3 
Kernel 2 
multi-core A 
1 2 
3 4 
multi-core B 
1 2 
3 4 
More allocation choices, 
NUMA node A or B to 
allocate stream. 
1 2 
7
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
A B C 
“Stream” is modeled as a Queue 
A Q1 B Q2 C 
8
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Optimization 
A B C 
“Stream” is modeled as a Queue 
A Q1 B Q2 C 
8
We want good models for streaming systems 
on shared multi-core systems (i.e., a cluster) 
Problem: Accurate measurement is very difficult. Is there 
a way to decide on a model without it. 
• Commodity multi-core timer availability and latency 
• Frequency scaling and core migration 
• Measuring modifies the application behavior 
SBS 
Streaming on Multi-core Systems 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
9
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Derived Information 
Expected Observed 
10
SBS 
Is there a pattern of minimal variation within the 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Derived Information 
Expected Observed 
systems we’re running on? 
Avg. Service Time = E[ X ] + Error 
10
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Goal 
Find a distribution that characterizes 
the minimum expected variation of a 
hardware and software system 
Use this characterization to 
accept or reject models 
11
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Process 
12 
• Measurement! 
• Workload definition! 
• Find a distribution! 
• Utilize the distribution to aid model selection
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Timer Mechanism 
Timer Thread Code 
13 
Ask for Time 
Receive Time
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Timer Mechanism 
Timer Thread 
rdtsc clock_gettime 
14 
• x86 assembly 
• varying methods 
to serialize 
• relatively fast 
• multiple drift 
issues 
• POSIX standard 
• relatively accurate 
• portable 
• slower than rdtsc
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Two Timing Choices 
15
SBS 
NUMA Node Variations 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
16
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Minimize Variation 
• Restricting timer to single core 
! 
• Use the x86 rdtsc instruction with processor 
recommended serializers for each processor 
type 
! 
• Keeping processes under test on the same 
NUMA node as timer 
! 
• Run timer thread with altered priority to 
minimize core context swaps 
17
SBS 
Best Case Execution Time Variation 
• no-op instruction implemented in most processors 
! 
• usually takes exactly 1 cycle 
! 
• no real functional units are involved, so least 
taxing 
! 
• variation observed in execution time should be 
external to process 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
18
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Data Collection 
• no-op loops calibrated for various nominal 
times, tied to a single core and run 
thousands of times 
! 
• Execution time measured end to end for 
each run, environment collected 
! 
• Parameters include: 
Number of processes executing on core 
Number of context swaps (voluntary, 
involuntary) 
Many others 
19
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
20 
Execution Time Error 
( obs - mean )
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
21 
Normal Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
22 
Gumbel Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
23 
Levy Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
23 
Levy Distribution
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Distribution 
• Truncation enables mean calculation, but 
requires fitting to each dataset to find where 
to truncate 
! 
• The truncation parameters are correlated to 
both the number of processes per core and 
the expected execution time 
! 
• Roughly linear relationship gives an 
approximate solution to truncation 
parameters without refitting 
24
1 - 5 processes 6 - 10 processes 
!0.000014 
!0.0000145 
!0.000015 
!0.0000155 
SBS 
11 - 15 processes 16 - 20 processes 
!0.00002 
!0.000025 
!0.00003 
!0.000035 
!0.00004 
!0.000045 
!0.00005 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Levy Fit 
!0.0000125 
!0.000013 
!0.0000135 
!0.000014 
!0.00001 
!0.000015 
!0.00002 
!0.000025 
!0.00003 
25 
!0.000025 !0.00001 
!0.000025 !0.00001 0 
!0.00006 !0.00003 0 
!0.00005 !0.00002 0
Hypothesis: Lower Kullback-Leibler (KL) divergence 
SBS 
Question: Can we use an M/M/1 queueing model to 
estimate the mean queue occupancy of this system? 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Test Setup 
A Q1 B 
! 
between expected and realized distribution is 
associated with higher model accuracy. 
26
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Test Setup 
A Q1 B 
1. Dedicated thread of execution monitors 
27 
queue occupancy 
2. Calculate the estimated mean queue 
occupancy using the M/M/1 model 
3. Calculate KL Divergence for the arrival 
process distribution using the truncated 
Levy distribution noise model
SBS 
Convolution with Exponential 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
28
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Conclusions 
• The truncated Levy distribution can be used to 
approximate BCETV 
! 
• The distribution of BCETV can be used as a tool 
to accept or reject a stochastic queueing model 
based on distributional assumptions 
! 
• KL divergence between the expected and 
convolved distribution highly correlates with 
queue model accuracy 
29
SBS 
Stream Based 
Supercomputing Lab 
http://sbs.wustl.edu 
Parting Notes 
Slides available here: 
sbs.wust.edu 
! 
Timer C++ template code: 
http://goo.gl/ItJ3jP 
! 
Test harness used to collect data: 
http://goo.gl/U1VG6N 
30

More Related Content

What's hot

Ceph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der SterCeph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der Ster
Ceph Community
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
Ceph Community
 
Performance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionPerformance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage Collection
Haribabu Nandyal Padmanaban
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Community
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
Marcelo Veiga Neves
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Ceph Community
 
An example transition to 1687-based mixed-signal DFT
An example transition to  1687-based mixed-signal DFTAn example transition to  1687-based mixed-signal DFT
An example transition to 1687-based mixed-signal DFT
Pete Sarson, PH.D
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
Andy Mauer
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph Community
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
Ceph Community
 
Hadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industryHadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industry
Matthew Hunt
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Amazon Web Services
 
Bin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacentersBin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacenters
Fabien Hermenier
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
Miguel Xavier
 
MySQL Head to Head Performance
MySQL Head to Head PerformanceMySQL Head to Head Performance
MySQL Head to Head Performance
Kyle Bader
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Ceph Community
 
Kafka reliability velocity 17
Kafka reliability   velocity 17Kafka reliability   velocity 17
Kafka reliability velocity 17
Gwen (Chen) Shapira
 
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
DigitalOcean
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
Kyle Bader
 
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYApache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Wangda Tan
 

What's hot (20)

Ceph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der SterCeph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der Ster
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
Performance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionPerformance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage Collection
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
An example transition to 1687-based mixed-signal DFT
An example transition to  1687-based mixed-signal DFTAn example transition to  1687-based mixed-signal DFT
An example transition to 1687-based mixed-signal DFT
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
Hadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industryHadoop at Bloomberg:Medium data for the financial industry
Hadoop at Bloomberg:Medium data for the financial industry
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Bin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacentersBin repacking scheduling in virtualized datacenters
Bin repacking scheduling in virtualized datacenters
 
Peformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based ViPeformance Evaluation of Container-based Vi
Peformance Evaluation of Container-based Vi
 
MySQL Head to Head Performance
MySQL Head to Head PerformanceMySQL Head to Head Performance
MySQL Head to Head Performance
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
 
Kafka reliability velocity 17
Kafka reliability   velocity 17Kafka reliability   velocity 17
Kafka reliability velocity 17
 
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYApache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NY
 

Similar to Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
Bryan Bende
 
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Toby Bloom
 
DevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance NetworksDevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance Networks
Jason TC HOU (侯宗成)
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
CloudLightning
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
Miguel Xavier
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
Tubular Labs
 
AMBA 2.0 REPORT
AMBA 2.0 REPORTAMBA 2.0 REPORT
AMBA 2.0 REPORT
Nirav Desai
 
Distributed automation selcamp2016
Distributed automation selcamp2016Distributed automation selcamp2016
Distributed automation selcamp2016
aragavan
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Continuent
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
Farley Lai
 
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
Amazon Web Services Korea
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
Samsung Electronics
 
High Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyHigh Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander Krishnamurthy
Massimo Talia
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
Sebastian Schmeier
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
Bioinformatics and Computational Biosciences Branch
 
Scan insertion
Scan insertionScan insertion
Scan insertion
kumar gavanurmath
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
Stéphane Maldini
 
Cloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems TriviaCloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems Trivia
Dr.-Ing. Michael Menzel
 
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOCSOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SnehaLatha68
 

Similar to Use of a Levy Distribution for Modeling Best Case Execution Time Variation (20)

Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
 
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
 
DevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance NetworksDevoFlow - Scaling Flow Management for High-Performance Networks
DevoFlow - Scaling Flow Management for High-Performance Networks
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
 
AMBA 2.0 REPORT
AMBA 2.0 REPORTAMBA 2.0 REPORT
AMBA 2.0 REPORT
 
Distributed automation selcamp2016
Distributed automation selcamp2016Distributed automation selcamp2016
Distributed automation selcamp2016
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
 
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
 
High Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyHigh Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander Krishnamurthy
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
 
Scan insertion
Scan insertionScan insertion
Scan insertion
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Cloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems TriviaCloud Architecture & Distributed Systems Trivia
Cloud Architecture & Distributed Systems Trivia
 
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOCSOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
SOC-CH3.pptSOC ProcessorsSOC Processors Used in SOC Used in SOC
 

Recently uploaded

Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 

Recently uploaded (20)

Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 

Use of a Levy Distribution for Modeling Best Case Execution Time Variation

  • 1. Use of a Levy Distribution for Modeling Best Case Execution Time Variation Jonathan Beard, Roger Chamberlain SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Work also supported by: 1
  • 2. Outline • Motivation! • Stream Processing! • Optimization Goals! • Methodology! • Distributions! • Results 2
  • 3. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing 3
  • 4. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing Kernel 3
  • 5. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Computing Kernel 1 Kernel 2 Kernel 3 Kernel 2 Stream Stream Stream Stream 4
  • 6. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Streaming Languages StreamIt, Auto-Pipe, Brook, Cg, S-Net, Scala-Pipe, Streams-C and many others 5
  • 7. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Slow Fast Kernel Super Fast Medium 6
  • 8. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 7
  • 9. 1 2 SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 7
  • 10. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization Kernel 1 Kernel 2 Kernel 3 Kernel 2 multi-core A 1 2 3 4 multi-core B 1 2 3 4 More allocation choices, NUMA node A or B to allocate stream. 1 2 7
  • 11. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization A B C “Stream” is modeled as a Queue A Q1 B Q2 C 8
  • 12. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Optimization A B C “Stream” is modeled as a Queue A Q1 B Q2 C 8
  • 13. We want good models for streaming systems on shared multi-core systems (i.e., a cluster) Problem: Accurate measurement is very difficult. Is there a way to decide on a model without it. • Commodity multi-core timer availability and latency • Frequency scaling and core migration • Measuring modifies the application behavior SBS Streaming on Multi-core Systems Stream Based Supercomputing Lab http://sbs.wustl.edu 9
  • 14. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Derived Information Expected Observed 10
  • 15. SBS Is there a pattern of minimal variation within the Stream Based Supercomputing Lab http://sbs.wustl.edu Derived Information Expected Observed systems we’re running on? Avg. Service Time = E[ X ] + Error 10
  • 16. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Goal Find a distribution that characterizes the minimum expected variation of a hardware and software system Use this characterization to accept or reject models 11
  • 17. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Process 12 • Measurement! • Workload definition! • Find a distribution! • Utilize the distribution to aid model selection
  • 18. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Timer Mechanism Timer Thread Code 13 Ask for Time Receive Time
  • 19. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Timer Mechanism Timer Thread rdtsc clock_gettime 14 • x86 assembly • varying methods to serialize • relatively fast • multiple drift issues • POSIX standard • relatively accurate • portable • slower than rdtsc
  • 20. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Two Timing Choices 15
  • 21. SBS NUMA Node Variations Stream Based Supercomputing Lab http://sbs.wustl.edu 16
  • 22. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Minimize Variation • Restricting timer to single core ! • Use the x86 rdtsc instruction with processor recommended serializers for each processor type ! • Keeping processes under test on the same NUMA node as timer ! • Run timer thread with altered priority to minimize core context swaps 17
  • 23. SBS Best Case Execution Time Variation • no-op instruction implemented in most processors ! • usually takes exactly 1 cycle ! • no real functional units are involved, so least taxing ! • variation observed in execution time should be external to process Stream Based Supercomputing Lab http://sbs.wustl.edu 18
  • 24. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Data Collection • no-op loops calibrated for various nominal times, tied to a single core and run thousands of times ! • Execution time measured end to end for each run, environment collected ! • Parameters include: Number of processes executing on core Number of context swaps (voluntary, involuntary) Many others 19
  • 25. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 20 Execution Time Error ( obs - mean )
  • 26. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 21 Normal Distribution
  • 27. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 22 Gumbel Distribution
  • 28. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 23 Levy Distribution
  • 29. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution 23 Levy Distribution
  • 30. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Distribution • Truncation enables mean calculation, but requires fitting to each dataset to find where to truncate ! • The truncation parameters are correlated to both the number of processes per core and the expected execution time ! • Roughly linear relationship gives an approximate solution to truncation parameters without refitting 24
  • 31. 1 - 5 processes 6 - 10 processes !0.000014 !0.0000145 !0.000015 !0.0000155 SBS 11 - 15 processes 16 - 20 processes !0.00002 !0.000025 !0.00003 !0.000035 !0.00004 !0.000045 !0.00005 Stream Based Supercomputing Lab http://sbs.wustl.edu Levy Fit !0.0000125 !0.000013 !0.0000135 !0.000014 !0.00001 !0.000015 !0.00002 !0.000025 !0.00003 25 !0.000025 !0.00001 !0.000025 !0.00001 0 !0.00006 !0.00003 0 !0.00005 !0.00002 0
  • 32. Hypothesis: Lower Kullback-Leibler (KL) divergence SBS Question: Can we use an M/M/1 queueing model to estimate the mean queue occupancy of this system? Stream Based Supercomputing Lab http://sbs.wustl.edu Test Setup A Q1 B ! between expected and realized distribution is associated with higher model accuracy. 26
  • 33. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Test Setup A Q1 B 1. Dedicated thread of execution monitors 27 queue occupancy 2. Calculate the estimated mean queue occupancy using the M/M/1 model 3. Calculate KL Divergence for the arrival process distribution using the truncated Levy distribution noise model
  • 34. SBS Convolution with Exponential Stream Based Supercomputing Lab http://sbs.wustl.edu 28
  • 35. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Conclusions • The truncated Levy distribution can be used to approximate BCETV ! • The distribution of BCETV can be used as a tool to accept or reject a stochastic queueing model based on distributional assumptions ! • KL divergence between the expected and convolved distribution highly correlates with queue model accuracy 29
  • 36. SBS Stream Based Supercomputing Lab http://sbs.wustl.edu Parting Notes Slides available here: sbs.wust.edu ! Timer C++ template code: http://goo.gl/ItJ3jP ! Test harness used to collect data: http://goo.gl/U1VG6N 30