SlideShare a Scribd company logo
1 of 33
RELATING THE TIME
REQUIRED TO OBSERVE A
CERTAIN NUMBER OF EVENTS
Asoka Korale, Ph.D.
C.Eng. MIESL
MOTIVATIONS FOR RELATING TIME
AND EVENTS
APPLICATIONS OF RELATING TIME
AND EVENTS Call CentersTraffic Management
Transportation and Logistics
Packet Switching
Production Scheduling
Forecasting / Relating Time based Ev
INSIGHTS FROM RELATING TIME
AND EVENTS• Relate an interval of observation to a sum of inter-arrival time random
variables
• Relate the interval of observation to
• the total number of events observed in the interval
• the uncertainty associated with the average number of events in the
interval
• the sum of the number of inter-arrival time intervals that compose
the interval
• Establish a probabilistic relationship for the time taken to observe a
number of events
• Relate the uncertainty in the interval of observation to a number of
events
NOVEL STOCHASTIC RELATIONSHIP
BETWEEN
TIME AND EVENTS
RELATE TIME TAKEN TO OBSERVE A CERTAIN NUMBER OF EVENTS
UNCERTAINTY ASSOCIATED WITH
EVENTS OVER TIME
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
 total time (Z) to observe a number of events (N) is a sum
of a similar number of inter-arrival time – time intervals
 each inter-arrival time – time interval a random variable
(∆𝑡i)
 total uncertainty in the time interval (Z) a reflection of
the uncertainty associated with each individual random
variable (∆𝑡i)
 the dependence between random variables impacts the
total uncertainty associated with the sum
 total uncertainty in the interval (Z) – leads to the
variance in the number of events observed in such an
interval
time interval Z to observe N events
inter-arrival time random
variables
distribution of inter-arrival
times
events
A SUM OF INTER-ARRIVAL TIME
RANDOM VARIABLES
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
• Each event inter-arrival time ∆𝑡i is a random variable
• each such random variable has associated with it a certain uncertainty
• An N number of inter-arrival time random variables are required to observe an equivalent
number of events
• The total time Z taken to observe N events is a sum of N inter-arrival time random
variables
• The uncertainty associated with this sum of random variables – translates in to a number
of events
• a number of events associated with the uncertainty in the total time taken to observe
the events
• The distribution of the inter-arrival times may be estimated from historical data
RELATING TIME AND
EVENTS
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
when the inter-arrival times are drawn from a single distribution and are independent (IID), Z has
mean and variance
E(Z) = 𝑁𝜇∆𝑡
Var Z = 𝑁𝜎∆𝑡
2
E(∆𝑡𝑖) = 𝜇∆𝑡
Var ∆𝑡𝑖 = 𝜎∆𝑡
2
when the events are correlated the variance of the sum of a number of inter-arrival times will
feature the covariance between each pair of random variables that compose the sum
𝑉𝑎𝑟 𝑍 = ∀𝑖 𝑉𝑎𝑟(∆𝑡𝑖) + ∀𝑖,𝑗 𝑖≠𝑗 𝐶𝑜𝑣(∆𝑡𝑖∆𝑡𝑗)
𝑁 =
𝑉𝑎𝑟(𝑍)
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
𝑁 = 𝑁 ± 𝑘 ∗ 𝑁
• to observe 𝑁 number of events in a time interval of length Z
• scale the variance (or standard deviation) via constant k
• a measure of the degree of the uncertainty in N - a measure of its deviation
from the mean.
where
where
NOVEL STOCHASTIC MODEL OF AN
M/M/1 QUEUE SYSTEM
BY RELATING TIME AND EVENTS VIA A SUM OF INTER-ARRIVAL TIME RANDOM
VARIABLES
Birth – Death process model of an M/M/1
Queue System
Deterministic approach –
• rates are deterministic – usually measured over an
interval of time
λ
>
n=0
Po
<
µ
λ
>
λ
>
<
µ
<
µ
n=n
Pn
n= n-1
Pn-1
n=1
P1
n=2
P2
λ
>
<
µ
λ𝑃𝑛−1 = μ𝑃𝑛 𝑃𝑛 = (λ/μ) 𝑛
𝑃0
𝑛=0
𝑁
𝑃𝑛 = 1ρ = λ/μ
𝑃𝑛 = 𝜌 − 1 [ 𝜌 𝑁+1
− 1]ρ 𝑛
balance equations
traffic intensity
probability distribution of
state
use sum to solve
for Po
probability of
state
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
approach
Deterministic Approach Stochastic Approach
λ
>
n=0
Po
<
µ
λ
>
λ
>
<
µ
<
µ
n=n
Pnn= n-1
Pn-1
n=1
P1
n=2
P2
λ
>
<
µ
λ 𝑛−1 𝑃𝑛−1 = 𝜇 𝑛 𝑃𝑛
𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0
𝑛=0
𝑁
𝑃𝑛 = 1
ρ 𝑛 = λ 𝑛−1/μ 𝑛
λ𝑖
𝑖
= 1/∆𝑡𝑖
𝐴
𝜇𝑖
𝑖
= 1/∆𝑡𝑖
𝐷
λ𝑖 = 𝐸{λ𝑖
𝑖
} = 𝐸{1/∆𝑡𝑖
𝐴
}
𝜇𝑖 = 𝐸{𝜇𝑖
𝑖
} = 𝐸{1/∆𝑡𝑖
𝐷
}
instantaneous arrivals and
departure rates
𝑃𝑛
𝑖
= ∆ 𝑡 𝑛
𝐷
∆ 𝑡 𝑛−1
𝐴
. . . (∆ 𝑡1
𝐷
∆ 𝑡0
𝐴
)𝑃0
𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0
expected probability of state converges to
deterministic result
instantaneous probability of
state
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
Probability of observing a particular sequence of
events
when inter-arrival times are independent the expectation of the product it the product
of the expectations
Let Z = 𝑃(∆𝑡1, ∆𝑡2, … , ∆𝑡 𝑁)
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
𝑃 𝑍 = 𝑖=1
𝑁
𝑃(∆𝑡𝑖)
𝐸 𝑃 𝑍 = 𝐸
𝑖=1
𝑁
)𝑃(∆𝑡𝑖 =
𝑖=1
𝑁
}𝐸{𝑃(∆𝑡𝑖)
probability of a sequence is the product of the individual probabilities of observing a
particular inter-arrival time
when inter-arrival times are independent – consistent with an M/M/1
scenario
ANOMALY DETECTION IN AN M/M/1
QUEUE SYSTEM
CHARACTERIZING PERFORMANCE OF A SOFTWARE COMPONENT
Anomaly Detection Scheme
• A system of components
• Each component a queue / server
Comp 1
Comp 2
Comp 3
Comp N
• Component Load  Distribution of No of Messages in System
 Arrivals – Departures in ∆𝑇
load trigger threshold
M (I)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 1 1
State N
Comp
2
Comp
• Dispersion of anomaly across component sy
Estimating Load on a Software
Component
• Treat system as a network of components
• inter-arrival times help to characterize the performance best
• Model each component as queue – server system
• Queue – buffering messages into the component
• Server – processing all messages within a component
• No of messages in “system” (in queuing parlance)
• those waiting and in service – difference between arrivals and departures
• account for multiple queues within a component
--------------------------------------------------------------------------
• Common approach - threshold based alert system
• Thresholds commonly measure performance - at
• component level
• system level
• Typically Thresholds use – latencies, queue lengths,
Performance Measures - Software
Component
• Variation in the number of messages in “system” (in queuing parlance)
• Performance measures –
• Variance, Mean - of messages in the system
• Variance / Mean - of messages in the system
• Estimate Performance measure from the Distribution of
• no of messages
• Variance / Mean
• Threshold setting –
• detect an outlier
• a certain number of standard deviations from mean
• The time behavior of the distribution in the arrivals and departures will imp
 envision time dependent thresholds
Characterizing Variation in the load
𝑍 𝐴
= ∆𝑇 = ∆𝑡1 𝐴 + ∆𝑡2 𝐴 + ⋯ + ∆𝑡 𝑁 𝐴
𝑍 𝐷
= ∆𝑇 = ∆𝑡1 𝐷 + ∆𝑡2 𝐷 + ⋯ + ∆𝑡 𝑁 𝐷
𝑁 𝐴 = 𝑘 𝐴
𝑉𝑎𝑟(𝑍 𝐴)
𝜇∆𝑡,𝐴
= 𝑘 𝐴
𝑁 𝐴 𝜎 𝐴
2
𝜇∆𝑡,𝐴
𝑁 𝐷 = 𝑘 𝐷
𝑉𝑎𝑟(𝑍 𝐷)
𝜇∆𝑡,𝐷
= 𝑘 𝐷
𝑁 𝐷 𝜎 𝐷
2
𝜇∆𝑡,𝐷
𝑁 = 𝐸{𝑁 𝐴
} − 𝐸{𝑁 𝐷
}
𝑉𝑎𝑟{𝑁 𝐴
− 𝑁 𝐷
} = 𝑉𝑎𝑟{𝑁 𝐴
} + Var{𝑁 𝐷
}
No of events in the system at the end of a common time interval ∆𝑇 is the difference
between those that arrive and those that depart
total number of arrivals in time interval ∆𝑇 is 𝑁 𝐴
total number of arrivals in time interval ∆𝑇
is 𝑁 𝐷
number of arrivals associated with the composition of 𝑁 𝐴
events in
time interval ∆𝑇
number of departures associated with the composition of 𝑁 𝐷
events
in time interval ∆𝑇
average number of events in the system at the end of time
interval ∆𝑇
variance in the number of events in the system at the end of
time interval ∆𝑇
The variance arises due to the contribution of the individual uncertainties associated with
the individual random variables that compose the sum ∆𝑇
Components
• Model the anomaly state (yes 1 / no 0) at each
component - interface
• Track anomalies across system and across
time via a transition matrix (M)
• Update transition matrix entries at each
change of state
• the difference between matrix M(I+1) and
M(I) will provide system state at M(I-1) and
also the
• The transition matrix gives insight in to how
M (I)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 1 1
State N
Comp
2
Comp
3 1
Comp
N
Comp 1
Comp 2
Comp 3
Comp N
M (I+1)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 2 1
State N
Comp
2
Comp
3 1
update when system state changes
record anomaly on a link-component
RESULTS:
ANOMALY DETECTION IN AN M/M/1
QUEUE SYSTEM
TO CHARACTERIZE PERFORMANCE OF A SOFTWARE COMPONENT
Test Scenarios and Validation of model
Test Scenarios:
• Different offered load and service discipline
• Poisson arrivals (exponential service time with
independent increments) Exponential service
time (independent increments)
Summary Results:
• Behavior of number in system
• Average number in system =
difference in mean arrivals and
departures
• Variance of number in system =
sum of variances in arrivals and
departures
Inter-Arrival Time (s) Scenario
I
Scenario
II
Scenario
III
Arrivals - Mean
Inter-Arrival Time
0.50 0.51 0.79
Arrivals - Variance
Inter-Arrival Time
0.26 0.26 0.62
Departures - Mean
Inter-Arrival Time
0.50 0.80 1.00
Departures -
Variance
Inter-Arrival Time
0.24 0.64 1.05
Number Over Window Scenario I Scenario II Scenario
III
Mean Arrivals 19.93 19.79 12.67
Variance in Arrivals 19.35 18.52 11.36
Mean Departures 20.05 12.39 9.99
Variance in Departures 18.71 10.98 10.41
Mean (Arrivals - Departures) -0.13 7.39 2.68
Variance (Arrivals - 37.57 29.45 21.38
Arrivals / Departures Process
• Exponential service time with mean 0.5
seconds
• Distribution of number of arrivals in an
interval of 10s
• The number of arrivals equivalent to the sum
of a number of inter-arrival times
• which is a sum of random variables
• the sum converges to a normal
Characterizing component load
• Use distribution of the average number of
events in the system into characterize the
load
• Variance in the number of events in
system
set thresholds to trigger at a probability level
Variation in the Variance
• Use cumulative distribution in the variance to
characterize the impact of variation in the
variance with window length
• Longer windows feature a larger number of
events – each event a inter-arrival time random
variable
• The uncertainty scales with the number of
random variables in the sum
• Longer intervals have larger uncertainty
associated with the composition of the time
interval –
• rightward shifting – flattening curves
𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
𝑁 =
𝑉𝑎𝑟(𝑍)
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
IMPROVEMENTS AND FUTURE
WORK
MESSAGE SCHEDULING AND THRESHOLD OPTIMIZATION
SCHEDULING AND IMPACT ON
PERFORMANCE
• Introduce load balancing to intelligently route messages –
• Particularly in components with multiple queues
• Assign messages
• to queue with lowest load
• to queue that is most likely to process it fastest / most efficiently
• Characterizing
• processing time of messages as a function of
• Type of messages – and expected processing time
• messages in the queue …
• Model inter-arrival times – on a per queue basis –
• see appendix: on relating events and time taken to observe them
• Account for time dependence of statistics
ALERT THRESHOLDS
OPTIMIZATION
• Critical Stats guide uses a fixed set of thresholds
• Consider component load stat – use variation of number of messages in
• stat – based on existing / recoded measurements
• Performance at component level –
• irrespective of input conditions
• based on maximum design spec of component
• depending on input conditions – traffic / trading / time dependent
• set thresholds to account for behavior that is also depending on
• expected / normal traffic
• Determine threshold values based on Normal / Abnormal behavior
• amount of load that is historically observed
• Consider time based thresholds –
• if feasible – as offered load is time varying
• Tune anomaly threshold – based on time varying load
Slide | 27
THANK YOU
APPENDIX
EVENT (INTER) ARRIVAL
TIME PROCESS
EVENT INTER –
ARRIVAL TIME
• Introduce a Feature to characterize the “Time property” in the Event based Model
• Each Event has a time stamp and between Events – an Event “Inter - Arrival time”
• Modeling this “time interval” will give insights in to “Time Patterns” of the Events in
characterizing Trading behavior
• Natural to consider basic statistics related to Inter- Arrival Time
• Descriptive Statistics – means, variances, Higher Order Statistics
• But they don’t necessarily capture the characteristics in the pattern of Event
Inter - Arrival Times
• Also fitting Distributions and estimating their characteristics may not be very viable /
reliable
• Data Dependent, too little data to estimate , degree of fit issues
A B C B …….. A C
E1 E2 E3 E4 …..... EN-1 EN
…….t1 t2 t3 tN-1
Event Type
Event No
MODELING THE - EVENT INTER-
ARRIVAL TIME
• This Time Series captures the time patterns in the placing of Market Orders and Trading
Event
• We characterize and quantify these patterns through Statistical Analysis that captures its
important properties
• The Randomness in the Event Inter - Arrival times – via Entropy
• Autocorrelation – measures degree of correlation between samples of inter arrival
times
A B C B …….. A C
E1 E2 E3 E4 .…..... EN-1 EN
…….t1 t2 t3 tN-1
1 2 3 ........... N-2 N-1
..…….
t1
t2
t3
tN-1
Event Type
Time Series of Event Inter - Arrival Time
Sample Number of time series
Event No
ti - Event inter-arrival time
A DISTRIBUTION INDEPENDENT OF
MEASUREMENT (TIME) WINDOW
• Observe the distribution of the time between each pair of events
• call it the event inter arrival time
• The distribution of this quantity does not change as its not dependent on a
window of measurement.
• purely a function of the event arrival (generative) process
• the process will depend on the particular quantity (orders, trades ect …) we are
observing
• The underlying distribution however is fixed for a particular data set
RELATING NUMBER OF EVENTS OBSERVED TO
INTERVALS OF TIME
E1 E2 E3 EN-1 EN
…….
Event No
∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3
Z= ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
Let Z be the sum of N IID random variables drawn from the distribution of the
inter arrival time
E(Z) = 𝑁𝜇∆𝑡
Let the mean and variance of distribution of the inter-arrival time
be
E(∆𝑡) = 𝜇∆𝑡 Var(∆𝑡) = 𝜎∆𝑡
2
∆𝑡
Var(Z) = 𝑁𝜎∆𝑡
2
For large N
Z is a random variable and is the time taken to observe N events.
Its expected value (average) is E(Z)
A measure of the uncertainty in Z (about its mean) is its standard deviation
RELATING NUMBER OF EVENTS TO
INTERVALS OF TIME
E1 E2 E3 EN-1 EN
…….
Event No
∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3
• The uncertainty in Z can be translated in to an average number of events
• As the total time and total the number of IID events observed in that time is
related probabilistically via the distribution in the inter arrival time
• So we may estimate an average number of events associated with this uncertainty
𝜎𝑧
2 = 𝑁𝜎∆𝑡
2
𝑁 =
𝜎𝑧
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
Thus we may set a threshold “T” for the number of events observed in an interval of
length
to detect outliers
E(Z) = 𝑁𝜇∆𝑡
𝑇 > 𝑁 + 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 ∗ 𝑁

More Related Content

What's hot (6)

Ch1 introduction to control
Ch1 introduction to controlCh1 introduction to control
Ch1 introduction to control
 
Lecture 12 time_domain_analysis_of_control_systems
Lecture 12 time_domain_analysis_of_control_systemsLecture 12 time_domain_analysis_of_control_systems
Lecture 12 time_domain_analysis_of_control_systems
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
ppt on Time Domain and Frequency Domain Analysis
ppt on Time Domain and Frequency Domain Analysisppt on Time Domain and Frequency Domain Analysis
ppt on Time Domain and Frequency Domain Analysis
 
Av 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman FilterAv 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman Filter
 
TIME DOMAIN ANALYSIS
TIME DOMAIN ANALYSISTIME DOMAIN ANALYSIS
TIME DOMAIN ANALYSIS
 

Similar to Improving predictability and performance by relating the number of events and the time over which to observe them

Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
Asoka Korale
 
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
Anis Nasir
 

Similar to Improving predictability and performance by relating the number of events and the time over which to observe them (20)

Industrial engineering notes for gate
Industrial engineering notes for gateIndustrial engineering notes for gate
Industrial engineering notes for gate
 
Lecture-2-01-02-2022.pdf
Lecture-2-01-02-2022.pdfLecture-2-01-02-2022.pdf
Lecture-2-01-02-2022.pdf
 
Control system unit(1)
Control system unit(1)Control system unit(1)
Control system unit(1)
 
Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
 
14 queuing
14 queuing14 queuing
14 queuing
 
solver (1)
solver (1)solver (1)
solver (1)
 
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
 
KEC-602 Control System Unit-3 gandgfdghhg
KEC-602 Control System Unit-3 gandgfdghhgKEC-602 Control System Unit-3 gandgfdghhg
KEC-602 Control System Unit-3 gandgfdghhg
 
Module 1 (1).pdf
Module 1 (1).pdfModule 1 (1).pdf
Module 1 (1).pdf
 
Probabilistic slope stability analysis as a tool to optimise a geotechnical s...
Probabilistic slope stability analysis as a tool to optimise a geotechnical s...Probabilistic slope stability analysis as a tool to optimise a geotechnical s...
Probabilistic slope stability analysis as a tool to optimise a geotechnical s...
 
Chap 5
Chap 5Chap 5
Chap 5
 
T1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdfT1-4_Maslennikov_et_al.pdf
T1-4_Maslennikov_et_al.pdf
 
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
 
Discrete Time Systems & its classifications
Discrete Time Systems & its classificationsDiscrete Time Systems & its classifications
Discrete Time Systems & its classifications
 
Cost minimization model
Cost minimization modelCost minimization model
Cost minimization model
 
Real Time Systems
Real Time SystemsReal Time Systems
Real Time Systems
 
DTSP UNIT I - INTRODUCTION.pptx
DTSP UNIT I - INTRODUCTION.pptxDTSP UNIT I - INTRODUCTION.pptx
DTSP UNIT I - INTRODUCTION.pptx
 
Time Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.pptTime Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.ppt
 
Daamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaperDaamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaper
 
Stat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng TheoryStat 2153 Introduction to Queiueng Theory
Stat 2153 Introduction to Queiueng Theory
 

More from Asoka Korale

Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedinAsoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka Korale
 
event tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedinevent tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedin
Asoka Korale
 
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedinIET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
Asoka Korale
 
Estimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedinEstimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedin
Asoka Korale
 

More from Asoka Korale (20)

Novel price models in the capital market
Novel price models in the capital marketNovel price models in the capital market
Novel price models in the capital market
 
Modeling prices for capital market surveillance
Modeling prices for capital market surveillanceModeling prices for capital market surveillance
Modeling prices for capital market surveillance
 
Entity profling and collusion detection
Entity profling and collusion detectionEntity profling and collusion detection
Entity profling and collusion detection
 
Entity Profiling and Collusion Detection
Entity Profiling and Collusion DetectionEntity Profiling and Collusion Detection
Entity Profiling and Collusion Detection
 
Markov Decision Processes in Market Surveillance
Markov Decision Processes in Market SurveillanceMarkov Decision Processes in Market Surveillance
Markov Decision Processes in Market Surveillance
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
 
Customer Lifetime Value Modeling
Customer Lifetime Value ModelingCustomer Lifetime Value Modeling
Customer Lifetime Value Modeling
 
Forecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime ValueForecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime Value
 
Capacity and utilization enhancement
Capacity and utilization enhancementCapacity and utilization enhancement
Capacity and utilization enhancement
 
Cell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield MaximizationCell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield Maximization
 
Vehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring ScenariosVehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring Scenarios
 
Mixed Numeric and Categorical Attribute Clustering Algorithm
Mixed Numeric and Categorical Attribute Clustering AlgorithmMixed Numeric and Categorical Attribute Clustering Algorithm
Mixed Numeric and Categorical Attribute Clustering Algorithm
 
Introduction to Bit Coin Model
Introduction to Bit Coin ModelIntroduction to Bit Coin Model
Introduction to Bit Coin Model
 
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
 
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
 
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedinAsoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
 
event tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedinevent tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedin
 
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedinIET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
 
Estimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedinEstimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedin
 

Recently uploaded

Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 

Recently uploaded (20)

Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 

Improving predictability and performance by relating the number of events and the time over which to observe them

  • 1. RELATING THE TIME REQUIRED TO OBSERVE A CERTAIN NUMBER OF EVENTS Asoka Korale, Ph.D. C.Eng. MIESL
  • 2. MOTIVATIONS FOR RELATING TIME AND EVENTS
  • 3. APPLICATIONS OF RELATING TIME AND EVENTS Call CentersTraffic Management Transportation and Logistics Packet Switching Production Scheduling Forecasting / Relating Time based Ev
  • 4. INSIGHTS FROM RELATING TIME AND EVENTS• Relate an interval of observation to a sum of inter-arrival time random variables • Relate the interval of observation to • the total number of events observed in the interval • the uncertainty associated with the average number of events in the interval • the sum of the number of inter-arrival time intervals that compose the interval • Establish a probabilistic relationship for the time taken to observe a number of events • Relate the uncertainty in the interval of observation to a number of events
  • 5. NOVEL STOCHASTIC RELATIONSHIP BETWEEN TIME AND EVENTS RELATE TIME TAKEN TO OBSERVE A CERTAIN NUMBER OF EVENTS
  • 6. UNCERTAINTY ASSOCIATED WITH EVENTS OVER TIME E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁  total time (Z) to observe a number of events (N) is a sum of a similar number of inter-arrival time – time intervals  each inter-arrival time – time interval a random variable (∆𝑡i)  total uncertainty in the time interval (Z) a reflection of the uncertainty associated with each individual random variable (∆𝑡i)  the dependence between random variables impacts the total uncertainty associated with the sum  total uncertainty in the interval (Z) – leads to the variance in the number of events observed in such an interval time interval Z to observe N events inter-arrival time random variables distribution of inter-arrival times events
  • 7. A SUM OF INTER-ARRIVAL TIME RANDOM VARIABLES E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 • Each event inter-arrival time ∆𝑡i is a random variable • each such random variable has associated with it a certain uncertainty • An N number of inter-arrival time random variables are required to observe an equivalent number of events • The total time Z taken to observe N events is a sum of N inter-arrival time random variables • The uncertainty associated with this sum of random variables – translates in to a number of events • a number of events associated with the uncertainty in the total time taken to observe the events • The distribution of the inter-arrival times may be estimated from historical data
  • 8. RELATING TIME AND EVENTS E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 when the inter-arrival times are drawn from a single distribution and are independent (IID), Z has mean and variance E(Z) = 𝑁𝜇∆𝑡 Var Z = 𝑁𝜎∆𝑡 2 E(∆𝑡𝑖) = 𝜇∆𝑡 Var ∆𝑡𝑖 = 𝜎∆𝑡 2 when the events are correlated the variance of the sum of a number of inter-arrival times will feature the covariance between each pair of random variables that compose the sum 𝑉𝑎𝑟 𝑍 = ∀𝑖 𝑉𝑎𝑟(∆𝑡𝑖) + ∀𝑖,𝑗 𝑖≠𝑗 𝐶𝑜𝑣(∆𝑡𝑖∆𝑡𝑗) 𝑁 = 𝑉𝑎𝑟(𝑍) 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡 𝑁 = 𝑁 ± 𝑘 ∗ 𝑁 • to observe 𝑁 number of events in a time interval of length Z • scale the variance (or standard deviation) via constant k • a measure of the degree of the uncertainty in N - a measure of its deviation from the mean. where where
  • 9. NOVEL STOCHASTIC MODEL OF AN M/M/1 QUEUE SYSTEM BY RELATING TIME AND EVENTS VIA A SUM OF INTER-ARRIVAL TIME RANDOM VARIABLES
  • 10. Birth – Death process model of an M/M/1 Queue System Deterministic approach – • rates are deterministic – usually measured over an interval of time λ > n=0 Po < µ λ > λ > < µ < µ n=n Pn n= n-1 Pn-1 n=1 P1 n=2 P2 λ > < µ λ𝑃𝑛−1 = μ𝑃𝑛 𝑃𝑛 = (λ/μ) 𝑛 𝑃0 𝑛=0 𝑁 𝑃𝑛 = 1ρ = λ/μ 𝑃𝑛 = 𝜌 − 1 [ 𝜌 𝑁+1 − 1]ρ 𝑛 balance equations traffic intensity probability distribution of state use sum to solve for Po probability of state E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
  • 11. approach Deterministic Approach Stochastic Approach λ > n=0 Po < µ λ > λ > < µ < µ n=n Pnn= n-1 Pn-1 n=1 P1 n=2 P2 λ > < µ λ 𝑛−1 𝑃𝑛−1 = 𝜇 𝑛 𝑃𝑛 𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0 𝑛=0 𝑁 𝑃𝑛 = 1 ρ 𝑛 = λ 𝑛−1/μ 𝑛 λ𝑖 𝑖 = 1/∆𝑡𝑖 𝐴 𝜇𝑖 𝑖 = 1/∆𝑡𝑖 𝐷 λ𝑖 = 𝐸{λ𝑖 𝑖 } = 𝐸{1/∆𝑡𝑖 𝐴 } 𝜇𝑖 = 𝐸{𝜇𝑖 𝑖 } = 𝐸{1/∆𝑡𝑖 𝐷 } instantaneous arrivals and departure rates 𝑃𝑛 𝑖 = ∆ 𝑡 𝑛 𝐷 ∆ 𝑡 𝑛−1 𝐴 . . . (∆ 𝑡1 𝐷 ∆ 𝑡0 𝐴 )𝑃0 𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0 expected probability of state converges to deterministic result instantaneous probability of state E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
  • 12. Probability of observing a particular sequence of events when inter-arrival times are independent the expectation of the product it the product of the expectations Let Z = 𝑃(∆𝑡1, ∆𝑡2, … , ∆𝑡 𝑁) E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑃 𝑍 = 𝑖=1 𝑁 𝑃(∆𝑡𝑖) 𝐸 𝑃 𝑍 = 𝐸 𝑖=1 𝑁 )𝑃(∆𝑡𝑖 = 𝑖=1 𝑁 }𝐸{𝑃(∆𝑡𝑖) probability of a sequence is the product of the individual probabilities of observing a particular inter-arrival time when inter-arrival times are independent – consistent with an M/M/1 scenario
  • 13. ANOMALY DETECTION IN AN M/M/1 QUEUE SYSTEM CHARACTERIZING PERFORMANCE OF A SOFTWARE COMPONENT
  • 14. Anomaly Detection Scheme • A system of components • Each component a queue / server Comp 1 Comp 2 Comp 3 Comp N • Component Load  Distribution of No of Messages in System  Arrivals – Departures in ∆𝑇 load trigger threshold M (I) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 1 1 State N Comp 2 Comp • Dispersion of anomaly across component sy
  • 15. Estimating Load on a Software Component • Treat system as a network of components • inter-arrival times help to characterize the performance best • Model each component as queue – server system • Queue – buffering messages into the component • Server – processing all messages within a component • No of messages in “system” (in queuing parlance) • those waiting and in service – difference between arrivals and departures • account for multiple queues within a component -------------------------------------------------------------------------- • Common approach - threshold based alert system • Thresholds commonly measure performance - at • component level • system level • Typically Thresholds use – latencies, queue lengths,
  • 16. Performance Measures - Software Component • Variation in the number of messages in “system” (in queuing parlance) • Performance measures – • Variance, Mean - of messages in the system • Variance / Mean - of messages in the system • Estimate Performance measure from the Distribution of • no of messages • Variance / Mean • Threshold setting – • detect an outlier • a certain number of standard deviations from mean • The time behavior of the distribution in the arrivals and departures will imp  envision time dependent thresholds
  • 17. Characterizing Variation in the load 𝑍 𝐴 = ∆𝑇 = ∆𝑡1 𝐴 + ∆𝑡2 𝐴 + ⋯ + ∆𝑡 𝑁 𝐴 𝑍 𝐷 = ∆𝑇 = ∆𝑡1 𝐷 + ∆𝑡2 𝐷 + ⋯ + ∆𝑡 𝑁 𝐷 𝑁 𝐴 = 𝑘 𝐴 𝑉𝑎𝑟(𝑍 𝐴) 𝜇∆𝑡,𝐴 = 𝑘 𝐴 𝑁 𝐴 𝜎 𝐴 2 𝜇∆𝑡,𝐴 𝑁 𝐷 = 𝑘 𝐷 𝑉𝑎𝑟(𝑍 𝐷) 𝜇∆𝑡,𝐷 = 𝑘 𝐷 𝑁 𝐷 𝜎 𝐷 2 𝜇∆𝑡,𝐷 𝑁 = 𝐸{𝑁 𝐴 } − 𝐸{𝑁 𝐷 } 𝑉𝑎𝑟{𝑁 𝐴 − 𝑁 𝐷 } = 𝑉𝑎𝑟{𝑁 𝐴 } + Var{𝑁 𝐷 } No of events in the system at the end of a common time interval ∆𝑇 is the difference between those that arrive and those that depart total number of arrivals in time interval ∆𝑇 is 𝑁 𝐴 total number of arrivals in time interval ∆𝑇 is 𝑁 𝐷 number of arrivals associated with the composition of 𝑁 𝐴 events in time interval ∆𝑇 number of departures associated with the composition of 𝑁 𝐷 events in time interval ∆𝑇 average number of events in the system at the end of time interval ∆𝑇 variance in the number of events in the system at the end of time interval ∆𝑇 The variance arises due to the contribution of the individual uncertainties associated with the individual random variables that compose the sum ∆𝑇
  • 18. Components • Model the anomaly state (yes 1 / no 0) at each component - interface • Track anomalies across system and across time via a transition matrix (M) • Update transition matrix entries at each change of state • the difference between matrix M(I+1) and M(I) will provide system state at M(I-1) and also the • The transition matrix gives insight in to how M (I) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 1 1 State N Comp 2 Comp 3 1 Comp N Comp 1 Comp 2 Comp 3 Comp N M (I+1) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 2 1 State N Comp 2 Comp 3 1 update when system state changes record anomaly on a link-component
  • 19. RESULTS: ANOMALY DETECTION IN AN M/M/1 QUEUE SYSTEM TO CHARACTERIZE PERFORMANCE OF A SOFTWARE COMPONENT
  • 20. Test Scenarios and Validation of model Test Scenarios: • Different offered load and service discipline • Poisson arrivals (exponential service time with independent increments) Exponential service time (independent increments) Summary Results: • Behavior of number in system • Average number in system = difference in mean arrivals and departures • Variance of number in system = sum of variances in arrivals and departures Inter-Arrival Time (s) Scenario I Scenario II Scenario III Arrivals - Mean Inter-Arrival Time 0.50 0.51 0.79 Arrivals - Variance Inter-Arrival Time 0.26 0.26 0.62 Departures - Mean Inter-Arrival Time 0.50 0.80 1.00 Departures - Variance Inter-Arrival Time 0.24 0.64 1.05 Number Over Window Scenario I Scenario II Scenario III Mean Arrivals 19.93 19.79 12.67 Variance in Arrivals 19.35 18.52 11.36 Mean Departures 20.05 12.39 9.99 Variance in Departures 18.71 10.98 10.41 Mean (Arrivals - Departures) -0.13 7.39 2.68 Variance (Arrivals - 37.57 29.45 21.38
  • 21. Arrivals / Departures Process • Exponential service time with mean 0.5 seconds • Distribution of number of arrivals in an interval of 10s • The number of arrivals equivalent to the sum of a number of inter-arrival times • which is a sum of random variables • the sum converges to a normal
  • 22. Characterizing component load • Use distribution of the average number of events in the system into characterize the load • Variance in the number of events in system set thresholds to trigger at a probability level
  • 23. Variation in the Variance • Use cumulative distribution in the variance to characterize the impact of variation in the variance with window length • Longer windows feature a larger number of events – each event a inter-arrival time random variable • The uncertainty scales with the number of random variables in the sum • Longer intervals have larger uncertainty associated with the composition of the time interval – • rightward shifting – flattening curves 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 𝑁 = 𝑉𝑎𝑟(𝑍) 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡
  • 24. IMPROVEMENTS AND FUTURE WORK MESSAGE SCHEDULING AND THRESHOLD OPTIMIZATION
  • 25. SCHEDULING AND IMPACT ON PERFORMANCE • Introduce load balancing to intelligently route messages – • Particularly in components with multiple queues • Assign messages • to queue with lowest load • to queue that is most likely to process it fastest / most efficiently • Characterizing • processing time of messages as a function of • Type of messages – and expected processing time • messages in the queue … • Model inter-arrival times – on a per queue basis – • see appendix: on relating events and time taken to observe them • Account for time dependence of statistics
  • 26. ALERT THRESHOLDS OPTIMIZATION • Critical Stats guide uses a fixed set of thresholds • Consider component load stat – use variation of number of messages in • stat – based on existing / recoded measurements • Performance at component level – • irrespective of input conditions • based on maximum design spec of component • depending on input conditions – traffic / trading / time dependent • set thresholds to account for behavior that is also depending on • expected / normal traffic • Determine threshold values based on Normal / Abnormal behavior • amount of load that is historically observed • Consider time based thresholds – • if feasible – as offered load is time varying • Tune anomaly threshold – based on time varying load
  • 29. EVENT INTER – ARRIVAL TIME • Introduce a Feature to characterize the “Time property” in the Event based Model • Each Event has a time stamp and between Events – an Event “Inter - Arrival time” • Modeling this “time interval” will give insights in to “Time Patterns” of the Events in characterizing Trading behavior • Natural to consider basic statistics related to Inter- Arrival Time • Descriptive Statistics – means, variances, Higher Order Statistics • But they don’t necessarily capture the characteristics in the pattern of Event Inter - Arrival Times • Also fitting Distributions and estimating their characteristics may not be very viable / reliable • Data Dependent, too little data to estimate , degree of fit issues A B C B …….. A C E1 E2 E3 E4 …..... EN-1 EN …….t1 t2 t3 tN-1 Event Type Event No
  • 30. MODELING THE - EVENT INTER- ARRIVAL TIME • This Time Series captures the time patterns in the placing of Market Orders and Trading Event • We characterize and quantify these patterns through Statistical Analysis that captures its important properties • The Randomness in the Event Inter - Arrival times – via Entropy • Autocorrelation – measures degree of correlation between samples of inter arrival times A B C B …….. A C E1 E2 E3 E4 .…..... EN-1 EN …….t1 t2 t3 tN-1 1 2 3 ........... N-2 N-1 ..……. t1 t2 t3 tN-1 Event Type Time Series of Event Inter - Arrival Time Sample Number of time series Event No ti - Event inter-arrival time
  • 31. A DISTRIBUTION INDEPENDENT OF MEASUREMENT (TIME) WINDOW • Observe the distribution of the time between each pair of events • call it the event inter arrival time • The distribution of this quantity does not change as its not dependent on a window of measurement. • purely a function of the event arrival (generative) process • the process will depend on the particular quantity (orders, trades ect …) we are observing • The underlying distribution however is fixed for a particular data set
  • 32. RELATING NUMBER OF EVENTS OBSERVED TO INTERVALS OF TIME E1 E2 E3 EN-1 EN ……. Event No ∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3 Z= ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 Let Z be the sum of N IID random variables drawn from the distribution of the inter arrival time E(Z) = 𝑁𝜇∆𝑡 Let the mean and variance of distribution of the inter-arrival time be E(∆𝑡) = 𝜇∆𝑡 Var(∆𝑡) = 𝜎∆𝑡 2 ∆𝑡 Var(Z) = 𝑁𝜎∆𝑡 2 For large N Z is a random variable and is the time taken to observe N events. Its expected value (average) is E(Z) A measure of the uncertainty in Z (about its mean) is its standard deviation
  • 33. RELATING NUMBER OF EVENTS TO INTERVALS OF TIME E1 E2 E3 EN-1 EN ……. Event No ∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3 • The uncertainty in Z can be translated in to an average number of events • As the total time and total the number of IID events observed in that time is related probabilistically via the distribution in the inter arrival time • So we may estimate an average number of events associated with this uncertainty 𝜎𝑧 2 = 𝑁𝜎∆𝑡 2 𝑁 = 𝜎𝑧 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡 Thus we may set a threshold “T” for the number of events observed in an interval of length to detect outliers E(Z) = 𝑁𝜇∆𝑡 𝑇 > 𝑁 + 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 ∗ 𝑁

Editor's Notes

  1. The element by element difference of the prices provides insights in to the underlying random processes …
  2. The element by element difference of the prices provides insights in to the underlying random processes …
  3. The element by element difference of the prices provides insights in to the underlying random processes …
  4. The element by element difference of the prices provides insights in to the underlying random processes …
  5. The element by element difference of the prices provides insights in to the underlying random processes …
  6. The element by element difference of the prices provides insights in to the underlying random processes …
  7. The element by element difference of the prices provides insights in to the underlying random processes …
  8. The element by element difference of the prices provides insights in to the underlying random processes …