SlideShare a Scribd company logo
1 of 19
Download to read offline
Dictyogram: a Statistical Approach for the
Definition and Visualization of Network Flow
Categories
David Muelas, Miguel Gordo, Jos´e Luis Garc´ıa-Dorado,
Jorge E. L´opez de Vergara
Email: {dav.muelas, jl.garcia, jorge.lopez vergara}@uam.es,
miguel.gordo@estudiante.uam.es
Universidad Aut´onoma de Madrid
CNSM 2015 – November 2015
Network Health Check
Network managers must monitor network vital signs to assure it is
healthy:
(a) ECG
00:00:00 03:20:00 06:40:00 10:00:00 13:20:00 16:40:00 20:00:00 23:20:00
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cat1 Cat2 Cat3 Cat4 Cat5 Cat6 Cat7 Cat8 Cat9 Cat10
(b) Dictyogram (Normalized version)
But. . . What exactly is Dictyogram?
Dictyogram (from δ´ικτυo, network in Greek): Method to
graphically trace the network flow behavior versus time. Its
graphical results can be like a network electrogram, showing its
vital signs.
Introduction
Method definition
Experimental results
Conclusions
Outline
1 Introduction
Context
Our Goals
2 Method definition
Probability integral transform
Modeling CDFs
3 Experimental results
Model evaluation
Dictyogram visualization
4 Conclusions
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 4
Introduction
Method definition
Experimental results
Conclusions
Context
Our Goals
Context
Network flow-based monitoring has been proven useful to
detect network intrusion, malfunction, or other types of
anomalies.
Unfortunately, network managers have to deal with tons of
measurement data, and its interpretation has become a
challenge.
Data summaries: difficult to reach a good trade-off between
detail and simplifications: insufficient data can lead to
restricted or even erroneous conclusions.
Not only the measurements are important from the point of
view of network management: the application of suitable
techniques improves the quality and depth of the knowledge
that can be extracted from measurements.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 5
Introduction
Method definition
Experimental results
Conclusions
Context
Our Goals
Our Goals
Our proposal is intended to ease network managers’ work by
proposing a novel approach to study the behavior of network flow
characteristics. Our main goal is to define comprehensive
summaries of network flow data:
Our approach is based in the study of different flow
characteristics’ ECDFs — e.g., flow size or duration
distributions.
Using those ECDFs, we define flow categories using the
integral probability transform — e.g., using decile delimited
intervals.
As we will see, this approach improves the detection of network
anomalies and the visualization of network state.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 6
Introduction
Method definition
Experimental results
Conclusions
Probability integral transform
Modeling CDFs
Method description
Probability integral transform:
Let X be a continuous random variable with cumulative
distribution function FX . Then FX (X) follows a uniform
distribution on [0, 1].
(b)
0
0.5
1
(a)
C
i
= F
X
−1
(P
i
)
P
i
And them, we define flow categories using a set of probability
levels using the CDF of certain flow characteristics.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 7
Introduction
Method definition
Experimental results
Conclusions
Probability integral transform
Modeling CDFs
Keep an eye on the hypotheses!
25 30 35
0
0.2
0.4
0.6
0.8
1
(b)
0200400600
0
0.2
0.4
0.6
0.8
1
(a)
(c) Gaussian
0 20 40 60
0
0.2
0.4
0.6
0.8
1
(b)
05101520
0
0.2
0.4
0.6
0.8
1
(a)
(d) Poisson
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 8
Introduction
Method definition
Experimental results
Conclusions
Probability integral transform
Modeling CDFs
How can we model an CDF?
Glivenko-Cantelli theorem: the ECDF converges to the CDF
as the number of observations increases.
Nonetheless, computational cost increases when we
accumulate all the values of the characteristic under analysis.
Alternative approach: Functional Data Analysis:
Mean Function: Fmean
X =
1
n
n
i=1
FXi
Problem: not robust
Functional Depth:
Maximum depth observation.
Median Function (it is the function that maximizes the
functional depth we use).
Problem: more computationally expensive
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 9
Introduction
Method definition
Experimental results
Conclusions
Probability integral transform
Modeling CDFs
Dataset for the evaluation
To asses the advantages of our method, we have use a real
dataset:
Flow records, Spanish Academic Network: more than one
million users, more than 7 years of data.
Exporters: 5 Netflow exporters, different geographical
locations (all of them in Spain).
Packet level sampling: rate of one out of 100 packets.
Period selected for the evaluation of the CDF estimation
methods: 30 days.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 10
Introduction
Method definition
Experimental results
Conclusions
Probability integral transform
Modeling CDFs
Analyzing ECDFs to get a model of the typical behavior
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
X: 40
Y: 0.9
X: 44
Y: 0.8
X: 53
Y: 0.7
X: 80
Y: 0.6
X: 149
Y: 0.5
X: 501
Y: 0.4
X: 1452
Y: 0.3
X: 1500
Y: 0.2
X: 3000
Y: 0.1
Flow size (bytes)
P(X>x)
Mean
Deepest
Median
Figure: Comparison between observed CCDFs (orange line, no marker)
for Exporter A, and models obtained using the mean (blue line, circles),
deepest (black line, diamonds) and median (red line, triangles) functions.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 11
Introduction
Method definition
Experimental results
Conclusions
Model evaluation
Dictyogram visualization
Empirical comparison (I)
0 5 10 15 20 25 30
0
5
10
x 10
5
A
0 5 10 15 20 25 30
0
5
10
x 10
6
B
0 5 10 15 20 25 30
0
5
10
x 10
7
C
0 5 10 15 20 25 30
0
5
10
x 10
6
D
0 5 10 15 20 25 30
0
5
x 10
6
E
Mean Deepest Median
Figure: Evolution of the Pearson’s test-statistic for all exporters. (Less is
better.)
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 12
Introduction
Method definition
Experimental results
Conclusions
Model evaluation
Dictyogram visualization
Empirical comparison (II)
Table: Summary of the evaluation of the different methods to estimate
the CDF.
Exporter Method # Best
A
Mean function 0
Deepest obs. 3
Median function 25
B
Mean function 0
Deepest obs. 6
Median function 22
C
Mean function 20
Deepest obs. 8
Median function 0
D
Mean function 0
Deepest obs. 23
Median function 5
E
Mean function 0
Deepest obs. 28
Median function 0
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 13
Introduction
Method definition
Experimental results
Conclusions
Model evaluation
Dictyogram visualization
Final visualization of Dictyogram
03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00
0
2
4
x 10
4
(a) Mean
Concurrentflowsforeachcategory
03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00
0
2
4
x 10
4
(b) Deepest Observation
Time of day
03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00
0
2
4
x 10
4
(c) Median
1
1
1
2
2
2
Figure: Dictyogram representation of fi (t) with their respective size
intervals delimited by the deciles given by (a) mean, (b) deepest observed
ECDF, and (c) median.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 14
Introduction
Method definition
Experimental results
Conclusions
Model evaluation
Dictyogram visualization
Final visualization of Dictyogram
00:00:00 03:20:00 06:40:00 10:00:00 13:20:00 16:40:00 20:00:00 23:20:00
0
0.5
1
1.5
2
2.5
3
3.5
4
x 10
4
1 2
Figure: Zoom in the median.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 15
Introduction
Method definition
Experimental results
Conclusions
Key remarks
Our method:
Is manager friendly: it provides Statistical summaries based
on certain probability levels, which eases the study of the
flows traversing the network.
Links statistical properties to time evolution: it eases the
detection of changes in the statistical properties of the
characteristics under analysis.
Improves network flow data visualization: it lets control
the resolution of the visualization of the distribution that
network flow characteristics follow.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 16
Introduction
Method definition
Experimental results
Conclusions
Future work
We plan to:
Study how to summarize several different network behaviors in
a multivariate uniform distribution, and use other well-known
distributions (and not only uniform) for signatures.
Study the distribution of the Pearson’s test-statistic to detect
anomalous events.
Test the stability of the estimation of the CDF ( to define
some criteria to recalibrate the model).
Explore other representations with higher dimensionality.
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 17
Introduction
Method definition
Experimental results
Conclusions
Thank you!
Questions?
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 18
Introduction
Method definition
Experimental results
Conclusions
Annex: Functional depth
We use the definition given by:
MSn,H(x) = min{SLn(x), ILn(x)} (1)
where
SLn(x) = 1
nλ(I)
n
i=1
λ{t ∈ I : x(t) ≤ xi (t)}
ILn(x) = 1
nλ(I)
n
i=1
λ{t ∈ I : x(t) ≥ xi (t)} (2)
With it, we consider:
Maximum depth observation.
Median Function (it is the function that maximizes the
functional depth we use).
D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 19

More Related Content

Similar to Dictyogram: a Statistical Approach for the Definition and Visualization of Network Flow Categories

Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...ISA Interchange
 
Multimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodMultimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodIJERA Editor
 
Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?Data Con LA
 
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...ssuser4b1f48
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayeseSAT Journals
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayeseSAT Journals
 
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUESNEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUEScscpconf
 
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUESNEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUEScsitconf
 
On Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised ApproachOn Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised ApproachWaqas Tariq
 
Medical diagnosis classification
Medical diagnosis classificationMedical diagnosis classification
Medical diagnosis classificationcsandit
 
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...cscpconf
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)Salford Systems
 
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesCredit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesZhongmin Luo
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxIshaq Khan
 
Detecting Discontinuties in Large Scale Systems
Detecting  Discontinuties in Large Scale SystemsDetecting  Discontinuties in Large Scale Systems
Detecting Discontinuties in Large Scale Systemsharoonmalik786
 

Similar to Dictyogram: a Statistical Approach for the Definition and Visualization of Network Flow Categories (20)

Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...Fault detection and diagnosis for non-Gaussian stochastic distribution system...
Fault detection and diagnosis for non-Gaussian stochastic distribution system...
 
Multimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodMultimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution Method
 
Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?
 
Csmr10a.ppt
Csmr10a.pptCsmr10a.ppt
Csmr10a.ppt
 
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
 
report
reportreport
report
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayes
 
High performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayesHigh performance intrusion detection using modified k mean & naïve bayes
High performance intrusion detection using modified k mean & naïve bayes
 
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUESNEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
 
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUESNEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
 
On Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised ApproachOn Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised Approach
 
Medical diagnosis classification
Medical diagnosis classificationMedical diagnosis classification
Medical diagnosis classification
 
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...
 
CSMR10a.ppt
CSMR10a.pptCSMR10a.ppt
CSMR10a.ppt
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
 
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesCredit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptx
 
Detecting Discontinuties in Large Scale Systems
Detecting  Discontinuties in Large Scale SystemsDetecting  Discontinuties in Large Scale Systems
Detecting Discontinuties in Large Scale Systems
 

More from Jorge E. López de Vergara Méndez

On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...Jorge E. López de Vergara Méndez
 
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...Jorge E. López de Vergara Méndez
 
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...Jorge E. López de Vergara Méndez
 
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOPMONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOPJorge E. López de Vergara Méndez
 
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...Jorge E. López de Vergara Méndez
 
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss RateEvaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss RateJorge E. López de Vergara Méndez
 
Integración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de redIntegración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de redJorge E. López de Vergara Méndez
 

More from Jorge E. López de Vergara Méndez (9)

On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...
 
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
 
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
 
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOPMONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
 
Merging heterogeneous network measurement data
Merging heterogeneous network measurement dataMerging heterogeneous network measurement data
Merging heterogeneous network measurement data
 
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
 
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss RateEvaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
 
Defining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISGDefining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISG
 
Integración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de redIntegración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de red
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

Dictyogram: a Statistical Approach for the Definition and Visualization of Network Flow Categories

  • 1. Dictyogram: a Statistical Approach for the Definition and Visualization of Network Flow Categories David Muelas, Miguel Gordo, Jos´e Luis Garc´ıa-Dorado, Jorge E. L´opez de Vergara Email: {dav.muelas, jl.garcia, jorge.lopez vergara}@uam.es, miguel.gordo@estudiante.uam.es Universidad Aut´onoma de Madrid CNSM 2015 – November 2015
  • 2. Network Health Check Network managers must monitor network vital signs to assure it is healthy: (a) ECG 00:00:00 03:20:00 06:40:00 10:00:00 13:20:00 16:40:00 20:00:00 23:20:00 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Cat1 Cat2 Cat3 Cat4 Cat5 Cat6 Cat7 Cat8 Cat9 Cat10 (b) Dictyogram (Normalized version) But. . . What exactly is Dictyogram?
  • 3. Dictyogram (from δ´ικτυo, network in Greek): Method to graphically trace the network flow behavior versus time. Its graphical results can be like a network electrogram, showing its vital signs.
  • 4. Introduction Method definition Experimental results Conclusions Outline 1 Introduction Context Our Goals 2 Method definition Probability integral transform Modeling CDFs 3 Experimental results Model evaluation Dictyogram visualization 4 Conclusions D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 4
  • 5. Introduction Method definition Experimental results Conclusions Context Our Goals Context Network flow-based monitoring has been proven useful to detect network intrusion, malfunction, or other types of anomalies. Unfortunately, network managers have to deal with tons of measurement data, and its interpretation has become a challenge. Data summaries: difficult to reach a good trade-off between detail and simplifications: insufficient data can lead to restricted or even erroneous conclusions. Not only the measurements are important from the point of view of network management: the application of suitable techniques improves the quality and depth of the knowledge that can be extracted from measurements. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 5
  • 6. Introduction Method definition Experimental results Conclusions Context Our Goals Our Goals Our proposal is intended to ease network managers’ work by proposing a novel approach to study the behavior of network flow characteristics. Our main goal is to define comprehensive summaries of network flow data: Our approach is based in the study of different flow characteristics’ ECDFs — e.g., flow size or duration distributions. Using those ECDFs, we define flow categories using the integral probability transform — e.g., using decile delimited intervals. As we will see, this approach improves the detection of network anomalies and the visualization of network state. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 6
  • 7. Introduction Method definition Experimental results Conclusions Probability integral transform Modeling CDFs Method description Probability integral transform: Let X be a continuous random variable with cumulative distribution function FX . Then FX (X) follows a uniform distribution on [0, 1]. (b) 0 0.5 1 (a) C i = F X −1 (P i ) P i And them, we define flow categories using a set of probability levels using the CDF of certain flow characteristics. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 7
  • 8. Introduction Method definition Experimental results Conclusions Probability integral transform Modeling CDFs Keep an eye on the hypotheses! 25 30 35 0 0.2 0.4 0.6 0.8 1 (b) 0200400600 0 0.2 0.4 0.6 0.8 1 (a) (c) Gaussian 0 20 40 60 0 0.2 0.4 0.6 0.8 1 (b) 05101520 0 0.2 0.4 0.6 0.8 1 (a) (d) Poisson D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 8
  • 9. Introduction Method definition Experimental results Conclusions Probability integral transform Modeling CDFs How can we model an CDF? Glivenko-Cantelli theorem: the ECDF converges to the CDF as the number of observations increases. Nonetheless, computational cost increases when we accumulate all the values of the characteristic under analysis. Alternative approach: Functional Data Analysis: Mean Function: Fmean X = 1 n n i=1 FXi Problem: not robust Functional Depth: Maximum depth observation. Median Function (it is the function that maximizes the functional depth we use). Problem: more computationally expensive D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 9
  • 10. Introduction Method definition Experimental results Conclusions Probability integral transform Modeling CDFs Dataset for the evaluation To asses the advantages of our method, we have use a real dataset: Flow records, Spanish Academic Network: more than one million users, more than 7 years of data. Exporters: 5 Netflow exporters, different geographical locations (all of them in Spain). Packet level sampling: rate of one out of 100 packets. Period selected for the evaluation of the CDF estimation methods: 30 days. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 10
  • 11. Introduction Method definition Experimental results Conclusions Probability integral transform Modeling CDFs Analyzing ECDFs to get a model of the typical behavior 10 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X: 40 Y: 0.9 X: 44 Y: 0.8 X: 53 Y: 0.7 X: 80 Y: 0.6 X: 149 Y: 0.5 X: 501 Y: 0.4 X: 1452 Y: 0.3 X: 1500 Y: 0.2 X: 3000 Y: 0.1 Flow size (bytes) P(X>x) Mean Deepest Median Figure: Comparison between observed CCDFs (orange line, no marker) for Exporter A, and models obtained using the mean (blue line, circles), deepest (black line, diamonds) and median (red line, triangles) functions. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 11
  • 12. Introduction Method definition Experimental results Conclusions Model evaluation Dictyogram visualization Empirical comparison (I) 0 5 10 15 20 25 30 0 5 10 x 10 5 A 0 5 10 15 20 25 30 0 5 10 x 10 6 B 0 5 10 15 20 25 30 0 5 10 x 10 7 C 0 5 10 15 20 25 30 0 5 10 x 10 6 D 0 5 10 15 20 25 30 0 5 x 10 6 E Mean Deepest Median Figure: Evolution of the Pearson’s test-statistic for all exporters. (Less is better.) D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 12
  • 13. Introduction Method definition Experimental results Conclusions Model evaluation Dictyogram visualization Empirical comparison (II) Table: Summary of the evaluation of the different methods to estimate the CDF. Exporter Method # Best A Mean function 0 Deepest obs. 3 Median function 25 B Mean function 0 Deepest obs. 6 Median function 22 C Mean function 20 Deepest obs. 8 Median function 0 D Mean function 0 Deepest obs. 23 Median function 5 E Mean function 0 Deepest obs. 28 Median function 0 D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 13
  • 14. Introduction Method definition Experimental results Conclusions Model evaluation Dictyogram visualization Final visualization of Dictyogram 03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00 0 2 4 x 10 4 (a) Mean Concurrentflowsforeachcategory 03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00 0 2 4 x 10 4 (b) Deepest Observation Time of day 03:00:00 06:00:00 09:00:00 12:00:00 15:00:00 18:00:00 21:00:00 0 2 4 x 10 4 (c) Median 1 1 1 2 2 2 Figure: Dictyogram representation of fi (t) with their respective size intervals delimited by the deciles given by (a) mean, (b) deepest observed ECDF, and (c) median. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 14
  • 15. Introduction Method definition Experimental results Conclusions Model evaluation Dictyogram visualization Final visualization of Dictyogram 00:00:00 03:20:00 06:40:00 10:00:00 13:20:00 16:40:00 20:00:00 23:20:00 0 0.5 1 1.5 2 2.5 3 3.5 4 x 10 4 1 2 Figure: Zoom in the median. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 15
  • 16. Introduction Method definition Experimental results Conclusions Key remarks Our method: Is manager friendly: it provides Statistical summaries based on certain probability levels, which eases the study of the flows traversing the network. Links statistical properties to time evolution: it eases the detection of changes in the statistical properties of the characteristics under analysis. Improves network flow data visualization: it lets control the resolution of the visualization of the distribution that network flow characteristics follow. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 16
  • 17. Introduction Method definition Experimental results Conclusions Future work We plan to: Study how to summarize several different network behaviors in a multivariate uniform distribution, and use other well-known distributions (and not only uniform) for signatures. Study the distribution of the Pearson’s test-statistic to detect anomalous events. Test the stability of the estimation of the CDF ( to define some criteria to recalibrate the model). Explore other representations with higher dimensionality. D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 17
  • 18. Introduction Method definition Experimental results Conclusions Thank you! Questions? D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 18
  • 19. Introduction Method definition Experimental results Conclusions Annex: Functional depth We use the definition given by: MSn,H(x) = min{SLn(x), ILn(x)} (1) where SLn(x) = 1 nλ(I) n i=1 λ{t ∈ I : x(t) ≤ xi (t)} ILn(x) = 1 nλ(I) n i=1 λ{t ∈ I : x(t) ≥ xi (t)} (2) With it, we consider: Maximum depth observation. Median Function (it is the function that maximizes the functional depth we use). D. Muelas, M. Gordo, J.L. Garc´ıa-Dorado, J.E. L´opez de Vergara Dictyogram 19