SlideShare a Scribd company logo
1 of 65
Download to read offline
Walking on AI/ML for
Networking
Ph.D Oscar Mauricio Caicedo
Grupo de Ingeniería Telemática
Universidad del Cauca
Colombia
Outline
● Unsupervised Learning for Heavy-Hitters Threshold Estimation
● Supervised Learning for Multipath Routing in Software-Defined Data Center
Networks
● Reinforcement Learning for Dynamic and Intelligent Routing
● Research Direction
3
Unsupervised Learning for Heavy-Hitters Threshold
Estimation
4
Unsupervised Learning
Learning approach Training dataset Problems aimed
Unsupervised Unlabeled Clustering
Clustering
(Boutaba et al., 2018)
5
● Duration – time between the first and last packet of the flow
● Size – total number of bytes transmitted in the flow
● Rate – given by the size divided by the duration
● Burstiness – characterization of burst in a flow over time t
(Duque-Torres et al., 2019)
Heavy-Hitters Classification Schemes
6
● Outlier and anomaly detection
● Load balancing
● Alleviating network congestion
(Duque-Torres et al., 2019)
Need for Heavy-Hitter Flows Detection
● HH detections is far from being
obvious
● Threshold estimation is a major
challenge
7
(Duque-Torres et al., 2019)
Knowledge Discovery Process
Threshold Estimation based on Knowledge Discovery
Process
8
(Duque-Torres et al., 2019)
● Can the Knowledge Discovery Process (KDP) to shed new light
on threshold estimation in HH detection?
● Can we determine an optimal threshold or set of thresholds
that would optimally separate the flows into HHs and non-HHs
in a variety of network and traffic conditions?
Threshold Estimation based on Knowledge Discovery
Process
9
● Duration → tortoises and dragonflies
● Sizes → elephants and mice
● Rate → cheetahs and snails
● Burstiness → porcupines and stingrays
Understanding the Problem
10
(Duque-Torres et al., 2019)
Understanding the Problem
11
Work Citations Year Appeared In Threshold Value
Curtis et al. 187 2011 IEEE INFOCOM Size
128KB, 1MB and
100MB
Chiesa et al. 29 2016 IEEE/ACM Trans. Netw. Bandwidth 10%
Moshref et al. 47 2013 ACM HotSDN Bandwidth 10%
Liu et al. 19 2013 IEEE Commun. Lett. Rate 1-10 Mbps
Trestain et al. 14 2013 IFIP/IEEE IM Rate Not specified
Lin et al. 9 2014 IEEE GLOBECOM Size 100 MB
Bi et al. 8 2013 IEEE GLOBECOM Size Adaptive
Understanding the Problem
(Duque-Torres et al., 2019)
12
TCP
(Duque-Torres et al., 2019)
Understanding the Data
Dataset Count Average
UNIV1
UNIV2
CAIDA2016
CAIDA2018
6886055
6092
9017007
8225731
710.74
198.84
517.31
1021.20
UDP
Count Average
1453303
9869 978
920417
1618524
390.74
758.24
433.62
674.91
TOTAL
Count Average
9999617
9 99956
9994197
9982640
582.60
749.34
509.13
959
Datasets
13
Feature name Feature description Direction
min_ps Packet with minimum size in the flow F & B
max_ps Packet with maximum size in the flow F & B
avg_ps Average packet size F & B
std_ps Standard deviation of packet size F & B
min_piat Minimum packet inter-arrival time F & B
max_piat Maximum packet inter-arrival time F & B
avg_piat Average packet inter-arrival time F & B
std_piat Stand. dev. of packet inter-arrival time F & B
octTotCount Number of transmitted bytes in the flow F & B
pktTotCount Number of transmitted pkts in the flow F & B
flowDur Duration of the flow (in seconds)
Data Preparation
(Duque-Torres et al., 2019)
14
● K-Means was used due to its simplicity, speed, and accuracy
● The optimal number of clusters was defined by Silhouette method
Data Mining
Value Interpretation
Si = 1 The samples are very well clustered
0.5 < Si < 0.7 The samples have a reasonable structure for clustering
Si = 0.5 The samples are probably placed in the wrong cluster
0.26 < Si < 0.5 The samples do not have a substantial structure for clustering
Si = 0 The samples lies between two cluster
Si < 0 The samples are placed in the wrong cluster
(Duque-Torres et al., 2019)
15
Data Mining
Silhouette Analysis
(Duque-Torres et al., 2019)
16
Data Mining
● K-means along the
two first principal
components
● k = 2
(Duque-Torres et al., 2019)
Dataset Class I Class II Total
UNIV1
UNIV2
CAIDA2016
CAIDA2018
293 962
18 056
451 696
563 869
8
1
9
3
293 970
18 057
451 705
563 872
17
Data Mining
● K-means along the
two first principal
components
● k = 4
(Duque-Torres et al., 2019)
18
max min average
11 394
18 180
33 043
14 258
1
2
1
1
16.09
103.81
16.78
12.4
281 168
28 4 648
28 465
44 251
149 047
108 230
20775
14 784
219 806
194 204
25 357
23 489
85 848
118 060
66 743
14 658
14 987
32 590
47 373
2 513
29 761
61 128
57 058
5 162
79 661
40 486
24 140
190 733
1 418
8 256
904
98 909
4.42
13.10
3.52
191.33
Flow Size [MB]
K Dataset flows
I
UNIV1
UNIV2
CAIDA2016
CAIDA2018
293 709
17 979
1451 051
563 472
II
UNIV1
UNIV2
CAIDA2016
CAIDA2018
6
6
7
21
III
UNIV1
UNIV2
CAIDA2016
CAIDA2018
26
32
2
376
IV
UNIV1
UNIV2
CAIDA2016
CAIDA2018
229
36
644
3
max min average
1.88
6.50
1.34
3.73
0.000032
0.000064
0.000028
0.000029
0.00847
0.00212
0.00421
0.00948
194.11
250.30
42.65
66.22
145.39
107.610
31.14
21.45
167.37
178.96
37.60
34.08
86.80
95.63
100.05
19.81
18.15
32.53
70.93
3.76
30.35
57.94
85.49
7.46
17.64
28.87
22.66
277.62
1.93
7.081
1.35
143.45
4.42
13.10
3.52
191.33
Number of Packets Flow Duration [s]
max min average
2 643.73
455.32
20.202
22.40
0
0
0
0
6.76
33.87
3.23
2.39
2 643.13
455.29
20.19
22.39
151.97
393.25
15.8
7.04
1 061.68
428.88
15.86
19.45
2 641.51
454.52
20.20
22.40
2.27
48.34
10.19
0.26
182.62
402.615
20.19
17.25
2 643.89
454.52
20.202
22.29
1.39
48.34
0.028
21.09
507.30
402.61
15.21
21.89
Evaluating the Discovered Knowledge
19
Silhouette Analysis - Ambiguous Flows
(Class I, Ambiguous Flows) (Duque-Torres et al., 2019)
Evaluating the Discovered Knowledge
20
(Duque-Torres et al., 2019)
Ambiguous Flows (Class I)
Evaluating the Discovered Knowledge
max min average
11 394
18 180
7 601
14 258
1
2
1
1
13.05
108.81
14.7
13.53
281 168
28 4 648
28 465
44 251
149 047
108 230
20775
14 784
1 275
17 376
834
1 356.28
Flow Size [MB]
K Dataset flows
I
UNIV1
UNIV2
CAIDA2016
CAIDA2018
293 002
17 979
449 904
562 273
II
UNIV1
UNIV2
CAIDA2016
CAIDA2018
707
36
1 147
1 199
max min average
0.456
6.5
0.324
0.922
0.000032
0.000064
0.000028
0.000029
0.00629
0.0214
0.00660
0.00558
145.39
28.87
1.34
3.734
1.88
107.610
31.14
21.45
0.306
178.96
37.60
34.08
Number of Packets Flow Duration [s]
max min average
2 643.73
455.32
20.20
22.40
0
0
0
0
6.09
33.87
3.29
2.36
2 643.05
453.71
20.20
22.403
0.379
16.06
0.05
0.030
0.379
339.58
15.65
15.52
For TCP traffic
○ Flow size = 6 kB
○ Packet count = 14 21
● There is no threshold that can
unambiguously classify HH flows
● Significance of network and traffic
characteristics
○ Flow size = 6 kB
○ Packet count = 14
(Duque-Torres et al., 2019)
Evaluating the Discovered Knowledge
22
● Open challenges
○ The time required to collect enough flow details for
reliable classification requires further research
■ Investigate the usefulness of per-flow packet size
distributions for accurate classification
■ Develop a threshold-less approach for HH
classification
Evaluating the Discovered Knowledge
23
Supervised Learning for Multipath Routing in
Software-Defined Data Center Networks
24
Supervised Learning
Learning approach Training dataset Problems aimed
Supervised Labeled
Classification and regression
Semi-supervised Incomplete labels
Classification Regression
(Boutaba et al., 2018) 25
● Data Center Networks (DCNs)
○ Significant bandwidth capacity
○ Dynamic traffic
○ Coexistence of many kind of flows
■ Small, short-lived flows → mice
■ Few large, long-lived flows → elephants
(Dixit et al., 2013) (Jain et al., 2009)
Multipath Routing in Software-Defined Data Center
Networks
26
● Equal-Cost MultiPath (ECMP)
○ The most prevalent multipath routing mechanism in DCNs
○ Each switch splits flows over the output ports applying a
function on packet headers
■ E.g., a hash function on the 5-tuple header
(Dixit et al., 2013) (Jain et al., 2009)
Multipath Routing in Software-Defined Data Center
Networks
27
Input Port
Output 1
Output 2
ABCD
hash
AD
BC
Hot-spot
A + D << B + C
Flows
(Dixit et al., 2013) (Jain et al., 2009)
● ECMP causes hot-spots in DCNs ⟶ low throughput and high
delay
Multipath Routing in Software-Defined Data Center
Networks
28
● Software-Defined Networking (SDN) for facing ECMP
limitations
○ A global view of the network
○ A logically centralized controller to dynamically make and
install routing decisions
● A DCN using SDN → SDDCN
(Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010)
Multipath Routing in Software-Defined Data Center
Networks
29
● Reactive SDN-based multipath routing
○ Dynamically rescheduling of elephant flows
○ Mouse flows are handle by default routing (e.g., ECMP)
○ Discriminate flows using static thresholds
○ Hot-spots may occur before elephants are detected
(Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010)
Multipath Routing in Software-Defined Data Center
Networks
30
● Proactive SDN/ML-based multicast routing
○ ML-based methods at the controller for proactively identifying
elephants
○ Central collection
■ Per-flow data
■ Sampling-based data
(Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010)
Multipath Routing in Software-Defined Data Center
Networks
31
● Proactive SDN/ML-based multicast routing - controller side
○ Central collection
■ Per-flow data
✗ Heavy traffic overhead and poor scalability
■ Sampling-based data
✗ Delayed and inaccurate flow information
(Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010)
Multipath Routing in Software-Defined Data Center
Networks
32
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
● Incremental learning at
server-side
● Proactive, accurately and timely
elephant detection
● Adaptation to varying traffic
conditions
● Low control overhead
● Constantly updated model with
limited resources
NELLY
33
(Estrada-Solano et al., 2019)
NELLY (Network Elephant Learner and anaLYzer)
Architecture - High level view
34
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
Analyzer
35
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
Learner
36
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
Packet traces UNI1 UNI2
Duration 65 min 158 min
Packets 19.85 M 100 M
IPv4 % of total traffic 98.98% (mostly TCP) 99.9% (mostly UDP)
IPv4 flows 1.02 M (TCP and UDP) 1.04 M (mostly UDP)
Datasets
37
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
● 50 incremental learning algorithms
● Metrics
○ True positive rate (TPR)
○ False positive rate (FPR)
○ Matthews Correlation Coefficient (MCC)
38
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
Algorithms
UNI1 UNI2
TPR (%) FPR (%) MCC TC
(µs)
Header
type
TPR (%) FPR (%) MCC TC
(µs)
Header
type
AHOT 85.97 35.52 0.327 4.07 BinNom 60.16 28.58 0.304 10.17 BinNom
ARF 82.39 28.82 0.359 12.01 BinNom 68.65 21.33 0.460 17.39 BinNom
Hoeffding tree 86.79 36.38 0.326 3.18 BinNom 57.92 28.46 0.284 4.64 BinNom
NB 74.76 35.69 0.254 4.76 BinNom 49.74 23.18 0.267 4.82 BinNom
OAUE 86.79 33.63 0.347 25.58 BinNom 63.28 28.65 0.332 33.06 BinNom
OzaBag 87.78 37.11 0.327 23.98 BinNom 64.17 31.13 0.314 36.61 BinNom
OzaBoost 75.88 29.93 0.307 11.56 BinNom 64.62 32.22 0.307 16.82 BinNom
SGD-SVM 16.76 10.21 0.067 0.81 Num 38.69 30.99 0.076 0.8 Num
In bold the top five results of TPR and MCC, and the TC
results shorter than 17.5 µs, for both UNI1 and UNI2
Accuracy and classification time with various incremental learning algorithms
39
Accuracy varying the
elephants weight
NELLY (Network Elephant Learner and anaLYzer)
40
Comparison
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
Works
Learning
approach
Elephants
detection
False
elephants
Table
occupancy
Control
overhead
Detection
time
Network
modifications*
Poupart et al., 2016 Incremental Very high Very low Very high High Very short None
Tang et al., 2017 Batch High Very low Medium Medium Medium
Hardware of
switches
Chao et al., 2016
Batch and
incremental
Very high High Low Medium Medium None
Curtis et al., 2011 None Perfect None None Very low Long
Software in
servers
NELLY Incremental Very high Very low None Very low Very short
Software in
servers
* Assuming OpenFlow-based networks
41
● Open challenges
○ NELLY implementation as an in-kernel software
○ Evaluate impact cost in servers
○ Processing and memory requirements
○ Threshold definition
NELLY (Network Elephant Learner and anaLYzer)
(Estrada-Solano et al., 2019)
42
● Open challenges
○ ML-based elephant rescheduler at the Controller
■ Predict the rate and duration of elephants
■ Regression or multi-classification?
■ How to define the classes?
○ Reschedule elephants based on prediction
○ Algorithm for allocating elephants to links
NELLY (Network Elephant Learner and anaLYzer)
43
Reinforcement Learning for Dynamic and Intelligent
Routing
44
Reinforcement Learning
(Mammeri, 2019)
RL Concept
45
● Network traffic routing is fundamental in networking
○ Selecting a path for packet transmission
● Selection criteria
○ Cost minimization
○ Maximization of link utilization
○ Quality of Service (QoS) provisioning
Network Routing Problem
46
● The efficient selection of the best path through a network is still
an important issue to be addressed and improved
STATIC and DYNAMIC routing
● Dynamic routing
○ Distance vector protocols
○ Link state protocols
Network Routing Problem
47
● Distance vector protocols
○ Routing information is not covered by the router itself
○ Delay in routing information
○ Network overhead
● Link state protocols
○ More memory and processing power
○ Understanding its operation is critical
Network Routing Problem
48
● Dynamic routing
○ More complexity
○ Overhead on the network
○ Very rigid
○ Doesn’t learn from past decisions
Network Routing Problem
49
Network Routing Problem
Motivating scenario 50
Network Routing Problem
Motivating scenario 51
(Casas, 2019)
Routing based on Knowledge-Defined Networking
● Architecture based
on RL and SDN
52
Routing based on Knowledge-Defined Networking
(Casas, 2019)
Q-learning
Architecture Operation
53
Routing based on Knowledge-Defined Networking
● Open challenges
○ The convergence time
○ Testing in real large networks
○ Multi-controllers
○ Probing of statistics collection based on RL
○ Collaborative learning for considering sub-networks?
Demo
54
Research Direction
55
5G Scaling Problem
● Mobile traffic is expected to grow exponentially
● The complexity, heterogeneity and scale of networks have grown
far beyond the limits of manual administration
● NFV is a 5G foundation
● Scaling VNFs pertaining to Network Services and Network Slices
is a big challenge
56
● Scaling in NFV brings cost reduction compared to oversize
capacity for workload peak
● Horizontal scaling dynamically increases/reduces the number of
VNF instances
● Vertical scaling increases/reduces the number of resources
allocated to one or more VNFs
● Most existing solutions scale NFs assuming they operate
isolatedly
5G Scaling Problem
57
Work Title Technique
Carella et al.
(2016)
An extensible autoscaling engine for software-based
network functions
Threshold rules
Dutta et al. (2016) QoE-Aware elasticity support in cloud-native 5G systems Threshold rules
Bilal et al. (2016) Dynamic cloud resource scheduling in virtualized 5G
mobile systems
Time series
Alawe et al. (2018) Improving traffic forecasting for 5G core network
scalability: A machine learning approach
Neural networks
Tang et al. (2015) Efficient auto-scaling approach in the telco cloud using
self-learning algorithm
Q-Learning
Alawe et al. (2018) On the scalability of 5G core network Control theory
5G Scaling Problem
Solutions focused on scaling NFs individually -(Demo)
58
Solutions that consider scaling NFs as part of compound network
services
Work Title Observation
Wang et al. (2016) Online VNF scaling in data centers It assumes that all NFs must
be scaled
Zhang et al. (2016) Co-scaler: cooperative scaling of software-defined
VNF service function chain
It focuses on flow migration
and scales all NFs
Rankothge et al.
(2017)
Optimizing resource allocation for virtualized
network functions in a cloud center using genetic
algorithms
It scales only NF that is
chosen randomly
Prados-Garzon et al.
(2018)
A queuing based dynamic auto scaling algorithm for
the LTE EPC control plane
It is specific for LTE vEPC
5G Scaling Problem
59
Collaborative Learning for 5G Scaling
60
Acknowledgements
Ph.D. Armando Ordoñez
Ph.D. Student - Felipe Estrada
Ph.D. Student - Carlos Tobar
Master Student - Alejandra Duque
Master Student - Daniela Casas
Undergraduate Student - Juan Daza
Undergraduate Student - Andrés Maca
Questions?
Obrigado
Monitoring link
Commercia
l
office
Commercial
office
Data processing
center
Data link
Performance degradation
(e.g., delays, congestion, and loss)
Monitoring performance parameters of the
network as link usage, packet loss and delay
To assure the QoS
“Accurate and timely statistics”
Service provider
Up-to-date view of network
Probing Problem
Monitoring link
Commercia
l
office
Commercial
office
Data processing
center
Data link
Performance degradation
(e.g., delays, congestion, and loss)
● It is necessary the accurate and timely collection of the
statistics of flows
Service provider
Up-to-date view of network
Probing Problem
Probing based on Knowledge-Defined Networking
● Architecture based
on RL and KDN
65

More Related Content

What's hot

NNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IINNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IIjuanrojochacon
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc Network
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc NetworkA Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc Network
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc NetworkIOSR Journals
 
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGA SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGijseajournal
 
Multiple Downlink Fair Packet Scheduling Scheme in Wi-Max
Multiple Downlink Fair Packet Scheduling Scheme in Wi-MaxMultiple Downlink Fair Packet Scheduling Scheme in Wi-Max
Multiple Downlink Fair Packet Scheduling Scheme in Wi-MaxEditor IJCATR
 

What's hot (8)

Traffic Classification
Traffic ClassificationTraffic Classification
Traffic Classification
 
NNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run IINNPDF3.0: parton distributions for the LHC Run II
NNPDF3.0: parton distributions for the LHC Run II
 
C0312023
C0312023C0312023
C0312023
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
B0330811
B0330811B0330811
B0330811
 
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc Network
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc NetworkA Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc Network
A Survey Paper on Cluster Head Selection Techniques for Mobile Ad-Hoc Network
 
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLINGA SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
 
Multiple Downlink Fair Packet Scheduling Scheme in Wi-Max
Multiple Downlink Fair Packet Scheduling Scheme in Wi-MaxMultiple Downlink Fair Packet Scheduling Scheme in Wi-Max
Multiple Downlink Fair Packet Scheduling Scheme in Wi-Max
 

Similar to Walking on AI/ML for Networking

Personal Research Overview presented at the KU-NAIST Research Meeting
Personal Research Overview presented at the KU-NAIST Research MeetingPersonal Research Overview presented at the KU-NAIST Research Meeting
Personal Research Overview presented at the KU-NAIST Research MeetingChawanat Nakasan
 
Optimal configuration of network coding in ad hoc networks
Optimal configuration of network coding in ad hoc networksOptimal configuration of network coding in ad hoc networks
Optimal configuration of network coding in ad hoc networksPvrtechnologies Nellore
 
Optimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc NetworksOptimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc Networks1crore projects
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Traffic Prediction from Street Network images.pptx
Traffic Prediction from  Street Network images.pptxTraffic Prediction from  Street Network images.pptx
Traffic Prediction from Street Network images.pptxchirantanGupta1
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataLarry Smarr
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performanceJisc
 
A Study of Protocols for Grid Computing Environment
A Study of Protocols for Grid Computing EnvironmentA Study of Protocols for Grid Computing Environment
A Study of Protocols for Grid Computing EnvironmentCSCJournals
 
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...Rangaprasad Sampath
 
Iccci10 bd siit
Iccci10 bd siitIccci10 bd siit
Iccci10 bd siithanums1
 
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...Danny Alex Lachos Perez
 
IEEE Big data 2016 Title and Abstract
IEEE Big data  2016 Title and AbstractIEEE Big data  2016 Title and Abstract
IEEE Big data 2016 Title and Abstracttsysglobalsolutions
 
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...Iffat Anjum
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingDemetris Trihinas
 
An Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringAn Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringIJECEIAES
 
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable MechanismSurvey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable MechanismIRJET Journal
 
Approximation of regression-based fault minimization for network traffic
Approximation of regression-based fault minimization for network trafficApproximation of regression-based fault minimization for network traffic
Approximation of regression-based fault minimization for network trafficTELKOMNIKA JOURNAL
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyGungi Achi
 
Major Report on ADIAN
Major Report on ADIANMajor Report on ADIAN
Major Report on ADIANsmittal121
 

Similar to Walking on AI/ML for Networking (20)

Personal Research Overview presented at the KU-NAIST Research Meeting
Personal Research Overview presented at the KU-NAIST Research MeetingPersonal Research Overview presented at the KU-NAIST Research Meeting
Personal Research Overview presented at the KU-NAIST Research Meeting
 
Optimal configuration of network coding in ad hoc networks
Optimal configuration of network coding in ad hoc networksOptimal configuration of network coding in ad hoc networks
Optimal configuration of network coding in ad hoc networks
 
Optimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc NetworksOptimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc Networks
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
GSU-RF-2013-Reddy-4
GSU-RF-2013-Reddy-4GSU-RF-2013-Reddy-4
GSU-RF-2013-Reddy-4
 
Traffic Prediction from Street Network images.pptx
Traffic Prediction from  Street Network images.pptxTraffic Prediction from  Street Network images.pptx
Traffic Prediction from Street Network images.pptx
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performance
 
A Study of Protocols for Grid Computing Environment
A Study of Protocols for Grid Computing EnvironmentA Study of Protocols for Grid Computing Environment
A Study of Protocols for Grid Computing Environment
 
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...
Network Traffic Trends Prediction Using Machine Learning Modelling of Packet ...
 
Iccci10 bd siit
Iccci10 bd siitIccci10 bd siit
Iccci10 bd siit
 
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...
Delivering Application-Layer​ Traffic Optimization​ (ALTO) Services based on ...
 
IEEE Big data 2016 Title and Abstract
IEEE Big data  2016 Title and AbstractIEEE Big data  2016 Title and Abstract
IEEE Big data 2016 Title and Abstract
 
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...
Data centric multiobjective qo s-aware routing protocol (dm-qos) for body are...
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
 
An Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringAn Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream Clustering
 
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable MechanismSurvey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
 
Approximation of regression-based fault minimization for network traffic
Approximation of regression-based fault minimization for network trafficApproximation of regression-based fault minimization for network traffic
Approximation of regression-based fault minimization for network traffic
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a survey
 
Major Report on ADIAN
Major Report on ADIANMajor Report on ADIAN
Major Report on ADIAN
 

Recently uploaded

How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 

Recently uploaded (20)

How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 

Walking on AI/ML for Networking

  • 1.
  • 2. Walking on AI/ML for Networking Ph.D Oscar Mauricio Caicedo Grupo de Ingeniería Telemática Universidad del Cauca Colombia
  • 3. Outline ● Unsupervised Learning for Heavy-Hitters Threshold Estimation ● Supervised Learning for Multipath Routing in Software-Defined Data Center Networks ● Reinforcement Learning for Dynamic and Intelligent Routing ● Research Direction 3
  • 4. Unsupervised Learning for Heavy-Hitters Threshold Estimation 4
  • 5. Unsupervised Learning Learning approach Training dataset Problems aimed Unsupervised Unlabeled Clustering Clustering (Boutaba et al., 2018) 5
  • 6. ● Duration – time between the first and last packet of the flow ● Size – total number of bytes transmitted in the flow ● Rate – given by the size divided by the duration ● Burstiness – characterization of burst in a flow over time t (Duque-Torres et al., 2019) Heavy-Hitters Classification Schemes 6
  • 7. ● Outlier and anomaly detection ● Load balancing ● Alleviating network congestion (Duque-Torres et al., 2019) Need for Heavy-Hitter Flows Detection ● HH detections is far from being obvious ● Threshold estimation is a major challenge 7
  • 8. (Duque-Torres et al., 2019) Knowledge Discovery Process Threshold Estimation based on Knowledge Discovery Process 8
  • 9. (Duque-Torres et al., 2019) ● Can the Knowledge Discovery Process (KDP) to shed new light on threshold estimation in HH detection? ● Can we determine an optimal threshold or set of thresholds that would optimally separate the flows into HHs and non-HHs in a variety of network and traffic conditions? Threshold Estimation based on Knowledge Discovery Process 9
  • 10. ● Duration → tortoises and dragonflies ● Sizes → elephants and mice ● Rate → cheetahs and snails ● Burstiness → porcupines and stingrays Understanding the Problem 10
  • 11. (Duque-Torres et al., 2019) Understanding the Problem 11
  • 12. Work Citations Year Appeared In Threshold Value Curtis et al. 187 2011 IEEE INFOCOM Size 128KB, 1MB and 100MB Chiesa et al. 29 2016 IEEE/ACM Trans. Netw. Bandwidth 10% Moshref et al. 47 2013 ACM HotSDN Bandwidth 10% Liu et al. 19 2013 IEEE Commun. Lett. Rate 1-10 Mbps Trestain et al. 14 2013 IFIP/IEEE IM Rate Not specified Lin et al. 9 2014 IEEE GLOBECOM Size 100 MB Bi et al. 8 2013 IEEE GLOBECOM Size Adaptive Understanding the Problem (Duque-Torres et al., 2019) 12
  • 13. TCP (Duque-Torres et al., 2019) Understanding the Data Dataset Count Average UNIV1 UNIV2 CAIDA2016 CAIDA2018 6886055 6092 9017007 8225731 710.74 198.84 517.31 1021.20 UDP Count Average 1453303 9869 978 920417 1618524 390.74 758.24 433.62 674.91 TOTAL Count Average 9999617 9 99956 9994197 9982640 582.60 749.34 509.13 959 Datasets 13
  • 14. Feature name Feature description Direction min_ps Packet with minimum size in the flow F & B max_ps Packet with maximum size in the flow F & B avg_ps Average packet size F & B std_ps Standard deviation of packet size F & B min_piat Minimum packet inter-arrival time F & B max_piat Maximum packet inter-arrival time F & B avg_piat Average packet inter-arrival time F & B std_piat Stand. dev. of packet inter-arrival time F & B octTotCount Number of transmitted bytes in the flow F & B pktTotCount Number of transmitted pkts in the flow F & B flowDur Duration of the flow (in seconds) Data Preparation (Duque-Torres et al., 2019) 14
  • 15. ● K-Means was used due to its simplicity, speed, and accuracy ● The optimal number of clusters was defined by Silhouette method Data Mining Value Interpretation Si = 1 The samples are very well clustered 0.5 < Si < 0.7 The samples have a reasonable structure for clustering Si = 0.5 The samples are probably placed in the wrong cluster 0.26 < Si < 0.5 The samples do not have a substantial structure for clustering Si = 0 The samples lies between two cluster Si < 0 The samples are placed in the wrong cluster (Duque-Torres et al., 2019) 15
  • 17. Data Mining ● K-means along the two first principal components ● k = 2 (Duque-Torres et al., 2019) Dataset Class I Class II Total UNIV1 UNIV2 CAIDA2016 CAIDA2018 293 962 18 056 451 696 563 869 8 1 9 3 293 970 18 057 451 705 563 872 17
  • 18. Data Mining ● K-means along the two first principal components ● k = 4 (Duque-Torres et al., 2019) 18
  • 19. max min average 11 394 18 180 33 043 14 258 1 2 1 1 16.09 103.81 16.78 12.4 281 168 28 4 648 28 465 44 251 149 047 108 230 20775 14 784 219 806 194 204 25 357 23 489 85 848 118 060 66 743 14 658 14 987 32 590 47 373 2 513 29 761 61 128 57 058 5 162 79 661 40 486 24 140 190 733 1 418 8 256 904 98 909 4.42 13.10 3.52 191.33 Flow Size [MB] K Dataset flows I UNIV1 UNIV2 CAIDA2016 CAIDA2018 293 709 17 979 1451 051 563 472 II UNIV1 UNIV2 CAIDA2016 CAIDA2018 6 6 7 21 III UNIV1 UNIV2 CAIDA2016 CAIDA2018 26 32 2 376 IV UNIV1 UNIV2 CAIDA2016 CAIDA2018 229 36 644 3 max min average 1.88 6.50 1.34 3.73 0.000032 0.000064 0.000028 0.000029 0.00847 0.00212 0.00421 0.00948 194.11 250.30 42.65 66.22 145.39 107.610 31.14 21.45 167.37 178.96 37.60 34.08 86.80 95.63 100.05 19.81 18.15 32.53 70.93 3.76 30.35 57.94 85.49 7.46 17.64 28.87 22.66 277.62 1.93 7.081 1.35 143.45 4.42 13.10 3.52 191.33 Number of Packets Flow Duration [s] max min average 2 643.73 455.32 20.202 22.40 0 0 0 0 6.76 33.87 3.23 2.39 2 643.13 455.29 20.19 22.39 151.97 393.25 15.8 7.04 1 061.68 428.88 15.86 19.45 2 641.51 454.52 20.20 22.40 2.27 48.34 10.19 0.26 182.62 402.615 20.19 17.25 2 643.89 454.52 20.202 22.29 1.39 48.34 0.028 21.09 507.30 402.61 15.21 21.89 Evaluating the Discovered Knowledge 19
  • 20. Silhouette Analysis - Ambiguous Flows (Class I, Ambiguous Flows) (Duque-Torres et al., 2019) Evaluating the Discovered Knowledge 20
  • 21. (Duque-Torres et al., 2019) Ambiguous Flows (Class I) Evaluating the Discovered Knowledge max min average 11 394 18 180 7 601 14 258 1 2 1 1 13.05 108.81 14.7 13.53 281 168 28 4 648 28 465 44 251 149 047 108 230 20775 14 784 1 275 17 376 834 1 356.28 Flow Size [MB] K Dataset flows I UNIV1 UNIV2 CAIDA2016 CAIDA2018 293 002 17 979 449 904 562 273 II UNIV1 UNIV2 CAIDA2016 CAIDA2018 707 36 1 147 1 199 max min average 0.456 6.5 0.324 0.922 0.000032 0.000064 0.000028 0.000029 0.00629 0.0214 0.00660 0.00558 145.39 28.87 1.34 3.734 1.88 107.610 31.14 21.45 0.306 178.96 37.60 34.08 Number of Packets Flow Duration [s] max min average 2 643.73 455.32 20.20 22.40 0 0 0 0 6.09 33.87 3.29 2.36 2 643.05 453.71 20.20 22.403 0.379 16.06 0.05 0.030 0.379 339.58 15.65 15.52 For TCP traffic ○ Flow size = 6 kB ○ Packet count = 14 21
  • 22. ● There is no threshold that can unambiguously classify HH flows ● Significance of network and traffic characteristics ○ Flow size = 6 kB ○ Packet count = 14 (Duque-Torres et al., 2019) Evaluating the Discovered Knowledge 22
  • 23. ● Open challenges ○ The time required to collect enough flow details for reliable classification requires further research ■ Investigate the usefulness of per-flow packet size distributions for accurate classification ■ Develop a threshold-less approach for HH classification Evaluating the Discovered Knowledge 23
  • 24. Supervised Learning for Multipath Routing in Software-Defined Data Center Networks 24
  • 25. Supervised Learning Learning approach Training dataset Problems aimed Supervised Labeled Classification and regression Semi-supervised Incomplete labels Classification Regression (Boutaba et al., 2018) 25
  • 26. ● Data Center Networks (DCNs) ○ Significant bandwidth capacity ○ Dynamic traffic ○ Coexistence of many kind of flows ■ Small, short-lived flows → mice ■ Few large, long-lived flows → elephants (Dixit et al., 2013) (Jain et al., 2009) Multipath Routing in Software-Defined Data Center Networks 26
  • 27. ● Equal-Cost MultiPath (ECMP) ○ The most prevalent multipath routing mechanism in DCNs ○ Each switch splits flows over the output ports applying a function on packet headers ■ E.g., a hash function on the 5-tuple header (Dixit et al., 2013) (Jain et al., 2009) Multipath Routing in Software-Defined Data Center Networks 27
  • 28. Input Port Output 1 Output 2 ABCD hash AD BC Hot-spot A + D << B + C Flows (Dixit et al., 2013) (Jain et al., 2009) ● ECMP causes hot-spots in DCNs ⟶ low throughput and high delay Multipath Routing in Software-Defined Data Center Networks 28
  • 29. ● Software-Defined Networking (SDN) for facing ECMP limitations ○ A global view of the network ○ A logically centralized controller to dynamically make and install routing decisions ● A DCN using SDN → SDDCN (Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010) Multipath Routing in Software-Defined Data Center Networks 29
  • 30. ● Reactive SDN-based multipath routing ○ Dynamically rescheduling of elephant flows ○ Mouse flows are handle by default routing (e.g., ECMP) ○ Discriminate flows using static thresholds ○ Hot-spots may occur before elephants are detected (Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010) Multipath Routing in Software-Defined Data Center Networks 30
  • 31. ● Proactive SDN/ML-based multicast routing ○ ML-based methods at the controller for proactively identifying elephants ○ Central collection ■ Per-flow data ■ Sampling-based data (Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010) Multipath Routing in Software-Defined Data Center Networks 31
  • 32. ● Proactive SDN/ML-based multicast routing - controller side ○ Central collection ■ Per-flow data ✗ Heavy traffic overhead and poor scalability ■ Sampling-based data ✗ Delayed and inaccurate flow information (Boutaba et al., 2018)(Tang et al., 2017) (Chao et al., 2016) (Poupart et al., 2016) (Curtis et al., 2011a-b) (Al-Fares et al., 2010) Multipath Routing in Software-Defined Data Center Networks 32
  • 33. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) ● Incremental learning at server-side ● Proactive, accurately and timely elephant detection ● Adaptation to varying traffic conditions ● Low control overhead ● Constantly updated model with limited resources NELLY 33
  • 34. (Estrada-Solano et al., 2019) NELLY (Network Elephant Learner and anaLYzer) Architecture - High level view 34
  • 35. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) Analyzer 35
  • 36. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) Learner 36
  • 37. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) Packet traces UNI1 UNI2 Duration 65 min 158 min Packets 19.85 M 100 M IPv4 % of total traffic 98.98% (mostly TCP) 99.9% (mostly UDP) IPv4 flows 1.02 M (TCP and UDP) 1.04 M (mostly UDP) Datasets 37
  • 38. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) ● 50 incremental learning algorithms ● Metrics ○ True positive rate (TPR) ○ False positive rate (FPR) ○ Matthews Correlation Coefficient (MCC) 38
  • 39. NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) Algorithms UNI1 UNI2 TPR (%) FPR (%) MCC TC (µs) Header type TPR (%) FPR (%) MCC TC (µs) Header type AHOT 85.97 35.52 0.327 4.07 BinNom 60.16 28.58 0.304 10.17 BinNom ARF 82.39 28.82 0.359 12.01 BinNom 68.65 21.33 0.460 17.39 BinNom Hoeffding tree 86.79 36.38 0.326 3.18 BinNom 57.92 28.46 0.284 4.64 BinNom NB 74.76 35.69 0.254 4.76 BinNom 49.74 23.18 0.267 4.82 BinNom OAUE 86.79 33.63 0.347 25.58 BinNom 63.28 28.65 0.332 33.06 BinNom OzaBag 87.78 37.11 0.327 23.98 BinNom 64.17 31.13 0.314 36.61 BinNom OzaBoost 75.88 29.93 0.307 11.56 BinNom 64.62 32.22 0.307 16.82 BinNom SGD-SVM 16.76 10.21 0.067 0.81 Num 38.69 30.99 0.076 0.8 Num In bold the top five results of TPR and MCC, and the TC results shorter than 17.5 µs, for both UNI1 and UNI2 Accuracy and classification time with various incremental learning algorithms 39
  • 40. Accuracy varying the elephants weight NELLY (Network Elephant Learner and anaLYzer) 40
  • 41. Comparison NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) Works Learning approach Elephants detection False elephants Table occupancy Control overhead Detection time Network modifications* Poupart et al., 2016 Incremental Very high Very low Very high High Very short None Tang et al., 2017 Batch High Very low Medium Medium Medium Hardware of switches Chao et al., 2016 Batch and incremental Very high High Low Medium Medium None Curtis et al., 2011 None Perfect None None Very low Long Software in servers NELLY Incremental Very high Very low None Very low Very short Software in servers * Assuming OpenFlow-based networks 41
  • 42. ● Open challenges ○ NELLY implementation as an in-kernel software ○ Evaluate impact cost in servers ○ Processing and memory requirements ○ Threshold definition NELLY (Network Elephant Learner and anaLYzer) (Estrada-Solano et al., 2019) 42
  • 43. ● Open challenges ○ ML-based elephant rescheduler at the Controller ■ Predict the rate and duration of elephants ■ Regression or multi-classification? ■ How to define the classes? ○ Reschedule elephants based on prediction ○ Algorithm for allocating elephants to links NELLY (Network Elephant Learner and anaLYzer) 43
  • 44. Reinforcement Learning for Dynamic and Intelligent Routing 44
  • 46. ● Network traffic routing is fundamental in networking ○ Selecting a path for packet transmission ● Selection criteria ○ Cost minimization ○ Maximization of link utilization ○ Quality of Service (QoS) provisioning Network Routing Problem 46
  • 47. ● The efficient selection of the best path through a network is still an important issue to be addressed and improved STATIC and DYNAMIC routing ● Dynamic routing ○ Distance vector protocols ○ Link state protocols Network Routing Problem 47
  • 48. ● Distance vector protocols ○ Routing information is not covered by the router itself ○ Delay in routing information ○ Network overhead ● Link state protocols ○ More memory and processing power ○ Understanding its operation is critical Network Routing Problem 48
  • 49. ● Dynamic routing ○ More complexity ○ Overhead on the network ○ Very rigid ○ Doesn’t learn from past decisions Network Routing Problem 49
  • 52. (Casas, 2019) Routing based on Knowledge-Defined Networking ● Architecture based on RL and SDN 52
  • 53. Routing based on Knowledge-Defined Networking (Casas, 2019) Q-learning Architecture Operation 53
  • 54. Routing based on Knowledge-Defined Networking ● Open challenges ○ The convergence time ○ Testing in real large networks ○ Multi-controllers ○ Probing of statistics collection based on RL ○ Collaborative learning for considering sub-networks? Demo 54
  • 56. 5G Scaling Problem ● Mobile traffic is expected to grow exponentially ● The complexity, heterogeneity and scale of networks have grown far beyond the limits of manual administration ● NFV is a 5G foundation ● Scaling VNFs pertaining to Network Services and Network Slices is a big challenge 56
  • 57. ● Scaling in NFV brings cost reduction compared to oversize capacity for workload peak ● Horizontal scaling dynamically increases/reduces the number of VNF instances ● Vertical scaling increases/reduces the number of resources allocated to one or more VNFs ● Most existing solutions scale NFs assuming they operate isolatedly 5G Scaling Problem 57
  • 58. Work Title Technique Carella et al. (2016) An extensible autoscaling engine for software-based network functions Threshold rules Dutta et al. (2016) QoE-Aware elasticity support in cloud-native 5G systems Threshold rules Bilal et al. (2016) Dynamic cloud resource scheduling in virtualized 5G mobile systems Time series Alawe et al. (2018) Improving traffic forecasting for 5G core network scalability: A machine learning approach Neural networks Tang et al. (2015) Efficient auto-scaling approach in the telco cloud using self-learning algorithm Q-Learning Alawe et al. (2018) On the scalability of 5G core network Control theory 5G Scaling Problem Solutions focused on scaling NFs individually -(Demo) 58
  • 59. Solutions that consider scaling NFs as part of compound network services Work Title Observation Wang et al. (2016) Online VNF scaling in data centers It assumes that all NFs must be scaled Zhang et al. (2016) Co-scaler: cooperative scaling of software-defined VNF service function chain It focuses on flow migration and scales all NFs Rankothge et al. (2017) Optimizing resource allocation for virtualized network functions in a cloud center using genetic algorithms It scales only NF that is chosen randomly Prados-Garzon et al. (2018) A queuing based dynamic auto scaling algorithm for the LTE EPC control plane It is specific for LTE vEPC 5G Scaling Problem 59
  • 60. Collaborative Learning for 5G Scaling 60
  • 61. Acknowledgements Ph.D. Armando Ordoñez Ph.D. Student - Felipe Estrada Ph.D. Student - Carlos Tobar Master Student - Alejandra Duque Master Student - Daniela Casas Undergraduate Student - Juan Daza Undergraduate Student - Andrés Maca
  • 63. Monitoring link Commercia l office Commercial office Data processing center Data link Performance degradation (e.g., delays, congestion, and loss) Monitoring performance parameters of the network as link usage, packet loss and delay To assure the QoS “Accurate and timely statistics” Service provider Up-to-date view of network Probing Problem
  • 64. Monitoring link Commercia l office Commercial office Data processing center Data link Performance degradation (e.g., delays, congestion, and loss) ● It is necessary the accurate and timely collection of the statistics of flows Service provider Up-to-date view of network Probing Problem
  • 65. Probing based on Knowledge-Defined Networking ● Architecture based on RL and KDN 65