Traffic classification svm_im2015_10may2015

Yang Hong, Changcheng Huang
Department of Systems and Computer Engineering,
Carleton University, Ottawa, Canada
Biswajit Nandy, Nabil Seddigh
Solana Networks
Ottawa, Ontario, Canada
14th IFIP/IEEE Symposium on Integrated Network and Service Management
(IFIP/IEEE IM), Ottawa, Canada, May 2015, pp.458−466

Why Traffic Classification?
 Different types of network/cloud applications
impose inherently different QoS requirements
 low end-to-end delay for interactive applications
 high throughput for file transfer applications
 Network utilization needs to be optimized
 while ensuring performance for various applications
 Network/cloud operators need to treat
applications differently
2

Applications over Network Traffic
Network applications are classified
into different categories
 bulk data transfer
 file transfer protocol
 peer-to-peer downloads
 cloud service
 cloud computing
 database transactions
 real-time streaming
 voice
 video
Different applications running over network
3
Internet
Computer
Group
Database
Streaming
Media
Web
FTP
E-mail
E-Commence

Contributions of This Paper
 Proposal of an iterative-tuning scheme to
 increase training speed of Support Vector Machine
(SVM) learning algorithms against multi-class
classification problem
 Theoretical analysis of iterative-tuning scheme to
 derive the equations to obtain SVM parameters
 Application of iterative-tuning SVM to
 achieve a best trade-off between classification accuracy
and training speed
4

Outline
 Related Work (Traffic Classification Approaches)
 Support Vector Machine (SVM) Overview
 SVM Multi-Class Formulation
 Iterative-Tuning SVM
 Performance Evaluation of SVM Classification
 Conclusions
5

Traffic Classification Approaches (1)
 Port-based Classification
 Perform application mapping using Internet Assigned
Numbers Authority (IANA) standardized port numbers
 Payload-based Classification
 Inspect packet header and payload to match it against
application pattern signatures
 Host-behavior-based Classification
 Capture behavioral information of a host to match it
against host-behavior signatures of applications
 Flow-features-based Classification
 Capture flow features to map different applications
with different statistical features
6

Traffic Classification Approaches (2)
 Port-based Classification
 insufficient for those applications which assign ports
dynamically or share popular ports
 Payload-based Classification
 can NOT accurately identify the traffic application if
the payload is encrypted
 Host-behavior-based Classification
 can NOT identify specific application sub-types
 Flow-features-based Classification
 require a large scale of dataset
 SVM algorithm achieves the highest traffic
classification accuracy (this paper improves SVM)
7

Support Vector Machine (SVM) Overview
 Construct separate hyper-plane
for each traffic class with multiple
flow-features
 Maximize the distance between
the closest training data samples of
different classes in n-dimensional
flow-feature space
 Red circle represents a training
sample of Class 1
 blue square represents a
training sample of Class 2
 A new sample is classified into a
class where it is closest to
Hyper-planes constructed by SVM for
two different classes
8
B
A
Class 1
Class 2

SVM Multi-Class Formulation (Primal Problem)
Notation: i is index of a class; m is number of classes;
wi is a weight vector associated with class i;
C is a general regularization parameter and C>0;
k is index of a training sample; l is number of training samples;
ξk is a non-negative constraint associated with the training sample;
xk∈ℜn is a input vector (n is the number of flow-features) associated with
the k-th training sample;
yk∈{1, . . . , m} is the corresponding membership class for xk;
The weight wi can be regarded as the inverse of the mean distance
between an sample and all training samples of class i.
(1)
(2)
9
∑∑
==
+
l
k
k
m
i
i
w
Cw
ki 11
2
, 2
1
min ξ
ξ
,,,, kiexwxw kkik
T
ik
T
yk
∀−≥− ξConstraints
( ) max max T
i i
i i
D x D w x= = (3)
Maximum value Di predicts the membership class of a testing sample

SVM Multi-Class Formulation (Dual Problem)
Notation: see the previous slide #9
(4)
(5)
10
Constraints
(10)
∑∑∑
= ==
+=Ω
l
k
m
i
kiki
m
i
i ew
1 1
,,
1
2
)(
2
1
)(min ααα
α
,)0]2[,]1[(
1
,,, kiC
m
i
kikiki ∀=∀≤ ∑
=
αα
∑∑
==
==
l
k
kki
l
k
kii xww
1
,
1
, )()( ααα (7)
,
( )
0
i k
α
α
∂Ω
=
∂ (8)Optimal solution:
.,,)(
)(
,
,
, ikexwg kik
T
i
ki
ki ∀+=
∂
Ω∂
= α
α
α
(9)
,,minmax ,
:
,
,,
kgg ki
Ci
ki
i
k
kiki
∀−=
<α
υ

Iterative-Tuning SVM
System diagram of
iterative-tuning scheme
11
w(α)
Iterative
Tuning
α Ω(α)
,1
, 1 ,
( )k j
k j k j j j
k
α
α α γ λ
α
−
+
∂Ω ′
′ = ′ −
∂ ′
1, , ,[ ; ; ; ; ]k k i k m kα α α α′ =   ∑
−
=
−=
1
1
,,
m
i
kikm αα
, ,
1
( ) ( )
T
m
i k j i k j
j
i k k
w wα α
λ
α α=
 ∂ ′ ∂ ′ 
 =   ∂ ′ ∂ ′  
∑ k
ki
ki
x
w
=
∂
∂
,
)(
α
α
Gauss-Newton algorithm provides faster convergence speed
(12)
(13) (14)
(16) (17)

Experimental Setup
12
 Use NetFlow-V5 to collect network traffic trace
 Collect data over a 24-hour period
 Utilize 12 flow-features obtained from NetFlow-V5
flow-records
 as the basis for input to classification algorithms
 traffic classification achieves a better accuracy, if all
12 flow-features are selected
 NetFlow data trace consists of 241,223 TCP flows
 3 testing datasets consist of 130,527 flows, 55,531
flows, and 55,165 flows
 belong to 3 different time periods respectively

Flow-Features For Traffic Classification
13
Feature ID Feature Name
1 source port
2 destination port
3 average packet size
4 average bytes/sec (src→dst)
5 average bytes/sec (dst→src)
6 packet count (src→dst)
7 packet count (dst→src)
8 byte count (src→dst)
9 byte count (dst→src)
10 ratio of byte count (src→dst) / byte count (dst→src)
11 SYN flag count
12 flow duration

Network Traffic Classes For Different Applications
14
Application
Class
1st Testing
Dataset
2nd Testing
Dataset
3rd Testing
Dataset
Database 781 36 43
FTP 4,422 307 386
Mail 13,018 2,771 2,508
Multimedia 488 36 33
P2P 797 109 283
Service 1,037 293 220
WWW 109,984 51,979 51,692
Total 130,527 55,531 55,165

Comparison of SVM Classification Algorithms
15
SVM
Type
Training time
(ms)
Overall Accuracy
SVM-IT 187 98.66% (128776/130527)
SVM-0 1,575 98.48% (128551/130527)
SVM-1 2,698 98.68% (128804/130527)
SVM-2 530 98.62% (128724/130527)
SVM-3 1,388 98.2% (128172/130527)
SVM-4 1,528 99.1% (129356/130527)
SVM-5 7,534 98.56% (128644/130527)
SVM-6 2,932 98.16% (128127/130527)
SVM-7 5,911 98.5% (128571/130527)

Ratio of Classification Accuracy/Training time
Ratio of Accuracy/Training time
(in logarithmic scale) provided by 9
different SVM classification
algorithms for 1st testing dataset
16
0 1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
SVM Type
log10[10(Accuracy/Time)]
SVM-IT
SVM-0
SVM-1
SVM-2
SVM-3
SVM-4
SVM-5
SVM-6
SVM-7
 SVM-5 exhibits the
lowest performance/cost
ratio
 iterative-tuning SVM
provides the highest
performance/cost ratio
 achieving better trade-off
between classification
accuracy and training speed
than other 8 SVMs

Classification Precision of Each Class (1)
17
 All 9 SVMs can identify more than 99% of WWW traffic
 SVM-4 has highest precisions for identifying
 Database, FTP, and P2P traffic
 SVM-3 exhibits higher precision for classifying Mail
traffic than other 8 SVMs
0.94
0.95
0.96
0.97
0.98
0.99
1
1.01
0 1 2 3 4 5 6 7 8 9 10
ClassificationPrecision
SVM Type
Overall
FTP
Mail
WWW

Classification Precision of Each Class (2)
18
 Iterative-tuning SVM can identify 90% of Service traffic,
more precisely than other 8 SVMs
 SVM-5 can identify Multimedia traffic with greater
precision than other 8 SVMs
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
ClassificationPrecision
SVM Type
Database
Multimedia
P2P
Service

Other Experimental Findings
 Benefit of SVM Classification over Port-based
Classification
 Port-based classification only obtains overall classification
accuracy as about 88%
 SVM classification achieves overall classification accuracy
as about 98%
 Advantage and Disadvantage of Unbiased Training
Dataset
unbiased training dataset makes the classification precision
of each different class more balanced
 there is no arbitrarily low precision for any particular class
 overall accuracy decreases by nearly 2%
19

Conclusions
 Propose iterative-tuning scheme to increase training
speed
 for SVM multi-class classification dual problem
 Analyze working mechanism of iterative-tuning scheme
 to obtain dual parameter vector for SVM classification model
 Iterative-tuning SVM is computationally more efficient
than 8 typical SVMs
 while exhibiting almost identical accuracy as those 8 SVMs
 SVM classification based on flow-level information
 achieve accuracy higher than 98%
 allow network/cloud operators to apply traffic classification for a
range of issues including semi-real-time security monitoring and
traffic engineering
20

Traffic classification svm_im2015_10may2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Traffic classification svm_im2015_10may2015

Similar to Traffic classification svm_im2015_10may2015 (20)

Recently uploaded

Recently uploaded (20)

Traffic classification svm_im2015_10may2015