SlideShare a Scribd company logo
1 of 20
Download to read offline
Yang Hong, Changcheng Huang
Department of Systems and Computer Engineering,
Carleton University, Ottawa, Canada
Biswajit Nandy, Nabil Seddigh
Solana Networks
Ottawa, Ontario, Canada
14th IFIP/IEEE Symposium on Integrated Network and Service Management
(IFIP/IEEE IM), Ottawa, Canada, May 2015, pp.458−466
Why Traffic Classification?
 Different types of network/cloud applications
impose inherently different QoS requirements
 low end-to-end delay for interactive applications
 high throughput for file transfer applications
 Network utilization needs to be optimized
 while ensuring performance for various applications
 Network/cloud operators need to treat
applications differently
2
Applications over Network Traffic
Network applications are classified
into different categories
 bulk data transfer
 file transfer protocol
 peer-to-peer downloads
 cloud service
 cloud computing
 database transactions
 real-time streaming
 voice
 video
Different applications running over network
3
Internet
Computer
Group
Database
Streaming
Media
Web
FTP
E-mail
E-Commence
Contributions of This Paper
 Proposal of an iterative-tuning scheme to
 increase training speed of Support Vector Machine
(SVM) learning algorithms against multi-class
classification problem
 Theoretical analysis of iterative-tuning scheme to
 derive the equations to obtain SVM parameters
 Application of iterative-tuning SVM to
 achieve a best trade-off between classification accuracy
and training speed
4
Outline
 Related Work (Traffic Classification Approaches)
 Support Vector Machine (SVM) Overview
 SVM Multi-Class Formulation
 Iterative-Tuning SVM
 Performance Evaluation of SVM Classification
 Conclusions
5
Traffic Classification Approaches (1)
 Port-based Classification
 Perform application mapping using Internet Assigned
Numbers Authority (IANA) standardized port numbers
 Payload-based Classification
 Inspect packet header and payload to match it against
application pattern signatures
 Host-behavior-based Classification
 Capture behavioral information of a host to match it
against host-behavior signatures of applications
 Flow-features-based Classification
 Capture flow features to map different applications
with different statistical features
6
Traffic Classification Approaches (2)
 Port-based Classification
 insufficient for those applications which assign ports
dynamically or share popular ports
 Payload-based Classification
 can NOT accurately identify the traffic application if
the payload is encrypted
 Host-behavior-based Classification
 can NOT identify specific application sub-types
 Flow-features-based Classification
 require a large scale of dataset
 SVM algorithm achieves the highest traffic
classification accuracy (this paper improves SVM)
7
Support Vector Machine (SVM) Overview
 Construct separate hyper-plane
for each traffic class with multiple
flow-features
 Maximize the distance between
the closest training data samples of
different classes in n-dimensional
flow-feature space
 Red circle represents a training
sample of Class 1
 blue square represents a
training sample of Class 2
 A new sample is classified into a
class where it is closest to
Hyper-planes constructed by SVM for
two different classes
8
B
A
Class 1
Class 2
SVM Multi-Class Formulation (Primal Problem)
Notation: i is index of a class; m is number of classes;
wi is a weight vector associated with class i;
C is a general regularization parameter and C>0;
k is index of a training sample; l is number of training samples;
ξk is a non-negative constraint associated with the training sample;
xk∈ℜn is a input vector (n is the number of flow-features) associated with
the k-th training sample;
yk∈{1, . . . , m} is the corresponding membership class for xk;
The weight wi can be regarded as the inverse of the mean distance
between an sample and all training samples of class i.
(1)
(2)
9
∑∑
==
+
l
k
k
m
i
i
w
Cw
ki 11
2
, 2
1
min ξ
ξ
,,,, kiexwxw kkik
T
ik
T
yk
∀−≥− ξConstraints
( ) max max T
i i
i i
D x D w x= = (3)
Maximum value Di predicts the membership class of a testing sample
SVM Multi-Class Formulation (Dual Problem)
Notation: see the previous slide #9
(4)
(5)
10
Constraints
(10)
∑∑∑
= ==
+=Ω
l
k
m
i
kiki
m
i
i ew
1 1
,,
1
2
)(
2
1
)(min ααα
α
,)0]2[,]1[(
1
,,, kiC
m
i
kikiki ∀=∀≤ ∑
=
αα
∑∑
==
==
l
k
kki
l
k
kii xww
1
,
1
, )()( ααα (7)
,
( )
0
i k
α
α
∂Ω
=
∂ (8)Optimal solution:
.,,)(
)(
,
,
, ikexwg kik
T
i
ki
ki ∀+=
∂
Ω∂
= α
α
α
(9)
,,minmax ,
:
,
,,
kgg ki
Ci
ki
i
k
kiki
∀−=
<α
υ
Iterative-Tuning SVM
System diagram of
iterative-tuning scheme
11
w(α)
Iterative
Tuning
α Ω(α)
,1
, 1 ,
( )k j
k j k j j j
k
α
α α γ λ
α
−
+
∂Ω ′
′ = ′ −
∂ ′
1, , ,[ ; ; ; ; ]k k i k m kα α α α′ =   ∑
−
=
−=
1
1
,,
m
i
kikm αα
, ,
1
( ) ( )
T
m
i k j i k j
j
i k k
w wα α
λ
α α=
 ∂ ′ ∂ ′ 
 =   ∂ ′ ∂ ′  
∑ k
ki
ki
x
w
=
∂
∂
,
)(
α
α
Gauss-Newton algorithm provides faster convergence speed
(12)
(13) (14)
(16) (17)
Experimental Setup
12
 Use NetFlow-V5 to collect network traffic trace
 Collect data over a 24-hour period
 Utilize 12 flow-features obtained from NetFlow-V5
flow-records
 as the basis for input to classification algorithms
 traffic classification achieves a better accuracy, if all
12 flow-features are selected
 NetFlow data trace consists of 241,223 TCP flows
 3 testing datasets consist of 130,527 flows, 55,531
flows, and 55,165 flows
 belong to 3 different time periods respectively
Flow-Features For Traffic Classification
13
Feature ID Feature Name
1 source port
2 destination port
3 average packet size
4 average bytes/sec (src→dst)
5 average bytes/sec (dst→src)
6 packet count (src→dst)
7 packet count (dst→src)
8 byte count (src→dst)
9 byte count (dst→src)
10 ratio of byte count (src→dst) / byte count (dst→src)
11 SYN flag count
12 flow duration
Network Traffic Classes For Different Applications
14
Application
Class
1st Testing
Dataset
2nd Testing
Dataset
3rd Testing
Dataset
Database 781 36 43
FTP 4,422 307 386
Mail 13,018 2,771 2,508
Multimedia 488 36 33
P2P 797 109 283
Service 1,037 293 220
WWW 109,984 51,979 51,692
Total 130,527 55,531 55,165
Comparison of SVM Classification Algorithms
15
SVM
Type
Training time
(ms)
Overall Accuracy
SVM-IT 187 98.66% (128776/130527)
SVM-0 1,575 98.48% (128551/130527)
SVM-1 2,698 98.68% (128804/130527)
SVM-2 530 98.62% (128724/130527)
SVM-3 1,388 98.2% (128172/130527)
SVM-4 1,528 99.1% (129356/130527)
SVM-5 7,534 98.56% (128644/130527)
SVM-6 2,932 98.16% (128127/130527)
SVM-7 5,911 98.5% (128571/130527)
Ratio of Classification Accuracy/Training time
Ratio of Accuracy/Training time
(in logarithmic scale) provided by 9
different SVM classification
algorithms for 1st testing dataset
16
0 1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
SVM Type
log10[10(Accuracy/Time)]
SVM-IT
SVM-0
SVM-1
SVM-2
SVM-3
SVM-4
SVM-5
SVM-6
SVM-7
 SVM-5 exhibits the
lowest performance/cost
ratio
 iterative-tuning SVM
provides the highest
performance/cost ratio
 achieving better trade-off
between classification
accuracy and training speed
than other 8 SVMs
Classification Precision of Each Class (1)
17
 All 9 SVMs can identify more than 99% of WWW traffic
 SVM-4 has highest precisions for identifying
 Database, FTP, and P2P traffic
 SVM-3 exhibits higher precision for classifying Mail
traffic than other 8 SVMs
0.94
0.95
0.96
0.97
0.98
0.99
1
1.01
0 1 2 3 4 5 6 7 8 9 10
ClassificationPrecision
SVM Type
Overall
FTP
Mail
WWW
Classification Precision of Each Class (2)
18
 Iterative-tuning SVM can identify 90% of Service traffic,
more precisely than other 8 SVMs
 SVM-5 can identify Multimedia traffic with greater
precision than other 8 SVMs
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
ClassificationPrecision
SVM Type
Database
Multimedia
P2P
Service
Other Experimental Findings
 Benefit of SVM Classification over Port-based
Classification
 Port-based classification only obtains overall classification
accuracy as about 88%
 SVM classification achieves overall classification accuracy
as about 98%
 Advantage and Disadvantage of Unbiased Training
Dataset
unbiased training dataset makes the classification precision
of each different class more balanced
 there is no arbitrarily low precision for any particular class
 overall accuracy decreases by nearly 2%
19
Conclusions
 Propose iterative-tuning scheme to increase training
speed
 for SVM multi-class classification dual problem
 Analyze working mechanism of iterative-tuning scheme
 to obtain dual parameter vector for SVM classification model
 Iterative-tuning SVM is computationally more efficient
than 8 typical SVMs
 while exhibiting almost identical accuracy as those 8 SVMs
 SVM classification based on flow-level information
 achieve accuracy higher than 98%
 allow network/cloud operators to apply traffic classification for a
range of issues including semi-real-time security monitoring and
traffic engineering
20

More Related Content

Similar to Traffic classification svm_im2015_10may2015

Presentation Robayet Nasim (IEEE CLOUD 2015)
Presentation Robayet Nasim (IEEE CLOUD 2015) Presentation Robayet Nasim (IEEE CLOUD 2015)
Presentation Robayet Nasim (IEEE CLOUD 2015)
Robayet Nasim
 
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
IJECEIAES
 
Fpga implementation of scalable queue manager
Fpga implementation of scalable queue managerFpga implementation of scalable queue manager
Fpga implementation of scalable queue manager
IAEME Publication
 
Fpga implementation of scalable queue manager
Fpga implementation of scalable queue managerFpga implementation of scalable queue manager
Fpga implementation of scalable queue manager
iaemedu
 
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Laurens De Vocht
 
Traffic Classification using a Statistical Approach
Traffic Classification using a Statistical ApproachTraffic Classification using a Statistical Approach
Traffic Classification using a Statistical Approach
Denis Zuev
 
Multi hop wireless-networks
Multi hop wireless-networksMulti hop wireless-networks
Multi hop wireless-networks
ambitlick
 

Similar to Traffic classification svm_im2015_10may2015 (20)

Presentation Robayet Nasim (IEEE CLOUD 2015)
Presentation Robayet Nasim (IEEE CLOUD 2015) Presentation Robayet Nasim (IEEE CLOUD 2015)
Presentation Robayet Nasim (IEEE CLOUD 2015)
 
An efficient recovery mechanism
An efficient recovery mechanismAn efficient recovery mechanism
An efficient recovery mechanism
 
Impact of Randomness on MAC Layer Schedulers over High Speed Wireless Campus ...
Impact of Randomness on MAC Layer Schedulers over High Speed Wireless Campus ...Impact of Randomness on MAC Layer Schedulers over High Speed Wireless Campus ...
Impact of Randomness on MAC Layer Schedulers over High Speed Wireless Campus ...
 
teste
testeteste
teste
 
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
 
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
An Accurate Performance Analysis of Hybrid Efficient and Reliable MAC Protoco...
 
Choosing the best quality of service algorithm using OPNET simulation
Choosing the best quality of service algorithm using OPNET  simulationChoosing the best quality of service algorithm using OPNET  simulation
Choosing the best quality of service algorithm using OPNET simulation
 
Final Year Project IEEE 2015
Final Year Project IEEE 2015Final Year Project IEEE 2015
Final Year Project IEEE 2015
 
Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015
 
D044021420
D044021420D044021420
D044021420
 
Quality of Service for Video Streaming using EDCA in MANET
Quality of Service for Video Streaming using EDCA in MANETQuality of Service for Video Streaming using EDCA in MANET
Quality of Service for Video Streaming using EDCA in MANET
 
Fpga implementation of scalable queue manager
Fpga implementation of scalable queue managerFpga implementation of scalable queue manager
Fpga implementation of scalable queue manager
 
Fpga implementation of scalable queue manager
Fpga implementation of scalable queue managerFpga implementation of scalable queue manager
Fpga implementation of scalable queue manager
 
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
 
IRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation ToolsIRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
 
Adaptive Traffic Sampling and Management Platform
Adaptive Traffic Sampling and Management PlatformAdaptive Traffic Sampling and Management Platform
Adaptive Traffic Sampling and Management Platform
 
Traffic Classification using a Statistical Approach
Traffic Classification using a Statistical ApproachTraffic Classification using a Statistical Approach
Traffic Classification using a Statistical Approach
 
Call Admission Control Scheme With Multimedia Scheduling Service in WiMAX Net...
Call Admission Control Scheme With Multimedia Scheduling Service in WiMAX Net...Call Admission Control Scheme With Multimedia Scheduling Service in WiMAX Net...
Call Admission Control Scheme With Multimedia Scheduling Service in WiMAX Net...
 
Enabling SDN for Service Providers by Khay Kid Chow
Enabling SDN for Service Providers by Khay Kid ChowEnabling SDN for Service Providers by Khay Kid Chow
Enabling SDN for Service Providers by Khay Kid Chow
 
Multi hop wireless-networks
Multi hop wireless-networksMulti hop wireless-networks
Multi hop wireless-networks
 

Recently uploaded

6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
nirzagarg
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
nirzagarg
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
imonikaupta
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 

Recently uploaded (20)

APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
 

Traffic classification svm_im2015_10may2015

  • 1. Yang Hong, Changcheng Huang Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada Biswajit Nandy, Nabil Seddigh Solana Networks Ottawa, Ontario, Canada 14th IFIP/IEEE Symposium on Integrated Network and Service Management (IFIP/IEEE IM), Ottawa, Canada, May 2015, pp.458−466
  • 2. Why Traffic Classification?  Different types of network/cloud applications impose inherently different QoS requirements  low end-to-end delay for interactive applications  high throughput for file transfer applications  Network utilization needs to be optimized  while ensuring performance for various applications  Network/cloud operators need to treat applications differently 2
  • 3. Applications over Network Traffic Network applications are classified into different categories  bulk data transfer  file transfer protocol  peer-to-peer downloads  cloud service  cloud computing  database transactions  real-time streaming  voice  video Different applications running over network 3 Internet Computer Group Database Streaming Media Web FTP E-mail E-Commence
  • 4. Contributions of This Paper  Proposal of an iterative-tuning scheme to  increase training speed of Support Vector Machine (SVM) learning algorithms against multi-class classification problem  Theoretical analysis of iterative-tuning scheme to  derive the equations to obtain SVM parameters  Application of iterative-tuning SVM to  achieve a best trade-off between classification accuracy and training speed 4
  • 5. Outline  Related Work (Traffic Classification Approaches)  Support Vector Machine (SVM) Overview  SVM Multi-Class Formulation  Iterative-Tuning SVM  Performance Evaluation of SVM Classification  Conclusions 5
  • 6. Traffic Classification Approaches (1)  Port-based Classification  Perform application mapping using Internet Assigned Numbers Authority (IANA) standardized port numbers  Payload-based Classification  Inspect packet header and payload to match it against application pattern signatures  Host-behavior-based Classification  Capture behavioral information of a host to match it against host-behavior signatures of applications  Flow-features-based Classification  Capture flow features to map different applications with different statistical features 6
  • 7. Traffic Classification Approaches (2)  Port-based Classification  insufficient for those applications which assign ports dynamically or share popular ports  Payload-based Classification  can NOT accurately identify the traffic application if the payload is encrypted  Host-behavior-based Classification  can NOT identify specific application sub-types  Flow-features-based Classification  require a large scale of dataset  SVM algorithm achieves the highest traffic classification accuracy (this paper improves SVM) 7
  • 8. Support Vector Machine (SVM) Overview  Construct separate hyper-plane for each traffic class with multiple flow-features  Maximize the distance between the closest training data samples of different classes in n-dimensional flow-feature space  Red circle represents a training sample of Class 1  blue square represents a training sample of Class 2  A new sample is classified into a class where it is closest to Hyper-planes constructed by SVM for two different classes 8 B A Class 1 Class 2
  • 9. SVM Multi-Class Formulation (Primal Problem) Notation: i is index of a class; m is number of classes; wi is a weight vector associated with class i; C is a general regularization parameter and C>0; k is index of a training sample; l is number of training samples; ξk is a non-negative constraint associated with the training sample; xk∈ℜn is a input vector (n is the number of flow-features) associated with the k-th training sample; yk∈{1, . . . , m} is the corresponding membership class for xk; The weight wi can be regarded as the inverse of the mean distance between an sample and all training samples of class i. (1) (2) 9 ∑∑ == + l k k m i i w Cw ki 11 2 , 2 1 min ξ ξ ,,,, kiexwxw kkik T ik T yk ∀−≥− ξConstraints ( ) max max T i i i i D x D w x= = (3) Maximum value Di predicts the membership class of a testing sample
  • 10. SVM Multi-Class Formulation (Dual Problem) Notation: see the previous slide #9 (4) (5) 10 Constraints (10) ∑∑∑ = == +=Ω l k m i kiki m i i ew 1 1 ,, 1 2 )( 2 1 )(min ααα α ,)0]2[,]1[( 1 ,,, kiC m i kikiki ∀=∀≤ ∑ = αα ∑∑ == == l k kki l k kii xww 1 , 1 , )()( ααα (7) , ( ) 0 i k α α ∂Ω = ∂ (8)Optimal solution: .,,)( )( , , , ikexwg kik T i ki ki ∀+= ∂ Ω∂ = α α α (9) ,,minmax , : , ,, kgg ki Ci ki i k kiki ∀−= <α υ
  • 11. Iterative-Tuning SVM System diagram of iterative-tuning scheme 11 w(α) Iterative Tuning α Ω(α) ,1 , 1 , ( )k j k j k j j j k α α α γ λ α − + ∂Ω ′ ′ = ′ − ∂ ′ 1, , ,[ ; ; ; ; ]k k i k m kα α α α′ =   ∑ − = −= 1 1 ,, m i kikm αα , , 1 ( ) ( ) T m i k j i k j j i k k w wα α λ α α=  ∂ ′ ∂ ′   =   ∂ ′ ∂ ′   ∑ k ki ki x w = ∂ ∂ , )( α α Gauss-Newton algorithm provides faster convergence speed (12) (13) (14) (16) (17)
  • 12. Experimental Setup 12  Use NetFlow-V5 to collect network traffic trace  Collect data over a 24-hour period  Utilize 12 flow-features obtained from NetFlow-V5 flow-records  as the basis for input to classification algorithms  traffic classification achieves a better accuracy, if all 12 flow-features are selected  NetFlow data trace consists of 241,223 TCP flows  3 testing datasets consist of 130,527 flows, 55,531 flows, and 55,165 flows  belong to 3 different time periods respectively
  • 13. Flow-Features For Traffic Classification 13 Feature ID Feature Name 1 source port 2 destination port 3 average packet size 4 average bytes/sec (src→dst) 5 average bytes/sec (dst→src) 6 packet count (src→dst) 7 packet count (dst→src) 8 byte count (src→dst) 9 byte count (dst→src) 10 ratio of byte count (src→dst) / byte count (dst→src) 11 SYN flag count 12 flow duration
  • 14. Network Traffic Classes For Different Applications 14 Application Class 1st Testing Dataset 2nd Testing Dataset 3rd Testing Dataset Database 781 36 43 FTP 4,422 307 386 Mail 13,018 2,771 2,508 Multimedia 488 36 33 P2P 797 109 283 Service 1,037 293 220 WWW 109,984 51,979 51,692 Total 130,527 55,531 55,165
  • 15. Comparison of SVM Classification Algorithms 15 SVM Type Training time (ms) Overall Accuracy SVM-IT 187 98.66% (128776/130527) SVM-0 1,575 98.48% (128551/130527) SVM-1 2,698 98.68% (128804/130527) SVM-2 530 98.62% (128724/130527) SVM-3 1,388 98.2% (128172/130527) SVM-4 1,528 99.1% (129356/130527) SVM-5 7,534 98.56% (128644/130527) SVM-6 2,932 98.16% (128127/130527) SVM-7 5,911 98.5% (128571/130527)
  • 16. Ratio of Classification Accuracy/Training time Ratio of Accuracy/Training time (in logarithmic scale) provided by 9 different SVM classification algorithms for 1st testing dataset 16 0 1 2 3 4 5 6 7 8 9 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 SVM Type log10[10(Accuracy/Time)] SVM-IT SVM-0 SVM-1 SVM-2 SVM-3 SVM-4 SVM-5 SVM-6 SVM-7  SVM-5 exhibits the lowest performance/cost ratio  iterative-tuning SVM provides the highest performance/cost ratio  achieving better trade-off between classification accuracy and training speed than other 8 SVMs
  • 17. Classification Precision of Each Class (1) 17  All 9 SVMs can identify more than 99% of WWW traffic  SVM-4 has highest precisions for identifying  Database, FTP, and P2P traffic  SVM-3 exhibits higher precision for classifying Mail traffic than other 8 SVMs 0.94 0.95 0.96 0.97 0.98 0.99 1 1.01 0 1 2 3 4 5 6 7 8 9 10 ClassificationPrecision SVM Type Overall FTP Mail WWW
  • 18. Classification Precision of Each Class (2) 18  Iterative-tuning SVM can identify 90% of Service traffic, more precisely than other 8 SVMs  SVM-5 can identify Multimedia traffic with greater precision than other 8 SVMs 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 9 10 ClassificationPrecision SVM Type Database Multimedia P2P Service
  • 19. Other Experimental Findings  Benefit of SVM Classification over Port-based Classification  Port-based classification only obtains overall classification accuracy as about 88%  SVM classification achieves overall classification accuracy as about 98%  Advantage and Disadvantage of Unbiased Training Dataset unbiased training dataset makes the classification precision of each different class more balanced  there is no arbitrarily low precision for any particular class  overall accuracy decreases by nearly 2% 19
  • 20. Conclusions  Propose iterative-tuning scheme to increase training speed  for SVM multi-class classification dual problem  Analyze working mechanism of iterative-tuning scheme  to obtain dual parameter vector for SVM classification model  Iterative-tuning SVM is computationally more efficient than 8 typical SVMs  while exhibiting almost identical accuracy as those 8 SVMs  SVM classification based on flow-level information  achieve accuracy higher than 98%  allow network/cloud operators to apply traffic classification for a range of issues including semi-real-time security monitoring and traffic engineering 20