Jun Liu ; Feng Liu and Ansari, N. 
Beijing Univ. of Posts & Telecommun., Beijing, China 
IEEE Network • July/August 2014 
Advisor : Dr. Jenq-Shiou Leu 
Student : Chia-Yun Chan 
Date : 2014/12/09
Introduction 
System Architecture 
Traffic Analysis Algorithms 
Experimental Results 
Conclusions
Network traffic monitoring and analysis is significance 
for optimizing network resource and improving user 
experience 
Existing solutions usually rely on a high-performance 
server with large storage capacity, are not scalable for 
detailed analysis of big traffic data
The features of Hadoop 
Distributed parallel computing 
Low-cost scale-out capability 
High fault tolerance 
But some important issues in large-scale commercial 
telecommunication networks have not been solved
Application-layer analysis 
Web service provider analysis 
User behavior analysis
d 
30 30 
80 54 
a 
e 
90 
b 
c 
120 
60 
80 
100 
200 
64 
20 10
We develop a three-step algorithm: 
1. Measuring affinity 
2. Sparsifying a graph 
3. Identifying communities
Mobile operators want to know the user behaviors of 
cellular devices including models, prices, and features 
We design a novel Jaccard-based learning method to 
build a cellular device model database 
1. Extract all keywords of a device model 
2. Filter candidate keywords 
3. Calculate the Jaccard coefficient index using statistical 
information, and select the keyword with the highest 
Jaccard index to represent the device model
A novel system for monitoring and analyzing large-scale 
network traffic data 
Designed algorithms and implemented MapReduce 
programs for network traffic analysis from different 
perspectives 
Revealed a number of network traffic and user 
behavior phenomena not shown before
Monitoring and Analyzing Big Traffic Data of a Large-Scale Cellular Network with Hadoop

Monitoring and Analyzing Big Traffic Data of a Large-Scale Cellular Network with Hadoop

  • 1.
    Jun Liu ;Feng Liu and Ansari, N. Beijing Univ. of Posts & Telecommun., Beijing, China IEEE Network • July/August 2014 Advisor : Dr. Jenq-Shiou Leu Student : Chia-Yun Chan Date : 2014/12/09
  • 2.
    Introduction System Architecture Traffic Analysis Algorithms Experimental Results Conclusions
  • 3.
    Network traffic monitoringand analysis is significance for optimizing network resource and improving user experience Existing solutions usually rely on a high-performance server with large storage capacity, are not scalable for detailed analysis of big traffic data
  • 4.
    The features ofHadoop Distributed parallel computing Low-cost scale-out capability High fault tolerance But some important issues in large-scale commercial telecommunication networks have not been solved
  • 6.
    Application-layer analysis Webservice provider analysis User behavior analysis
  • 9.
    d 30 30 80 54 a e 90 b c 120 60 80 100 200 64 20 10
  • 10.
    We develop athree-step algorithm: 1. Measuring affinity 2. Sparsifying a graph 3. Identifying communities
  • 11.
    Mobile operators wantto know the user behaviors of cellular devices including models, prices, and features We design a novel Jaccard-based learning method to build a cellular device model database 1. Extract all keywords of a device model 2. Filter candidate keywords 3. Calculate the Jaccard coefficient index using statistical information, and select the keyword with the highest Jaccard index to represent the device model
  • 17.
    A novel systemfor monitoring and analyzing large-scale network traffic data Designed algorithms and implemented MapReduce programs for network traffic analysis from different perspectives Revealed a number of network traffic and user behavior phenomena not shown before

Editor's Notes

  • #2 監測和分析與Hadoop的大型蜂窩網絡的大流量數據
  • #4 網絡流量監測和分析是優化網絡資源,提升用戶體驗的意義。 現有的解決方案通常依賴具有大存儲容量的高性能服務器上,都沒有可擴展為大量的業務數據的詳細分析。
  • #5 Hadoop的具有幾個重要特點:高效分散平行運算,低成本的向外擴展的能力,和高容錯性。 HADOOP用於分析網絡流量的數據,一些重要的在大規模商用的電信網絡問題還沒有得到解決。
  • #12 移動運營商希望了解移動設備,包括型號,價格和功能的用戶行為 我們設計了一種新的杰卡德為基礎的學習方法來建立一個蜂窩設備模型數據庫 1.提取有關的器件模型描述的所有關鍵字。 2.篩選候選關鍵字,通過評估每個關鍵字和設備型號之間的條件概率值。 3.使用的統計信息,計算杰卡德係數索引,並選擇具有最高的Jaccard指數來表示該設備模型的關鍵字。
  • #18 一種新的系統,用於監測和分析大規模網絡流量的數據。 從不同的角度的網絡流量分析算法設計並實現了MapReduce程序 使我們能夠揭示了一些之前未顯示網絡流量和用戶行為的現象。