SlideShare a Scribd company logo
Use case:
Merging heterogeneous
network measurement data
Jorge E. López de Vergara and Javier Aracil
Jorge.lopez_vergara@uam.es
Credits to Rubén García-Valcárcel, Iván González, Rafael
Leira, Víctor Moreno, David Muelas, Javier Ramos, Paula
Roquero, Carlos Vega and the rest of the HPCN-UAM team
SMART Internet Monitoring Study 3rd Workshop,
Barcelona, Spain, 22nd April 2015
Contents
•  Introduction
•  Technologies
–  High-speed traffic measurements
–  Data integration alternatives
–  Hadoop for network measurements
–  Log processing
•  Conclusions
Merging heterogeneous network measurement data 2
Introduction
•  Network measurement data can come from
different sources
–  Network-oriented sources:
•  SNMP MIB instances
•  Netflow records
•  Pcap files
•  …
–  Application-oriented sources:
•  Logs
–  Some standard (Apache web log)
–  Some proprietary (application specific)
•  Important to deal with encrypted traffic
–  It is necessary to provide ways to merge them
Merging heterogeneous network measurement data 3
High-speed traffic
measurements
•  Requirements
–  Capture at core networks
–  +10 Gbps links, sometimes virtualized
–  No packet drops
•  Available off-the-shelf resources that help on this
tough task
–  Intel +10G network cards
•  Intel DPDK
•  Other developments: HPCAP at UAM
–  Mellanox +10G network cards
•  Mellanox Messaging Accelerator (VMA)
–  Multicore processors
•  CPU affinity and isolation for key tasks
–  Lots of RAM memory
•  Use of hugepages and mmap
Merging heterogeneous network measurement data 4
High-speed traffic
measurements
Merging heterogeneous network measurement data 5
HPCAP
M3
OMON
Traffic
Sniffer
NIC	
  
Packet
Ring
Packet
Buffer
Packet	
  
Dumper
Flow	
  
Manager
Flow	
  
Exporter
Packet-­‐level
Traces
Flow-­‐level
Traces
Agg.Stats
Traces
App.	
  1
App.	
  2
Flow
Table
READ
(on	
  demand)
WRITE
NON-­‐VOLATILE	
  
MEMORY
VOLATILE	
  
MEMORY
KERNEL-­‐LEVEL
PROCESSING	
  
MODULE
Packet
Arrival
USER-­‐LEVEL
PROCESSING	
  
MODULE
App.	
  N
M3
OMON
API
packet
access
MRTG
access
flow
access
READ
(real-­‐time)
Data integration alternatives
•  SQL databases
–  Pros: reliable, normalized schemas, consistent
with defined constraints
–  Cons: slow, need to use materialized views to
go faster, creating materialized views is costly
and sometimes it can’t be done concurrently
à Not valid to deal with high-speed network
measurements
•  Plain files, no SQL
–  Pros: much faster
–  Cons: inconsistencies, lack of normalization
à Necessary to deal with high-speed network
measurements, but keeping in mind its limitations
Merging heterogeneous network measurement data 6
Hadoop for network
measurements
•  General scenario
Merging heterogeneous network measurement data 7
Flow
Process
Flows
PCAP
PC PC
Users’ network
Internet
MapReduce
Jobs
HTTPSDNS
Hive
Hadoop
Time series
Popular
web pages
Malfunctioning
Equipment
Most
used PCs
Predictions
SerDe
HQL
HDFS
Admin Queries
Raw traffic
Processed
traffic
Deployed on
many nodes
to work
properly
Otherdata
sources(logs)
Hadoop for network
measurements
•  DNS analysis
Merging heterogeneous network measurement data 8
DNS
Traffic
Whois
database
Intelligent
Capture
Module
RADIUS
records
Hadoop
Long-Term
Storage
NoSQL
Database
Data
Visualization
Log processing
•  Requirements (real scenario):
–  Process 3M events per second (about 5 Gbps)
•  Put together application logs and network flows
–  Several disks in parallel are necessary to store the
events at appropriate rate
–  Fast access to time series and aggregated statistics
à What operators demand
–  Slower access to raw data
à What IT analysts demand
•  Elasticsearch and Kibana tools provide some
support, but it is necessary to tune them
Merging heterogeneous network measurement data 9
Kibana UI
Merging heterogeneous network measurement data 10
Conclusions
•  Need for different network measurement data
sources
–  Combine network and application data
–  Necessary to find sources of problems
•  Is the slowdown caused by the network, the server or the
application?
•  This question can only be answered if all information from
different layers is provided and analyzed
•  Huge amount of data à it is necessary to work
with fast processing systems
•  Availability of processing tools from the Cloud
Computing community
–  It is necessary to adapt them to the network
measurment processes
Merging heterogeneous network measurement data 11

More Related Content

What's hot

Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Big Data Spain
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
Tal Lavian Ph.D.
 
Cloudgene - A MapReduce based Workflow Management System
Cloudgene - A MapReduce based Workflow Management SystemCloudgene - A MapReduce based Workflow Management System
Cloudgene - A MapReduce based Workflow Management System
Lukas Forer
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
Apache spot 系統架構
Apache spot 系統架構Apache spot 系統架構
Apache spot 系統架構
Hua Chu
 
Integrating Apache Phoenix with Distributed Query Engines
Integrating Apache Phoenix with Distributed Query EnginesIntegrating Apache Phoenix with Distributed Query Engines
Integrating Apache Phoenix with Distributed Query Engines
DataWorks Summit
 
Data streaming fundamentals
Data streaming fundamentalsData streaming fundamentals
Data streaming fundamentals
Mohammed Fazuluddin
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
DataWorks Summit
 
Shaping a Digital Vision
Shaping a Digital VisionShaping a Digital Vision
Shaping a Digital Vision
DataWorks Summit/Hadoop Summit
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
Jen Aman
 
StreamHorizon and bigdata overview
StreamHorizon and bigdata overviewStreamHorizon and bigdata overview
StreamHorizon and bigdata overview
StreamHorizon
 
Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
Lucidworks (Archived)
 
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinJim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Flink Forward
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2
VIVEKVANAVAN
 
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan SaldichSpark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark Summit
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
DataWorks Summit
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
Jim Haughwout
 
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-FiHow Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
Spark Summit
 
Data Science at Scale by Sarah Guido
Data Science at Scale by Sarah GuidoData Science at Scale by Sarah Guido
Data Science at Scale by Sarah Guido
Spark Summit
 
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Spark Summit
 

What's hot (20)

Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
Cloudgene - A MapReduce based Workflow Management System
Cloudgene - A MapReduce based Workflow Management SystemCloudgene - A MapReduce based Workflow Management System
Cloudgene - A MapReduce based Workflow Management System
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Apache spot 系統架構
Apache spot 系統架構Apache spot 系統架構
Apache spot 系統架構
 
Integrating Apache Phoenix with Distributed Query Engines
Integrating Apache Phoenix with Distributed Query EnginesIntegrating Apache Phoenix with Distributed Query Engines
Integrating Apache Phoenix with Distributed Query Engines
 
Data streaming fundamentals
Data streaming fundamentalsData streaming fundamentals
Data streaming fundamentals
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
Shaping a Digital Vision
Shaping a Digital VisionShaping a Digital Vision
Shaping a Digital Vision
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
StreamHorizon and bigdata overview
StreamHorizon and bigdata overviewStreamHorizon and bigdata overview
StreamHorizon and bigdata overview
 
Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinJim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2
 
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan SaldichSpark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-FiHow Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
 
Data Science at Scale by Sarah Guido
Data Science at Scale by Sarah GuidoData Science at Scale by Sarah Guido
Data Science at Scale by Sarah Guido
 
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
Using Spark and Riak for IoT Apps—Patterns and Anti-Patterns: Spark Summit Ea...
 

Similar to Merging heterogeneous network measurement data

Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
chariorienit
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
DataWorks Summit
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File Systemelliando dias
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Isuru Suriarachchi
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
John Evans
 
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFSHadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
praveen bhat
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
Konstantin V. Shvachko
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
arslanhaneef
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
sonukumar379092
 
Architecting Big Data Ingest & Manipulation
Architecting Big Data Ingest & ManipulationArchitecting Big Data Ingest & Manipulation
Architecting Big Data Ingest & Manipulation
George Long
 
Big Data for QAs
Big Data for QAsBig Data for QAs
Big Data for QAs
Ahmed Misbah
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
clairvoyantllc
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
Apache Apex
 
VTU 6th Sem Elective CSE - Module 4 cloud computing
VTU 6th Sem Elective CSE - Module 4  cloud computingVTU 6th Sem Elective CSE - Module 4  cloud computing
VTU 6th Sem Elective CSE - Module 4 cloud computing
Sachin Gowda
 
module4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdfmodule4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdf
SumanthReddy540432
 
Big data analytics and machine intelligence v5.0
Big data analytics and machine intelligence   v5.0Big data analytics and machine intelligence   v5.0
Big data analytics and machine intelligence v5.0
Amr Kamel Deklel
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
InSemble
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
MapR Technologies
 

Similar to Merging heterogeneous network measurement data (20)

Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
 
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFSHadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Architecting Big Data Ingest & Manipulation
Architecting Big Data Ingest & ManipulationArchitecting Big Data Ingest & Manipulation
Architecting Big Data Ingest & Manipulation
 
Big Data for QAs
Big Data for QAsBig Data for QAs
Big Data for QAs
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
VTU 6th Sem Elective CSE - Module 4 cloud computing
VTU 6th Sem Elective CSE - Module 4  cloud computingVTU 6th Sem Elective CSE - Module 4  cloud computing
VTU 6th Sem Elective CSE - Module 4 cloud computing
 
module4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdfmodule4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdf
 
Big data analytics and machine intelligence v5.0
Big data analytics and machine intelligence   v5.0Big data analytics and machine intelligence   v5.0
Big data analytics and machine intelligence v5.0
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 

More from Jorge E. López de Vergara Méndez

On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...
Jorge E. López de Vergara Méndez
 
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Jorge E. López de Vergara Méndez
 
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
Jorge E. López de Vergara Méndez
 
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Jorge E. López de Vergara Méndez
 
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOPMONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
Jorge E. López de Vergara Méndez
 
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Jorge E. López de Vergara Méndez
 
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss RateEvaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Jorge E. López de Vergara Méndez
 
Defining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISGDefining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISG
Jorge E. López de Vergara Méndez
 
Integración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de redIntegración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de red
Jorge E. López de Vergara Méndez
 

More from Jorge E. López de Vergara Méndez (9)

On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...On the feasibility of 40 Gbps network data capture and retention with general...
On the feasibility of 40 Gbps network data capture and retention with general...
 
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
Evaluación de equipamiento de bajo coste para realizar medidas de red en ento...
 
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
Dictyogram: a Statistical Approach for the Definition and Visualization of Ne...
 
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
Análisis de Datos Funcionales para Gestión de Red: Téecnicas, Retos y Oportun...
 
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOPMONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
MONITORIZACIÓN Y ANÁLISIS DE TRÁFICO DE RED CON APACHE HADOOP
 
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...Multimedia flow classification at 10 Gbps using acceleration techniques on co...
Multimedia flow classification at 10 Gbps using acceleration techniques on co...
 
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss RateEvaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
Evaluating Quality of Experience in IPTV Services Using MPEG Frame Loss Rate
 
Defining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISGDefining ontologies for IP traffic measurements at MOI ISG
Defining ontologies for IP traffic measurements at MOI ISG
 
Integración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de redIntegración semántica de información de distintos repositorios de medidas de red
Integración semántica de información de distintos repositorios de medidas de red
 

Recently uploaded

The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
Himani415946
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
ShahulHameed54211
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
TristanJasperRamos
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 

Recently uploaded (16)

The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 

Merging heterogeneous network measurement data

  • 1. Use case: Merging heterogeneous network measurement data Jorge E. López de Vergara and Javier Aracil Jorge.lopez_vergara@uam.es Credits to Rubén García-Valcárcel, Iván González, Rafael Leira, Víctor Moreno, David Muelas, Javier Ramos, Paula Roquero, Carlos Vega and the rest of the HPCN-UAM team SMART Internet Monitoring Study 3rd Workshop, Barcelona, Spain, 22nd April 2015
  • 2. Contents •  Introduction •  Technologies –  High-speed traffic measurements –  Data integration alternatives –  Hadoop for network measurements –  Log processing •  Conclusions Merging heterogeneous network measurement data 2
  • 3. Introduction •  Network measurement data can come from different sources –  Network-oriented sources: •  SNMP MIB instances •  Netflow records •  Pcap files •  … –  Application-oriented sources: •  Logs –  Some standard (Apache web log) –  Some proprietary (application specific) •  Important to deal with encrypted traffic –  It is necessary to provide ways to merge them Merging heterogeneous network measurement data 3
  • 4. High-speed traffic measurements •  Requirements –  Capture at core networks –  +10 Gbps links, sometimes virtualized –  No packet drops •  Available off-the-shelf resources that help on this tough task –  Intel +10G network cards •  Intel DPDK •  Other developments: HPCAP at UAM –  Mellanox +10G network cards •  Mellanox Messaging Accelerator (VMA) –  Multicore processors •  CPU affinity and isolation for key tasks –  Lots of RAM memory •  Use of hugepages and mmap Merging heterogeneous network measurement data 4
  • 5. High-speed traffic measurements Merging heterogeneous network measurement data 5 HPCAP M3 OMON Traffic Sniffer NIC   Packet Ring Packet Buffer Packet   Dumper Flow   Manager Flow   Exporter Packet-­‐level Traces Flow-­‐level Traces Agg.Stats Traces App.  1 App.  2 Flow Table READ (on  demand) WRITE NON-­‐VOLATILE   MEMORY VOLATILE   MEMORY KERNEL-­‐LEVEL PROCESSING   MODULE Packet Arrival USER-­‐LEVEL PROCESSING   MODULE App.  N M3 OMON API packet access MRTG access flow access READ (real-­‐time)
  • 6. Data integration alternatives •  SQL databases –  Pros: reliable, normalized schemas, consistent with defined constraints –  Cons: slow, need to use materialized views to go faster, creating materialized views is costly and sometimes it can’t be done concurrently à Not valid to deal with high-speed network measurements •  Plain files, no SQL –  Pros: much faster –  Cons: inconsistencies, lack of normalization à Necessary to deal with high-speed network measurements, but keeping in mind its limitations Merging heterogeneous network measurement data 6
  • 7. Hadoop for network measurements •  General scenario Merging heterogeneous network measurement data 7 Flow Process Flows PCAP PC PC Users’ network Internet MapReduce Jobs HTTPSDNS Hive Hadoop Time series Popular web pages Malfunctioning Equipment Most used PCs Predictions SerDe HQL HDFS Admin Queries Raw traffic Processed traffic Deployed on many nodes to work properly Otherdata sources(logs)
  • 8. Hadoop for network measurements •  DNS analysis Merging heterogeneous network measurement data 8 DNS Traffic Whois database Intelligent Capture Module RADIUS records Hadoop Long-Term Storage NoSQL Database Data Visualization
  • 9. Log processing •  Requirements (real scenario): –  Process 3M events per second (about 5 Gbps) •  Put together application logs and network flows –  Several disks in parallel are necessary to store the events at appropriate rate –  Fast access to time series and aggregated statistics à What operators demand –  Slower access to raw data à What IT analysts demand •  Elasticsearch and Kibana tools provide some support, but it is necessary to tune them Merging heterogeneous network measurement data 9
  • 10. Kibana UI Merging heterogeneous network measurement data 10
  • 11. Conclusions •  Need for different network measurement data sources –  Combine network and application data –  Necessary to find sources of problems •  Is the slowdown caused by the network, the server or the application? •  This question can only be answered if all information from different layers is provided and analyzed •  Huge amount of data à it is necessary to work with fast processing systems •  Availability of processing tools from the Cloud Computing community –  It is necessary to adapt them to the network measurment processes Merging heterogeneous network measurement data 11