Stsg17 speaker yousunjeong

Big Telco  
Real-Time Network Analytics
Yousun Jeong

Who am I?
§ Senior Software Engineer of SK Telecom, South Korea’s largest wireless communications
provider
§ Work on commercial products (~ ’17)
- She worked with Big Data Solution
- She worked with IaaS(OpenStack)
- She worked with PaaS(CloudFoundry) 
§ Mail to : jerryjung@apache.org
22

Table of Contents
§ Big Data in SK Telecom
§ History of SKT's big data
§ Overall Architecture
§ Use case: Real-Time Network Analytics
3

Big Data in SKT in a Nutshell
§ Data Size
- Currently collecting 100 TB/day
§ Big Data Management Infrastructure
- Hadoop cluster (1400+ nodes); migrated from MPP RDBMS
§ Overall Architecture
- Spark
- Druid
§ Real-Time Network Analytics
- Real-Time Processing
- Hadoop DW
- Big Data Discovery
4

History of SKT’s Big Data
6
§ Batch Processing(Daily)
§ Map-Reduce Programming
§ Hadoop HDFS
2013
§ Batch Processing(Hourly, Daily)
§ SQL on Hadoop
§ Hive(UDF, UDAF)
2014
§ Real-time Processing (Near real-time)
§ Hadoop DW
§ Spark(Streaming, SQL)
2015
§ Big Data OLAP cube
§ Self Data Discovery
§ Druid
Now

Overall Architecture
§ Designed to handle both real-time & batch data processing and high level analysis using
Spark and Druid as a core technology
7
BatchInterface Layer
Flume
Kafka HDFS
oozie (workflow)
Spark
(ETL)
Analytics
Layer
1
2
Spark SQL
Spark MlLib
Jupyter(R,Python)
Kubernetes
YARN (Unified Resource Manager)
Real-Time
Layer
NoSQL
Elastic 
Search
HDFS
Data Service
Layer
Legacy
App
3
Analytics Layer
Batch Processing Layer
Hadoop EDW
Real-Time Layer
Real-Time analysis
3
1
2
【 Components 】
Spark Streaming
H/W Accelerator
(SSD, FPGA)
Provisioning
PXEBoot/chef
4
5
Druid
(Mart)
Metatron(BI)

Benefits of Spark
§ Spark help us to have the gains in processing speed and implement various big data
applications easily and speedily
§ Why SKT use Spark…
- Support for Event Stream Processing
- Fast Data Queries in Real Time
- Improved Programmer Productivity
- Fast Batch Processing of Large Data Set
8

Benchmark - SQL on Hadoop
§ Spark vs Hive
9
Table 1
Query 
ID
Q01 Q02 Q03 Q04 Q05 Q06 Q07 Q08 Q09 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22
Spark 47s 16s 47s 61s 62s 50s 72s 107s 133s 57s 191s 59s 25s 50s 56s 40s 143s 147s 60s 81s 228s 21s
Hive 
(tez)
68s 62s 190s 122s 115s 61s 207s 133s 390s 110s 47s 70s 54s 54s 69s 81s 139s 195s 85s 114s 232s 91s

Benefits of Druid
§ Druid is a distributed in-memory OLAP data store. It has features of timestamp-based
sharding, columnar index & compression, and pre-aggregation on the metric
§ Why SKT use Druid…
- Sub-second processing capability
- Stores aggregated summary data  
for time-series data
- Separated processing engine 
(Real-time and historical engine)  
support analytics at the same time
10
Deep
Storage
(HDFS/S3)
Realtime
Nodes
Hand off Data
Historical
Nodes
Broker
Coordinator
MetaData
Streaming Data
Batch Data
Indexing
Data segments
Queries
Queries

Druid vs Spark Performance Comparison
§ Druid and Spark have different results depending on the nature of the engine.
§ Druid vs Spark
- Druid converts data into OLAP  
optimized pre-aggregated, indexed,  
columnar structures
- Druid has separate ingestion overhead
- Excellent in terms of memory and  
disk I/O compared to Spark
- Spark is able to process all TPC-H queries
11
https://github.com/jaehc/tpch-spark/tree/feature-run-multiple-queries 
http://druid.io/blog/2014/03/17/benchmarking-druid.html

Druid vs Spark Performance Comparison
§ SUM_ALL_YEAR
- SELECT YEAR(L_SHIPDATE),
SUM(L_EXTENDEDPRICE),
SUM(L_DISCOUNT),SUM(L_TAX), SUM(L_QUANTITY)
FROM LINEITEM GROUP BY YEAR(L_SHIPDATE)
§ TOP_100_PARTS_DETAILS
- SELECT L_PARTKEY, SUM(L_QUANTITY),
SUM(L_EXTENDEDPRICE),MIN(L_DISCOUNT),
MAX(L_DISCOUNT) FROM LINEITEM GROUP BY
L_PARTKEY ORDER BY SUM(L_QUANTITY) DESC
LIMIT 100
12

Use cases : Summary
13
TANGO-D
APOLLO
• TANGO(T Advanced Next Generation OSS)-D(Data warehouse)
• End-to-end network quality assurance and fault analysis in a
timely manner
• APOLLO(Analytics PlatfOrm for inteLLigent Operation)
• Real-time analysis of radio access network to improve
operation efficiency
Real-Time Network analytics
1
2
Metatron  
Discovery
3
• Metatron(Development by SKT big data discovery & analytics
solution)
• Interactive Analysis for network engineer & operator & data
scientist

Use Case 1: Apollo Real-Time Analytics
§ APOLLO aims to improve mobile user experience, reduce operation cost, and improve
operation efficiency by analyzing radio access networks
14
Analytics Output
Root
Cause
Finding
Anomaly
Detection
Optimization
Resource
Monitoring
Call Data
RF Signal
Customer/Service
Device Data
A/F/S
Real-Time Analytics
Platform
Data
Collecting
Analytics based
Control
OAM
Operator
Predictive
Analysis
Service
Analysis
Real-time
Monitoring &
OptimizationEngineering  
Optimization
NetworkIntelligence
KPI
Detection
* APOLLO : Analytics PlatfOrm for inteLLigent Operation

Use Case 1: Apollo Real-Time Analytics
§ APOLLO collects and analyzes raw data from base stations in real time to optimize the
service performance
§ Spark Streaming
- Processes raw data to obtain statistics  
every 10 seconds
- Automatically detects abnormality
§ Real-Time User/Service Level Optimization
- Predict traffic variation and base  
station performance
- Minimize degradation in base 
station and user performance
15
Base Station
Storage
Spark
Dashboard
Spark Streaming
Data
Parsing
Real time
Processing
Kafka
Data
Converting
RDD
Elastic 
Search
[ Real-Time Analytics]

Use case 2: TANGO-D
§ TANGO-D is a Hadoop DW that can handle big telco data with scalability & cost efficiency
16
“Hadoop S/W and Commodity H/W
Based Cost-effective IT Infrastructure System”
【 SKT DW Infrastructure】
“High-price, High-performance
Proprietary IT Infrastructure System”
【 Legacy IT Infrastructure 】
※ MPP Massively Parallel Processing, SAN Storage Area Network, NAS Network Attached Storage, RDBMS Relational DB Management System
Structured/Un-structured Data
Scale-out Structure (Petabyte, Exabyte)
Data
Structured Data
Scale-up Structure (Terabyte)
Commodity H/W (x86 Server)H/W
High Performance H/W
(MPP, Fabric Switch, etc.)
Hadoop Architecture
SQL on Hadoop
S/W
Proprietary S/W 
(RDBMS, etc.)
Transaction/Batch
Processing
(SQL) Hadoop File System
※ MPP Massively Parallel Processing

Use case 2: TANGO-D
§ Data scientists need unified platform to collect data from all network equipment for
management and analysis purpose
§ Expected advantages
- Unification of 130+ legacy DMBSs, each of which was managing separate network monitoring system,  
enabling thorough analysis over the entire network
- Quick and accurate identification of root causes of network failure
17
NMS#1
DBMS
…
NMS#1
DBMS
NMS#N-1
DBMS
[ AS-WAS ] 
Siloed Data & IT Management
Access NW Core NW Transport
NMS 
#1
…
NMS 
#2
NMS 
#N-1
Legacy
NMS 
#N
Hadoop DW
DW
Legacy
NEW 
NMS#1
…
NEW 
NMS#N
BI & 
Analytic…
[ AS-IS ]
Network Enterprise DW

Use case 2: TANGO-D
§ TANGO-D is a Hadoop-based data warehouse built on Spark for various network statistics
or raw data
§ User Benefits
- End-to-End quality assurance, 
Fault analysis
- Reduces analysis lead time 
(days → minutes)
- Saves TCO (1/5 less than legacy DW)
§ Hadoop DW
- Spark-SQL functions and query optimizer
- Bulk-loading and timely processing  
of large data  
(processing 2,500 table per hour)
18
Acess
Core
Transport
EMS
EMS
T-Pani
EMS
Hadoop DW
DW Data
Data Mart
SQL on
Hadoop 
(Spark SQL)
IP
EMS
AnalyticsSQL
ETL
ETL
O
D
S
MQE 
(Meta Query 
Engine)
BI

Use case 3: Metatron Discovery
§ We developed the Metatron Discovery solution for quick and easy data analysis and we
applied it in-house big data system
19
Analysis & Analytics tools
(Jupyter, Prediction, Clustering)
Application
(Visualization,
Data Preparation, Workbench)
Big Data
Storage
File system
Key FeaturesArchitecture
It easy to analyze big data with end-to-end
functionality from data preparation to
analysis charts.
Intuitive Analysis
Minimize ETL cost, speed up, and
support schema changes by creating a
single Big Mart by combining various
dimension data based on large-capacity
Big OLAP Cube
By transferring data to In-memory, Local
Storage, and Deep Storage over time, it is
possible to respond quickly to large-
capacity data over TB.
Sub-second Processing
Advanced Analytics
Provides analysis function in conjunction with
jupyter, Provides fast time series forecasting,
clustering with embedded analytics.
Data Processing Engine
(OLAP Engine)
Complex to analysis
separated various SWs
needed for each step of data
discovery
Too slow for big data
not support real-time
analysis
Lack of analytics functions
and visualization charts
for telecom analysis
Challenges

§ Metatron Discovery enables E2E analysis to perform on a unified analytics platform
§ User Benefits
- Operational BI using  
network engineer and operator
- Work with Jupyter to perform  
Advanced Analysis
- Drill-Down search  
by Drag and Drop interface easily
20
Executive
Officer
Network
Operator
Field
Engineer Biz. Partner
TANGO-D
Access
Transport
Core/ICT
Planning and
Investment
Strategy
Engineering Construction Operation
Work & TT
Management
Network
Monitoring
N/W Data Repository Analytics PlatformE2E Inventory
Operational BI
Advanced 
Analytics
Data Discovery

§ Metatron's core engine is that Druid can query quickly by time granularity using a cache
21
Historical
Nodes
Broker
Zookeeper
Coordinator
Nodes
Druid Cluster
 
HDFS
metastore
Oozie
Hadoop Cluster(DW)
HDFS(Deep Storage)
Segment
Memory
Segment
Disk
Cache 
Entries
Segment
Metadata
Data/segment
Queries
Querying 
2017-01-03 ~ 
2017-01-08
Cache (Broker Nodes)
Result segment 2017-01-03/2017-01-04
Result segment 2017-01-07/2017-01-08
Querying 
(Not in Cache)
Historical Node
Segment 2017-01-04/2017-01-05
Segment 2017-01-07/2017-01-08
Druid 
Query Process
TANGO-D (Hadoop DW)
1
3
4
2

§ Metatron Discovery composes to 3 Parts (Workspace, Workbench, Jupyter). Each user can
experience various analysis environments.
§ Workspace
- General Network Engineer  
& Operator
§ Workbench
- Advanced Analyst
§ Jupyter
- Statistical Analyst
22
Direct Query
TANGO-D(Hadoop DW Cluster)
Oozie
Spark 
SQL
Thrift Server
Yarn
SparkSQL
HDFS
Druid Cluster
Deep Storage
Historical Nodes Real-Time Nodes
Broker
Nodes
Zookeeper
Coordinator
Nodes
Workbench
Workspace
Data
Analytics
(SQL)
특수지역 동기화
(Sqoop)
Fixed Report Dynamic Report
DW/Mart Data Batch
Data  
Analytics 
Ad-hoc
Jupyter
R/Python
Metatron Discovery
Direct
Query
1
2 3

Containerized Environment of Analytics(Ongoing)
§ The analysis environment can deploy as a docker, configured for individual analysis
environments, and managed container resources as needed using by Kubernetes,
GlusterFS
23
K8S Master K8S Master K8S Node#1 K8S Node#N K8S Node#N
Nginx
GlusterFS GlusterFS GlusterFS
private shared
[Container]
[Provisioning]
Admin
User
Docker
Registry

Self-Data Preparation(Ongoing)
§ Data preparation makes it easy for anyone to do tedious and repetitive ETL tasks that
preprocessing for visualizing and analyzing data
24

Self-Data Analytics(Ongoing)
§ Data analysts can interact with Metatron Discovery to run analytics and create Rest API
directly from jupyter
25
1
2
3
4

Metatron
§ If you have any questions, please visit here - https://metatron.sktelecom.com/
26

Stsg17 speaker yousunjeong

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Stsg17 speaker yousunjeong

Similar to Stsg17 speaker yousunjeong (20)

More from Yousun Jeong

More from Yousun Jeong (8)

Recently uploaded

Recently uploaded (20)

Stsg17 speaker yousunjeong