SlideShare a Scribd company logo
1 of 21
Download to read offline
‹#›
Big Telco 

Real-Time Network Analytics
Yousun Jeong
Who am I
• Senior Software Engineer of SK Telecom, South Korea’s largest
wireless communications provider
• Work on commercial products (~ ’15)

- She worked with Hadoop DW

- She worked with IaaS(OpenStack)

- She worked with PaaS(CloudFoundry)

• Mail to : jerryjung@sk.com
2
3
Table of Contents
1. Big Data in SK Telecom
2. Benefit of Spark
3. Spark Real Workload 

Real-Time Network Analytics
4. Ongoing R&D
Big Data in SKT in a Nutshell
✓ Data Size
- Currently collecting 250 TB/day
!
✓ Big Data Management Infrastructure
- Hadoop cluster (1400+ nodes); migrated from 

MPP RDBMS
✓ Use cases

- Real-Time Analytics of Base Stations

- Network Enterprise DW
!
✓ Ongoing R&D

- SKT Hadoop DW Appliance with H/W acceleration
4
Operating over 1400 nodes (30 PB+) of Hadoop cluster
SKT Hadoop Infrastructure
• Optimized configuration
• Fault tolerant and effective resource management system 5
Data Collector
Data Collect "
& pre-processing
Main Cluster
Analysis
R&D Cluster
~250 TB/day (500+ node)
Service!
Logic
Repository
(400+ Node)
(100+ node)
Service Cluster
(400+ node)
Marketing
NW 

Analytics
VoC
SKT Hadoop Infra
Data Feeding
Data Feeding
Commercialize
Develop.
Batch LayerInterface Layer
Flume
Kafka"
HDFS 

(Data Mart)
oozie (workflow)
Hive
(ETL)
Spark
(ETL)
Analytics Layer
1
2
Spark SQL
Spark MlLib
Spark GraphX
Spark R
YARN (Unified Resource Manager)
Real-Time Layer
NoSQL
Elastic

Search
HDFS
Data Service
Layer
BI
Legacy
App
3
Analytics Layer
Batch Processing Layer -
Hadoop EDW
Real-Time Processing Layer
– Real Time Analysis
3
1
2
【 Components 】
Spark Streaming"
!
H/W Accelerator
(SSD, FGPA)
Cluster Manger
Ambari
SKT Big Data Reference Architecture
Designed to handle both real-time & batch data processing and high level
analysis using Spark as a core technology
6
Benefit of Spark
Spark help us to have the gains in processing speed and implement various big
data applications easily and speedily
▪ Support for Event Stream Processing
▪ Fast Data Queries in Real Time
▪ Improved Programmer Productivity
▪ Fast Batch Processing of Large Data Set
Why SKT use spark …
7
Use cases: Summary
Network
Enterprise DW
APOLLO
• End-to-end network quality assurance and

fault analysis in a timely manner
• Real-time analysis of radio access network
to improve operation efficiency
Network analytics
8
9
DC

Parser
Kafka"
Broker
Kafka"
Producer

Kafka"
Topic
Spark
Streaming
Kafka Direct"
Stream"
1 minute widow
10 s
HDFS ES
10 s
Real-Time
Dashboard
Spark
SQL
BI

Analysis
JDBC"
ODBC
1
2
4
5
Data
Collector"
(Flume)
3
Spark

MLlib
6
Timely Processing"
Quick Response
Requirements
Parallelism
• Executors
• Partitions
• Using Akka
Use case 1: Requirements & Challenges
“Hadoop S/W and Commodity H/W
Based Cost-effective IT Infrastructure System”
【 SKT DW Infrastructure】
“High-price, High-performance
Proprietary IT Infrastructure System”
【 Legacy IT Infrastructure 】
※ MPP Massively Parallel Processing, SAN Storage Area Network, NAS Network Attached Storage, RDBMS Relational DB Management System
Structured/Un-structured Data
Scale-out Structure (Petabyte, Exabyte)Data
Structured Data
Scale-up Structure (Terabyte)
Commodity H/W (x86 Server)H/W
High Performance H/W
(MPP, Fabric Switch, etc.)
Hadoop Architecture
SQL on Hadoop
S/W
Proprietary S/W

(RDBMS, etc.)
Transaction/Batch
Processing"
(SQL) Hadoop File System
Hadoop DW can handle telco big data with scalability & cost efficiency
Use case 2: Hadoop based Enterprise DW
10
※ MPP Massively Parallel Processing
11
Use case 2: Network Enterprise DW
NMS#1
DBMS
…
NMS#1
DBMS
NMS#N-1
DBMS
[ Current ]

Siloed Data & IT Management
Access NW Core NW Transport
Expected advantages
• Unification of 130+ legacy DMBSs, each of which was managing separate network
monitoring system, enabling thorough analysis over the entire network
• Quick and accurate identification of root causes of network failure
Data scientists need unified platform to collect data from all network equipment
for management and analysis purpose
NMS

#1 …
NMS

#2
NMS

#N-1
Legacy
NMS

#N
Hadoop DW
DW
Legacy
NEWN
MS#1
… NEW

NMS#
N
BI &

Analytic
…
[ Goal (4Q, 2015) ]"
Network Enterprise DW
Network EDW is a Hadoop-based data warehouse built on Spark for various
network statistics or raw data
User Benefits
• End-to-End quality assurance,

Fault analysis
• Reduces analysis lead time

(days → minutes)
• Saves TCO (1/5 less than legacy DW)
!
Hadoop DW
• Spark-SQL functions and query
optimizer
• Bulk-loading and timely processing of
large data
• SSD caching applied for 

performance enhancement
Acess
Core
Transport
EMS
EMS
T-Pani
EMS
Hadoop DW
DW Data
Data Mart
SQL on
Hadoop

(Spark SQL)
IP
EMS
AnalyticsSQL
ETL
ETL
O!
D!
S
MQE*

(Meta Query

Engine)
H/W
Accelerator !
SSD Caching
H/W
Accelerator

SSD Caching
BI
* MQE (Meta Query Engine) : Heterogeneous database integration query, including the Hadoop.
Use case 2: Network Enterprise DW
12
13
https://github.com/bitnine-oss/octopus
Use case 2: Meta Query Engine
Features"
1. Subset of ANSI-SQL"
2. Queries on multiple databases 

including Spark-SQL, Oracle."
3. SQL-based authorization"
4. User authentication"
5. Unified schema view
Use case 2: Requirements & Challenges
Timely Processing -ETL"
Integrated BI Tools"
Quick Response
Requirements
14
MDS #1
MQE #1
HA Proxy
Thrift Server 

#1
Thrift Server 

#2
Spark SQL
HDFS
YARN
WEB
MDS
BI
MQE
Meta Store
Octopus
NW EDW # 96
ETL
Spark
3
2
1
4
Use case 2: YARN(Dynamic Resource Allocation)
15
spark.dynamicAllocation.enabled true!
spark.shuffle.service.enabled true!
spark.dynamicAllocation.minExecutors 50!
spark.dynamicAllocation.maxExecutors 150!
spark.dynamicAllocation.initialExecutors 50!
spark.dynamicAllocation.cacheExecutorIdleTimeout 600!
spark.dynamicAllocation.executorIdleTimeout! 5!
spark.dynamicAllocation.schedulerBacklogTimeout! ! 5!
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout! 5
<property>!
<name>yarn.nodemanager.aux-services</name>!
<value>mapreduce_shuffle,spark_shuffle</value>!
</property>!
<property>!
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>!
<value>org.apache.spark.network.yarn.YarnShuffleService</value>!
</property>
Configuration
Use case 2: BI Integration
16
spark.sql.thriftServer.incrementalCollect true!
spark.driver.maxResultSize 10g
Configuration
Use case 2: Patches
17
SPARK-7792! - HiveContext registerTempTable not thread safe!
SPARK-7936! - Add configuration for initial size and limit of hash for aggregation!
SPARK-8153! - Add configuration for disabling partial aggregation in runtime!
SPARK-8285! - CombineSum should be calculated as unlimited decimal first!
SPARK-8312! - Populate statistics info of hive tables if it's needed to be!
SPARK-8333! - Spark failed to delete temp directory created by HiveContext!
SPARK-8334 ! - Binary logical plan should provide more realistic statistics!
SPARK-8357! - Memory leakage on unsafe aggregation path with empty input!
SPARK-8420! - Inconsistent behavior with Dataframe Timestamp between 1.3.1 and 1.4.0!
SPARK-8552! - Using incorrect database in multiple sessions!
SPARK-8707! - RDD#toDebugString fails if any cached RDD has invalid partitions!
SPARK-8826! - Fix ClassCastException in GeneratedAggregate!
SPARK-9685! - Unspported dataType: char(X) in Hive!
SPARK-10151! - Support invocation of hive macro!
SPARK-10152! - Support Init script for hive-thriftserver!
SPARK-10679! - javax.jdo.JDOFatalUserException in executor!
SPARK-10684! - StructType.interpretedOrdering need not to be serialised!
SPARK-10216 - Avoid creating empty files during overwrite into Hive table with group by query
Open Issues
Use case 2: Performance
18
TPC-H
Use case 2: Performance
19
Job Server
Hadoop DW Appliance (ongoing)
【 SKT Hadoop DW Appliance 】
Management & Automation
Core Software Solution
Hardware Acceleration
3
1
2
▪ Develop Interactive Spark SQL
▪ Develop Meta Query Engine
▪ Develop Flash Storage-based I/O Acceleration
▪ Develop FPGA-based CPU Acceleration
▪ Develop Data & System Security
▪ Workload Optimization & Automation
Industry Oriented Solution4
▪ Fault Detection & Classification in Manufacturing
▪ Mobile Network Data Analytic Solution
▪ Unstructured Data Collection/Processing Solution
Develop a Hadoop DW appliance combining optimized S/W layer and H/W
acceleration
20
H/W Acceleration Layer
Data Processing Layer
* Meta Query Engine
DW Management Layer
Industry"
Oriented
Solution
!
!
!
!
!
!
!
Monitoring DB Migration Security OptimizationPackaging
SQL Engine/Storage "
!
!
!
* SPARK HIVE
Legacy
RDBMS
FDC
Telco
others
Hadoop Storage DB Storage
* Flash based I/O Accelerator * FPGA Accelerator
2
1
3
4
21
Thank You!

More Related Content

What's hot

Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scaledatamantra
 
From Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexFrom Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexDataWorks Summit
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesReal-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesSingleStore
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Spark Summit
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Spark Summit
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Data Con LA
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkBurak Yavuz
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakeDatabricks
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Pactera_US
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with SparkVincent GALOPIN
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010Membase
 
ETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksDatabricks
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsPat Patterson
 

What's hot (20)

Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
 
From Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache ApexFrom Batch to Streaming ET(L) with Apache Apex
From Batch to Streaming ET(L) with Apache Apex
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesReal-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache Spark
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta Lake
 
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and BeyondWhat's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with Spark
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010
 
Lambda-less Stream Processing @Scale in LinkedIn
Lambda-less Stream Processing @Scale in LinkedIn Lambda-less Stream Processing @Scale in LinkedIn
Lambda-less Stream Processing @Scale in LinkedIn
 
ETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure Databricks
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSets
 

Viewers also liked

Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
 
Telco Cloud - An evolution approach 2016
Telco Cloud - An evolution approach 2016Telco Cloud - An evolution approach 2016
Telco Cloud - An evolution approach 2016Fernando Herrera
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...DataStax Academy
 
The Modern Telco Network: Defining The Telco Cloud
The Modern Telco Network: Defining The Telco CloudThe Modern Telco Network: Defining The Telco Cloud
The Modern Telco Network: Defining The Telco CloudMarco Rodrigues
 
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...Spark Summit
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsGaurav Dutta
 
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)Amazon Web Services
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
Netflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsNetflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsBlake Irvine
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...Spark Summit
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSpark Summit
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...Spark Summit
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Dawen Liang
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...Amazon Web Services
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learningYves Raimond
 
Balancing Discovery and Continuation in Recommendations
Balancing Discovery and Continuation in RecommendationsBalancing Discovery and Continuation in Recommendations
Balancing Discovery and Continuation in RecommendationsMohammad Hossein Taghavi
 

Viewers also liked (20)

RCA-Mobile-Networks-Summary-20150421
RCA-Mobile-Networks-Summary-20150421RCA-Mobile-Networks-Summary-20150421
RCA-Mobile-Networks-Summary-20150421
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
 
Telco Cloud - An evolution approach 2016
Telco Cloud - An evolution approach 2016Telco Cloud - An evolution approach 2016
Telco Cloud - An evolution approach 2016
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
 
The Modern Telco Network: Defining The Telco Cloud
The Modern Telco Network: Defining The Telco CloudThe Modern Telco Network: Defining The Telco Cloud
The Modern Telco Network: Defining The Telco Cloud
 
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...
Predictive Analytics for IoT Network Capacity Planning: Spark Summit East tal...
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hits
 
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
AWS re:Invent 2016: Introduction to Amazon CloudFront (CTD205)
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
Netflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsNetflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of Analytics
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning
 
Balancing Discovery and Continuation in Recommendations
Balancing Discovery and Continuation in RecommendationsBalancing Discovery and Continuation in Recommendations
Balancing Discovery and Continuation in Recommendations
 

Similar to Big Telco Real-Time Network Analytics

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Databricks
 
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Sa introduction to big data pipelining with cassandra &amp; spark   west mins...Sa introduction to big data pipelining with cassandra &amp; spark   west mins...
Sa introduction to big data pipelining with cassandra &amp; spark west mins...Simon Ambridge
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015Iulia Emanuela Iancuta
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark Summit
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...Databricks
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoMapR Technologies
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksAnyscale
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataCarlos Andrés García
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataVMware Tanzu
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit
 
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkLenovo Data Center
 
IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveTorsten Steinbach
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 

Similar to Big Telco Real-Time Network Analytics (20)

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Sa introduction to big data pipelining with cassandra &amp; spark   west mins...Sa introduction to big data pipelining with cassandra &amp; spark   west mins...
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
 
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
 
IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep Dive
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 

More from Yousun Jeong

Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesYousun Jeong
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidYousun Jeong
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with DruidYousun Jeong
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQLYousun Jeong
 
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례Yousun Jeong
 
2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatformYousun Jeong
 

More from Yousun Jeong (8)

Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druid
 
Kubernetes on aws
Kubernetes on awsKubernetes on aws
Kubernetes on aws
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
 
2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform
 

Recently uploaded

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 

Recently uploaded (20)

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 

Big Telco Real-Time Network Analytics

  • 1. ‹#› Big Telco 
 Real-Time Network Analytics Yousun Jeong
  • 2. Who am I • Senior Software Engineer of SK Telecom, South Korea’s largest wireless communications provider • Work on commercial products (~ ’15)
 - She worked with Hadoop DW
 - She worked with IaaS(OpenStack)
 - She worked with PaaS(CloudFoundry)
 • Mail to : jerryjung@sk.com 2
  • 3. 3 Table of Contents 1. Big Data in SK Telecom 2. Benefit of Spark 3. Spark Real Workload 
 Real-Time Network Analytics 4. Ongoing R&D
  • 4. Big Data in SKT in a Nutshell ✓ Data Size - Currently collecting 250 TB/day ! ✓ Big Data Management Infrastructure - Hadoop cluster (1400+ nodes); migrated from 
 MPP RDBMS ✓ Use cases
 - Real-Time Analytics of Base Stations
 - Network Enterprise DW ! ✓ Ongoing R&D
 - SKT Hadoop DW Appliance with H/W acceleration 4
  • 5. Operating over 1400 nodes (30 PB+) of Hadoop cluster SKT Hadoop Infrastructure • Optimized configuration • Fault tolerant and effective resource management system 5 Data Collector Data Collect " & pre-processing Main Cluster Analysis R&D Cluster ~250 TB/day (500+ node) Service! Logic Repository (400+ Node) (100+ node) Service Cluster (400+ node) Marketing NW 
 Analytics VoC SKT Hadoop Infra Data Feeding Data Feeding Commercialize Develop.
  • 6. Batch LayerInterface Layer Flume Kafka" HDFS 
 (Data Mart) oozie (workflow) Hive (ETL) Spark (ETL) Analytics Layer 1 2 Spark SQL Spark MlLib Spark GraphX Spark R YARN (Unified Resource Manager) Real-Time Layer NoSQL Elastic
 Search HDFS Data Service Layer BI Legacy App 3 Analytics Layer Batch Processing Layer - Hadoop EDW Real-Time Processing Layer – Real Time Analysis 3 1 2 【 Components 】 Spark Streaming" ! H/W Accelerator (SSD, FGPA) Cluster Manger Ambari SKT Big Data Reference Architecture Designed to handle both real-time & batch data processing and high level analysis using Spark as a core technology 6
  • 7. Benefit of Spark Spark help us to have the gains in processing speed and implement various big data applications easily and speedily ▪ Support for Event Stream Processing ▪ Fast Data Queries in Real Time ▪ Improved Programmer Productivity ▪ Fast Batch Processing of Large Data Set Why SKT use spark … 7
  • 8. Use cases: Summary Network Enterprise DW APOLLO • End-to-end network quality assurance and
 fault analysis in a timely manner • Real-time analysis of radio access network to improve operation efficiency Network analytics 8
  • 9. 9 DC
 Parser Kafka" Broker Kafka" Producer
 Kafka" Topic Spark Streaming Kafka Direct" Stream" 1 minute widow 10 s HDFS ES 10 s Real-Time Dashboard Spark SQL BI
 Analysis JDBC" ODBC 1 2 4 5 Data Collector" (Flume) 3 Spark
 MLlib 6 Timely Processing" Quick Response Requirements Parallelism • Executors • Partitions • Using Akka Use case 1: Requirements & Challenges
  • 10. “Hadoop S/W and Commodity H/W Based Cost-effective IT Infrastructure System” 【 SKT DW Infrastructure】 “High-price, High-performance Proprietary IT Infrastructure System” 【 Legacy IT Infrastructure 】 ※ MPP Massively Parallel Processing, SAN Storage Area Network, NAS Network Attached Storage, RDBMS Relational DB Management System Structured/Un-structured Data Scale-out Structure (Petabyte, Exabyte)Data Structured Data Scale-up Structure (Terabyte) Commodity H/W (x86 Server)H/W High Performance H/W (MPP, Fabric Switch, etc.) Hadoop Architecture SQL on Hadoop S/W Proprietary S/W
 (RDBMS, etc.) Transaction/Batch Processing" (SQL) Hadoop File System Hadoop DW can handle telco big data with scalability & cost efficiency Use case 2: Hadoop based Enterprise DW 10 ※ MPP Massively Parallel Processing
  • 11. 11 Use case 2: Network Enterprise DW NMS#1 DBMS … NMS#1 DBMS NMS#N-1 DBMS [ Current ]
 Siloed Data & IT Management Access NW Core NW Transport Expected advantages • Unification of 130+ legacy DMBSs, each of which was managing separate network monitoring system, enabling thorough analysis over the entire network • Quick and accurate identification of root causes of network failure Data scientists need unified platform to collect data from all network equipment for management and analysis purpose NMS
 #1 … NMS
 #2 NMS
 #N-1 Legacy NMS
 #N Hadoop DW DW Legacy NEWN MS#1 … NEW
 NMS# N BI &
 Analytic … [ Goal (4Q, 2015) ]" Network Enterprise DW
  • 12. Network EDW is a Hadoop-based data warehouse built on Spark for various network statistics or raw data User Benefits • End-to-End quality assurance,
 Fault analysis • Reduces analysis lead time
 (days → minutes) • Saves TCO (1/5 less than legacy DW) ! Hadoop DW • Spark-SQL functions and query optimizer • Bulk-loading and timely processing of large data • SSD caching applied for 
 performance enhancement Acess Core Transport EMS EMS T-Pani EMS Hadoop DW DW Data Data Mart SQL on Hadoop
 (Spark SQL) IP EMS AnalyticsSQL ETL ETL O! D! S MQE*
 (Meta Query
 Engine) H/W Accelerator ! SSD Caching H/W Accelerator
 SSD Caching BI * MQE (Meta Query Engine) : Heterogeneous database integration query, including the Hadoop. Use case 2: Network Enterprise DW 12
  • 13. 13 https://github.com/bitnine-oss/octopus Use case 2: Meta Query Engine Features" 1. Subset of ANSI-SQL" 2. Queries on multiple databases 
 including Spark-SQL, Oracle." 3. SQL-based authorization" 4. User authentication" 5. Unified schema view
  • 14. Use case 2: Requirements & Challenges Timely Processing -ETL" Integrated BI Tools" Quick Response Requirements 14 MDS #1 MQE #1 HA Proxy Thrift Server 
 #1 Thrift Server 
 #2 Spark SQL HDFS YARN WEB MDS BI MQE Meta Store Octopus NW EDW # 96 ETL Spark 3 2 1 4
  • 15. Use case 2: YARN(Dynamic Resource Allocation) 15 spark.dynamicAllocation.enabled true! spark.shuffle.service.enabled true! spark.dynamicAllocation.minExecutors 50! spark.dynamicAllocation.maxExecutors 150! spark.dynamicAllocation.initialExecutors 50! spark.dynamicAllocation.cacheExecutorIdleTimeout 600! spark.dynamicAllocation.executorIdleTimeout! 5! spark.dynamicAllocation.schedulerBacklogTimeout! ! 5! spark.dynamicAllocation.sustainedSchedulerBacklogTimeout! 5 <property>! <name>yarn.nodemanager.aux-services</name>! <value>mapreduce_shuffle,spark_shuffle</value>! </property>! <property>! <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>! <value>org.apache.spark.network.yarn.YarnShuffleService</value>! </property> Configuration
  • 16. Use case 2: BI Integration 16 spark.sql.thriftServer.incrementalCollect true! spark.driver.maxResultSize 10g Configuration
  • 17. Use case 2: Patches 17 SPARK-7792! - HiveContext registerTempTable not thread safe! SPARK-7936! - Add configuration for initial size and limit of hash for aggregation! SPARK-8153! - Add configuration for disabling partial aggregation in runtime! SPARK-8285! - CombineSum should be calculated as unlimited decimal first! SPARK-8312! - Populate statistics info of hive tables if it's needed to be! SPARK-8333! - Spark failed to delete temp directory created by HiveContext! SPARK-8334 ! - Binary logical plan should provide more realistic statistics! SPARK-8357! - Memory leakage on unsafe aggregation path with empty input! SPARK-8420! - Inconsistent behavior with Dataframe Timestamp between 1.3.1 and 1.4.0! SPARK-8552! - Using incorrect database in multiple sessions! SPARK-8707! - RDD#toDebugString fails if any cached RDD has invalid partitions! SPARK-8826! - Fix ClassCastException in GeneratedAggregate! SPARK-9685! - Unspported dataType: char(X) in Hive! SPARK-10151! - Support invocation of hive macro! SPARK-10152! - Support Init script for hive-thriftserver! SPARK-10679! - javax.jdo.JDOFatalUserException in executor! SPARK-10684! - StructType.interpretedOrdering need not to be serialised! SPARK-10216 - Avoid creating empty files during overwrite into Hive table with group by query Open Issues
  • 18. Use case 2: Performance 18 TPC-H
  • 19. Use case 2: Performance 19 Job Server
  • 20. Hadoop DW Appliance (ongoing) 【 SKT Hadoop DW Appliance 】 Management & Automation Core Software Solution Hardware Acceleration 3 1 2 ▪ Develop Interactive Spark SQL ▪ Develop Meta Query Engine ▪ Develop Flash Storage-based I/O Acceleration ▪ Develop FPGA-based CPU Acceleration ▪ Develop Data & System Security ▪ Workload Optimization & Automation Industry Oriented Solution4 ▪ Fault Detection & Classification in Manufacturing ▪ Mobile Network Data Analytic Solution ▪ Unstructured Data Collection/Processing Solution Develop a Hadoop DW appliance combining optimized S/W layer and H/W acceleration 20 H/W Acceleration Layer Data Processing Layer * Meta Query Engine DW Management Layer Industry" Oriented Solution ! ! ! ! ! ! ! Monitoring DB Migration Security OptimizationPackaging SQL Engine/Storage " ! ! ! * SPARK HIVE Legacy RDBMS FDC Telco others Hadoop Storage DB Storage * Flash based I/O Accelerator * FPGA Accelerator 2 1 3 4