Stream processing on mobile networks

Apache Flink in action –
stream processing of mobile
networks
Future of Data: Real Time Stream Processing with Apache Flink

Who we are
We are a company that deals with the
processing of data, its storage, distribution
and analysis. We combine advanced
technology with expert services in order to
obtain value for our customers.
Main focus is on the big data technologies,
like Hadoop, Kafka, NiFi, Flink.
Web: http://triviadata.com/

What we‘re going to talk about
• Why mobile network operators need stream processing
• Architecture
• Business Challenges
• Operating Flink in Hadoop environment
• Stream processing challenges in our use case

Network architecture
Credits: https://www.gl.com/images/gsm-gprs-umts-sigtran-protocol-analyzer-over-tdm-ip-ps-web.gif
data sources (probes, devices, ...)

xDRStreamingConversion
2G
BTS
3G
NodeB
4G
eNodeB
00101101001111100010101000100110111001000010
00101101000101010001001101110010000111110010
01101001101110010111000101010001001100010
10101101000101010001001101110010000111110010
0010111001001011010000111110101000100000010
0011101001101110010111000101010001001100010
101101001101110010111000101010001001100010
Events - VOICE, SMS, DATA
• Date; Time; Event Type; MSISDN; VPN; IMEI;
Duration; Locality; Performance; Closing
Time; Relation; NULL; ...
• Date; Time; App; PortApp; IPCust; IPDest;
SrcPort; DstPor; Start; Stop; Duration;
ByteUp; ByteDn; nPacketUp; ...
• Date; Time; Event Type; MSISDN; VPN;
Duration; Locality; Performance; Closing
Time; Relation; NULL; ...
• Date; Time; Event Type; Customer APN;
Network; Locality; Performance; Closing
Time; Relation; Delay_Ans; ServiceProvider;
CDNProvider; Domain/Host; nBigPacket;
VLAN; SessionID
• Date; Time; MCC; MSISDN; Network; Locality;
IMSI, IMEI Performance; Closing Time;
Relation
• Date; Time; Event Type; MSISDN; Lenght;
Locality; Performance; Closing Time;
Relation; NULL; ...
Data conversion

Mobile operator’s data
Client’s transactions:
• SMS – simplest transaction (mostly a few records)
• Data – lenght of session = number of records
• Calls – most complex joining of records
Operators data:
• Network usage
• Billing events

Typical use cases in telco
Customer oriented
• fraud & security
• Customer Experience Management
• triggers alarms based on customer-related
quality indicators
• CEM KPI
• Fast issue diagnosis & Customer support
• reduce the Average Handling Time and First
Call Resolution rate
• Data source for analysis:
• Community analysis
• Household detection
• Segmentation
• Churn prediction
• Behavioural analysis
Operation oriented
• networks performing overlook
• service management support
• precise problem geolocation
• end-to-end in-depth troubleshooting
• real-time fault detection
• automated troubleshooting (diagnosis,
recovery)
• QoS KPI trend analysis
Constant monitoring of network,
service and customer KPIs.

Use cases in action
• Network Analytics (web application)
• Cell
• User
• Device
• Getting raw data into HDFS for analysts – SQL queries via
Impala

They already do it
• DWH style
• Batch processing

Challenges
• Conversion from binary format (e.g. ASN.1)
• Tightening the feedback loop
• Have solution ready for future use cases
• Anomaly detection
• Predictive maintenance
• Still allow people to run analytical queries on data

Apache Kafka
• De facto standard for stream processing
• Fault tolerant
• Highly scalable
• We use it with
• Avro (schema evolution)
• Schema registry

Apache Flink
• Very flexible window definitions
• Event time semantics
• Many deployment options
• Can handle large state

Challenges
• Running Flink on YARN
• Secured Hadoop & Kafka cluster
• Data onboarding
• Side inputs/data enrichment
• Storing data in Hadoop

Flink on YARN
• Big, Fat, Long running
YARN session
• Or Flink cluster per job
${FLINK_HOME}/bin/flink run
-m yarn-cluster
-d
-ynm ${APPLICATION_NAME}
-yn 2
-ys 2
-yjm 2048
-ytm 4096
-c com.triviadata.streaming.job.SipVoiceStream ${JAR_PATH}
--kafkaServer ${KAFKA_SERVER}
--schemaRegistryUrl ${SCHEMA_REGISTRY_URL}
--sipVoiceTopic raw.SipVoice
--correlatedSipVoiceTopic result.SipVoiceCorrelated
--stateLocation ${FLINK_STATE_LOCATION}
--security-protocol SASL_PLAINTEXT
--sasl-kerberos-service-name kafka

Kerberized Hadoop & Kafka
• Easy & Straightforward Flink setup
• Hbase/Phoenix privileges
• Hassle with Kafka ACLs
• ACL to read from the topic
• ACL to write to the topic
• ACL to join consumer group
security.kerberos.login.use-ticket-cache: false
security.kerberos.login.keytab: /home/appuser/appuser.keytab
security.kerberos.login.principal: appuser
security.kerberos.login.contexts: Client,KafkaClient

Side inputs/Data enrichment
• Read code lists from HDFS
• Store them in Rocks DB
on the local filesystem of the Data Node
• Ask Rocks DB to translate code -> value

Side inputs/Data enrichment
• Code list files on HDFS updated
once a day
• Command topic to notify jobs about
new files
• Refresh code lists stored in Rocks
DB

Apache Phoenix
• OLTP DB on top of HBase
• JDBC API
• ACID transactions
• Secondary indexes
• Joins

Cloudera Impala
• Analytic database for Hadoop

Correlation
• Merge together related messages coming from one stream
• Key stream by calling/called number
• Merge messages with the same key where start time difference is less
than X.

Correlation
override def processElement(
value: SipVoice,
ctx: KeyedProcessFunction[String, SipVoice,
SipVoices]#Context,
out: Collector[SipVoices]): Unit = {
val startTime = parseTime(value.startTime)
val (key, values) =
sipVoiceState
.keys
.asScala
.find(s => math.abs(s - startTime) <= waitingTime)
.map(k => (k, value :: sipVoiceState.get(k)))
.getOrElse {
val triggerTimeStamp =
ctx.timerService().currentProcessingTime() + delayPeriod
ctx
.timerService
.registerProcessingTimeTimer(triggerTimeStamp)
sipVoiceTimers
.put(triggerTimeStamp, startTime)
(startTime, List(value))
}
sipVoiceState.put(key, values)
}
override def onTimer(
timestamp: Long,
ctx: KeyedProcessFunction[String, SipVoice,
SipVoices]#OnTimerContext,
out: Collector[SipVoices]): Unit = {
if (sipVoiceTimers.contains(timestamp)) {
val sipVoiceKey = sipVoiceTimers.get(timestamp)
val correlationId = UUID.randomUUID().toString
val correlatedSipVoices =
sipVoiceState
.get(sipVoiceKey)
.map(_.toCorrelated(correlationId))
.sortBy(_.startTime)
out.collect(SipVoices(correlatedSipVoices))
correlatedSipVoice.inc()
inStateSipVoice.dec(correlatedSipVoices.size)
sipVoiceTimers.remove(timestamp)
sipVoiceState.remove(sipVoiceKey)
}
}

Correlation
• Correlate massages among
multiple streams
• Switching between networks
during the call
• Call failure and reestablishment
• Event time semantics
• Lateness
• Out of order messages

Aggregations
• As an example for a cell we
want to see:
• Number of errors
• Number of calls
• Number of intercell handovers
• …

Defined window
table.window(
Tumble over windowLengthInMinutes.minutes
on 'timestamp as 'timeWindow)

Table API
table
.window(Tumble over windowLengthInMinutes.minutes on 'timestamp as 'timeWindow)
.groupBy(
'lastCell,
'cellName,
'cellType,
'cellBand,
'cellBandwidthDownload4g,
'cellBandwidthUpload4g,
'cellSiteName,
'cellSiteAddress,
'timeWindow
)
.select(
'lastCell,
'cellName,
'cellType,
'cellBand,
'cellBandwidthDownload4g,
'cellBandwidthUpload4g,
'cellSiteName,
'cellSiteAddress,
'voiceConnectAttempt.sum as 'voiceConnectAttempt,
'voiceConnectSuccess.sum as 'voiceConnectSuccess,
'interCellHandovers.sum as 'interCellHandovers,
'srvccHandovers.sum as 'srvccHandovers,
'timeWindow.start.cast(Types.LONG) as 'timeWindow
)

Stream processing on mobile networks

More Related Content

What's hot

Similar to Stream processing on mobile networks

Recently uploaded

Stream processing on mobile networks

Editor's Notes