SlideShare a Scribd company logo
Mobility Insights at Swisscom :
Understanding Collective Mobility
in Switzerland
Spark Summit, October 2016
francois.garillot@swisscom.com @huitseeker
mohamed.kafsi@swisscom.com @mou7
Agenda
Intro
Smart-Data
Big Data Architecture
Trajectory Classification
Streaming
Data challenges
Introduction : Positioning
Positioning users in a modern
network
no triangulation at scale
positioning based on cell attachement history, prec ~200m
cell-to-cell handover, prec ~50m around limit
Timing Advance (roundtrip) : better results on good data sources
Trajectory data
mining
time series reconstruction
trajectory segmentation
map matching, clustering
mode of transport detection
...
How to create value with
positioning at Swisscom ?
with competitive analytics & data sources,
and by making sure it embodies the right values.
Smart Data
On (not) tracking (any users)
"Swisscom strictly complies with all applicable legislations, in
particular with the telecommunications law and the data
protection initiative."
Jürg Studerus, Swisscom Senior Manager, Corporate Responsibility
Smart Data : Big Data without Big Brother
Privacy preservation is an asset
It makes sense to care as much about your customer as they do about you.
We technically enforce this
answering only synoptic questions, no individual ones,
with data flow control : we neutralize quasi-identifiers at every stage
Swisscom mobile subscribers
source: xavierstuder.com, MD&A reports
Our choices
public good applications: making Switzerland run better,
understanding places, not individuals,
anonymized aggregations
A first product : City
"It's a dream for civil engineers" -- Alexandre Machu, Urban
systems engineer, Pully
Demo time
Usages
New roads to divert transit traffic out of downtown (informs a 50M$
project)
Parking lot expansion and transformation (informs a 10M$ project)
Electric car charging station deployment
Big Data architecture
In the backend
Spark configuration essentials for enterprise
jobs
spark.executor.memory="not the default 1g"
spark.kryo.registrator="something custom" // among others
spark.shuffle.service.enabled="true"
spark.dynamicAllocation.enabled="true"
spark.deploy.recoveryMode="ZOOKEEPER"
spark.deploy.recoveryDirectory="/path/to/state"
spark.deploy.zookeeper.url="quorumMachine1:2181, ..."
NOT the only valuable settings, see
https://techsuppdiva.github.io
Scala (1/2)
type ChronoHistory = List[UEupdate] @@ Chronological
type AnteChronoHistory = List[UEupdate] @@ AnteChronological
implicit class Chrono(l: List[UEupdate]) {
def asChrono: ChronoHistory = {
chronoCheck(l)
l.asInstanceOf[ChronoHistory]
}
def asAnteChrono: AnteChronoHistory = {
anteChronoCheck(l)
l.asInstanceOf[AnteChronoHistory]
}
}
Scala (2/2)
implicit def reverseChrono(l: ChronoHistory): AnteChronoHistory = l.reve
implicit def reverseAnteChrono(l: AnteChronoHistory): ChronoHistory = l.
>
Trajectory Classification
What is the proportion of trips
associated with trains?
Mode of Transport Detection
Input: Sequence of network events
Output: Mode of transport (train vs. other)
Network events associated with cells
Create fingerprints of cells
Intuition: cells with intermittent increases in the number of connections are
associated with collective mode of transports
Bursty Cell
Number of devices vs. minute of day
Burstiness
Random process with mean and variance , the relative variance isμ σ
2
.D =
σ
2
μ
Machine Learning with Spark
Periodic Spark job to compute cell features
Supervised training on labeled data (train vs. others)
Training and test with Spark ML
Spark (1/2)
val labeledPoints: RDD[LabeledPoint] = data.map {
case (transportMode, tripFeatures) =>
LabeledPoint(
labelOf(transportMode).toDouble,
featuresToFeatureVector(tripFeatures)
)
} // generate labeled data
labeledPoints.cache()
def trainNewModel = // Fix the used model
new LogisticRegressionWithLBFGS()
.setIntercept(true)
.setNumClasses(numberOfClasses)
.run(_: RDD[LabeledPoint])
Spark (2/2)
// train a model for performance evaluation
val model = trainNewModel(trainingData)
// Evaluate model on test instances and compute test error
val labelAndPreds = testData.map { point =>
val prediction = model.predict(point.features)
(point.label, prediction)
}
val testErr =
(labelAndPreds
.filter(r => r._1 != r._2)
.count().toDouble) / testData.count()
// train final model on the whole dataset
val finalModel = trainNewModel(labeledPoints)
Streaming Analytics
Road conditions on highways
Selecting users on a path of Interest
Graph matching
Locality-sensitive hashing :
A family H of hashing functions is
-sensitive if:
if then
if then
More:
LocalitySensitiveHashingBySpark, Uber, Spark Summit
2016
AGentleIntroductiontoLocality-SensitiveHashingwith
ApacheSpark, Scala By The Bay 2015
(r, cr, , )p1 p2
p– q ≤ r
P [h(q) = h(p)] ≥rH p1
p– q ≥ cr
P [h(q) = h(p)] ≤rH p2
Computing speeds: Solving graph
constraints
given a history of cells, where was the user, exactly ? (twice)
what's the path between 2 positions ?
linear query per user
Checkpointing: Set the checkpoint
interval
are you checkpointing too often ?
every batches, you'll need batches to recover from checkpointing time
loss
make sure
k p
k ≥ p
Data Challenges
Crucial elements
Quality, reliability of data sources
Automated ground truth checking
sensors
TEMS fleet
What's the ground truth for mode of transport, domicile, etc ?
Colleagues and friends volunteers
Questions ?

More Related Content

What's hot

Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
ICTeam S.p.A.
 
Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作
鈵斯 倪
 
MATLAB Based Research Projects List Assistance
MATLAB Based Research Projects List AssistanceMATLAB Based Research Projects List Assistance
MATLAB Based Research Projects List Assistance
Matlab Simulation
 
Exploring Modeling - Doing More with Lists
Exploring Modeling - Doing More with ListsExploring Modeling - Doing More with Lists
Exploring Modeling - Doing More with Lists
Ronen Botzer
 
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
Roland Schmehl
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
NECST Lab @ Politecnico di Milano
 
Low Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work StealingLow Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work Stealing
LEGATO project
 
S1170143 2
S1170143 2S1170143 2
S1170143 2s1170143
 
Composable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone ApplicationsComposable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone Applications
Demetris Trihinas
 
Big Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyrBig Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyr
Nicola Lambiase
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
Demetris Trihinas
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
LEGATO project
 
FabSim: Facilitating computational research through automation on large-scale...
FabSim: Facilitating computational research through automation on large-scale...FabSim: Facilitating computational research through automation on large-scale...
FabSim: Facilitating computational research through automation on large-scale...
Derek Groen
 

What's hot (13)

Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
Image Caption Generation: Intro to Distributed Tensorflow and Distributed Sco...
 
Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作Landset 8 的雲層去除技巧實作
Landset 8 的雲層去除技巧實作
 
MATLAB Based Research Projects List Assistance
MATLAB Based Research Projects List AssistanceMATLAB Based Research Projects List Assistance
MATLAB Based Research Projects List Assistance
 
Exploring Modeling - Doing More with Lists
Exploring Modeling - Doing More with ListsExploring Modeling - Doing More with Lists
Exploring Modeling - Doing More with Lists
 
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
Rachel Leuthold: Shape Optimization for Rigid Airfoils in Multiple-Kite AWE S...
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
 
Low Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work StealingLow Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work Stealing
 
S1170143 2
S1170143 2S1170143 2
S1170143 2
 
Composable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone ApplicationsComposable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone Applications
 
Big Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyrBig Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyr
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
 
FabSim: Facilitating computational research through automation on large-scale...
FabSim: Facilitating computational research through automation on large-scale...FabSim: Facilitating computational research through automation on large-scale...
FabSim: Facilitating computational research through automation on large-scale...
 

Similar to Mobility insights at Swisscom - Understanding collective mobility in Switzerland

Analytics with Spark
Analytics with SparkAnalytics with Spark
Analytics with Spark
Probst Ludwine
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
Yousun Jeong
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
GoDataDriven
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
Samir Bessalah
 
Mechatronics engineer
Mechatronics engineerMechatronics engineer
Mechatronics engineer
Samuel Narcisse
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
NECST Lab @ Politecnico di Milano
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
Large scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at GrabLarge scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at Grab
Roman
 
112 portfpres.pdf
112 portfpres.pdf112 portfpres.pdf
112 portfpres.pdf
sash236
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
Chester Chen
 
Just Another QSAR Project under OpenTox
Just Another QSAR Project under OpenToxJust Another QSAR Project under OpenTox
Just Another QSAR Project under OpenTox
Pantelis Sopasakis
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to Streaming
Databricks
 
Shifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapterShifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapter
Lisa Hua
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's coming
Databricks
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
jeykottalam
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0
Anyscale
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Chetan Khatri
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
Kelly Technologies
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Chetan Khatri
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in production
Chetan Khatri
 

Similar to Mobility insights at Swisscom - Understanding collective mobility in Switzerland (20)

Analytics with Spark
Analytics with SparkAnalytics with Spark
Analytics with Spark
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Mechatronics engineer
Mechatronics engineerMechatronics engineer
Mechatronics engineer
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
 
Large scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at GrabLarge scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at Grab
 
112 portfpres.pdf
112 portfpres.pdf112 portfpres.pdf
112 portfpres.pdf
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
 
Just Another QSAR Project under OpenTox
Just Another QSAR Project under OpenToxJust Another QSAR Project under OpenTox
Just Another QSAR Project under OpenTox
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to Streaming
 
Shifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapterShifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapter
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's coming
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in production
 

More from François Garillot

Growing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your WorkloadGrowing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your Workload
François Garillot
 
Deep learning on a mixed cluster with deeplearning4j and spark
Deep learning on a mixed cluster with deeplearning4j and sparkDeep learning on a mixed cluster with deeplearning4j and spark
Deep learning on a mixed cluster with deeplearning4j and spark
François Garillot
 
Spark Streaming : Dealing with State
Spark Streaming : Dealing with StateSpark Streaming : Dealing with State
Spark Streaming : Dealing with State
François Garillot
 
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache SparkA Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
François Garillot
 
Ramping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developersRamping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developers
François Garillot
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data Pool
François Garillot
 
Scala Collections : Java 8 on Steroids
Scala Collections : Java 8 on SteroidsScala Collections : Java 8 on Steroids
Scala Collections : Java 8 on Steroids
François Garillot
 

More from François Garillot (7)

Growing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your WorkloadGrowing Your Types Without Growing Your Workload
Growing Your Types Without Growing Your Workload
 
Deep learning on a mixed cluster with deeplearning4j and spark
Deep learning on a mixed cluster with deeplearning4j and sparkDeep learning on a mixed cluster with deeplearning4j and spark
Deep learning on a mixed cluster with deeplearning4j and spark
 
Spark Streaming : Dealing with State
Spark Streaming : Dealing with StateSpark Streaming : Dealing with State
Spark Streaming : Dealing with State
 
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache SparkA Gentle Introduction to Locality Sensitive Hashing with Apache Spark
A Gentle Introduction to Locality Sensitive Hashing with Apache Spark
 
Ramping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developersRamping up your Devops Fu for Big Data developers
Ramping up your Devops Fu for Big Data developers
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data Pool
 
Scala Collections : Java 8 on Steroids
Scala Collections : Java 8 on SteroidsScala Collections : Java 8 on Steroids
Scala Collections : Java 8 on Steroids
 

Recently uploaded

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 

Recently uploaded (20)

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 

Mobility insights at Swisscom - Understanding collective mobility in Switzerland

  • 1. Mobility Insights at Swisscom : Understanding Collective Mobility in Switzerland Spark Summit, October 2016 francois.garillot@swisscom.com @huitseeker mohamed.kafsi@swisscom.com @mou7
  • 2. Agenda Intro Smart-Data Big Data Architecture Trajectory Classification Streaming Data challenges
  • 4. Positioning users in a modern network no triangulation at scale positioning based on cell attachement history, prec ~200m cell-to-cell handover, prec ~50m around limit Timing Advance (roundtrip) : better results on good data sources
  • 5. Trajectory data mining time series reconstruction trajectory segmentation map matching, clustering mode of transport detection ...
  • 6. How to create value with positioning at Swisscom ? with competitive analytics & data sources, and by making sure it embodies the right values.
  • 8. On (not) tracking (any users) "Swisscom strictly complies with all applicable legislations, in particular with the telecommunications law and the data protection initiative." Jürg Studerus, Swisscom Senior Manager, Corporate Responsibility
  • 9. Smart Data : Big Data without Big Brother Privacy preservation is an asset It makes sense to care as much about your customer as they do about you. We technically enforce this answering only synoptic questions, no individual ones, with data flow control : we neutralize quasi-identifiers at every stage
  • 10. Swisscom mobile subscribers source: xavierstuder.com, MD&A reports
  • 11. Our choices public good applications: making Switzerland run better, understanding places, not individuals, anonymized aggregations
  • 12. A first product : City "It's a dream for civil engineers" -- Alexandre Machu, Urban systems engineer, Pully
  • 14. Usages New roads to divert transit traffic out of downtown (informs a 50M$ project) Parking lot expansion and transformation (informs a 10M$ project) Electric car charging station deployment
  • 17. Spark configuration essentials for enterprise jobs spark.executor.memory="not the default 1g" spark.kryo.registrator="something custom" // among others spark.shuffle.service.enabled="true" spark.dynamicAllocation.enabled="true" spark.deploy.recoveryMode="ZOOKEEPER" spark.deploy.recoveryDirectory="/path/to/state" spark.deploy.zookeeper.url="quorumMachine1:2181, ..." NOT the only valuable settings, see https://techsuppdiva.github.io
  • 18. Scala (1/2) type ChronoHistory = List[UEupdate] @@ Chronological type AnteChronoHistory = List[UEupdate] @@ AnteChronological implicit class Chrono(l: List[UEupdate]) { def asChrono: ChronoHistory = { chronoCheck(l) l.asInstanceOf[ChronoHistory] } def asAnteChrono: AnteChronoHistory = { anteChronoCheck(l) l.asInstanceOf[AnteChronoHistory] } }
  • 19. Scala (2/2) implicit def reverseChrono(l: ChronoHistory): AnteChronoHistory = l.reve implicit def reverseAnteChrono(l: AnteChronoHistory): ChronoHistory = l.
  • 21. What is the proportion of trips associated with trains?
  • 22. Mode of Transport Detection Input: Sequence of network events Output: Mode of transport (train vs. other) Network events associated with cells Create fingerprints of cells Intuition: cells with intermittent increases in the number of connections are associated with collective mode of transports
  • 23. Bursty Cell Number of devices vs. minute of day
  • 24. Burstiness Random process with mean and variance , the relative variance isμ σ 2 .D = σ 2 μ
  • 25. Machine Learning with Spark Periodic Spark job to compute cell features Supervised training on labeled data (train vs. others) Training and test with Spark ML
  • 26. Spark (1/2) val labeledPoints: RDD[LabeledPoint] = data.map { case (transportMode, tripFeatures) => LabeledPoint( labelOf(transportMode).toDouble, featuresToFeatureVector(tripFeatures) ) } // generate labeled data labeledPoints.cache() def trainNewModel = // Fix the used model new LogisticRegressionWithLBFGS() .setIntercept(true) .setNumClasses(numberOfClasses) .run(_: RDD[LabeledPoint])
  • 27. Spark (2/2) // train a model for performance evaluation val model = trainNewModel(trainingData) // Evaluate model on test instances and compute test error val labelAndPreds = testData.map { point => val prediction = model.predict(point.features) (point.label, prediction) } val testErr = (labelAndPreds .filter(r => r._1 != r._2) .count().toDouble) / testData.count() // train final model on the whole dataset val finalModel = trainNewModel(labeledPoints)
  • 29. Road conditions on highways
  • 30. Selecting users on a path of Interest
  • 31. Graph matching Locality-sensitive hashing : A family H of hashing functions is -sensitive if: if then if then More: LocalitySensitiveHashingBySpark, Uber, Spark Summit 2016 AGentleIntroductiontoLocality-SensitiveHashingwith ApacheSpark, Scala By The Bay 2015 (r, cr, , )p1 p2 p– q ≤ r P [h(q) = h(p)] ≥rH p1 p– q ≥ cr P [h(q) = h(p)] ≤rH p2
  • 32. Computing speeds: Solving graph constraints given a history of cells, where was the user, exactly ? (twice) what's the path between 2 positions ? linear query per user
  • 33. Checkpointing: Set the checkpoint interval are you checkpointing too often ? every batches, you'll need batches to recover from checkpointing time loss make sure k p k ≥ p
  • 35. Crucial elements Quality, reliability of data sources Automated ground truth checking sensors TEMS fleet What's the ground truth for mode of transport, domicile, etc ? Colleagues and friends volunteers