SlideShare a Scribd company logo
1 of 31
Download to read offline
Oleg Vasilyev, Henry Milner, Wensi Fu, Oleg White
Assigning Responsibility
for Deteriorations in
Video Quality
…is a video experience management
platform that maximizes viewer
engagement.
We provide quality metrics that give a
comprehensive view into online video
businesses.
Devices & OVPs
Internet Video Streaming is Hard
Many parties, many paths, but no E2E owner
ISPs and Networks CDNs Publishers
CDN
Cable/DSL ISP
Backbone Network
CDN
Wireless ISP
Encoder
Video Origin &
Operations
Home WiFi
Devices & OVPs
Internet Video Streaming is Hard
Many parties, many paths, but no E2E owner
ISPs and Networks CDNs Publishers
CDN
Cable/DSL ISP
Backbone Network
CDN
Wireless ISP
Encoder
Video Origin &
Operations
Home WiFi
Publisher
control
ISP
control
Devices & OVPs ISPs and Networks CDNs Publishers
CDN
Cable/DSL ISP
Backbone Network
CDN
Wireless ISP
Encoder
Video Origin &
Operations
Home WiFi
100
publishers
1,000,000
unique
videos
(“assets”)
100 CDNs100,000 nodes in internal ISP
network
1,000,000 IP
addresses
1,000 unique
devices
The ecosystem is big.
…in 10,000,000 video sessions from one week, at one US ISP
Viewers are expecting TV-like quality (or better)
QoE is Critical to Engagement
For both video and advertisement businesses
HOW LIKELY ARE YOU TO WATCH
FROM THAT SAME PROVIDER AGAIN?
33.6%
VERY
UNLIKELY
24.8%
UNLIKELY
24.6%
UNCHANGED
58.4%CHURN RISK
6.6%
VERY LIKELY
8%
LIKELY
-3
-8
-11
-14
-16
2011 2012 2013 2014 2015
ENGAGEMENTREDUCTION
(IN MINUTES)
WITH 1% INCREASE IN BUFFERING
MINS
MINS
MINS
MINS
MINS
Source: Conviva, “2015Consumer Survey Report”
Devices & OVPs ISPs and Networks CDNs Publishers
CDN
Cable/DSL ISP
Backbone Network
CDN
Wireless ISP
Encoder
Video Origin &
Operations
Home WiFi
Quality impact
Replace internal
ISP fiber node
No change
0.1% of session
spent buffering
2% of session
spent buffering
Quality impact of
this fiber node on
session 3
Quality impact of
this fiber node on
session 2
Quality impact
0.1% of session
spent buffering
2% of session
spent buffering
-
Quality impact of
this fiber node on
this session
=
Quality impact of this
fiber node
Quality impact of
this fiber node on
session 1=
𝚺
Quality impact
Replace internal
ISP fiber node
No change
0.1% of session
spent buffering
2% of session
spent buffering
???
Typically better
home WiFi
Confounding factors
0.1% of session
spent buffering
2% of session
spent buffering
Different fiber node
Actual fiber node
Actual effect
Effect of confounder
Devices & OVPs ISPs and Networks CDNs Publishers
CDN
Cable/DSL ISP
Backbone Network
CDN
Wireless ISP
Encoder
Video Origin &
Operations
Home WiFi
100
publishers
1,000,000
unique
videos
(“assets”)
100 CDNs100,000 nodes in internal ISP
network
1,000,000 IP
addresses
1,000 unique
devices
Many potentially-important factors
…and confounding factors!
Typically better
home WiFi
Fixing confounding
0.1% of session
spent buffering
2% of session
spent buffering
Different fiber node
Actual fiber node
Typically better
home WiFi
Fixing confounding
0.1% of session
spent buffering
2% of session
spent buffering
Different fiber node
Actual fiber node
Estimate both
effects
Estimating quality impact
Fiber Node = #1
(a good FN)
Fiber Node = #567
(actual FN)
1.7% of session
spent buffering
2.5% of session
spent buffering
Model
- IP: 1.2.3.4
- Fiber Node: #567
- Asset: “Glee”
1.7% of session
spent buffering
2.5% of session
spent buffering
-
Estimated quality
impact of this fiber
node on this session
=
Devices produce short buffering
Devices produce short buffering
Device-level issues are responsible
for most short buffering episodes
Good and bad IP addresses
From worst PCD, typical week:
From worst PCD, typical week:
Worst assets, typical week:
Only 6 really bad assets responsible for strong
deterioration of their sessions regardless of other factors.
Of these, 4 are sport programs and 2 are obscure regularly
scheduled foreign programs.
Modeling video quality
random forests
boosted trees
neural networks
GLMs
Today’s results
Model for rebuffering ratio
Response: Rebuffering ratio (R)
(buffering time / (buffering time + playing time))
Categorical features:
IP address (I)
Fiber Node (N)
Service Group (S)
Publisher (P)
Asset (A)
CDN (C)
Device (D)
Live / VoD
More features:
Time, day of week, asset length.
Time or no time?
Time is strongest or one of strongest features.
Big game, popular show – many sessions suffer, regardless of IP and
device or even CDN.
Time makes model more precise.
But: time steals effect from big nodes such as Asset and Publisher.
Similar question: Bitrate or no bitrate?
Model versions, practical issues
Preference: Spark cluster of 10-30 nodes, each node 4-8 cores and
30GB memory. Hopefully 1-3 hours to process 1-3 weeks of data of big
ISP, whole US.
Nodes as embedded features. Boosted trees on Spark. Whole US. Learn
from 1-3 weeks, apply to last week.
Nodes as one-hot encoding. Random forest on Spark. Each
geographical area is processed separately.
Model versions, practical issues
Nodes as a trainable embedding first layer. Neural network on single
Spark node with Tensorflow. Each geographical area separately.
Embedding size is same for all kinds of nodes.
Embedding size is ~ log of node's dictionary size.
One or two hidden layers above the embedding layer.
Adagrad or any other Ada-like optimizer performs better than no-Ada.
Slow improvements beyond fast mediocre accuracy.
Finding a “good” node
Effect of a node = Model( real features ) - Model( features with good node )
In case of trainable embedding – iterative finding of good nodes:
good = nodes with lowest avg label
while set of good nodes is changing:
for each session:
Effect = Model(real) - Model(good)
good = nodes with lowest avg effect
Overlap of set of good nodes with
the set from previous iteration -
typically stabilizes in 2-3 iterations.
Sessions are affected by ...
If effect is linear, this would be like ART in 3D tomography
Thanks to Spark and Databricks
for making it easy for us
Thank You.
Contact information:
ovasilyev@conviva.com
hmilner@conviva.com
We Are Hiring!
www.conviva.com/our-team/careers

More Related Content

What's hot

Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkDongwon Kim
 
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...Spark Summit
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Josef A. Habdank
 
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...Databricks
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationJen Aman
 
Transactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric LiangTransactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric LiangDatabricks
 
Reactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark StreamingReactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark StreamingSpark Summit
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
 
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersApril 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersYahoo Developer Network
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 
The Hidden Life of Spark Jobs
The Hidden Life of Spark JobsThe Hidden Life of Spark Jobs
The Hidden Life of Spark JobsDataWorks Summit
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...Databricks
 
Unleashing Data Intelligence with Intel and Apache Spark with Michael Greene
Unleashing Data Intelligence with Intel and Apache Spark with Michael GreeneUnleashing Data Intelligence with Intel and Apache Spark with Michael Greene
Unleashing Data Intelligence with Intel and Apache Spark with Michael GreeneDatabricks
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityJen Aman
 

What's hot (20)

Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
 
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...
A Spark Framework For < $100, < 1 Hour, Accurate Personalized DNA Analy...
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...
Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp ...
 
Pedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon InnovationPedal to the Metal: Accelerating Spark with Silicon Innovation
Pedal to the Metal: Accelerating Spark with Silicon Innovation
 
Transactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric LiangTransactional writes to cloud storage with Eric Liang
Transactional writes to cloud storage with Eric Liang
 
Reactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark StreamingReactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark Streaming
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersApril 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
Big Data Tools in AWS
Big Data Tools in AWSBig Data Tools in AWS
Big Data Tools in AWS
 
The Hidden Life of Spark Jobs
The Hidden Life of Spark JobsThe Hidden Life of Spark Jobs
The Hidden Life of Spark Jobs
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...
Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Ge...
 
Unleashing Data Intelligence with Intel and Apache Spark with Michael Greene
Unleashing Data Intelligence with Intel and Apache Spark with Michael GreeneUnleashing Data Intelligence with Intel and Apache Spark with Michael Greene
Unleashing Data Intelligence with Intel and Apache Spark with Michael Greene
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance Understandability
 
Enterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on HadoopEnterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on Hadoop
 

Similar to Assigning Responsibility for Deteriorations in Video Quality with Henry Milner and Oleg Vasilyev

Optimizing your client's wi fi experience
Optimizing your client's wi fi experience Optimizing your client's wi fi experience
Optimizing your client's wi fi experience Cisco Canada
 
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...Naoki Shibata
 
Optimizing your client's wi fi experience
Optimizing your client's wi fi experienceOptimizing your client's wi fi experience
Optimizing your client's wi fi experienceCisco Canada
 
Managing the Mobile Device Wave
Managing the Mobile Device WaveManaging the Mobile Device Wave
Managing the Mobile Device WaveCisco Canada
 
High Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandHigh Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandwebhostingguy
 
Challenges and experiences with IPTV from a network point of view
Challenges and experiences with IPTV from a network point of viewChallenges and experiences with IPTV from a network point of view
Challenges and experiences with IPTV from a network point of viewbrouer
 
Spectrum management best practices in a Gigabit wireless world
Spectrum management best practices in a Gigabit wireless worldSpectrum management best practices in a Gigabit wireless world
Spectrum management best practices in a Gigabit wireless worldCisco Canada
 
Network hardware
Network hardwareNetwork hardware
Network hardwaresnoonan
 
MIS Chapter 3
MIS Chapter 3MIS Chapter 3
MIS Chapter 3Lee Gomez
 
Audio/Video Streaming over 802.11
Audio/Video Streaming over 802.11Audio/Video Streaming over 802.11
Audio/Video Streaming over 802.11Videoguy
 
Getting the best out of WebRTC
Getting the best out of WebRTCGetting the best out of WebRTC
Getting the best out of WebRTCDigium
 
Getting the Best Out Of WebRTC - Astricon 2014
Getting the Best Out Of WebRTC - Astricon 2014Getting the Best Out Of WebRTC - Astricon 2014
Getting the Best Out Of WebRTC - Astricon 2014Dan Jenkins
 
Aeroscout Random
Aeroscout RandomAeroscout Random
Aeroscout RandomMarc
 
My profile
My profileMy profile
My profiledhruv_63
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLPTyrone Systems
 
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded Technology
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded TechnologyARM server, The Cy7 Introduction by Aaron Joue, Ambedded Technology
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded TechnologyAaron Joue
 
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...VirtualTech Japan Inc.
 

Similar to Assigning Responsibility for Deteriorations in Video Quality with Henry Milner and Oleg Vasilyev (20)

types of network cables
types of network cablestypes of network cables
types of network cables
 
2019 types of network cables
2019 types of network cables2019 types of network cables
2019 types of network cables
 
Optimizing your client's wi fi experience
Optimizing your client's wi fi experience Optimizing your client's wi fi experience
Optimizing your client's wi fi experience
 
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...
(Slides) P2P video broadcast based on per-peer transcoding and its evaluatio...
 
Optimizing your client's wi fi experience
Optimizing your client's wi fi experienceOptimizing your client's wi fi experience
Optimizing your client's wi fi experience
 
Managing the Mobile Device Wave
Managing the Mobile Device WaveManaging the Mobile Device Wave
Managing the Mobile Device Wave
 
High Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandHigh Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBand
 
Challenges and experiences with IPTV from a network point of view
Challenges and experiences with IPTV from a network point of viewChallenges and experiences with IPTV from a network point of view
Challenges and experiences with IPTV from a network point of view
 
Spectrum management best practices in a Gigabit wireless world
Spectrum management best practices in a Gigabit wireless worldSpectrum management best practices in a Gigabit wireless world
Spectrum management best practices in a Gigabit wireless world
 
Network hardware
Network hardwareNetwork hardware
Network hardware
 
MIS Chapter 3
MIS Chapter 3MIS Chapter 3
MIS Chapter 3
 
Smart Antennas
Smart AntennasSmart Antennas
Smart Antennas
 
Audio/Video Streaming over 802.11
Audio/Video Streaming over 802.11Audio/Video Streaming over 802.11
Audio/Video Streaming over 802.11
 
Getting the best out of WebRTC
Getting the best out of WebRTCGetting the best out of WebRTC
Getting the best out of WebRTC
 
Getting the Best Out Of WebRTC - Astricon 2014
Getting the Best Out Of WebRTC - Astricon 2014Getting the Best Out Of WebRTC - Astricon 2014
Getting the Best Out Of WebRTC - Astricon 2014
 
Aeroscout Random
Aeroscout RandomAeroscout Random
Aeroscout Random
 
My profile
My profileMy profile
My profile
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLP
 
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded Technology
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded TechnologyARM server, The Cy7 Introduction by Aaron Joue, Ambedded Technology
ARM server, The Cy7 Introduction by Aaron Joue, Ambedded Technology
 
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...
NTTドコモ様 導入事例 OpenStack Summit 2016 Barcelona 講演「Expanding and Deepening NTT D...
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 

Recently uploaded (20)

Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 

Assigning Responsibility for Deteriorations in Video Quality with Henry Milner and Oleg Vasilyev

  • 1. Oleg Vasilyev, Henry Milner, Wensi Fu, Oleg White Assigning Responsibility for Deteriorations in Video Quality
  • 2. …is a video experience management platform that maximizes viewer engagement. We provide quality metrics that give a comprehensive view into online video businesses.
  • 3. Devices & OVPs Internet Video Streaming is Hard Many parties, many paths, but no E2E owner ISPs and Networks CDNs Publishers CDN Cable/DSL ISP Backbone Network CDN Wireless ISP Encoder Video Origin & Operations Home WiFi
  • 4. Devices & OVPs Internet Video Streaming is Hard Many parties, many paths, but no E2E owner ISPs and Networks CDNs Publishers CDN Cable/DSL ISP Backbone Network CDN Wireless ISP Encoder Video Origin & Operations Home WiFi Publisher control ISP control
  • 5. Devices & OVPs ISPs and Networks CDNs Publishers CDN Cable/DSL ISP Backbone Network CDN Wireless ISP Encoder Video Origin & Operations Home WiFi 100 publishers 1,000,000 unique videos (“assets”) 100 CDNs100,000 nodes in internal ISP network 1,000,000 IP addresses 1,000 unique devices The ecosystem is big. …in 10,000,000 video sessions from one week, at one US ISP
  • 6. Viewers are expecting TV-like quality (or better) QoE is Critical to Engagement For both video and advertisement businesses HOW LIKELY ARE YOU TO WATCH FROM THAT SAME PROVIDER AGAIN? 33.6% VERY UNLIKELY 24.8% UNLIKELY 24.6% UNCHANGED 58.4%CHURN RISK 6.6% VERY LIKELY 8% LIKELY -3 -8 -11 -14 -16 2011 2012 2013 2014 2015 ENGAGEMENTREDUCTION (IN MINUTES) WITH 1% INCREASE IN BUFFERING MINS MINS MINS MINS MINS Source: Conviva, “2015Consumer Survey Report”
  • 7. Devices & OVPs ISPs and Networks CDNs Publishers CDN Cable/DSL ISP Backbone Network CDN Wireless ISP Encoder Video Origin & Operations Home WiFi
  • 8. Quality impact Replace internal ISP fiber node No change 0.1% of session spent buffering 2% of session spent buffering
  • 9. Quality impact of this fiber node on session 3 Quality impact of this fiber node on session 2 Quality impact 0.1% of session spent buffering 2% of session spent buffering - Quality impact of this fiber node on this session = Quality impact of this fiber node Quality impact of this fiber node on session 1= 𝚺
  • 10. Quality impact Replace internal ISP fiber node No change 0.1% of session spent buffering 2% of session spent buffering ???
  • 11. Typically better home WiFi Confounding factors 0.1% of session spent buffering 2% of session spent buffering Different fiber node Actual fiber node Actual effect Effect of confounder
  • 12. Devices & OVPs ISPs and Networks CDNs Publishers CDN Cable/DSL ISP Backbone Network CDN Wireless ISP Encoder Video Origin & Operations Home WiFi 100 publishers 1,000,000 unique videos (“assets”) 100 CDNs100,000 nodes in internal ISP network 1,000,000 IP addresses 1,000 unique devices Many potentially-important factors …and confounding factors!
  • 13. Typically better home WiFi Fixing confounding 0.1% of session spent buffering 2% of session spent buffering Different fiber node Actual fiber node
  • 14. Typically better home WiFi Fixing confounding 0.1% of session spent buffering 2% of session spent buffering Different fiber node Actual fiber node Estimate both effects
  • 15. Estimating quality impact Fiber Node = #1 (a good FN) Fiber Node = #567 (actual FN) 1.7% of session spent buffering 2.5% of session spent buffering Model - IP: 1.2.3.4 - Fiber Node: #567 - Asset: “Glee” 1.7% of session spent buffering 2.5% of session spent buffering - Estimated quality impact of this fiber node on this session =
  • 18. Device-level issues are responsible for most short buffering episodes
  • 19. Good and bad IP addresses
  • 20. From worst PCD, typical week:
  • 21. From worst PCD, typical week:
  • 22. Worst assets, typical week: Only 6 really bad assets responsible for strong deterioration of their sessions regardless of other factors. Of these, 4 are sport programs and 2 are obscure regularly scheduled foreign programs.
  • 23. Modeling video quality random forests boosted trees neural networks GLMs Today’s results
  • 24. Model for rebuffering ratio Response: Rebuffering ratio (R) (buffering time / (buffering time + playing time)) Categorical features: IP address (I) Fiber Node (N) Service Group (S) Publisher (P) Asset (A) CDN (C) Device (D) Live / VoD More features: Time, day of week, asset length.
  • 25. Time or no time? Time is strongest or one of strongest features. Big game, popular show – many sessions suffer, regardless of IP and device or even CDN. Time makes model more precise. But: time steals effect from big nodes such as Asset and Publisher. Similar question: Bitrate or no bitrate?
  • 26. Model versions, practical issues Preference: Spark cluster of 10-30 nodes, each node 4-8 cores and 30GB memory. Hopefully 1-3 hours to process 1-3 weeks of data of big ISP, whole US. Nodes as embedded features. Boosted trees on Spark. Whole US. Learn from 1-3 weeks, apply to last week. Nodes as one-hot encoding. Random forest on Spark. Each geographical area is processed separately.
  • 27. Model versions, practical issues Nodes as a trainable embedding first layer. Neural network on single Spark node with Tensorflow. Each geographical area separately. Embedding size is same for all kinds of nodes. Embedding size is ~ log of node's dictionary size. One or two hidden layers above the embedding layer. Adagrad or any other Ada-like optimizer performs better than no-Ada. Slow improvements beyond fast mediocre accuracy.
  • 28. Finding a “good” node Effect of a node = Model( real features ) - Model( features with good node ) In case of trainable embedding – iterative finding of good nodes: good = nodes with lowest avg label while set of good nodes is changing: for each session: Effect = Model(real) - Model(good) good = nodes with lowest avg effect Overlap of set of good nodes with the set from previous iteration - typically stabilizes in 2-3 iterations.
  • 29. Sessions are affected by ... If effect is linear, this would be like ART in 3D tomography
  • 30. Thanks to Spark and Databricks for making it easy for us