Nishanth Sastry
King’s College London
Saving the Internet from On-demand Video Streaming
Content
Plan for the talk
• Content Delivery: a brief introduction
• Analysis of drawbacks of current systems
• The CD-GAIN take on content dlivery
http://bit.ly/cd-gain
Some statistics
• Cisco Visual Networking Index 2013-18:
– By 2018, videos will be 79% of all traffic
*Currently they comprise 66% of all traffic
• Netflix alone is 33% of peak time US traffic
• 44% of UK households using BBC iPlayer
Introducing BBC iPlayer
• 9 months: May 2013 – Jan 2014
• 1.9 Billion sessions
• In one representative month:
– 32 Million users/devices
– 20 Million IP addresses
• In London alone, in one month:
– 1.26 Million IPs
– 2.15 Million users/devices
How do Netflix/Youtube/iPlayer
scale?
?
If Internet connection is bottleneck,
bypass it by replicating!
Multihoming
Content Delivery Network
If Internet connection is bottleneck,
bypass it by replicating!
Replication has fundamentally
changed the Internet’s structure
2
0
0
7
Traditional Internet Model
Replication has fundamentally
changed the Internet’s structure
C. Labovitz et al., Internet Inter-domain Traffic.
Proc. SIGCOMM 2010
2
0
0
9
Page 15 - Labovitz SIGCOMM 2010
A New Internet Model
Flatter and much more densely interconnected Internet
Disintermediation between content and “eyeball” networks
New commercial models between content, consumer and transit
Settlement Free
Pay for BW
Pay for access BW
Problem solved! But is solution right?
1. No longer an “Internet” of connected nets
– Have hyper-giants become “too big to fail”?
Problem solved! But is solution right?
2. Distributed systems hard to engineer:
– Consistency
– Failover
– …
Global scale distributed systems extremely hard!
3. Global replica infrastructure is expensive
– Content providers need to pay hyper-giants
– Or… be hyper-giants themselves
Problem solved! But is solution right?
4. Even CDNs don’t cover all over the globe:
performance and cost diverge by region
HH Liu et al., Optimizing cost and performance for content multihoming. SIGCOMM 2012
5. Misses opportunities for local sharing!
Problem solved! But is solution right?
Taking stock with
TV Content
How did we consume content before?
How do we consume content now?
What can we learn from what we see?
How did we watch TV before?
http://www.watfordobserver.co.uk/nostalgia/memories/10099510.Coronation_treat_as_community_gathers_around_the_only
_TV/
Today, TV is just another “app”
What changed: Push Pull
 Superficially: audience to TV set ratio has decreased
 At a fundamental level:
 audience per “broadcast” is lower
 “Broadcast” time is chosen by the consumer
 Traditional mass media pushed content to consumer
 Current dominant model has changed to pull
But people have not changed!
New Directions for
Content Delivery
1. Select few items become globally popular
Can we exploit redundancy using P2P?
2. …but individual users may have favourites
Can we predict user quirks/favourites and
personalise content delivery?
3. What if we could in fact change users?
Can we “nudge” user behaviour and make content
delivery cheaper for all?
1. Can we exploit
redundancy with peer-
assistance?
P2P works at scale for Long Duration content such as TV
under “online while you watch” model.
P2P-assisted content delivery:
Looks good, but details important!
Simple model – augmenting traditional delivery:
 Server-based content delivery as mainstay
 Shift seamlessly to P2P as more users join
 Peer availability offloads traffic from provider!
? Will there be enough peers in swarms?
• Peer arrivals may be asynchronous
• Peers may not participate in uploads
? Can P2P swarms be ISP-friendly & local?
• …and still work well?
Swarm fragmentation Factors
• ISP friendliness
• Bitrate stratification
• Partial participation
• Limited upload bandwidth
Taffic offloading gain as a function of
peer availability (swarm capacity, c)
Model swarms as infinite-server queue
(extending Menasche et al,. CoNEXT 2009)
• Server load increases with no. of users
• … until swarm has one user on average
• Subsequent increase in load decreases
server traffic as swarm takes over!
Let’s test on real data from London
Gains in swarms fragmented by
ISP-locality & Bit-rate stratification
Why fragmentation does no harm?
Top 8 ISPs = 70% traffic Top 2 bitrates=70% sessions
“Online while you watch” model
critical for ensuring availability
ISP-friendly P2P is also greener
because of fewer hops to replica!
Carbon savings of P2P over CDN
for one ISP’s topology
2. Can we personalise
content delivery?
Users are highly predictable.
Simple analytics can offload traffic and
decrease carbon footprint
Why iPlayer, not DVRs?
• DVRs have >50% penetration in US, UK
• Many (e.g. YouView) don’t need cable
• Could also use TV tuner and record on laptop
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Because, people don’t remember to record!
Can we help users record
what they want to watch?
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Speculative Content Offloading and
Recording Engine
Caching at-the-very-edge completely offloads traffic!
Which features to use?-I
• BBC proposes, consumer disposes!
• Serials:~50% of content corpus;
80% of watched content!
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Which features to use?-II
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Which features to use?-III
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
SCORE=predictor+optimiser
• Predict using user affinity for
• Serials: Episodes of same programme
• Favourite genres
• We can optimise for decreasing traffic or carbon footprint
• Decreasing carbon decreases traffic, but not vice versa
• Turns out we only take 5-15% hit by focusing on carbon
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Performance evaluation
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
Compare SCORE wrt. Oracle knowing future requests
Oracle saves:
• Up to 97% of traffic
• Up to 74% of energy
• Savings relatively insensitive to choice
of energy model parameters
• SCORE: ~40-60% of Oracle savings
energy than traffic opt.
Not all of these savings come from
predicting popular content
• Indiscriminately recording top n shows can lead to
negative energy savings!
• Personalised approach necessary, despite
popularity of “prime time” content
Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
3. ‘Nudging’ user
behaviour
Decrease content delivery costs by
asking user to “go easy” on
infrastructure
What is ‘nudging’?
Current mindset: User is king
Operators/providers attempt to
satisfy all user accesses
Idea: ‘Nudge’ user to behaviours
better suited to network!
Passive nudging
Give users flexibility to choose: on-demand!
Active nudging
Time-shift users’ access pattern
E.g., lower price for off-peak access
Space-shift users’ accesses to different ISP
E.g., move smartphones from 3G Wifi
(Applying SCORE to smart phones)
Content-shifting: suggest alternate items for
users to watch, based on cache contents!
Digital Media Convergence:
Remember the hype?
Good News: it has happened
CD-GAIN: New directions for a
Content-centric Internet
1. Can we exploit redundancy using P2P?
– YES, but “online while you watch” is critical
2. Can we predict user quirks/favourites and
personalise content delivery?
– YES, Speculative Content Offloading and
Recording Engine (SCORE)
3. Can we “nudge” user behaviour and make
content delivery cheaper for all?
Guiding principles
1. Cache as close to user as possible
2. Increase cache reuse by any means!
3. Decrease peak usage: infrastructure
can be provisioned for smaller load
• Can increase average use
(speculative traffic is fine!)
Saving the Internet from On-demand Video Streaming
Content
http://bit.ly/c
d-gain
Nishanth Sastry
King’s College London
Joint work with:
Mustafa Al-Bassam, King’s College London
Jigna Chandaria, BBC R&D
Jon Crowcroft, U. Cambridge
Nick Feamster, Georgia Tech
Dmytro Karamshuk, King’s College London
Richard Mortier, Uni. Nottingham
Gianfranco Nencioni, Uni. Pisa
Andy Secker, BBC R&D
Gareth Tyson, Queen Mary London

Attack of the Content Clones: Saving the Internet from On-demand Video Streaming

  • 1.
    Nishanth Sastry King’s CollegeLondon Saving the Internet from On-demand Video Streaming Content
  • 2.
    Plan for thetalk • Content Delivery: a brief introduction • Analysis of drawbacks of current systems • The CD-GAIN take on content dlivery http://bit.ly/cd-gain
  • 3.
    Some statistics • CiscoVisual Networking Index 2013-18: – By 2018, videos will be 79% of all traffic *Currently they comprise 66% of all traffic • Netflix alone is 33% of peak time US traffic • 44% of UK households using BBC iPlayer
  • 4.
    Introducing BBC iPlayer •9 months: May 2013 – Jan 2014 • 1.9 Billion sessions • In one representative month: – 32 Million users/devices – 20 Million IP addresses • In London alone, in one month: – 1.26 Million IPs – 2.15 Million users/devices
  • 5.
  • 6.
    If Internet connectionis bottleneck, bypass it by replicating! Multihoming
  • 7.
    Content Delivery Network IfInternet connection is bottleneck, bypass it by replicating!
  • 8.
    Replication has fundamentally changedthe Internet’s structure 2 0 0 7 Traditional Internet Model
  • 9.
    Replication has fundamentally changedthe Internet’s structure C. Labovitz et al., Internet Inter-domain Traffic. Proc. SIGCOMM 2010 2 0 0 9 Page 15 - Labovitz SIGCOMM 2010 A New Internet Model Flatter and much more densely interconnected Internet Disintermediation between content and “eyeball” networks New commercial models between content, consumer and transit Settlement Free Pay for BW Pay for access BW
  • 10.
    Problem solved! Butis solution right? 1. No longer an “Internet” of connected nets – Have hyper-giants become “too big to fail”?
  • 11.
    Problem solved! Butis solution right? 2. Distributed systems hard to engineer: – Consistency – Failover – … Global scale distributed systems extremely hard! 3. Global replica infrastructure is expensive – Content providers need to pay hyper-giants – Or… be hyper-giants themselves
  • 12.
    Problem solved! Butis solution right? 4. Even CDNs don’t cover all over the globe: performance and cost diverge by region HH Liu et al., Optimizing cost and performance for content multihoming. SIGCOMM 2012
  • 13.
    5. Misses opportunitiesfor local sharing! Problem solved! But is solution right?
  • 14.
    Taking stock with TVContent How did we consume content before? How do we consume content now? What can we learn from what we see?
  • 15.
    How did wewatch TV before? http://www.watfordobserver.co.uk/nostalgia/memories/10099510.Coronation_treat_as_community_gathers_around_the_only _TV/
  • 16.
    Today, TV isjust another “app”
  • 17.
    What changed: PushPull  Superficially: audience to TV set ratio has decreased  At a fundamental level:  audience per “broadcast” is lower  “Broadcast” time is chosen by the consumer  Traditional mass media pushed content to consumer  Current dominant model has changed to pull
  • 18.
    But people havenot changed!
  • 19.
    New Directions for ContentDelivery 1. Select few items become globally popular Can we exploit redundancy using P2P? 2. …but individual users may have favourites Can we predict user quirks/favourites and personalise content delivery? 3. What if we could in fact change users? Can we “nudge” user behaviour and make content delivery cheaper for all?
  • 20.
    1. Can weexploit redundancy with peer- assistance? P2P works at scale for Long Duration content such as TV under “online while you watch” model.
  • 21.
    P2P-assisted content delivery: Looksgood, but details important! Simple model – augmenting traditional delivery:  Server-based content delivery as mainstay  Shift seamlessly to P2P as more users join  Peer availability offloads traffic from provider! ? Will there be enough peers in swarms? • Peer arrivals may be asynchronous • Peers may not participate in uploads ? Can P2P swarms be ISP-friendly & local? • …and still work well?
  • 22.
    Swarm fragmentation Factors •ISP friendliness • Bitrate stratification • Partial participation • Limited upload bandwidth
  • 23.
    Taffic offloading gainas a function of peer availability (swarm capacity, c) Model swarms as infinite-server queue (extending Menasche et al,. CoNEXT 2009) • Server load increases with no. of users • … until swarm has one user on average • Subsequent increase in load decreases server traffic as swarm takes over!
  • 24.
    Let’s test onreal data from London
  • 25.
    Gains in swarmsfragmented by ISP-locality & Bit-rate stratification
  • 26.
    Why fragmentation doesno harm? Top 8 ISPs = 70% traffic Top 2 bitrates=70% sessions
  • 27.
    “Online while youwatch” model critical for ensuring availability
  • 28.
    ISP-friendly P2P isalso greener because of fewer hops to replica!
  • 29.
    Carbon savings ofP2P over CDN for one ISP’s topology
  • 30.
    2. Can wepersonalise content delivery? Users are highly predictable. Simple analytics can offload traffic and decrease carbon footprint
  • 31.
    Why iPlayer, notDVRs? • DVRs have >50% penetration in US, UK • Many (e.g. YouView) don’t need cable • Could also use TV tuner and record on laptop Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 Because, people don’t remember to record!
  • 32.
    Can we helpusers record what they want to watch? Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 Speculative Content Offloading and Recording Engine Caching at-the-very-edge completely offloads traffic!
  • 33.
    Which features touse?-I • BBC proposes, consumer disposes! • Serials:~50% of content corpus; 80% of watched content! Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
  • 34.
    Which features touse?-II Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
  • 35.
    Which features touse?-III Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
  • 36.
    SCORE=predictor+optimiser • Predict usinguser affinity for • Serials: Episodes of same programme • Favourite genres • We can optimise for decreasing traffic or carbon footprint • Decreasing carbon decreases traffic, but not vice versa • Turns out we only take 5-15% hit by focusing on carbon Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
  • 37.
    Performance evaluation Understanding anddecreasing the Network Footprint of Catch-up TV-WWW’13 Compare SCORE wrt. Oracle knowing future requests Oracle saves: • Up to 97% of traffic • Up to 74% of energy • Savings relatively insensitive to choice of energy model parameters • SCORE: ~40-60% of Oracle savings energy than traffic opt.
  • 38.
    Not all ofthese savings come from predicting popular content • Indiscriminately recording top n shows can lead to negative energy savings! • Personalised approach necessary, despite popularity of “prime time” content Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13
  • 39.
    3. ‘Nudging’ user behaviour Decreasecontent delivery costs by asking user to “go easy” on infrastructure
  • 40.
    What is ‘nudging’? Currentmindset: User is king Operators/providers attempt to satisfy all user accesses Idea: ‘Nudge’ user to behaviours better suited to network!
  • 41.
    Passive nudging Give usersflexibility to choose: on-demand!
  • 42.
    Active nudging Time-shift users’access pattern E.g., lower price for off-peak access Space-shift users’ accesses to different ISP E.g., move smartphones from 3G Wifi (Applying SCORE to smart phones) Content-shifting: suggest alternate items for users to watch, based on cache contents!
  • 43.
    Digital Media Convergence: Rememberthe hype? Good News: it has happened
  • 44.
    CD-GAIN: New directionsfor a Content-centric Internet 1. Can we exploit redundancy using P2P? – YES, but “online while you watch” is critical 2. Can we predict user quirks/favourites and personalise content delivery? – YES, Speculative Content Offloading and Recording Engine (SCORE) 3. Can we “nudge” user behaviour and make content delivery cheaper for all? Guiding principles 1. Cache as close to user as possible 2. Increase cache reuse by any means! 3. Decrease peak usage: infrastructure can be provisioned for smaller load • Can increase average use (speculative traffic is fine!)
  • 45.
    Saving the Internetfrom On-demand Video Streaming Content http://bit.ly/c d-gain Nishanth Sastry King’s College London Joint work with: Mustafa Al-Bassam, King’s College London Jigna Chandaria, BBC R&D Jon Crowcroft, U. Cambridge Nick Feamster, Georgia Tech Dmytro Karamshuk, King’s College London Richard Mortier, Uni. Nottingham Gianfranco Nencioni, Uni. Pisa Andy Secker, BBC R&D Gareth Tyson, Queen Mary London

Editor's Notes

  • #2 On-demand video streaming is undoubtedly one of the biggest successes of the past few years. What used to be a text or HTML oriented Internet about 10 years ago has now become overrun by rich media content clones from the likes of YouTube. Despite this success, delivering rich media content is still a complex and expensive affair. Today I want to give you an overview of our work with the BBC, one of the largest content providers in the UK, on how to decrease their network footprint.
  • #7 at this point - one ISP not sufficient for your data needs. You also want to make sure you are not reliant on one cable to reach all your customers.
  • #8 What do you do beyond that? Replicate globally. CDNs do this for you if you hire them. This recipe is so successful that all major content providers I know of follow it. In fact it has become so successful that it has changed the very structure of the Internet.
  • #10 What is really surprising is this happened in just a couple of years. 2007, we had the traditional Internet, and then google deploys its dark fibre infrastructure and Google CDN. 2009 sees Google as biggest ISP in the world.
  • #11 So, we have a recipe that is so successful that it has undone 30 years of Internet history in 2 years – we , we have cracked the problem, right? What we have created is entities that like banks, have become too big to fail.
  • #13 If you are distributing in China, Chinacache is better. South America only covered by CloudFront. Everywhere in the world is not the same as Europe and North America.
  • #14 Architecturally there are issues as well:
  • #15 Let us see if we can do a better job with the TV content we have. We’ll take a historical perspective.
  • #18 Content delivery was easy and cheap before because it was broadcast=based. Cost of a broadcast amortizes across millions of viewers. Now, if you have a viral hit on YouTube, each user creates her own stream – load increases linearly.
  • #28 Here we take the same popular content item we see before, and ask what the gains would be if users left after downloading. Obviously, peers who can download faster can leave faster – so something like bittorrent would not work if you selfish hosts with high broadband speeds. P2P was killed by the success of P2P streaming. But think also of the roles of the “selfish” downloads model, and increasing broadband speeds!
  • #29 Given a particular topology, we can calculate savings; we have a closed form solution for the energy savings S in terms of the capacity of the swarm and m. And for one ISP who gave us their exact topology, we can calculate carbon footprint reduction for the typical swarm sizes we observe.
  • #30 Savings depend on the energy efficiency of the data centers used by traditional CDN. As number of peer sin swarm increase, average hops decrease and higher savings.
  • #34 But what to record?? Our MO is simple: look at the content corpus, and see which parts are heavily used. This tells us what users prefer to watch