SlideShare a Scribd company logo
1 of 47
Download to read offline
Enabling near-VoD via P2P Networks

       Siddhartha Annapureddy
     Saikat Guha, Dinan Gunawardena
    Christos Gkantsidis, Pablo Rodriguez

          World Wide Web, 2007

                                           1
Motivation
 Growing demand for distribution of videos
   Distribution of home-grown videos
   BBC wants to make its documentaries public
   People like sharing movies (in spite of RIAA)
 Video distribution on Internet very popular
   About 35% of the traffic


 What we want?
   (Near) Video-on-Demand (VoD)
   User waits a little, and starts watching the video
                                                    2
First Generation Approaches
 Multicast support in network
   Hasn’t taken off yet
   Research yielded good ideas
     Pyramid schemes


 Dedicated Infrastructure
   Akamai will do
   Costs big bucks

                                 3
Recent Approaches
BitTorrent Model
  P2P mesh-based system
  Video divided into blocks
  Blocks fetched at random
    Preference for rare blocks


+ High throughput through cooperation
- Cannot watch until entire video fetched
                                            4
Current Approaches

    Distribute from a server farm
      YouTube, Google Video


-   Scalability problems
      Bandwidth costs very high
-   Poor quality to support high demand
-   Censor-ship constraints
      Not all videos welcome at Google video
      Costs to run server farm prohibitive
                                               5
Live Streaming Approaches
- Provides a different functionality
    User arriving in middle of a movie cannot
    watch from the beginning

  Live Streaming
    All users interested in same part of video
    Narada, Splitstream, PPLive, CoolStreaming
    P2P system caching windows of relevant
    chunks
                                            6
Challenges for VoD
 Given this motivation for VoD, what are the challenges?

Near VoD
  Small setup time to play
  videos
High sustainable goodput
  Largest slope osculating the
  block arrival curve
  Highest video encoding rate
  system can support

      Goal: Do block scheduling to achieve
      low setup time and high goodput                      7
System Design – Principles
Keep bw load on server to a minimum
  Server connects and gives data only to few users
    Leverage participants to distribute to other users
  Maintenance overhead with users to be kept low


Keep bw load on users to minimum
  Each node knows only a small subset of user
  population (called neighbourhood)
    Bitmaps of blocks


                                                         8
System Design – Outline
Components based on BitTorrent
  Central server (tracker + seed)
  Users interested in the data
Central server
  Maintains a list of nodes

                 Each node finds neighbours
                    Contacts the server for this
                    Another option – Use gossip protocols
                 Exchange data with neighbours
                    Connections are bi-directional      9
Overlay-based Solutions
Tree-based approach
  Use existing research for IP Multicast
  Complexity in maintaining structure


Mesh-based approach
  Simple structure of random graphs
  Robust to churn


                                           10
Simulator – Outline
 All nodes have unit bandwidth capacity
    500 nodes in most of our experiments
    Heterogeneous nodes not described in talk
 Each node has 4-6 neighbours
 Simulator is round-based
    Block exchanges done at each round
    System moves as a whole to the next round
 File divided into 250 blocks
    Segment is a set of consecutive blocks
    Segment size = 10


 Maximum possible goodput – 1 block/round
                                                11
Why 95th percentile?
        500                                                       (x, y)
        450                                                          y nodes have a
        400                                                          goodput which is at
        350
                                                                     least x.
        300                                                       Red line at top shows
                                                                  95th percentile
Nodes




        250

        200

        150
                                                                  95th percentile a good
        100
                                                                  indicator of goodput
         50
                                                                     Most nodes have at
          0
              0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8      least that goodput

                    Goodput (in blocks/round)
                                                                                           12
Feasibility of VoD –
What does block/round mean?
0.6 blocks/round with 35 rounds setup time

Two independent parameters
  1 round ≈ 10 seconds
  1 block/round ≈ 512 kbps

Consider 35 rounds a good setup time
  35 rounds   ≈ 6 min
Goodput = 0.6 blocks/round
  Size of video = 250 / 0.6 ≈ 70 min
  Max encoding rate = 0.6 * 512 > 300 kbps

                                             13
Naïve Scheduling Policies
Random                3   6    8    1   4    2   7   5
Sequential            1   2    3    4   5    6   7   8
Segment-Random        4   1    3    2   6    5   8   7


Segment-Random Policy
  Divide file into segments
  Fetch blocks in random order inside same segment
  Download segments in order.
                                                     14
Naïve Approaches – Random
                          1
                                                                                       Each node fetches a
                         0.9
                                                                                       block at random
Goodput (blocks/round)




                         0.8

                         0.7                                                           Throughput – High as
                         0.6                                                           nodes fetch disjoint
                         0.5                                                           blocks
                         0.4                                                              More opportunities for
                         0.3                                                              block exchanges
                         0.2

                         0.1
                                                                                       Goodput – Low as nodes
                          0
                               0   10    20   30   40   50   60   70   80   90   100   don’t get blocks in order
                                        Setup Time (in rounds)                            0 blocks/round

                                                                                                                   15
Naïve Approaches – Sequential
                            1

                           0.9
  Goodput (blocks/round)



                           0.8

                           0.7

                           0.6                                                          < 0.25 blocks/round
                           0.5

                           0.4

                           0.3                                                            Sequential
                           0.2
                                                                                          Random
                           0.1

                            0
                                 0   10   20   30   40   50   60   70   80   90   100

                                     Setup Time (in rounds)
Each node fetches blocks in order of playback
Throughput – Low as fewer opportunities for exchange
              Increase this                                                                             16
Naïve Approaches –
   Segment-Random Policy
    Goodput (blocks/round)
                              1

                             0.9

                             0.8

                             0.7

                             0.6
                                                                                           < 0.50 blocks/round
                             0.5

                             0.4

                             0.3
                                                                                            Segment-Random
                                                                                            Sequential
                             0.2
                                                                                            Random
                             0.1

                              0
                                   0   10    20   30   40   50   60   70   80   90   100

                                            Setup Time (in rounds)
Segment-random
   Fetch random block from the segment the earliest block falls in
Increases the number of blocks in progress
                                                                                                             17
Peaks and Valleys
                                                    400




                           Throughput (in blocks)
                                                    350

Throughput is number                                300
of block exchanges                                  250
system-wide
                                                    200


                                                    150

Period between two                                  100
valleys corresponds to                               50
a segment
                                                      0
                                                       50   60   70    80   90   100   110   120   130   140   150

                                                                      Time (in rounds)

  Throughput could be improved at two places – Reduce time:
     To obtain last blocks of current segment
     For initial blocks of segment to propagate
                                                                                                                     18
Mountains and Valleys
                                                 400




                          Progress (in blocks)
                                                 350


                                                 300


                                                 250



Segment-Random                                   200



Local-rarest
                                                 150


                                                 100


                                                  50


                                                   0
                                                    50   60   70   80   90   100   110   120   130   140   150

                                                                   Time (in rounds)

Local rarest
   Fetch the rarest block in the current segment
   System progress improves 7% (195.47 to 208.61)
                                                                                                            19
Naïve Approaches –
  Rarest-client                                       1

                                                     0.9




                            Goodput (blocks/round)
                                                     0.8

                                                     0.7

                                                     0.6

                                                     0.5
Segment-Random                                       0.4

Local-rarest                                         0.3

                                                     0.2

                                                     0.1

                                                      0
                                                           0   10    20   30   40   50   60   70   80   90        100

                                                                    Setup Time (in rounds)

 How can we improve this further?
 Pre-fetching
   Improve availability of blocks from next segment
                                                                                                             20
Pre-fetching
Random                     3    6     8      1    4    2      7   5
Sequential                 1    2     3      4    5    6      7   8
Segment-Random             4    1     3      2    6    5      8   7
Pre-fetching               1    3     2      5    4    7      6   8
  Pre-fetching
    Fetch blocks from the next segment with a small probability
    Creates seeds for blocks ahead of time
    Trade-off: block is not immediately useful for playback
                                                                      21
Pre-fetching
                          1
                                                                                       Pre-fetching with
                         0.9
                                              80-20                                    probability 0.2
Goodput (blocks/round)




                         0.8
                                              Local-rarest
                         0.7
                                              Segment-Random
                         0.6
                                                                                       Throughput:
                         0.5

                         0.4
                                                                                         9% improvement
                         0.3                                                             195.47 to 212.21
                         0.2

                         0.1

                          0
                                                                                       Get better with
                               0   10    20   30   40   50   60   70   80   90   100
                                        Setup Time (in rounds)                         Network Coding!!!
                                                                                                            22
NetCoding – How it Helps?
                        A has blocks 1 and 2

                        B gets 1 or 2 with
                        equal prob. from A
                        C gets 1 in parallel

                         If B downloaded 1
                         Link B-C becomes
                         useless
 Network coding routinely sends 1⊕2
                                           23
Network Coding – Mechanics
                                      Coding over blocks B1,
                                      B2, …, Bn

                                      Choose coefficients c1,
                                      c2, …, cn from finite
                                      field

                                      Generate encoded
                                      block: E1 = Σi=1..n ci.Bi

Without n coded blocks, decoding cannot be done
  Setup time at least the time to fetch a segment
                                                            24
Benefits of NetCoding
                          1        Netcoding 80-20                                     Throughput:
                         0.9       Netcoding-segment-random                              39% improvement
Goodput (blocks/round)




                         0.8       80-20
                                                                                         271.20 with NC
                         0.7
                                                                                         212.21 with pre-
                         0.6
                                                                                         fetching and no NC
                         0.5
                                                                                         195.47 initial
                         0.4

                         0.3

                         0.2                                                           Pre-fetching segments
                         0.1                                                           moderately beneficial
                          0
                               0   10    20   30   40   50   60   70   80   90   100


                                        Setup Time (in rounds)
                                                                                                              25
Implementation
 With the above insights, built a C# prototype
   25K lines of code


 Rest of talk
   Identify problems when nodes arrive at different
   times, nodes with heterogeneous capacities
   Address these issues with improvements to the
   basic Netcoding scheme
   Note – All evaluation done with the prototype

                                                 26
Segment Scheduling
NetCoding good for fetching blocks in a segment,
but cannot be used across segments
  Because decoding requires all encoded blocks


Problem: How to schedule fetching of segments?
  Until now, sequential segment scheduling
     Segment pre-fetching beneficial
  Results in rare segments when nodes arrive at different
  times

Segment scheduling algorithm to avoid rare
segments
                                                            27
Segment Scheduling
                 A has 75% of the file
                 Flash crowd of 20 Bs join with
    Server       empty caches

                 Server shared between A and Bs
             B     Throughput to A decreases, as server
                   is busy serving Bs
A
       B     B   Initial segments served repeatedly
                   Throughput of Bs low because the
                   same segments are fetched from
                   server
                                                   28
Segment Scheduling – Problem

Segment #         1     2     3     4        5   6   7   8
   Server
      A
      Bs

Popularities      2     2     2     2        2   2   1   1

8 segments in the video; A has 75% = 6 segments, Bs have none
Green is popularity 2, Red is popularity 1
    Popularity = # full copies in the system
                                                             29
Segment Scheduling – Problem

Segment #      1   2   3   4   5   6   7   8
  Server
     A
     Bs
Popularities   2   2   2   2   2   2   2
                                       1   1


    Instead of serving A …
                                               30
Segment Scheduling – Problem

Segment #      1   2   3   4   5   6   7   8
  Server
     A
     Bs
Popularities   3   2   2   2   2   2   1   1


    The server gives segments to Bs
                                               31
Segment Scheduling – Problem


                                              A
                                              Bs




A’s goodput plummets
Get the best of both worlds – Improve A and Bs
  Idea: Each node serves only rarest segments in system   32
Segment Scheduling – Algorithm

Segment #      1    2     3    4   5   6   7   8
  Server
     A
     Bs

Popularities   2    2     2    2   2   2   1   1

When dst node connects to the src node…
   Here, dst is B, src is Server
                                                   33
Segment Scheduling – Algorithm

Segment #      1     2    3     4       5   6   7   8
  Server
     A
     Bs
Popularities   2     2    2     2       2   2   1   1

  src node sorts segments in order of popularity
    Segments 7, 8 least popular at 1
    Segments 1-6 equally popular at 2
                                                        34
Segment Scheduling – Algorithm

Segment #       1     2     3     4     5   6   7   8
  Server
     A
     Bs
Popularities    2     2     2     2     2   2   1   1

 src considers segments in sorted order one-by-one,
 and serves dst
   Either completely available at src, or
   First segment required by dst                        35
Segment Scheduling – Algorithm

Segment #       1     2     3    4     5     6     7    8
  Server
     A
     Bs
Popularities    2     2     2    2     2     2     1    1

  Server injects segment needed by A into system
     Avoids wasting bandwidth in serving initial segments multiple
     times
                                                                36
Segment Scheduling – Popularities
 How does src figure out popularities?
   Centrally available at the server
     Our implementation uses this technique


   Each node maintains popularities for
   segments
     Could use a gossiping protocol for aggregation



                                                37
Segment Scheduling




Note that goodput of both A and Bs improves
                                              38
Topology Management
                          A has 75% of the file
                          Flash crowd of 20 Bs join with
      Server              empty caches

                          Limitations of segment scheduling
                   B
                             Bs do not benefit from server,
A                            as they get blocks for A
                             A delayed too because of
          B       B          re-routing of blocks from Bs

    Present an algorithm that avoids these problems
                                                       39
Topology Management – Algorithm
Cluster nodes interested in the
same part of the video
   Retain connections which have                     Server
   high goodput

                                                              B
When dst approaches src,
src serves dst only if                           A
   src has spare capacity, or
   dst is interested in the worst-seeded                B     B
   segment per src’s estimate
      If so, kick a neighbour with low goodput

                                                              40
Topology Management – Algorithm
When Bs approach server…
  Server does NOT have spare capacity
                                                    Server
  B is NOT interested in the worst-seeded
  segment per server’s estimate (segment 7)

                                                             B
When dst approaches src,
src serves dst only if                          A
  src has spare capacity, or
  dst is interested in the worst-seeded                B     B
  segment per src’s estimate
     If so, kick a neighbour with low goodput

                                                             41
Topology Management




Note that both A and the Bs see increased goodput
                                                    42
Heterogeneous Capacities
                 Above algorithms do not work
                 well with heterogeneous nodes
    Server
                 A is a slow node
             B   Flash crowd of 20 fast Bs
A
                 Bs get choked because A is slow
       B     B



                                             43
Heterogeneous Capacities
                    Server refuses to give initial
                    segments to Bs
                       Popularity calculation takes A to
                       be a source
Fast
                    Adjust segment popularity
Slow                with weight proportional to
                    the capacity of the node
                       Initial segments have < 2
                       popularity, rounded to 1
                       Server then gives to Bs

                                                   44
Heterogeneous Capacities


                                     Fast
                                     Slow




Note the improvement in the fast/slow nodes
                                              45
Conclusions
 System designed to support near-VoD
 using peer-to-peer networks
   Hitherto, only large file distribution possible

 Three techniques to achieve low setup
 time and sustainable goodput
   NetCoding – Improves throughput
   Segment scheduling – Prevent rare blocks
   Topology management – Cluster “similar” nodes

                                               46
Questions?


             47

More Related Content

Viewers also liked

Front Page
Front PageFront Page
Front Pageagelso
 
26º domingo tob 2015 bene pagola
26º domingo tob 2015 bene pagola26º domingo tob 2015 bene pagola
26º domingo tob 2015 bene pagolanuria04
 
80133823 backdor-nectcat-through-smb
80133823 backdor-nectcat-through-smb80133823 backdor-nectcat-through-smb
80133823 backdor-nectcat-through-smbjeweh
 
La riforma scolastica
La riforma scolasticaLa riforma scolastica
La riforma scolasticaandymene
 
SyncNI Magazine Spring 2012
SyncNI Magazine Spring 2012SyncNI Magazine Spring 2012
SyncNI Magazine Spring 2012Mark W. Bennett
 
Fall2010 quinnedu261syllabus
Fall2010 quinnedu261syllabusFall2010 quinnedu261syllabus
Fall2010 quinnedu261syllabusPeggy Quinn
 
Game theory
Game theoryGame theory
Game theorypoundza
 
ScalaDays Amsterdam - Don't block yourself
ScalaDays Amsterdam - Don't block yourselfScalaDays Amsterdam - Don't block yourself
ScalaDays Amsterdam - Don't block yourselfFlavio W. Brasil
 
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4Yuyu Wahida
 

Viewers also liked (14)

Front Page
Front PageFront Page
Front Page
 
26º domingo tob 2015 bene pagola
26º domingo tob 2015 bene pagola26º domingo tob 2015 bene pagola
26º domingo tob 2015 bene pagola
 
80133823 backdor-nectcat-through-smb
80133823 backdor-nectcat-through-smb80133823 backdor-nectcat-through-smb
80133823 backdor-nectcat-through-smb
 
La riforma scolastica
La riforma scolasticaLa riforma scolastica
La riforma scolastica
 
Turst
TurstTurst
Turst
 
SyncNI Magazine Spring 2012
SyncNI Magazine Spring 2012SyncNI Magazine Spring 2012
SyncNI Magazine Spring 2012
 
Fall2010 quinnedu261syllabus
Fall2010 quinnedu261syllabusFall2010 quinnedu261syllabus
Fall2010 quinnedu261syllabus
 
The Space Corps Vision
The Space Corps VisionThe Space Corps Vision
The Space Corps Vision
 
Bennett_Resume-2016
Bennett_Resume-2016Bennett_Resume-2016
Bennett_Resume-2016
 
Portafolio 4to corte admon
Portafolio 4to corte admonPortafolio 4to corte admon
Portafolio 4to corte admon
 
Final report mst 5
Final report mst 5Final report mst 5
Final report mst 5
 
Game theory
Game theoryGame theory
Game theory
 
ScalaDays Amsterdam - Don't block yourself
ScalaDays Amsterdam - Don't block yourselfScalaDays Amsterdam - Don't block yourself
ScalaDays Amsterdam - Don't block yourself
 
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4
Aplikasi Skype dalam P&P Teknologi Maklumat tingkatan 4
 

Enabling near-VoD via P2P Networks

  • 1. Enabling near-VoD via P2P Networks Siddhartha Annapureddy Saikat Guha, Dinan Gunawardena Christos Gkantsidis, Pablo Rodriguez World Wide Web, 2007 1
  • 2. Motivation Growing demand for distribution of videos Distribution of home-grown videos BBC wants to make its documentaries public People like sharing movies (in spite of RIAA) Video distribution on Internet very popular About 35% of the traffic What we want? (Near) Video-on-Demand (VoD) User waits a little, and starts watching the video 2
  • 3. First Generation Approaches Multicast support in network Hasn’t taken off yet Research yielded good ideas Pyramid schemes Dedicated Infrastructure Akamai will do Costs big bucks 3
  • 4. Recent Approaches BitTorrent Model P2P mesh-based system Video divided into blocks Blocks fetched at random Preference for rare blocks + High throughput through cooperation - Cannot watch until entire video fetched 4
  • 5. Current Approaches Distribute from a server farm YouTube, Google Video - Scalability problems Bandwidth costs very high - Poor quality to support high demand - Censor-ship constraints Not all videos welcome at Google video Costs to run server farm prohibitive 5
  • 6. Live Streaming Approaches - Provides a different functionality User arriving in middle of a movie cannot watch from the beginning Live Streaming All users interested in same part of video Narada, Splitstream, PPLive, CoolStreaming P2P system caching windows of relevant chunks 6
  • 7. Challenges for VoD Given this motivation for VoD, what are the challenges? Near VoD Small setup time to play videos High sustainable goodput Largest slope osculating the block arrival curve Highest video encoding rate system can support Goal: Do block scheduling to achieve low setup time and high goodput 7
  • 8. System Design – Principles Keep bw load on server to a minimum Server connects and gives data only to few users Leverage participants to distribute to other users Maintenance overhead with users to be kept low Keep bw load on users to minimum Each node knows only a small subset of user population (called neighbourhood) Bitmaps of blocks 8
  • 9. System Design – Outline Components based on BitTorrent Central server (tracker + seed) Users interested in the data Central server Maintains a list of nodes Each node finds neighbours Contacts the server for this Another option – Use gossip protocols Exchange data with neighbours Connections are bi-directional 9
  • 10. Overlay-based Solutions Tree-based approach Use existing research for IP Multicast Complexity in maintaining structure Mesh-based approach Simple structure of random graphs Robust to churn 10
  • 11. Simulator – Outline All nodes have unit bandwidth capacity 500 nodes in most of our experiments Heterogeneous nodes not described in talk Each node has 4-6 neighbours Simulator is round-based Block exchanges done at each round System moves as a whole to the next round File divided into 250 blocks Segment is a set of consecutive blocks Segment size = 10 Maximum possible goodput – 1 block/round 11
  • 12. Why 95th percentile? 500 (x, y) 450 y nodes have a 400 goodput which is at 350 least x. 300 Red line at top shows 95th percentile Nodes 250 200 150 95th percentile a good 100 indicator of goodput 50 Most nodes have at 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 least that goodput Goodput (in blocks/round) 12
  • 13. Feasibility of VoD – What does block/round mean? 0.6 blocks/round with 35 rounds setup time Two independent parameters 1 round ≈ 10 seconds 1 block/round ≈ 512 kbps Consider 35 rounds a good setup time 35 rounds ≈ 6 min Goodput = 0.6 blocks/round Size of video = 250 / 0.6 ≈ 70 min Max encoding rate = 0.6 * 512 > 300 kbps 13
  • 14. Naïve Scheduling Policies Random 3 6 8 1 4 2 7 5 Sequential 1 2 3 4 5 6 7 8 Segment-Random 4 1 3 2 6 5 8 7 Segment-Random Policy Divide file into segments Fetch blocks in random order inside same segment Download segments in order. 14
  • 15. Naïve Approaches – Random 1 Each node fetches a 0.9 block at random Goodput (blocks/round) 0.8 0.7 Throughput – High as 0.6 nodes fetch disjoint 0.5 blocks 0.4 More opportunities for 0.3 block exchanges 0.2 0.1 Goodput – Low as nodes 0 0 10 20 30 40 50 60 70 80 90 100 don’t get blocks in order Setup Time (in rounds) 0 blocks/round 15
  • 16. Naïve Approaches – Sequential 1 0.9 Goodput (blocks/round) 0.8 0.7 0.6 < 0.25 blocks/round 0.5 0.4 0.3 Sequential 0.2 Random 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Setup Time (in rounds) Each node fetches blocks in order of playback Throughput – Low as fewer opportunities for exchange Increase this 16
  • 17. Naïve Approaches – Segment-Random Policy Goodput (blocks/round) 1 0.9 0.8 0.7 0.6 < 0.50 blocks/round 0.5 0.4 0.3 Segment-Random Sequential 0.2 Random 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Setup Time (in rounds) Segment-random Fetch random block from the segment the earliest block falls in Increases the number of blocks in progress 17
  • 18. Peaks and Valleys 400 Throughput (in blocks) 350 Throughput is number 300 of block exchanges 250 system-wide 200 150 Period between two 100 valleys corresponds to 50 a segment 0 50 60 70 80 90 100 110 120 130 140 150 Time (in rounds) Throughput could be improved at two places – Reduce time: To obtain last blocks of current segment For initial blocks of segment to propagate 18
  • 19. Mountains and Valleys 400 Progress (in blocks) 350 300 250 Segment-Random 200 Local-rarest 150 100 50 0 50 60 70 80 90 100 110 120 130 140 150 Time (in rounds) Local rarest Fetch the rarest block in the current segment System progress improves 7% (195.47 to 208.61) 19
  • 20. Naïve Approaches – Rarest-client 1 0.9 Goodput (blocks/round) 0.8 0.7 0.6 0.5 Segment-Random 0.4 Local-rarest 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Setup Time (in rounds) How can we improve this further? Pre-fetching Improve availability of blocks from next segment 20
  • 21. Pre-fetching Random 3 6 8 1 4 2 7 5 Sequential 1 2 3 4 5 6 7 8 Segment-Random 4 1 3 2 6 5 8 7 Pre-fetching 1 3 2 5 4 7 6 8 Pre-fetching Fetch blocks from the next segment with a small probability Creates seeds for blocks ahead of time Trade-off: block is not immediately useful for playback 21
  • 22. Pre-fetching 1 Pre-fetching with 0.9 80-20 probability 0.2 Goodput (blocks/round) 0.8 Local-rarest 0.7 Segment-Random 0.6 Throughput: 0.5 0.4 9% improvement 0.3 195.47 to 212.21 0.2 0.1 0 Get better with 0 10 20 30 40 50 60 70 80 90 100 Setup Time (in rounds) Network Coding!!! 22
  • 23. NetCoding – How it Helps? A has blocks 1 and 2 B gets 1 or 2 with equal prob. from A C gets 1 in parallel If B downloaded 1 Link B-C becomes useless Network coding routinely sends 1⊕2 23
  • 24. Network Coding – Mechanics Coding over blocks B1, B2, …, Bn Choose coefficients c1, c2, …, cn from finite field Generate encoded block: E1 = Σi=1..n ci.Bi Without n coded blocks, decoding cannot be done Setup time at least the time to fetch a segment 24
  • 25. Benefits of NetCoding 1 Netcoding 80-20 Throughput: 0.9 Netcoding-segment-random 39% improvement Goodput (blocks/round) 0.8 80-20 271.20 with NC 0.7 212.21 with pre- 0.6 fetching and no NC 0.5 195.47 initial 0.4 0.3 0.2 Pre-fetching segments 0.1 moderately beneficial 0 0 10 20 30 40 50 60 70 80 90 100 Setup Time (in rounds) 25
  • 26. Implementation With the above insights, built a C# prototype 25K lines of code Rest of talk Identify problems when nodes arrive at different times, nodes with heterogeneous capacities Address these issues with improvements to the basic Netcoding scheme Note – All evaluation done with the prototype 26
  • 27. Segment Scheduling NetCoding good for fetching blocks in a segment, but cannot be used across segments Because decoding requires all encoded blocks Problem: How to schedule fetching of segments? Until now, sequential segment scheduling Segment pre-fetching beneficial Results in rare segments when nodes arrive at different times Segment scheduling algorithm to avoid rare segments 27
  • 28. Segment Scheduling A has 75% of the file Flash crowd of 20 Bs join with Server empty caches Server shared between A and Bs B Throughput to A decreases, as server is busy serving Bs A B B Initial segments served repeatedly Throughput of Bs low because the same segments are fetched from server 28
  • 29. Segment Scheduling – Problem Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 1 1 8 segments in the video; A has 75% = 6 segments, Bs have none Green is popularity 2, Red is popularity 1 Popularity = # full copies in the system 29
  • 30. Segment Scheduling – Problem Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 2 1 1 Instead of serving A … 30
  • 31. Segment Scheduling – Problem Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 3 2 2 2 2 2 1 1 The server gives segments to Bs 31
  • 32. Segment Scheduling – Problem A Bs A’s goodput plummets Get the best of both worlds – Improve A and Bs Idea: Each node serves only rarest segments in system 32
  • 33. Segment Scheduling – Algorithm Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 1 1 When dst node connects to the src node… Here, dst is B, src is Server 33
  • 34. Segment Scheduling – Algorithm Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 1 1 src node sorts segments in order of popularity Segments 7, 8 least popular at 1 Segments 1-6 equally popular at 2 34
  • 35. Segment Scheduling – Algorithm Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 1 1 src considers segments in sorted order one-by-one, and serves dst Either completely available at src, or First segment required by dst 35
  • 36. Segment Scheduling – Algorithm Segment # 1 2 3 4 5 6 7 8 Server A Bs Popularities 2 2 2 2 2 2 1 1 Server injects segment needed by A into system Avoids wasting bandwidth in serving initial segments multiple times 36
  • 37. Segment Scheduling – Popularities How does src figure out popularities? Centrally available at the server Our implementation uses this technique Each node maintains popularities for segments Could use a gossiping protocol for aggregation 37
  • 38. Segment Scheduling Note that goodput of both A and Bs improves 38
  • 39. Topology Management A has 75% of the file Flash crowd of 20 Bs join with Server empty caches Limitations of segment scheduling B Bs do not benefit from server, A as they get blocks for A A delayed too because of B B re-routing of blocks from Bs Present an algorithm that avoids these problems 39
  • 40. Topology Management – Algorithm Cluster nodes interested in the same part of the video Retain connections which have Server high goodput B When dst approaches src, src serves dst only if A src has spare capacity, or dst is interested in the worst-seeded B B segment per src’s estimate If so, kick a neighbour with low goodput 40
  • 41. Topology Management – Algorithm When Bs approach server… Server does NOT have spare capacity Server B is NOT interested in the worst-seeded segment per server’s estimate (segment 7) B When dst approaches src, src serves dst only if A src has spare capacity, or dst is interested in the worst-seeded B B segment per src’s estimate If so, kick a neighbour with low goodput 41
  • 42. Topology Management Note that both A and the Bs see increased goodput 42
  • 43. Heterogeneous Capacities Above algorithms do not work well with heterogeneous nodes Server A is a slow node B Flash crowd of 20 fast Bs A Bs get choked because A is slow B B 43
  • 44. Heterogeneous Capacities Server refuses to give initial segments to Bs Popularity calculation takes A to be a source Fast Adjust segment popularity Slow with weight proportional to the capacity of the node Initial segments have < 2 popularity, rounded to 1 Server then gives to Bs 44
  • 45. Heterogeneous Capacities Fast Slow Note the improvement in the fast/slow nodes 45
  • 46. Conclusions System designed to support near-VoD using peer-to-peer networks Hitherto, only large file distribution possible Three techniques to achieve low setup time and sustainable goodput NetCoding – Improves throughput Segment scheduling – Prevent rare blocks Topology management – Cluster “similar” nodes 46