Your SlideShare is downloading. ×
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply



Published on

Published in: Technology, Education

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • The incredible simultaneous online users reveals the outstanding scalability of the data-driven protocol
  • Previous empirical study has shown that “rarest-first” is one of the most efficient strategies in data dissemination A block that is in danger of being delayed beyond the deadline should be with more priority than the one that is just entering the exchanging window
  • Transcript

    • 1. On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006
    • 2. Outline
      • Background
      • Related Work
      • Problem Statement and Formulation
      • Global Optimal Solution
      • Distributed Algorithm
      • Performance Evaluation
      • Conclusion & Future Work
    • 3. Background
      • The Internet has witnessed a rapid growth in deployment of data-driven (swarming based) overlay/peer-to-peer network based IPTV systems during recent years.
      • These products are based on data-driven protocol
      • Facts of concurrent online users
        • GridMedia: over 230,000 , rate 310kbps (achieved by one server) (developed by our lab)
        • PPLive: 500,000 , rate 300-500kbps
        • QQLive: 1,460,000 , rate 300-500kbps (not one server)
    • 4. Background - Data-Driven Protocol Review
      • Aiming to enable large-scale live broadcasting in the Internet environment
      • Very simple and very similar to that of Bit-Torrent
      • Two steps in data-driven protocol
        • The overlay construction
        • The block scheduling
    • 5. Background - Data-Driven Protocol Review
      • The first step – overlay construction
        • All the nodes self-organize into a random graph
      I have block 1,2,4 I have block 1,2,3 I have block 1,2 I have block 2,3 Request block 4 Request block 3 Request block 1 Request block 2 Send block 4 Send block 3 Send block 1 Send block 2
      • The second step – block scheduling
        • The streaming is divided into blocks
        • Each node has a sliding window containing all the blocks it is interested in currently
    • 6. Related Work
      • To improve data-driven protocol, most recent efforts focus on optimizing overlay construction (i.e. the first step ):
        • Vishnumurthy & Francis (INFOCOM2006): random graph building under heterogeneous overlay
        • Liang & Nahrstedt (INFOCOM2006): propose RandPeer, a peer-to-peer QoS-sensitive membership management protocol
    • 7. Related Work
      • An problem not well addressed is how to optimize the second step, that is,
        • how to do optimal block scheduling and maximize the throughput of data-driven protocol under a constructed overlay
      • Most existent methods are straight forward and ad hoc
        • Chainsaw: pure random way
        • DONet: greedy local rarest-first
        • PALS: round-robin method
    • 8. Problem Statement and Formulation
      • How to do optimal scheduling to maximize the throughput of the whole overlay?
      • The real situation is more complicated because different blocks may have different importance and the bottlenecks are not only at the last mile.
      • Our basic approach:
        • Define priority to different blocks due to their importance
        • Maximize the sum of priorities of all requested blocks
      Throughput is 4 Optimal scheduling, throughput gain is 25% Some requests congestion at node 1 Local Rarest First (LRF) strategy
    • 9. Problem Statement and Formulation - Priority Definition
      • We use two factors to represent the significance of a block:
        • rarity factor
        • emergency factor
      • We define the priority of block j ∈ A i for node i ∈ R as follow:
        • P j i = βP R ( Σ k ∈Nbr( i ) h kj )+(1- β ) P E ( C i + W T - d j i ),
        • Where 0≤ β ≤1 , functions P R ( * ) (rarity factor) and P E ( * ) (emergency factor) are both monotonously non-increasing ones
    • 10. Problem Statement and Formulation - Formulation
      • Decision variable
      • Global block scheduling problem:
      • s.t.
      set of all absent blocks in the current exchanging window of node i D i play out time of block j at node i d j i the current play out time of node i C i the exchanging windows size W T period of requesting new blocks τ set of neighbors of node i NBR i Blocks availability: “ a kj =1” denotes node k holds block j ; otherwise, “ a kj =0” h kj ∈{0,1} the end-to-end available bandwidth between node i and node k E ik , the outbound bandwidth of node i O i , the inbound bandwidth of node i I i , N +1 is the number of overlay nodes, where node 0 is the source node N Definition Notation
    • 11. Global Optimal Solution
      • Convert the global block scheduling formulation into an equivalent Min-Cost Flow Problem
    • 12. Global Optimal Algorithm
      • Proposition:
        • The optimal goal of global block scheduling problem has the same absolute value as the minimum flow amount of its corresponding min-cost network flow problem. The flow amount on arc (v ki n , v ij b ) ∈{0, 1} is just the value of x kj i , which is the solution to the optimal block scheduling.
      • Algorithm complexity:
        • O ( nm (loglog U )log( nC )) , where n and m are the number of vertices and arcs while U and C is the largest magnitude of arc capacity and cost
    • 13. Distributed Algorithm
      • We first use a simple way to estimate the bandwidth that is available from each neighbor with historical information.
      • q ki ( m ) : the total number of blocks arrived at node i from neighbor k in the m th period.
      • W ki ( m +1) : the estimated bandwidth from node k to node i
    • 14. Distributed Algorithm
      • With the estimated available bandwidth, a local block scheduling is performed on each node
      • It can be also transformed into an equivalent min-cost network flow problem for local optimal request
    • 15. Distributed Algorithm
      • Heuristic distributed algorithm:
        • Node i estimates the bandwidth W ki ( m +1) that its neighbor k can allocate it in the ( m +1) th period with the traffic received from that neighbor in the previous M periods, as shown in equation (3);
        • Based on W ki ( m +1) , node i performs the local block scheduling (2) using min-cost network flow model. The results x kj i ∈{0,1} represent whether node i should request block j from neighbor k ;
        • Send requests to every neighbor.
    • 16. Performance Evaluation - Compared Scheduling Methods
      • Random Strategy: each node will assign each desired block randomly to a neighbor which holds that block. Chainsaw uses this simple strategy.
      • Local Rarest First (LRF) Strategy: A block that has the minimum owners among the neighbors will be requested first. DONet adopts this strategy.
      • Round Robin (RR) Strategy: All the desired blocks will be assigned to one neighbor in a prescribed order in a round-robin way. If there is multiple available senders, it is assigned to a sender that has the maximum surplus available bandwidth.
    • 17. Simulation Configuration
      • For a fair comparison, all the experiments use the same simple algorithm for overlay construction
      • Delivery ratio : to represent the number of blocks that arrive at each node before playback deadline over the total number of blocks encoded.
      • DSL nodes:
        • Download bandwidth: 40% 512K, 30% 1M, 30% 2M
        • Upload bandwidth: half of download bandwidth
      • 500 nodes
      • Each node has 15 neighbors
      • Request period: 2 second
    • 18. Simulation Results
      • All are DSL nodes with exchanging window of 10 sec and bottlenecks only at the last mile. Group size is 500
    • 19. Simulation Results
      • All are DSL users with exchanging window of 10 sec and end-to-end available bandwidth 10~150Kbps. Group size is 500
    • 20. Conclusion & Future Work
      • The contributions of this paper are twofold.
        • First, to the best of our knowledge, we are the first to theoretically address the streaming scheduling problem in data-driven (swarming based) streaming protocol.
        • Second, we give the optimal scheduling algorithm under different bandwidth constraints, as well as a distributed asynchronous algorithm which can be practically applied in real system and outperforms existent methods by about 10%~80%
      • Future work
        • How to do optimization over a horizon of several periods, taking into account the inter-dependence between the periods.
        • How to do optimal scheduling with scalable video coding (such as layered video coding) or multiple description coding
    • 21.
      • Thanks
      • Q&A