The incredible simultaneous online users reveals the outstanding scalability of the data-driven protocol
Previous empirical study has shown that “rarest-first” is one of the most efficient strategies in data dissemination A block that is in danger of being delayed beyond the deadline should be with more priority than the one that is just entering the exchanging window
Transcript
1.
On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006
The Internet has witnessed a rapid growth in deployment of data-driven (swarming based) overlay/peer-to-peer network based IPTV systems during recent years.
These products are based on data-driven protocol
Facts of concurrent online users
GridMedia: over 230,000 , rate 310kbps (achieved by one server) (developed by our lab)
PPLive: 500,000 , rate 300-500kbps
QQLive: 1,460,000 , rate 300-500kbps (not one server)
I have block 1,2,4 I have block 1,2,3 I have block 1,2 I have block 2,3 Request block 4 Request block 3 Request block 1 Request block 2 Send block 4 Send block 3 Send block 1 Send block 2
The second step – block scheduling
The streaming is divided into blocks
Each node has a sliding window containing all the blocks it is interested in currently
How to do optimal scheduling to maximize the throughput of the whole overlay?
The real situation is more complicated because different blocks may have different importance and the bottlenecks are not only at the last mile.
Our basic approach:
Define priority to different blocks due to their importance
Maximize the sum of priorities of all requested blocks
Throughput is 4 Optimal scheduling, throughput gain is 25% Some requests congestion at node 1 Local Rarest First (LRF) strategy
9.
Problem Statement and Formulation - Priority Definition
We use two factors to represent the significance of a block:
rarity factor
emergency factor
We define the priority of block j ∈ A i for node i ∈ R as follow:
P j i = βP R ( Σ k ∈Nbr( i ) h kj )+(1- β ) P E ( C i + W T - d j i ),
Where 0≤ β ≤1 , functions P R ( * ) (rarity factor) and P E ( * ) (emergency factor) are both monotonously non-increasing ones
10.
Problem Statement and Formulation - Formulation
Decision variable
Global block scheduling problem:
s.t.
set of all absent blocks in the current exchanging window of node i D i play out time of block j at node i d j i the current play out time of node i C i the exchanging windows size W T period of requesting new blocks τ set of neighbors of node i NBR i Blocks availability: “ a kj =1” denotes node k holds block j ; otherwise, “ a kj =0” h kj ∈{0,1} the end-to-end available bandwidth between node i and node k E ik , the outbound bandwidth of node i O i , the inbound bandwidth of node i I i , N +1 is the number of overlay nodes, where node 0 is the source node N Definition Notation
The optimal goal of global block scheduling problem has the same absolute value as the minimum flow amount of its corresponding min-cost network flow problem. The flow amount on arc (v ki n , v ij b ) ∈{0, 1} is just the value of x kj i , which is the solution to the optimal block scheduling.
Algorithm complexity:
O ( nm (loglog U )log( nC )) , where n and m are the number of vertices and arcs while U and C is the largest magnitude of arc capacity and cost
Node i estimates the bandwidth W ki ( m +1) that its neighbor k can allocate it in the ( m +1) th period with the traffic received from that neighbor in the previous M periods, as shown in equation (3);
Based on W ki ( m +1) , node i performs the local block scheduling (2) using min-cost network flow model. The results x kj i ∈{0,1} represent whether node i should request block j from neighbor k ;
Random Strategy: each node will assign each desired block randomly to a neighbor which holds that block. Chainsaw uses this simple strategy.
Local Rarest First (LRF) Strategy: A block that has the minimum owners among the neighbors will be requested first. DONet adopts this strategy.
Round Robin (RR) Strategy: All the desired blocks will be assigned to one neighbor in a prescribed order in a round-robin way. If there is multiple available senders, it is assigned to a sender that has the maximum surplus available bandwidth.
First, to the best of our knowledge, we are the first to theoretically address the streaming scheduling problem in data-driven (swarming based) streaming protocol.
Second, we give the optimal scheduling algorithm under different bandwidth constraints, as well as a distributed asynchronous algorithm which can be practically applied in real system and outperforms existent methods by about 10%~80%
Future work
How to do optimization over a horizon of several periods, taking into account the inter-dependence between the periods.
How to do optimal scheduling with scalable video coding (such as layered video coding) or multiple description coding
Be the first to comment