View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
1.
1 Reverse-Engineering BitTorrent: A Markov Approximation Perspective Ziyu Shao∗, Hao Zhang+, Minghua Chen∗, and Kannan Ramchandran + ∗ Department of Information Engineering The Chinese University of Hong Kong, Shatin, N.T., Hong Kong Email: {zyshao, minghua}@ie.cuhk.edu.hk + Department of Electrical Engineering and Computer Sciences, UC Berkeley, USA Email: {zhanghao, kannanr}@eecs.berkeley.edu Abstract—BitTorrent has been the most popular P2P (Peer-to- a neighbor selection scheme, provides an effective sharingPeer) paradigm during recent years. Built upon great intuition, incentive by implementing a tit-for-tat mechanism. Overall,the piece-selection and neighbor-selection modules rooted in BitTorrent is shown to be remarkably robust and scalableBitTorrent are critical for efﬁciency and scalability of manyP2P systems, such as ﬁle-sharing and video-and-demand. Yet the at ensuring high uplink bandwidth utilization [1]–[4]. Thesetheoretical underpin of these two modules remain largely undis- observations and insights made by existing efforts enrich thecovered. In this paper we reverse-engineer BitTorrent protocol fundamental understandings of BitTorrent and greatly sharpenfrom a Markov approximation perspective. We show that to- the design skills of P2P systems. We will discuss more detailsgether with the underlying rate control algorithm, the rarest ﬁrst in the related Work section (Section II).and choking algorithms in BitTorrent protocol implicitly solve acooperative combinatorial network utility maximization problem Encouraged by the results from these exciting work, weby implementing a Markov chain in a distributed manner. This further explore answers to the questions and reverse engineerunderstanding allows us to access properties of BitTorrent from a BitTorrent from a new perspective, with the hope to revealfresh perspective, including performance optimality, convergenceand impacts of design parameters. Our numerical evaluations new hidden facts and provide new insights for future design.validate the analytical results. The insights obtained by studying Our new perspective is based on the Markov approximationBitTorrent not only help design better P2P systems, but also framework [5]. The main results are summarized in Table I.provide useful ingredients for synthesizing distributed algorithms Detailed statements of results and contributions are listed asfor combinatorial problems in other domains. follows: • We ﬁnd that with underlying rate-control algorithms I. I NTRODUCTION such as TCP, BitTorrent actually implements a Markov In recent years the BitTorrent has attracted the attention of chain solving the following global optimization problem:both industry and academia, eager to understand the remark- maximize the aggregate downloading rates of all peersable success of this simple yet powerful protocol. A central given the underlying physical edge capacity constraintselement of the design philosophy that shaped the BitTorrent and concurrent uploading connections limit. BitTorrentis the cooperative swarming mechanism, and a key illustration solves this problem by combining three components (rateof the mechanism is provided by the rarest ﬁrst and choking control, piece selection and neighbor selection) in threealgorithms. Why does BitTorrent work so well? How might, separated time scales respectively.or should, it evolve in the future? Answers to these questions • More precisely, we consider all peer neighboring conﬁgu-will help the fundamental understanding of BitTorrent, which rations satisfying concurrent uploading connections limit.facilitates not only the improved design of P2P systems in First, the rate control algorithm assigns overlay edgea systematic way rather than the way with ad-hoc heuristics, capacity given the peer neighboring conﬁguration andbut also distributed algorithms for other combinatorial network arbitrary underlying physical edge capacity constraints.optimization problems. By doing so, BitTorrent actually goes beyond a common A lot of efforts have been made to answer these questions, assumption made in almost all P2P algorithm designsincluding real data measurements, game-theoretic analysis (except [6], [7]) that uplink and (or) downlink of peersand differential-equation based macroscopic analysis. Through is the only rate-limiting bottleneck. Second, rarest ﬁrstthese efforts, it is clear now that piece and neighbor selection algorithm, the piece selection component of BitTorrent,strategies are the two keys of efﬁcient and scalable P2P implicitly maximizes the aggregate downloading ratessystems. For each peer, piece and neighbor selection strategies of all peers given the peer neighboring conﬁgurationdecide which peers to upload to and which pieces to download and overlay edge capacity. Third, choking algorithm, thefrom which service peers. It is observed that rarest ﬁrst neighbor selection component of BitTorrent, implicitlyalgorithm, a piece selection scheme, guarantees close to ideal ﬁnding the best peer neighboring conﬁguration by im-diversity of the pieces among peers; and choking algorithm, plementing a Markov chain over all conﬁgurations and
2.
2 statistically hopping towards the best conﬁguration. uniform distribution of pieces. In contrast, our work does not • We characterize the following properties of the corre- assume global knowledge and each peer has only a limited sponding Markov chain: approximation gap, perturbation local view of networks. This is a more realistic modeling error bound, insensitivity of count-down time, mixing of BitTorrent protocol. In [26], a coupon replication model time, and trade-off between approximation gap and mix- of BitTorrent-like system is proposed by considering peers ing time. These studies enable us to analyze properties with only limited upload. It is argued that overall system of BitTorrent protocol including performance optimality, performance does not depend critically on either altruistic peer convergence and impacts of design parameters. Insights behavior or the rarest-ﬁrst piece selection strategy. In [27], an obtained from these studies further improves not only extend coupon replication model is proposed by considering P2P system design, but also distributed algorithms for peers with limited upload and download capacity. With the combinatorial network optimization problems in other same access link bottleneck assumption, in [28], a model is domain. proposed to capture the trade-off between performance and The remainder of this paper is organized as follows. In fairness. In contrast, our model assume that the bottleneck linkSection II, we discuss the related work on BitTorrent. In can be anywhere, which is more realistic. In [29], an improvedSection III, we introduce the system model for BitTorrent piece selection strategy is proposed and analyzed. In contrast,and our perspective. In Section IV, we discuss the neighbor in our work, we can characterize system trade-offs ( such asselection component for BitTorrent and corresponding Markov performance vs. convergence) by both analytical modeling andchain design in details. Numerical results are provided in simulation. Further, our results access properties which areSection V, and conclusions are drawn in Section VI. hard to analyze before. For example, we are able to quantify the impact of each component (including rate-control,piece selection and neighbor selection)individually in BitTorrent. II. R ELATED W ORK We believe that our results provide a fresh perspective to Bram Cohen, the BitTorrent protocol’s creator, has de- reverse engineering BitTorrent protocols. Together with thescribed BitTorrent’s main mechanisms and the design rationale existing works, our results forms a comprehensive basis uponin [17]. Since then, the widespread popularity of BitTorrent which we have a fundamental understanding of BitTorrenthas attracted the attention from the research community to protocols including performance optimality, convergence andconduct various performance studies in order to understand impacts of design parameters.the behavior of the BitTorrent protocol, its mechanisms and Note that previously we apply Markov approximationthe overall application performance. Here we intend to not framework to design P2P systems [11], [30]. Though ourenumerate all approaches in the literature, but three main current work indeed borrows techniques from these worksapproaches related to our work. including extended Markov chain model and insensitivity The ﬁrst approach is based on real measurements [1], of count-down time, our current work is different from the[3], [4], [18]–[20]. These measurements usually lasted for previous designs [11], [30] in the following aspects:several months, either collecting tracker logs obtained from • Our current work is reverse engineering the BitTorrentthe trackers or collecting event logs by joining an ongoing protocol while previous designs focus on forward engi-torrent with a modiﬁed client. The track logs enable us to neering of P2P system.have the global view of BitTorrent performance whereas event • Our current work provides new and non-trivial results,logs enable us to observe the individual behavior of peers. including new bounds on mixing time analysis of MarkovObservations based on real measurements indeed give us some chain, new analysis on the trade-off between approxima-ad-hoc design heuristics. In contrast, our work enable us to tion gap and mixing time, and a tighter bound on systemhave a systematic design. utility gap due to perturbation. The second approach is based on game-theoretic analysis[14], [21]–[25]. This line of work follows economic ﬂavor, III. M ODELING B IT T ORRENTincluding Tit-for-Tat (TFT) strategy analysis, feasibility ofselﬁsh behavior (free-riding), incentive compatibility, and auc- A. BitTorrenttion analysis. Major parts of these studies are characterizing Here we give an concise introduction of BitTorrent protocol.the existence, uniqueness, stability and other key properties of Files transferred using BitTorrent are split in pieces of typi-Nash equilibrium of gaming BitTorrent. These studies prove cally 256 kB, and each piece is split in blocks of 16 kB. Thereus a clear picture of economic ﬂavor of BitTorrent. In contrast, are two key components in BitTorrent: torrent and tracker. Aour work is orthogonal to these studies. We do not consider torrent deﬁnes a session of transfer of a single content to athe economic incentive issue and focus on the cooperative set of peers. The tracker of this torrent keeps track of thebehavior of BitTorrent. peers currently involved in the torrent and collect statistics on The third approach is differential-equation based macro- the torrent. When a peer joins a torrent, the tracker will givescopic analysis [14], [26]–[29]. In [14], a reﬁned ﬂuid model it a list of neighboring peers involved in this torrent, whichof BitTorrent is proposed and the high efﬁciency of BitTorrent is called the neighbor set. Each peer knows the distributionis shown. i.e., its capacity of service does not scale with the of the pieces for each peers in its neighbor set. Further, anumber of peers. However, this model assumes peer selection peer can only upload data to a subset of its neighbor setbased on global knowledge of all peers in the torrent, as well as simultaneously, and we call this subset the active neighbor set.
3.
3 TABLE I S UMMARY OF R ESULTS FOR R EVERSE E NGINEERING B IT T ORRENT (BT) P ROTOCOL BT Components Functionality Time Scale Critical Issues Proof Techniques Sections and References Optimality NUM Sec. III-C-1) & [8]–[10] Rate Control Assigning overlay edge rates Fast Convergence Lyapunov Function [8]–[10] Optimality Linear Programming Sec. III-C-2) & [8]–[10] Piece Selection∗ Maximizing the aggregate downloading rates of all peers Normal Convergence Lyapunov Function [8]–[10] Optimality Markov approximation and perturbation analysis Sec. III-C-3), IV & [5], [11] Neighbor Selection Choosing the best peer neighboring conﬁguration Slow Convergence Markov approximation and mixing time Sec. III-C-3), IV-G & [5], [12], [13] ∗ Note that our formulation is complementary to existing formulations of piece selection component [14]–[16].The difference between neighbor set and active neighbor set is to at most δ neighbors simultaneously. We denote f as peercalled potential neighbor set. For piece selection and neighbor neighboring conﬁguration, i.e., a speciﬁc peer neighboringselection strategies, we focus here on two core algorithms of relationship satisfying limits on the number of concurrent fBitTorrent: uploading connections. Let N v denote the active neighbor set 1) Rarest First Algorithm for Piece Selection: In general, of peer v under conﬁguration f . We also denote F as the set a peer has a choice of several blocks that it could of all possible peer neighboring conﬁgurations. For any peer download from its neighbors. It employs a local rarest u ∈ V , let xf denote the peer u’s downloading rate from uv ﬁrst (LRF) policy in picking which block to download: peer v under f , and x f = {xf } denote the vector of peers’ uv it tries to download the block that is least replicated downloading rates given conﬁguration f . among its neighbors. 2) choking Algorithm for Neighbor Selection: Each peer C. Our Perspective limits the number of concurrent uploads to a small We adopt the deterministic ﬂuid model. Overall, together number, typically 4 ∼ 7. The mechanism used to limit with TCP rate control algorithms, BitTorrent solves a global the number of concurrent uploads is called choking. optimization problem: given the underlying physical capacity Choking is temporary refusal to upload; it stops upload- constraints and concurrent uploading connections limit, maxi- ing, but downloading can still happen. mize the aggregate downloading rates of all peers. This global • Rate-based Tit-for-Tat Choking Policy: Every 10 optimization problem is formulated as follows: seconds, a peer reevaluates the download rate pro- viding by its neighbors, then it chokes one active maxf,xf >0 xf uv (2) neighbor which providing the worst downloading u∈V v∈Nu rate and unchokes a new neighbor. s.t. x ∈ Γ(˜ f ), f z (3) • Optimistic Unchoking Policy: Every 30 seconds, f ∈ F. (4) a peer unchokes a randomly chosen neighbor re- gardless of the download rate achieved from that where Γ(˜ f ) is the feasible region for downloading rates x f z neighbor. ˜ ˜ given conﬁguration f and overlay edge rates z f , and z f is the optimal solution of the following rate-control problem:B. Notations RC : maxz f ≥0 U (z f ) (5) Consider an underlying physical network modeled as a s.t. aT l · z f ≤ Cl , ∀l ∈ L (6)directed graph G = (N , L), where V is the set of all physical Here U (·) is the TCP utility function, and the constraint in (6)nodes, including peering nodes and other intermediate nodes is the physical link capacity constraint.such as routers, and L is the set of all physical links. Each The above global optimization problem is challenging tolink l ∈ L has a nonnegative capacity C l . solve due to its combinatorial nature. BitTorrent solves this Consider a P2P ﬁle sharing system over G. Note that in global optimization problem by combining three componentsthis paper, we do not consider dynamic scenarios, i.e., peers (rate control, piece selection and neighbor selection) with threecome and go. We use V ⊆ N to denote the set of all peering time scales (Trs , Tps , Tns ) respectively. The smaller the timenodes. We use E to denote the set of directed overlay links scale, the faster the corresponding component converges. Webetween these peers. Note an overlay link (u, v) means u can make the time scale separation assumption, represented bysend data to v by setting up connections. For all e ∈ E and Trs Tps Tns .l ∈ L, we deﬁne 1) Rate Control Component. The purpose of this component 1, if overlay link e passes physical link l; is to determine the overlay link rate given the overlay conﬁgu- al,e = (1) ration f . This part is formulated as problem RC. By [8]–[10] 0, otherwise. we know that TCP protocol is reverse engineered to solve For any edge e ∈ E, we denote z e as the associated edge the problem RC. More speciﬁcally, TCP protocol allocatesrate. We denote Ef is the edge set of overlay graph under f . overlay edge rates subject to underlay edge capacity accordingWe also denote z f = [ze , e ∈ Ef ]T and al = [al,e , e ∈ Ef ]T . to TCP utility maximization. For example, when TCP Reno Each peer v has a neighbor set denoted by N v and the size protocol is adopted, U (z f ) is shown to be e∈Ef − ze1 e [9], dof Nv is denoted by Δ. However, each peer v can upload where de is some delay metric on link e. The proofs for both
4.
4optimality and convergence to optimality are standard routine Choking algorithm, a neighbor selection component, ap-by adopting Lyapunov technique [8]–[10]. proximately solves the problem NS in a distributed way. Remarks: In practice, some ISPs (internet service Details are provided in next section.providers) such as Comcast started to throttle the BitTorrentTrafﬁc, creating bottlenecks at ISP peeing points. Moreover, IV. N EIGHBOR S ELECTION OF B IT T ORRENTthe capacity bottleneck can be anywhere in the network, not Based on Markov approximation framework [5], the prob-necessarily at the edge of the network [31], [32]. Our above lem of NS are solved approximately by designing a Markovmodel and formulation allows us to access the performance of chain in a distributed way. We ﬁrst design a speciﬁc MarkovBitTorrent over arbitrary network topologies where bottlenecks chain that gives a distributed algorithm solving the problemcan be anywhere in the network. This is different from most NS approximately. We then show how the resulting algorithmother P2P models that assume the peer uplinks/downlinks are accurately corresponds to the BitTorrent protocol.the only rate-limiting bottleneck. 2) Piece Selection Component. The purpose of this com- A. Markov Approximation Frameworkponent is to maximize the summation of aggregate downlink Original paper in [5] explains the framework from therates of all peers given the overlay conﬁguration f and overly optimization perspective. Here we present the framework from ˜edge rates z f . This part is formulated as follows: the sampling perspective. Without loss of generality, we assume that the optimal PS : maxxf >0 xf uv (7) solution for problem NS is unique and denote it as follows: u∈V v∈Nu f o = arg max gf (10) s.t. x ∈ Γ(˜ f ) f z (8) f ∈F This is the linear programming problem and we can adopt We associate with each conﬁguration f ∈ F a probabilitystandard Lyapunov technique [8]–[10] to show optimality and pf . Then we can see that solving problem NS is equivalent toconvergence of corresponding subgradient algorithms. sampling the conﬁguration space F from the following Dirac How to relate this ﬂuid ﬂow model with the scheduling distribution:of discrete ﬁle pieces? In our opinion, for every downloading 1 if f = f o pf = (11)rate computed by PS component, piece selection algorithms in 0 otherwiseBitTorrent such as rarest ﬁrst strategy and others [3], [14]–[16] However, Dirac distribution is hard to obtain since f o isaim at achieving such downloading rate by carefully exchange unknown to us. Therefore, we need to sample the conﬁgurationﬁle pieces so that every receiving bit is ensured to be useful for space F from a new target distribution, which needs to satisfythe receivers. It remains open to investigate the correctness of two conditions:this conjecture. In this paper, we focus on ﬂuid model based • C1: it can be obtained without knowing the exact valuestudy to reveal the big picture and in particular understandthe neighbor selection in BitTorrent. The investigation on the of f o . o • C2: f is the conﬁguration with the largest probabilityabove conjecture is left for future work. Note that there are several other formulations for piece se- (not necessary to be 1).lection problems [14]–[16]. Our formulation is complementary It turns out a product-form distribution parameterized byto existing models. β > 0 is a choice of this target distribution, shown as follows: 3) Neighbor Selection Component. Suppose the optimal exp (βgf ) p∗ (g) = , ∀f ∈ F , (12)solution to problem PS given conﬁguration f is denoted by f exp (βgf )gf = u∈V v∈Nu x∗f , which represents the system utility uv f ∈Funder f . The purpose of this component is to ﬁnd the best As we will see later that this product-form distributionconﬁguration such that its corresponding system utility g f is can be obtained by designing a time-reversible Markov chainmaximum. This part is formulated as follows: without knowing the value of f o [36]. On the other hand, given a positive constant β, it is not hard to see f o = NS : max gf (9) arg maxf ∈F p∗ (g). Thus both conditions C1 and C2 are f ∈F f satisﬁed. This is a combinatorial optimization problem. In general, the Then when we sample the conﬁguration space F from thesize of F , i.e., the number of all possible peer neighboring distribution p∗ (g) in (12), i.e., time-sharing among different fconﬁgurations, can be exponential in the number of peer conﬁgurations according to distribution p ∗ (g) in (12), we fnodes. We have the following result: actually solve the problem NS approximately and obtain aProposition 1. In general, for any δ ≥ 2, the problem NS close-to-optimal system utility.is NP-complete and APX-hard (no effective polynomial-time By Markov approximation framework [5], we know that byapproximate solution). doing so, we actually approximate the maximum system utility by a log-sum-exp function: This proof is based on a polynomial-time reduction with ⎡ ⎤degree-bounded-subgraph problem, a problem known as NP- 1complete and APX-hard [33], [34]. The proof adopts the max gf ≈ log ⎣ exp (βgf )⎦ , (13) f ∈F βmethod in [35] and we omit details here. f ∈F
5.
5 We call the above conditions as direct transition condition. In other words, we only allow direct transition that correspond to only a single peer unchoking a new inactive neighbor and choking an active neighbor. It can be shown that in this way, the topology of state space is connected and irreducibility is satisﬁed. Since the rates of indirect transitions are all zero, we focus on the design of direct transition rates. For convenience, we state some notations. For any two conﬁgurations f, f ∈ F satisfying direct transition condition, and any node w ∈ f = ˜ f ∪ f , we deﬁne ˜ ˜ Aw,f = ˜ f ∈ F | f = f {(w, u)} , ∀u ∈ Nw . (15) f We can see that any conﬁguration f can be reached from ˜ ˜ f by node w choking one active neighbor in N w . Then for f Dirac Distribution Product-form Distribution any two conﬁgurations f, f ∈ F satisfying direct transition condition, we set the transition rates as:Fig. 1. Illustration of log-sum-exp approximation and sampling. We show exp β gf − gf ˜the sampling distribution pf , ∀f ∈ F and the corresponding performance qf,f = τ (16) 1metric f ∈F pf ·gf − β f ∈F pf log pf , i.e., sampling average of system exp β gf − gf ˜ 1 f ∈Av(f,f ˜ f ∈F pf · gf off by an entropy item β f ∈F pf log pf . On the ),futilityleft hand side, we sampling from the Dirac distribution and the correspondingperformance metric is maxf ∈F gf . On the right hand side, we sampling exp β gf − gf ˜from the product-form distribution and the corresponding performance metric qf ,f =τ (17) 1is β log exp β gf − gf ˜ f ∈F exp βgf . f ∈Av(f,f ˜ ),f ˜ where τ > 0 is a constant and f = f ∪ f .and the approximation accuracy is known as follows [5]: We have the following results: ⎡ ⎤ Proposition 2. The designed perfect Markov chain is a time- 1 1 0 ≤ log ⎣ exp (βgf )⎦ − max gf ≤ log |F |, (14) reversible Markov chain with the desired stationary distribu- β f ∈F β f ∈F tion in p∗ (g) in (12). fwhere |F | denote the size of the set F . The whole relationship The proof is relegated to Appendix-A.is illustrated in Fig.1. One possible implementation is shown as follows: After log-sum-exp approximation, we need to design a Perfect Markov Chain ImplementationMarkov chain such that its state space is F (all feasible peer • The following procedure runs on each individual peerneighboring conﬁgurations) and its stationary distribution is independently. We focus on a particular peer v ∈ V .p∗ (g) in (12). f • Initialization: Peer v randomly selects δ neighbors from In the following, we design a Markov chain to satisfy its neighbor list Nv and builds connections with thesethe above requirement and its construction requires global selected neighbors.information of P2P systems. We call it “perfect Markov chain” • Step 1: Denote by f the current conﬁguration. Peerand use it as a benchmark for further performance comparison. v independently generates an exponentially distributed 1Then we will show that neighbor selection of BitTorrent random number with mean τ (Δ−δ) and counts downactually implements a local perturbation of perfect Markov according to this number.chain. We call this Markov chain as “BitTorrent Markov • Step 2: When the count-down expires, peer v randomchain”. Performance guarantee of BitTorrent Markov chain is uniformly unchokes a new inactive neighbor w from N v 1also provided. potential set. This happens with probability Δ−δ , and the ˜ system transits to a temporary conﬁguration f . • Step 3: Peer v chokes an active neighbor u with proba-B. Perfect Markov Chain Design bility First, we design the topology structure of state space. Directtransitions between two conﬁgurations f, f ∈ F can happen exp β gf − gf ˜if and only if: , (18) exp β gf − gf ˜ ˜ ˜ ˜ ˜ ˜ • there ∃f such that f = f ∪ f , f ⊇ f, f ⊇ f , |f f | = f ∈Av(f,f ˜ ),f ˜ |f f | = 1; ˜ ˜ • Link f f and link f f originates from the same peer, ˜ where f = f {(v, u)}. Peer v then repeats Step 1. denoted as v(f, f ). We observe the following fact:
6.
6Proposition 3. By the perfect Markov chain design, for any Algorithm 1 :Soft Choking Algorithmtwo conﬁgurations f, f ∈ F satisfying direct transition 1: The following procedure runs on each individual peercondition, the transition rates are exactly the desired ones in independently. We focus on a particular peer v ∈ V .(16). 2: procedure I NITIALIZATION The proof is relegated to Appendix-B. 3: Nv ← δ neighbors randomly picked from N v f However, one drawback of the design above is that it is hard 4: Builds connection with these neighbors.to implement the perfect Markov chain in a distributed manner 5: end proceduresince it requires every peer v ∈ V to know global informationgf − gf for every adjacent conﬁguration f ∈ Av,f . ˜ ˜ 6: procedure S TEP 1: C OUNT-D OWN P ROCESS(v) In the following, we propose to use local estimation to re- 7: Generates a timerplace the required global information. This lead to a fully dis-tributed implementation named as “soft choking algorithm”. Tv ∼ exp (τ (Δ − δ))However, we no longer implement the perfect Markov chain an begins counting down.but a perturbed one which we call “BitTorrent Markov chain”.We present the soft choking algorithm ﬁrstly, map it to the 8: end procedureneighbor selection schemes of BitTorrent, then we characterize 9: procedure S TEP 2: O PTIMISTIC U NCHOKING(v)its performance. 10: Count-down expires. 11: ˜ f ← a new inactive neighbor w unchoked randomlyC. BitTorrent Markov Chain Design from the peer v’s in active neighbor set. We keep the same topology construction of state space 12: end procedureas the one for perfect Markov chain. However, for directtransition rates, we modiﬁed them as follows: for any two 13: procedure S TEP 3: S OFT-W ORST-N EIGHBOR -conﬁgurations f, f ∈ F satisfying direct transition condition, C HOKING(v) 14: Measures the downloading rates from each active exp −βxf f ˜ ˜ qf,f = τ (19) neighbor u ∈ N v . f exp −βxf f ˜ 15: Chokes an active neighbor u with probability f ∈Av(f,f ˜ ),f ˜ exp (−βxf ) vu exp −βxf f ˜ , (21) ˜ qf,f = τ (20) u f ∈Nv ˜ exp (−βxf ) vu exp −βxf f ˜ ˜ f ∈Av(f,f ˜ ),f where f = f {(v, u)}. 16: end procedure ˜where τ > 0 is a constant and f = f ∪ f . 17: Repeats Step 1. Note that transition rates (19) are perturbation of transitionrates in (16) by using local measurement quantity −x f f to ˜replace global information quantity g f − gf . Consequently, ˜the perturbed Markov chain can be implemented in a fully of count-down time. As it will see later in next subsectiondistributed manner, shown in Algorithm 1. (subsection IV-E), the stationary distribution of Markov chain We have the following result: is insensitive to the distribution of count-down time. In this sense, we can say Step 2 of Algorithm 1 is performingProposition 4. By running soft choking algorithm (Algorithm optimistic unchoking scheme.1), for any two conﬁgurations f, f ∈ F satisfying directtransition condition, the transition rates are exactly the desired Third, Step 3 of Algorithm 1 (Soft-Worst-Neighbor-ones in (19). Choking) is a generalization of the Choking algorithm of BitTorrent protocol, including choking algorithm as a special The proof is similar to the proof of proposition 3 and we case. In soft-worst-neighbor-choking scheme, we can see thatomit the details. the lower the downloading rates, the more likely the corre- sponding active neighbor will be choked and vice versa. AsD. Mapping Algorithm 1 to BitTorrent Protocol β → ∞, we can see that with probability 1, peer v chokes the We now mapping Algorithm 1 (the implementation of neighbor with the worst downloading rates. Thus, soft-worst-BitTorrent Markov chain) to BitTorrent protocol. neighbor-choking scheme degenerates to worst-neighbor- First, Step 1 of Algorithm 1 is exactly the same as the choke scheme in BitTorrent as β → ∞. Therefore, chokinginitialization of BitTorrent protocol. algorithm of BitTorrent protocol is an asymptotic version of Second, Step 2 of Algorithm 1 is nearly the same as our soft-worst-neighbor-choking algorithm (as β → ∞).the optimistic unchoking algorithm of BitTorrent protocol. After building mapping from Algorithm 1 to BitTorrentThe difference lies in the distribution of counter-time. The Protocol, we can investigate properties of BitTorrent protocolBitTorrent protocol used in practice adopts a constant count- by studying the properties of the corresponding Markov chain,down time, while Algorithm 1 adopts exponential distribution including performance optimality, convergence and impacts of
7.
7design parameters. Theorem 2. (a) The stationary distribution of BitTorrent Markov chain is σf exp (βgf )E. Insensitivity of Count-down Time Distribution pf (g) = ¯ , ∀f ∈ F (23) f ∈F σf exp (βgf ) For implementations of perfect Markov chain and BitTor-rent Markov chain, the probability distribution of count-down n where σf = k=0 ρfk exp β kΛmax , nf , ∀f ∈ F is the level f nftime are both exponential. However, in practice, the count- of quantization errors and ρ fk (0 ≤ k ≤ nf ) is the distributiondown time can be constant. For example, the count-down of quantized perturbation errors.time before the optimistic unchoking is 30 seconds in real (b) Let gmax = maxf ∈F gf denote the optimal system utility,BitTorrent implementations [17]. In general, the distribution gave = f ∈F p∗ · gf denote the expected system utility with ∗ fof count-down time is not exponential and the transitions of ¯ perfect Markov chain, and g ave = f ∈F pf · gf denote ¯peer neighboring conﬁgurations do not form Markov chains expected system utility with BitTorrent Markov chain. Thenany more. To analyze the stationary distribution of this non- the optimality gap are shown as follows:Markov process, we adopt the supplementary variable method[37]–[39]. First, we extend the state by including both con- ∗ log |F | 0 ≤ gmax − gave ≤ (24)ﬁguration and residual count-down time. The transitions of βextended states form a continuous Markov process. Second, log |F |we analyze its equilibrium distributions. Then we obtain the 0 ≤ gmax − gave ¯ ≤ + Λmax (25) βstationary distribution of conﬁgurations by averaging over thedistribution of residual count-down times. We observe the We omit details here since the proof is very similar to [40].following insensitivity result: We observe the following properties: • When Λmax = 0, i.e., all perturbation bounds are zero,Theorem 1. In implementations of perfect Markov chain, if we BitTorrent Markov chain degenerates into the perfectchange the distribution of count-down time from exponential Markov chain.distribution to general distribution and keep the same mean • The upper bound on optimality gap of BitTorrent Markov 1of count-down time ( τ (Δ−δ) ), then the stationary distribution chain shown in (25) is quite general, as it is independentof any conﬁguration f ∈ F is still p ∗ (g) in (12). f of the values of n f , f ∈ F , and the distributions of The proof is relegated to Appendix-C. In the same way, sim- perturbation errors ρ fk (0 ≤ k ≤ nf , f ∈ F).ilar insensitivity results can be obtained for implementations • The upper bound on optimality gap of BitTorrent Markovof BitTorrent Markov chain. chain decreases linear with the maximum perturbation We make the following remarks. error Λmax . • When β increases, the optimality gap for both perfect • Insensitivity property is very important because the count- Markov chain and BitTorrent Markov chain decreases. down times in practical BitTorrent and other P2P system • The upper bound on optimality gap of BitTorrent Markov implementations are not exponential in general. chain is loose than the counterpart of perfect Markov • As a corollary, the optimality gaps shown in (24) and (25) chain because of perturbation errors. The difference is are also insensitive to the distribution of count-down time 1 Λmax and we call it “the price of local perturbation”. if the mean of count-down time is still τ (Δ−δ) . Remarks: Existing perturbation theory of Markov chain [41], [42] is based on general matrix analysis approach andF. Impacts of Local Perturbation can be applied for arbitrary Markov chains. The obtained For any direct transition from f ∈ F to f ∈ F , the bounds may not be tight for time-reversible Markov chains.perturbation error is deﬁned as In contrast, we exploit the structure of time-reversible Markov chains and develop the extended Markov chain approach. This ωf,f = [gf − gf ] − (−xf f ) ˜ ˜ (22) extended Markov chain approach is dedicated for studying perturbation analysis of time-reversible Markov chains. ˜where f = f ∪ f . Recall that g(·) is the summation of all peers’ aggregate G. Mixing Time of The Designed Markov Chaindownloading rates given conﬁguration and f is obtained from ˜ ˜f by dropping one link f f . Therefore, 0 ≤ g f −gf ≤ xf f , ˜ ˜ The convergence time of BitTorrent mainly depend on theand ωf,f ≥ 0. Without loss of generality, we assume ω f,f is convergence time of neighbor selection component, which canbounded and takes values between 0 and Λ max . be characterized by the mixing time of Markov random ﬁled. Perturbation errors are incurred by replacing global knowl- This is open in general. However, by the insensitivity resultsedge [gf − gf ] with local estimation −xf f , thus the sta- above, we can study the implementations with exponentially ˜ ˜tionary distribution of BitTorrent Markov chain is different distributed count-down times, i.e., perfect Markov chain andfrom the one of Perfect Markov chain. Based on a recently BitTorrent Markov chain. We focus on the mixing time ofdeveloped analysis on two-dimensional perturbation errors perfect Markov chain. By studying mixing time, we can[40], we have the following result: quantify the convergence time of BitTorrent protocols and explore the design space.
8.
8 TABLE II Recall that p∗ (12) is the stationary distribution of the DISTRIBUTION OF PEERS ’ UPLOAD BANDWIDTHperfect Markov chain. Let H t (f ) denote the probabilitydistribution of all states in F at time t given that the initial Upload (kbps) 256 378 512 768 1024 2048 Fraction (%) 40 5 5 5 5 40state is f . We deﬁne mixing time of perfect Markov chain asfollows: V. E XPERIMENTAL R ESULTS A. Setup and Purpose tmix ( ) inf t ≥ 0 : max dT V (H t (f ), p∗ ) ≤ (26) f ∈F The system parameters are chosen as follows: number of peers |V | = 1000, peers’ upload neighbor size Δ = 20, peers’where the total variance distance of any two probability upload degree bound δ = 4. Two types of bottlenecks are setdistributions p, p is deﬁned as [13]: to exist in the network, i.e., peers’ upload capacity bottleneck 1 and download capacity bottleneck. We assume upload capacity dT V (p, p ) |pf − pf | (27) follow the distribution shown in Table II. This distribution is 2 f ∈F chosen based on practical data in commercialized P2P sys- tems [43]. For simplicity, we assume all peers’ have the same We have the following results. download capacity of 512kpbs. TCP is run to facilitate the rate allocation given any neighboring topology conﬁguration,Theorem 3. The mixing time tmix ( ) for perfect Markov chain and the proposed neighbor selection algorithm is run on top ofis bounded as follows: TCP. It is worth mentioning that any rate allocation algorithm(a) for general β ∈ (0, ∞), can be used, and our proposed solution is independent of the exp (−β (gmax − gmin )) 1 underlying rate control scheme. tmix ( ) ≥ ln (28) With the above setup, we aim to demonstrate our algorithm’s 2τ · δ n−1 (Δ − δ)n 2 2n performance and compare the results with random neighbor 2δ n+1 (Δ − δ)n Δ choking (essentially a special case of our algorithm at β = 0), tmix ( ) ≤ exp(5β(gmax − gmin )) τ δ BitTorrent’s worst neighbor choking (another special case 1 n Δ 1 when β = ∞), and no choking. We show that our proposed · [ln + ln + β(gmax − gmin )] (29) 2 2 δ 2 solution is able to achieve the theoretical upper bound of system utility. We then compare the convergence time atwhere gmax = maxf ∈F gf , gmin = minf ∈F gf and n = |V | different β’s and number of nodes |V |.denotes the number of peers in system.(b) When B. Effectiveness of Neighbor Choking 1 1 1 In Fig.2(a), we show system’s total utility versus simulation 0<β< ln[(1 + )(1 + )] (30) time of different algorithms. When β = 0, the scheme gmax − gmin δ Δ−δ−1 essentially applies random neighbor choking, in which peers, we have a tighter upper bound: randomly select a neighbor to choke. In the no choking case, peers stick to their initial randomly-chosen neighbor connec- 1 τ (Δ−δ) · ln nδ tions without changing neighbors. In all the above cases, τ is tmix ( ) ≤ 1 chosen such that each peer has a constant count-down time of 1 − (1 − Δ−δ ) · ( δ+1 exp(β(gmax δ − gmin ))) 30 seconds. In Fig.2(b), our algorithm is shown for various β’s. (31) BitTorrent’s worst neighbor choking algorithm can be thought of a special case when β = ∞. From the above ﬁgures, The proof is relegated to Appendix-D. we can see that the proposed solution achieve much better Remarks: We discuss the trad-off between optimality gap performance compared to other algorithms, and achieve better(Theorem 2) and mixing time (Theorem 3). We consider two performance at larger β’s. All algorithms converge reasonablyends of this spectrum. fast, in about 200 to 400 seconds. The scheme can also achieve • As β → ∞, the optimality gap approaches zero while the close to optimal solution when β is large. upper bound of mixing time scales with exp(Ω(n)) and approaches inﬁnity (slow-mixing). C. Performance-Convergence Trade-off • As β → 0, the optimality gap approaches inﬁnity while In this section, we take a closer look of convergence time the upper bound of mixing time scales with O(log(n)) of the algorithm at different β’s and |V |’s while keeping other and remain limited (fast-mixing). parameters the same, where convergence time is deﬁned asThis resembles the phase transition phenomenon in statistics the time when the system’s average utility does not changephysics ﬁeld: when β ≤ β th , the whole system is fast mixing, within 0.01% of its current value. Fig.2 (c) shows that thewhile when β > βth , the whole system is slow mixing. Here convergence time increases almost linearly with β, and sub-βth is the threshold value for phase transition. In our case, linear in the network size |V |. From Fig.2 (b) however, weβth = gmax −gmin ln[(1 + 1 )(1 + Δ−δ−1 )], a small value. As 1 δ 1 see that the algorithm approaches to very close to optimalwe will see later, βth can be larger in experiments. value when β = 10, and even in this case, the convergence
9.
9 5 5 1.4 x 10 1.37 x 10 500 1.36 400 overall utility 1.3 overall utility 1.35 time 1.2 no choking 300 SWNC, β = 0 1.34 SWNC, β = 2 (random choking) SWNC, β = 3 200 |V| = 100 1.1 |V| = 1000 SWNC, β = 10 1.33 SWNC, β = 10 upper bound upper bound |V| = 2000 1 1.32 100 1 100 200 300 400 100 200 300 400 500 0 2 4 6 8 10 time (seconds) time (seconds) β (a) effectiveness of neighbor selection (b) different β’s (c) convergence timeFig. 2. (a) compares no choking, random choking, and soft-worst-neighbor choking; (b) compares different β’s and ; (c) shows convergence time for variousβ’s and |V |’s.time is small, i.e., less than 500 seconds. While the theoretical In summary, our work brings fresh perspectives into un-bounds provide a guidance to the mixing time, in practice derstanding and improving P2P system designs and otherthe algorithms performance much better, and the system will combinatorial network optimization problems.stabilize in less than 10 minutes for a reasonably large swarmsize. R EFERENCES VI. C ONCLUSION [1] J. Pouwelse, P. Garbacki, D. Epema, and H. Sips, “The bittorrent p2p ﬁle-sharing system: Measurements and analysis,” in IPTPS, 2005, pp. In this paper, based on Markov approximation frame- 205–216.work, we reverse-engineered BitTorrent protocol from a fresh [2] A. Bharambe, C. Herley, and V. Padmanabhan, “Analyzing and improv-perspective. We ﬁnd that BitTorrent actually implements a ing a bittorrent network’s performance mechanisms,” in Proc. of IEEE INFOCOM’06, 2006.Markov chain solving a challenging combinatorial optimiza- [3] A. Legout, G. Urvoy-Keller, and P. Michiardi, “Rarest ﬁrst and choketion problem: maximize the aggregate downloading rates of all algorithms are enough,” in Proceedings of the 6th ACM SIGCOMMpeers given the underlying physical edge capacity constraints conference on Internet measurement, 2006, pp. 203–216. [4] A. Legout, N. Liogkas, E. Kohler, and L. Zhang, “Clustering and sharingand concurrent uploading connections limit. BitTorrent solves incentives in bittorrent systems,” in Proceedings of the 2007 ACMthis problem by combining three components (rate control, SIGMETRICS, 2007, pp. 301–312.piece selection and neighbor selection) with three separated [5] M. Chen, S. Liew, Z. Shao, and C. Kai, “Markov approximation for combinatorial network optimization,” in Proc. of IEEE INFOCOM 2010,time scales respectively. After brieﬂy discussing the rate- 2010, pp. 1–9.control component and the piece selection component, we turn [6] N. Laoutaris, D. Carra, and P. Michiardi, “Uplink allocation beyondour focus to the neighbor selection component. choke/unchoke,” in Proc. of ACM CoNEXT’08, 2008. By applying Markov approximation framework, we design a [7] X. Chen, M. Chen, B. Li, Y. Zhao, Y. Wu, and J. Li, “Celerity: A low de- lay multiparty conferencing solution,” in Proc. of ACM Multimedia’11,perfect Markov chain requiring global information to approxi- 2011.mately solve the above challenging combinatorial optimization [8] F. Kelly, A. Maulloo, and D. Tan, “Rate control for communicationproblem in a distributed manner. We also design a BitTorrent networks: shadow prices, proportional fairness and stability,” Journal of the Operational Research society, vol. 49, no. 3, pp. 237–252, 1998.Markov chain requiring only local information by a local [9] S. H. Low, “A duality model of tcp and queue management algorithms,”perturbation of the perfect Markov chain. We map the imple- IEEE/ACM Transactions on Networking, vol. 11, no. 4, pp. 525–536,mentation of BitTorrent Markov chain to BitTorrent protocol. 2003. [10] M. Chiang, S. H. Low, A. Calderbank, and J. Doyle, “Layering asThus properties of BitTorrent protocol can be analyzed by optimization decomposition: A mathematical theory of network archi-studying properties of Markov chain. tectures,” Proceedings of the IEEE, vol. 95, no. 1, pp. 255–312, 2007. To characterize the impacts of Markov approximation and [11] S. Zhang, Z. Shao, and M. Chen, “Optimal distributed p2p streaming under node degree bounds,” in Proc. of IEEE ICNP 2010, 2010, pp.local perturbation, we show the bounds of system utility 253–262.gap, which depend on approximation factor and maximum [12] D. Levin, Y. Peres, and E. Wilmer, Markov Chains and Mixing Times.perturbation error bound. We then show the insensitivity of American Mathematical Society, 2009. [13] P. Diaconis and D. Stroock, “Geometric bounds for eigenvalues ofcount-down time distribution, which includes real BitTorrent Markov chains,” The Annals of Applied Probability, vol. 1, no. 1, pp.scenario (constant count-down time). Further, we study the 36–61, 1991.convergence time of Markov chain through the mixing time [14] D. Qiu and R. Srikant, “Modeling and performance analysis of bittorrent-like peer-to-peer networks,” in Proc. of ACM SIGCOMM 04,metric. By conductance method and coupling method, we 2004, pp. 367–378.obtain both upper and lower bounds of mixing time. We [15] T. Bonald, L. Massouli´ , F. Mathieu, D. Perino, and A. Twigg, “Epi- ealso characterize the trade-off between the performance and demic live streaming: optimal performance trade-offs,” in Proc. of ACM Sigmetrics’08, 2008, pp. 325–336.convergence of BitTorrent via the approximation gap and mix- [16] R. Xia and J. Muppala, “A survey of bittorrent performance,” IEEEing time of the designed Markov chain. This characterization Communications Surveys & Tutorials, vol. 12, no. 2, pp. 140–158, 2010.provides insights for future design of P2P systems. We present [17] B. Cohen, “Incentives build robustness in BitTorrent,” in Workshop onnumerical results that validate our ﬁndings. A possible future Economics of Peer-to-Peer systems, vol. 6, 2003, pp. 68–72. [18] M. Izal, G. Urvoy-Keller, E. Biersack, P. Felber, A. Al Hamra, andstep is to incorporate session-level stochastic dynamics of P2P L. Garces-Erice, “Dissecting bittorrent: Five months in a torrent’ssystems where each peer stays in the system for a ﬁnite time. lifetime,” Passive and Active Network Measurement, pp. 1–11, 2004.
10.
10[19] A. Bellissimo, B. Levine, and P. Shenoy, “Exploring the use of bittorrent A PPENDIX as the basis for a large trace repository,” University of Massachusetts Technical Report, pp. 04–41, 2004. A. Proof for Proposition 2[20] N. Andrade, M. Mowbray, A. Lima, G. Wagner, and M. Ripeanu, By direct transition condition, we know that all conﬁgura- “Inﬂuences on cooperation in bittorrent communities,” in Proceedings of the 2005 ACM SIGCOMM workshop on Economics of peer-to-peer tions can reach each other within a ﬁnite number of transitions, systems, 2005, pp. 111–115. thus the constructed Markov chain is irreducible. Further, it is[21] M. Feldman, K. Lai, I. Stoica, and J. Chuang, “Robust incentive a ﬁnite state ergodic Markov chain with a unique stationary techniques for peer-to-peer networks,” in Proceedings of the 5th ACM distribution. We now show that the stationary distribution is conference on Electronic commerce, 2004, pp. 102–111.[22] S. Jun and M. Ahamad, “Incentives in bittorrent induce free riding,” in indeed (12). Proceedings of the 2005 ACM SIGCOMM workshop on Economics of Based on the transition rate speciﬁed in (16), we see peer-to-peer systems. ACM, 2005, pp. 116–121. that p∗ (g)qf,f = p∗ (g)qf ,f , ∀f, f ∈ F , i.e., the detailed f f[23] M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy, and A. Venkatara- mani, “Do incentives build robustness in bittorrent,” in Proc. of NSDI, balance equations hold. Thus the constructed Markov chain 2007. is time-reversible and its stationary distribution is indeed (12)[24] G. Neglia, G. Presti, H. Zhang, and D. Towsley, “A network formation according to Theorem 1.3 and Theorem 1.14 in [36]. game approach to study bittorrent tit-for-tat,” Network Control and Optimization, pp. 13–22, 2007.[25] D. Levin, K. LaCurts, N. Spring, and B. Bhattacharjee, “Bittorrent is B. Proof for Proposition 3 an auction: analyzing and improving bittorrent’s incentives,” in Proc. of We know that for any direct transition from f to f , there ACM SIGCOMM 2008, 2008, pp. 243–254. ˜ exists a temporary state f = f ∪ f , a peer v unchokes peer w[26] L. Massoulie and M. Vojnovic, “Coupon replication systems,” IEEE/ACM Transactions on Networking (TON), vol. 16, no. 3, pp. 603– and choke peer u. Since the count-down rate for peer v in f 1 616, 2008. is τ (Δ − δ), the probability for peer v to unchoke w is Δ−δ ,[27] M. Lin, B. Fan, J. Lui, and D. Chiu, “Stochastic analysis of ﬁle- swarming systems,” Performance Evaluation, vol. 64, no. 9-12, pp. 856– and the probability for peer v to choke u is 875, 2007.[28] B. Fan, D. Chiu, and J. Lui, “The delicate tradeoffs in bittorrent-like ﬁle exp β gf − gf ˜ sharing protocol design,” in Proc. of IEEE ICNP’06., 2006, pp. 239–248. (32)[29] P. Michiardi, K. Ramachandran, and B. Sikdar, “Modeling seed schedul- exp β gf − gf ˜ ing strategies in bittorrent,” NETWORKING 2007. Ad Hoc and Sensor f ∈Av(f,f ˜ ),f Networks, Wireless Networks, Next Generation Internet, pp. 606–616, 2007. It follows that the transition rate from f to f is[30] H. Zhang, Z. Shao, M. Chen, and K. Ramchandran, “Optimal Neighbor Selection in BitTorrent-like Peer-to-Peer Networks,” in Proceedings of 1 exp β gf − gf ˜ ACM SIGMETRICS, 2011. qf,f = τ (Δ − δ) · ·[31] A. Akella, S. Seshan, and A. Shaikh, “An empirical evaluation of Δ−δ exp β gf − gf ˜ wide-area internet bottlenecks,” in Proc. of the 3rd ACM SIGCOMM f ∈Av(f,f ˜ ),f Conference on Internet Measurement, 2003, pp. 101–114. (33)[32] N. Hu, L. Li, Z. Mao, P. Steenkiste, and J. Wang, “Locating internet bottlenecks: Algorithms, measurements, and implications,” in ACM exp β gf − gf ˜ SIGCOMM, 2004. =τ (34)[33] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness. WH Freeman & Co. New York, NY, exp β gf − gf ˜ f ∈Av(f,f ˜ ),f USA, 1979.[34] G. Ausiello, Complexity and Approximation: Combinatorial Optimiza- This concludes the proof. tion Problems and Their Approximability Properties. Springer Verlag, 1999.[35] S. Liu, M. Chen, S. Sengupta, M. Chiang, J. Li, and P. Chou, “P2p C. Proof for Theorem 1 streaming capacity under node degree bound,” in Proc. of IEEE ICDCS 2010, 2010, pp. 587–598. We suppose that within any conﬁguration f ∈ F , the count-[36] F. Kelly, Reversibility and Stochastic Networks. Wiley,Chichester, 1979. down time for each peer v ∈ V are i.i.d with probability 1[37] B. Sevastyanov, “An ergodic theorem for Markov processes and its density function l v,f and mean τ (Δ−δ) . Previously we model application to telephone systems with refusals,” Theory of probability and its applications, vol. 2, p. 104, 1957. the peer neighboring state as peer conﬁgurations f ∈ F . This[38] K. Chandy, J. Howard Jr, and D. Towsley, “Product form and local model is complete under the exponential count-down time balance in queueing networks,” Journal of the ACM (JACM), vol. 24, assumption because of the memoryless property of the expo- no. 2, pp. 250–263, 1977. nential distribution. In general, for each conﬁguration f ∈ F ,[39] S. Liew, C. Kai, H. Leung, and P. Wong, “Back-of-the-envelope com- putation of throughput distributions in CSMA wireless networks,” IEEE we deﬁne an extended state Y f = (f, {Rv (f ), v ∈ V }), Transactions on Mobile Computing, vol. 9, pp. 1319–1331, 2010. where Rv (f ) ∈ [0, +∞), ∀v ∈ V is the residual count-down[40] H. Zhang, Z. Shao, M. Chen, and K. Ramchandran, “An Optimized time. Rv (f ) decreases continuously and its rate of decrease Video-on-Demand System Based on Distributed Caching,” Technical Report, 2011. is dRdt(f ) = −1. Since there are inﬁnite possible values for v[41] G. Cho and C. Meyer, “Comparison of perturbation bounds for the Rv (f ), the state space is inﬁnite. Therefore Y = {Y f , f ∈ F } stationary distribution of a markov chain,” Linear Algebra and its is a continuous-state Markov process and we denote y = Applications, vol. 335, no. 1-3, pp. 137–150, 2001. (f, {rv (f ), v ∈ V }) as a realization.[42] A. Mitrophanov, “The spectral gap and perturbation bounds for re- versible continuous-time markov chains,” Journal of applied probability, Let pY (t, y) be the state probability density at time t. Its vol. 41, no. 4, pp. 1219–1222, 2004. derivative with respect to t is[43] PPStream. http://www.ppstream.com/. dpY (t, y) pY (t + t, y) − pY (t, y)[44] R. Bubley and M. Dyer, “Path coupling: A technique for proving rapid = lim (35) mixing in markov chains,” in FOCS, 1997, pp. 223–231. dt t→0 t
11.
11 At the time interval from t to t + t, the state changes Under (42), ∀v ∈ V ,as a result of peer ﬁnishing counter-down and peer continuing ∂pY (y)counter-down. There is only one type of jump events that cause − ∂rv (f )a discontinuity in the evolution of y: a peer ﬁnishing count- rv (f ) 1− 0 lv,f (t)dtdown. For small values of t, multiple jump events occur with d rv (f ) 1 (1 − lv ,f (t)dt) · p∗ · 0 τ (Δ−δ)probability in order o( t) and can be disregarded. Between =− 1the jump events, peers are continue counting down, in which drv (f ) f τ (Δ−δ) v ∈V −{v}case y changes continuously without f being changed. For a (44)particular realization y, we have rv (f ) lv,f (rv (f )) (1 − lv ,f (t)dt) pY (t + t, y) = A + B + o( t), (36) = 1 · p∗ · f 0 1 (45) τ (Δ−δ) v ∈V −{v} τ (Δ−δ)where A is the contribution due to count-down-to-zero jump On the other hand, in state y, ∀v ∈ V , it is not hard to seeevents, and B is the contribution due to ordinary counting the probability density function of residual count-down timedown without any jump events, and lim t→0 o( tt) = 0. r (f ) (1− 0 v lv,f (t)dt) A. Let y = (f, {rv (f )}v∈V ) be the state at t + t. Then rv (f ) is 1 [37], [39]. Then we have τ (Δ−δ)we have pY (RCv0+ (y))lv,f (rv (f )) A= pY (t, RCv0+ (y))lv,f (rv (f )) t (37) 0+ (1 − 0 lv,f (t)dt) v∈V = lv,f (rv (f )) · 1 · p∗ fwhere RCv0+ (y) is the operation that sets rv (f ) in f to be 0+ τ (Δ−δ) rv (f )(i.e, just before the counter-down of peer v completes), and (1 − 0 lv ,f (t)dt)lv,f (rv (f )) is the probability density of a newly generated · 1 (46) v ∈V −{v} τ (Δ−δ)count-down time for peer v. rv (f ) B. Let y = (f, {rv (f )}v∈V ) be the state at t + t and lv,f (rv (f )) (1 − lv ,f (t)dt)suppose that no count-down-to-zero events occur during the = 1 · p∗ · f 0 1 (47) τ (Δ−δ) v ∈V −{v} τ (Δ−δ)interval from t to t + t. Then the state at time t must havebeen y = (f, {rv (f ) + t}v∈V , for the rv (f ) decrease at Thus by comparing (45) and (47), we know thatrate −1. Therefore,the contribution is B = p Y (t, (f, {rv (f ) + ∂pY (y) t}v∈V . By expanding in a Taylor series about each r v (f ), − = pY (RCv0+ (y))lv,f (rv (f )), ∀v ∈ V (48)we have ∂rv (f ) Therefore, the stationary probability density in (42) satisﬁes B = pY (t, (f, {rv (f ) + t}v∈V (38) the balance equation (41). ∂pY (t, y) By integrating p Y (y) in (42) overall all possible values of = pY (t, y) + t + o( t) (39) ∂rv (f ) rv (f ), ∀v ∈ V , we see that the stationary distribution for any v∈V conﬁguration f ∈ F is p ∗ in (12). This means the stationary f Putting (37) and (39) into (36), and applying the deﬁnition distribution of conﬁguration f is insensitive to the distributionof derivative in (35), we have of count-down times.dpY (t, y) ∂pY (t, y) This concludes the proof. = pY (t, RCv0+ (y))lv,f (rv (f )) + . dt ∂rv (f ) v∈V (40) D. Proof for Theorem 3 In stationary, the derivative with respect to time t cancel, so (a). The proof for part (a) is based on the spectral analysis (t,y) method [12], [13].that dpYdt = 0 and we have the following balance equation The perfect Markov chain is a continuous-time Markov ∂pY (y) chain and its stationary distribution is pY (RCv0+ (y))lv,f (rv (f )) + =0 (41) ∂rv (f ) v∈V exp[βgf ] p∗ (g) = , ∀f ∈ F Next, we will show that the stationary probability density f exp[βgf ] f ∈Fof Y is: Δ n (1 − rv (f ) lv,f (t)dt) Since exp[βgf ] ≤ |F | exp(βgmax ) and |F | ≤ δ , pY (y) = p∗ f 0 1 (42) f ∈F v∈V τ (Δ−δ) the minimal probability in the stationary distributionwhere p∗ is given by (12). exp(βgmin ) f pmin min p∗ (g) ≥ (49) In other words, we will show that the stationary probability f ∈F f |F | · exp(βgmax )density in (42) satisﬁes the balance equation (41). In fact, we 1 ≥ Δ n exp(−β(gmax − gmin )) (50)will show that δ ∂pY (y) pY (RCv0+ (y))lv,f (rv (f )) + = 0, ∀v ∈ V. (43) To utilize the existing bounds on convergence to the station- ∂rv (f ) ary distribution of discrete-time Markov chain, we uniformize
12.
12the perfect Markov chain. Uniformization plays the role of First, we give an upper bound of Φ. For any N ⊂bridge between discrete-time Markov chain and continuous- F , π(N ) ∈ (0, 1/2]time Markov chain. F (N, N c ) Denote Q={qf,f } as the transition rate matrix of perfect Φ= min (60) N ⊂F ,π(N )∈(0,1/2] πN (g)Markov chain. Construct a discrete-time Markov chain Z(n) 1with its probability transition matrix P = I + Q , where I is the ≤ p∗ (g)P (f, f ). (61) θidentity matrix. Then consider a system that successive states πN (g) c f f ∈N ,f ∈Nvisited form a Markov chain Z(n) and the times at which the 1 = p∗ (g) · ( P (f, f )) (62)system changes its state form a Poisson process N(t). Here πN (g) f c f ∈N f ∈NN(t) is an independent Poisson process with rate θ. Then the 1state of this system at time t is denoted by Z(N(t)), which is ≤ p∗ (g) (63)called a subordinated Markov chain. πN (g) f f ∈N Let =1 (64) τ θ = δ n (Δ − δ)n · exp (β(gmax − gmin)) (51) Combining (51), (59) and (64), we have δwhere gmin = min gf and gmax = max gf . 1 1 f ∈F f ∈F tmix ( ) ≥ ln (65) Since ∀f, f ∈ F , 2θ 2 exp (−β (gmax − gmin )) 1 = ln (66) exp β gf − gf ˜ 2τ · δ n−1 (Δ − δ)n 2 qf,f ≤ τ (52) Now we give a lower bound of Φ. When q f,f = 0, ∀ ∈ F , exp β gf − gf ˜ f ∈Av(f,f ˜ ),f by (16), exp (β (gf )) exp β gf − gf ˜ =τ (53) exp (β (gf )) qf,f = τ (67) f ∈Av(f,f ˜ ),f exp β gf − gf ˜ f ∈Av(f,f ˜ exp (βgmax ) τ ),f ≤τ = exp (β(gmax − gmin )) (54) exp (β (gf )) δ exp (βgmin ) δ =τ (68) exp (β (gf ))and f can at most transit to [δ(Δ − δ)] other states, thus n f ∈Av(f,f ˜ ),f qf,f ≤ δ n (Δ − δ)n · τ exp (β(gmax − gmin)) = θ. Thenf =f δ exp (βgmin ) τ ≥τ = exp (−β(gmax − gmin )) (69)by uniformization theorem [12] , perfect Markov chain and δ exp (βgmax ) δits discrete-time counterpart Z(N(t)) has the same distribution. Combining (58) and (69), we haveFurther, p∗ is also the stationary distribution of Z(n). Let ρ2 denotes the second largest eigenvalue of transition Φ≥ min F (N, N c ) (70) N ⊂F ,π(N )∈(0,1/2]matrix P for Markov chain Z(n). By spectral gap inequality ≥ min F (f, f ) (71)[12], [13] we have f =f ,P (f,f )>0exp(−θ(1 − ρ2 )t) exp(−θ(1 − ρ2 )t) = min p∗ (g)P (f, f ) (72) ≤ max dT V (H t (f ), p∗ ) ≤ √ f =f ,P (f,f )>0 f 2 f ∈F 2 pmin qf,f (55) = min p∗ (g) · f (73) f =f ,P (f,f )>0 θ Therefore, pmin τ ≥ · exp (−β (gmax − gmin )) (74) 1 1 1 1 1 1 θ δ ln ≤ tmix ( ) ≤ [ln + ln ] θ(1 − ρ2 ) 2 θ(1 − ρ2 ) 2 2 pmin Combining (59), (50) and (51), we have (56) 2 1 1 1 tmix ( ) ≤ 2 [ln + ln ] (75) Now we bound ρ 2 by Cheeger’s inequality [12], [13]: θΦ 2 2 pmin 1 2θδ 2 exp(2β(gmax − gmin )) 1 1 1 1 − 2Φ ≤ ρ2 ≤ 1 − Φ2 (57) ≤ 2 τ2 [ln + ln ] 2 pmin 2 2 pminwhere Φ is the “Conductance” of P, deﬁned as (76) 2n F (N, N c ) 2θδ 2 Δ Φ min (58) ≤ exp(4β(gmax − gmin )) N ⊂F ,πN ∈(0,1/2] πN (g) τ2 δ 1 n Δ 1 Here πN (g) = p∗ (g) f and F (N, N c ) = · [ln + ln + β(gmax − gmin)] (77) f ∈N 2 2 δ 2 p∗ (g)P (f, f ). 2n f 2δ n+1 (Δ − δ)n Δf ∈N,f ∈N c ≤ exp(5β(gmax − gmin )) Combining (56) and (57), we have τ δ 1 1 2 1 1 1 1 n Δ 1 ln ≤ tmix ( ) ≤ [ln + ln ] (59) · [ln + ln + β(gmax − gmin)] (78) 2θΦ 2 θΦ 2 2 2 pmin 2 2 δ 2
13.
13 This concludes the proof for part (a). Next, we show the More precisely, consider a conﬁguration pair (X 0 , Y0 ) ∈ S.proof for part (b). Without loss of generality, we have (b). The proof for part (b) is based on the coupling method[12]. X0 = (v1 0 , . . . , vn 0 ) X X (82) First, we obtain a discrete-time Markov chain by uni- Y0 = (v1 0 , . . . , vn0 ) Y Y (83)formization of continuous-time perfect Markov chain. Denotethis discrete-time Markov chain as M. M is designed to where VjX0 = VjY0 , ∀j = 2, . . . , n, andsample from a given probability distribution p ∗ (12) on a state v1 0 = {(v1 , a), (v1 , z1 ), . . . , (v1 , zδ−1 )} X (84)space F . At each step, it selects a peer v ∈ V uniformlyat random and modiﬁed the active neighbor set of v. More Y v1 0 = {(v1 , b), (v1 , z1 ), . . . , (v1 , zδ−1 )} (85)precisely, in feasible conﬁguration f ∈ M, it does the A peer w ∈ V is chosen uniformly at random. Atfollowing: 1 every step, both chains update the same peer w. The 1) pick w ∈ V uniformly at random (with probability n = coupling for the update at time 1 is a grand coupling 1 |V | ); ˆ ˆ ˆ ˆ (X0 , Y0 ) → (X, Y ) → (X1 , Y1 ), where (X0 , Y0 ) → (X, Y ) 2) unchoke a new neighbor v from peer w s inactive neigh- ˆ ˆ denotes the unchoking operation and ( X, Y ) → (X1 , Y1 ) 1 bor set uniformly at random (with probability Δ−δ ); The denotes the choking operation. Let w X0 (+) wY0 (+) system transits to a temporary conﬁguration f .˜ ˆ denote the peer unchoked by w under X 0 (Y0 ), and w X (−) 3) choke an active neighbor u of peer w with probability ˆ ˆ ˆ wY (−) denote the peer choked by w under X (Y ). Then exp β gf − gf ˜ the coupling is shown as follows: , (79) exp β gf − gf ˜ (1). if w = v1 , then we can make the identical updates at f ∈Av(f,f ˜ ),f peer w for both chains. This leads to w X0 (+) = wY0 (+), ˆ ˆ ˜ ˜ where f = f ∩ {(w, v)}, f = f {(w, u)}. wX (−) = wY (−) and d(X1 , Y1 ) = 1.It can be shown that M has a transition matrix P = I +Q/θ , (2). otherwise, w = v 1 .where I is the identity matrix, θ = nτ (Δ − δ). Now we apply coupling method to bound the mixing timeof M. By a “coupling” for this chain, we mean a joint (2)-1. if v1 0 (+) = b, then v1 0 (+) = a. This leads to X Y ˆ ˆ ˆ ˆstochastic process (Xt , Yt ) on F × F such that each of X = Y . So next we can make v 1 (−) = v1 (−) and we have X Ythe processes (Xt ) and (Yt ) is a Markov chain on F with d(X1 , Y1 ) = 0.transition matrix P . Typically, after deﬁning the distancemetric d : F × F → {0, 1, . . . , dmax }, we try to construct (2)-2. else if v1 0 (+) = c = b for any inactive neighbor c Xa one-step distance-decreasing coupling (X 0 , Y0 ) → (X1 , Y1 ) of v1 under X0 , then v1 0 (+) = c. In this case, we have Ysuch that ˆ v1 = {(v1 , a), (v1 , z1 ), . . . , (v1 , zδ−1 ), (v1 , c)} X (86) E(d(X1 , Y1 )|X0 , Y0 ) ≤ α · d(X0 , Y0 ) (80) ˆ Y v1 = {(v1 , b), (v1 , z1 ), . . . , (v1 , zδ−1 ), (v1 , c)}. (87)for all (X0 , Y0 ) ∈ F × F , where 0 ≤ α < 1. Applying this For convenience, we denote z δ = c. Let pk , qk be thecoupling iteratively results in a t-step coupling and a mixing ˆ ˆ probability of peer v 1 choking neighbor z k for X and Ytime analysis. respectively, 1 ≤ k ≤ δ. We also denote p 0 (q0 ) as the In general, deﬁning and analyzing a coupling for all pairs ˆ ˆ probability of peer v 1 choking neighbor a (b) for X (Y ). WeXt , Yt ∈ F is difﬁcult. The path coupling technique [45] then take rk = min{pk , qk } for any k ∈ [0, δ]. Now We deﬁnesimpliﬁes the approach by restricting attention to pairs in a a random variable H satisfyingconnected subset S ⊆ F × F . It then sufﬁces to deﬁne a one- ⎧step coupling such that (80) holds for all (X 0 , Y0 ) ∈ S. Then ⎪ri ⎨ if 0 ≤ i ≤ δthe path coupling theorem [45] constructs, via simple compo- P r(H = i) = 1 − i=0 if i = δ + 1 δ (88)sitions, a one-step coupling satisfying (80) for all X 0 , Y0 ∈ F . ⎪ ⎩ 0 otherwise Given any two conﬁgurations X, Y ∈ F , let d(X, Y )denote the Hamming distance between X and Y , which equals ˆ ˆ We update X, Y according to the following rules.to the number of different node-pairs. Now we denote S as ˆ ˆconﬁguration pairs X, Y ∈ F such that differ at exactly one 1) If H = 0, then v 1 (−) = a and v1 (−) = b. X Y ˆ ˆpeer-neighbor pair. Then we have 2) If H = i where 1 ≤ i ≤ δ, then v 1 (−) = v1 (−) = zi . X Y ˆ ˆ 3) If H = δ + 1, then update X, Y independently. S = {(X, Y ) ∈ F × F : d(X, Y ) = 1}. (81) ˆ a) P r(v1 (−) = a|H = δ + 1) = 1−p0 −r0 r X δ j=0 j For any peer v ∈ V , denote v X as the set of pairs ˆ b) P r(v1 (−) = b|H = δ + 1) = Y q0 −r0 1− δ rj j=0of active neighbors under conﬁguration X. For example, if ˆpeer v has 3 active neighbors j, k, l under X, then v X = c) P r(v1 (−) = zi |H = δ + 1) X = 1−pi −ri r , δ ∀i ∈ j=0 j{(v, j), (v, k), (v, l)}. Now we design a one-step coupling. {1, . . . , δ}
14.
14 ˆ qi −ri d) P r(v1 (−) = zi |H = δ + 1) = Y 1− δ rj , ∀i ∈ Therefore, j=0 {1, . . . , δ} E(d(X1 , Y1 ) − 1|w = v1 , X0 , Y0 , (2) − 2, c) (94) ˆ ˆ It is not hard to show the one-step coupling designed = P (A)(−1) + P ((v1 (−) = v1 (−)) ∩ D) X Y (95) ˆ ˆabove is a valid coupling. Now we analyze distance metric = −r0 + (P (D) − P ((v1 (−) = v1 (−)) ∩ D)) X Y (96)E(d(X1 , Y1 ) − 1|X0 , Y0 ) = 0. δ =1− rj − p0 − q0 + r0 (97) We know that E(d(X1 , Y1 ) − 1|X0 , Y0 , w = v1 ) = 0. We j=0also can see that scenario (2) − 1 happens with probability δ 1Δ−δ and corresponding distance d(X 1 , Y1 ) = 0. On the other = (1 − p0 ) + (1 − q0 ) − rj − 1 (98)hand, given any c = a, b, scenario (2) − 2 also happens with j=1 1probability Δ−δ . Under this scenario, we know that δ = (pj + qj − rj ) − 1 (99) j=1 ˆ ˆ • Case A. v1 (−) = a and v1 (−) = b, then d(X1 , Y1 ) = 0. X Y δ ˆ ˆ • Case B. v1 (−) = a and v1 (−) = b , then d(X1 , Y1 ) = X Y = max(pj , qj ) − 1 (100) 1. j=1 ˆ ˆ • Case C. v1 (−) = a and v1 (−) = b , then d(X1 , Y1 ) = X Y δ 1 1. ≤ exp(β(gmax − gmin )) − 1 (101) ˆ ˆ δ+1 • Case D. v1 (−) = a and v1 (−) = b , then d(X1 , Y1 ) = 1 X Y j=1 ˆ ˆ ˆ if v1 (−) = v1 (−) and d(X1 , Y1 ) = 2 if v1 (−) = X Y X δ ˆ = exp(β(gmax − gmin )) − 1 (102) v1 (−). Y δ+1 It follows that Since probability of case A is P (A) = r 0 and probability E[d(X1 , Y1 ) − 1|X0 , Y0 ] (103)of case D is = P (w = v1 ) · 0 + P (w = v1 ) · E[d(X1 , Y1 ) − 1|X0 , Y0 , w = v1 ] (104) 1 = · E[d(X1 , Y1 ) − 1|X0 , Y0 , w = v1 ] (105) n 1 1 E(d(X1 , Y1 ) − 1|w = v1 , X0 , Y0 , (2) − 2, c)] = · [− + ]P (D) (89) n Δ−δ Δ−δ c=a δ δ (106) p0 − r0 q0 − r0= rj + (1 − rj )(1 − δ )(1 − δ ) 1 1 1 δ j=1 j=0 1− j=0 rj 1− j=0 rj ≤ · [− + (1 − )·( exp(β(gmax − gmin)) − 1)] n Δ−δ Δ−δ δ+1 (90) (107) δ δ p0 − r0 q0 − r0 1 1 δ= rj + (1 − rj )(1 − − ) = · [−1 + (1 − )·( exp(β(gmax − gmin)))] 1− δ 1− δ n Δ−δ δ+1 j=1 j=0 j=0 rj j=0 rj (108) (91) δ δ For convenience, let= rj + (1 − rj − p0 − q0 + 2r0 ) (92) 1 δ K = 1 − (1 − )·( exp(β(gmax − gmin ))) j=1 j=0 Δ−δ δ+1 (109) Then when , 1 1 1 0<β< ln[(1 + )(1 + )], (110) gmax − gmin δ Δ−δ−1where we utilize the fact that (p i − ri )(qi − ri ) = 0, ∀i ∈{0, . . . , δ}. we have K > 0. It follows that for any (X 0 , Y0 ) ∈ S K K On the other hand, E[d(X1 , Y1 )|X0 , Y0 ] < 1 − = (1 − ) · d(X0 , Y0 ). n n (111) By path coupling theorem [45] we know that for any (X0 , Y0 ) ∈ F × F , δ K ˆ ˆ P ((v1 (−) = v1 (−)) ∩ D) = X Y rj (93) E[d(X1 , Y1 )|X0 , Y0 ] < (1 − ) · d(X0 , Y0 ) = λ · d(X0 , Y0 ). n j=1 (112)
15.
15where λ = 1 − K . n Applying this one-step coupling iteratively results in a t-stepcoupling, and we have for any t, (X t , Yt ) ∈ F × F , P [Xt = Yt ] = P [d(Xt , Yt ) ≥ 1] (113) ≤ E[d(Xt , Yt )] (114) ≤ λt · diam(F ) (115) ≤ nδ · λt (116) Thus for discrete-time Markov chain M, dT V (P t (X0 , ·), P t (Y0 , ·)) ≤ nδ · λt (117) Then by uniformization theorem [12], we know that for anyf ∈ F, ∞ (θ t)jdT V (H t (f ), p∗ ) = dT V [ exp(−θ t)P n (f, ·), p∗ ] j=0 j! (118) ∞ (θ t)j ≤ exp(−θ t)dT V (P j (f, ·), p∗ ) j=0 j! (119) ∞ (θ tλ)j ≤ nδ · exp(−θ t) (120) j=0 j! = nδ · exp(−θ (1 − λ)t) (121) Kt = nδ · exp(−θ · ) (122) n = nδ · exp(−τ (Δ − δ) · Kt) (123) Thus we have ln nδtmix ( ) ≤ 1 τ (Δ − δ)(1 − (1 − Δ−δ ) · ( δ+1 exp(β(gmax − gmin )))) δ (124) This concludes the proof for part (b).
Views
Actions
Embeds 0
Report content