Cross layer design in multihop wireless networks


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cross layer design in multihop wireless networks

  1. 1. Cross-Layer Design in Multihop Wireless Networks Lijun Chen, Steven H. Low, John C. Doyle Engineering and Applied Science Division, California Institute of Technology, Pasadena, CA 91125, USAAbstractIn this paper, we take a holistic approach to the protocol architecture design in multihop wireless networks. Our goal is to integratevarious protocol layers into a rigorous framework, by regarding them as distributed computations over the network to solve someoptimization problem. Different layers carry out distributed computation on different subsets of the decision variables using localinformation to achieve individual optimality. Taken together, these local algorithms (with respect to different layers) achieve aglobal optimality. Our current theory integrates three functions—congestion control, routing and scheduling—in transport, networkand link layers into a coherent framework. These three functions interact through and are regulated by congestion price so as toachieve a global optimality, even in a time-varying environment. Within this context, this model allows us to systematically derivethe layering structure of the various mechanisms of different protocol layers, their interfaces, and the control information that mustcross these interfaces to achieve a certain performance and robustness.Keywords: Theory-based approach, Cross-layer design, Dual decomposition, Multihop wireless networks1. Introduction optimization problem. This approach and the associated utility maximization problem were originally proposed as an analyti- The success of communication network has largely been a cal tool for reverse engineering TCP congestion control where aresult of adopting a layered architecture. With this architecture, network with fixed link capacities and prespecified routes is im-its design and implementation is divided into simpler modules plicitly assumed. A natural framework for cross-layer design isthat are separately designed and implemented and then inter- then to extend the basic utility maximization problem to includeconnected. A protocol stack typically has five layers, applica- decision variables of other layers, and seek a decompositiontion, transport (TCP), network (IP), data link (include MAC) such that different layers carry out distributed computation onand physical layer. Each layer controls a subset of the deci- different subsets of decision variables using local informationsion variables, hides the complexity of the layer below and pro- to achieve individual optimality, and taken together, these lo-vides well-defined services to the layer above. Together, they cal algorithms achieve the global optimality. This approach hasallocate networked resources to provide a reliable and usually come to be known as layering-as-optimization-decomposition;best-effort communication service to a large pool of competing see [6] for an extensive survey.users. However, the layered structure addresses only abstract and We apply this approach to design an overall frameworkhigh-level aspects of the whole network protocol design. Var- for the protocol architecture in multihop wireless networks,ious layers of the network are put together often in an ad hoc with the goal of achieving efficient resource allocation throughmanner, and might not be optimal as a whole. In order to im- cross-layer design. We first consider the network with fixedprove the performance and achieve efficient resource allocation, channel or single-rate devices, and formulate network resourcewe need to understand interactions across layers and carry out allocation as a utility maximization problem with rate con-cross-layer design. Moreover, in wireless networks there does straints at the network layer and schedulability constraints atnot exist a good interface between the physical and network the link layer. We then apply duality theory to decompose thelayers. Wireless links are unreliable and wireless nodes usually system problem vertically into congestion control, routing andrely on random access mechanism to access wireless channel. scheduling subproblems that interact through congestion prices.Thus, the performance of link layer is not guaranteed, which Based on this decomposition, a distributed subgradient algo-will result in performance problems for the whole network such rithm for joint congestion control, routing and scheduling isas degraded TCP performance. So, we need cross-layer design, obtained, and proved to approach arbitrarily close to the opti-i.e., to exchange information between physical/link layer with mum of the system problem. We next extend the dual subgradi-higher layers in order to achieve better performance. ent algorithm to wireless multihop networks with time-varying Motivated by the duality model of TCP congestion control channels and adaptive multi-rate devices. The stability of the[16] [22] [18] [23], one approach to understand interactions resulting system is proved, and its performance is characterizedacross layers is to view the network as an optimization solver with respect to an ideal reference system. We finally apply theand various protocol layers as distributed algorithms solving an general algorithm to the joint congestion control and mediumPreprint submitted to Computer Networks May 12, 2010
  2. 2. access control design over the network with single-path routing less networks; see, e.g., [10], [37], [45], [46], [47]. The modeland to the cross-layer congestion control, routing and schedul- used in section 7 (see also [4]) is motivated by Neely et al. [26]ing design in the network without prespecified paths. that studies dynamic power control and routing for time-varying Our current theory integrates three functions— congestion wireless networks and by Hajeck et al. [11] and Kodialam atcontrol, routing and scheduling—in transport, network and link al. [17] that study joint routing and scheduling to determine thelayers into a coherent framework. While the integration of all achievable rates in multi-hop wireless networks; similar decom-protocol components remain a big challenge, this framework is position for the network with deterministic wireless channel haspromising to be extended to provide a mathematical theory for also been revealed in the journal version of [26] and in [20].network architecture, and allow us to systematically derive the The utility maximization in time-varying wireless networkslayering structure of the various mechanisms of different pro- is first studied in the context of fair scheduling. It has beentocol layers, their interfaces, and the control information that shown that a family of primal scheduling algorithms maximizemust cross these interfaces to achieve a certain performance and the sum of the utilities of the long-run average data rates pro-robustness. We also present a general technique and results re- vided to the users; see, e.g., [39] [19] [35]. In contrast, thegarding the stability and optimality of dual algorithm in face result presented in section 5 (see also [4]) is for the dual algo-of time-varying parameters. As the flow contention graph that rithms. An earlier result for the dual scheduling algorithm iswill be used to characterize feasible rate regions of the networks by Eryilmaz et al. [8] that studies fair resource allocation us-is a rather general construction and can be used to capture the ing queue-length based scheduling and congestion control. An-interdependence or contention among parallel servers of any other similar result is by Neely et al. [27] that studies fairnessqueueing networks, these results are applicable to any systems and optimal stochastic control for heterogeneous networks. Allthat can be modelled by a general model of queueing network these three works use stochastic Lyapunov method to establishthat is served by a set of interdependent parallel servers with stability, but the technical details are somewhat different. Espe-time-varying service capabilities. cially, the stability and optimality result presented in section 5 The remainder of this paper is organized as follows. The is based only on general properties of convexity and the defini-next section briefly discusses related work. Section 3 presents tion of subgradients, and can be directly applied to a variety ofdetails of the system model for the network with fixed chan- time-varying systems that can be solved or modelled by the dualnel or single-rate devices, and section 4 presents a distributed algorithms. Another comparable result is by Stolyar [36] thatalgorithm for joint congestion control, routing and scheduling proposes greedy primal-dual algorithm to maximize networkvia dual decomposition. Section 5 extends the dual algorithm utility. It uses a very different technique to establish handle the network with time-varying channel and adaptive Here, we focus on dual decomposition, but there are manymulti-rate devices. As specific cases of the general model and different ways to decompose a given problem, each of whichalgorithm developed in sections 4 and 5, section 6 discusses corresponds to a different layering scheme. See the survey ar-joint congestion control and medium access control design in ticle [6] and the references therein for various recent work onmultihop wireless networks with single-path routing, and sec- cross-layer design.tion 7 discusses cross-layer congestion control, routing andscheduling design in the network without prespecified paths. 3. System ModelWe conclude in section 8. Consider a multihop wireless network with a set N of nodes2. Related Work and a set L of directed logical links. We assume a static topol- ogy and each link l ∈ L has a fixed finite capacity cl bits per The utility maximization framework [16] [22] [18] on TCP second when active, i.e., we implicitly assume that the wirelesscongestion control has been extensively applied and extended to channel is fixed or some underlying mechanism is used to maskstudy protocol design, especially congestion control (see, e.g., the channel variation so that the wireless channel appears to[48] [49]), fair channel access (see, e.g., [25] [39] [19] [9] [35] have a fixed rate. This assumption will be relaxed in section 5.[8]), and cross-layer design (see, e.g., [44] [5] [20] [3] [4]), Wireless channel is interference limited, where links contendin wireless networks. Xue et al. [48] and Yi et al. [49] are with each other for exclusive access to the channel. We willamong the first to formulate schedulability constraints at link use the flow contention graph to capture the contention rela-layer for congestion control over multihop wireless networks. tions among links. The feasible rate region at link layer is thenXiao et al. [44] study joint routing and resource allocation, and a convex hull of the corresponding rate vectors of independentare among the first to apply dual decomposition to cross-layer sets of the flow contention graph. We will further describe ratedesign in wireless networks. Chiang [5] is among the first to constraints at the network layer by linear inequalities in termsstudy joint congestion and power control. Lin et al. [20] and of user service requirements and allocated link capacities. TheChen et al. [3] are among the first to study joint congestion resource allocation of the network is then formulated as a utilitycontrol and scheduling. Chen et al. [3] and Wang at al. [41] are maximization problem with schedulability and rate constraints.among the first to study cross-layer design in the network withcontention-based medium access. 3.1. Flow Contention Graph and Schedulability Constraint The work presented in section 6 (see also [3]) is originally The interference among wireless links is usually specified bymotivated to solve TCP unfairness problem over multihop wire- some interference model that describes physical constraints re- 2
  3. 3. garding wireless transmissions and successful receptions. For 3.2. Rate Constraintexample, in a network with primary interference, links that Let fl ≥ 0 denote the amount of capacity allocated to link l.share a common node cannot transmit or received simultane- From the schedulability constraint, f should satisfyously but links that do not share nodes can do so. It modelsa wireless network with multiple channels where simultaneous f ∈ Π. (2)communications in a neighborhood are enabled by using or-thogonal CDMA or FDMA channels. In a network with sec- Assume that the network is shared by a set S of sources, withondary interference, links mutually interfere with each other each source s ∈ S transmitting at rate x s bits per second. In thewhenever either the sender or the receiver of one is within the following, we will formulate rate constraints for networks withinterference range of the sender or receiver of the other. Given different kinds of interference model, we can construct a flow contention graphthat captures the contention relations among the links; see, e.g., The Network with Single-Path Routing[25]. In the contention graph, each vertex represents a link, and Each source s uses a path consisting of a set L s ⊂ L of edge between two vertices denotes the contention between The sets L s define an |L| × |S | routing matrixthe corresponding links: two links interfere with each other andcannot transmit at the same time. Figure 1 shows an example of 1 if l ∈ L s , Rls =a simple multihop wireless network with primary interference 0 otherwise.and the corresponding flow contention graph. Thus, the aggregate rate over link l is s∈S Rls x s . The rate con- 1 3 5 straint is written as A B C D Rx ≤ f, (3) 2 4 6 i.e., the aggregate link rate should not exceed the link capacity. 1 3 5 The Network with Multipath Routing Each source s can send traffic along a set T s of given paths. s Each path r ∈ T s contains a set of Lr ⊂ L of links, which defines s 2 4 6 a |L| × |T s | routing matrix H whose (l, r)th entry is given by sFigure 1: Example of a multihop wireless network with 4 nodes and 6 logical s 1 if l ∈ Lr , Hlr =links and the corresponding flow contention graph. 0 otherwise. Given a flow contention graph, we can identify all its inde- Denote by xr the rate at which source s sends along path r. spendent sets of vertices. An independent set is a set of vertices Thus, the source rate x s = r∈T s xr . The rate constraint is writ- sthat have no edges between each other [7]. The links in an inde- ten aspendent set can transmit simultaneously. Let E denote the set Hlr xr ≤ fl , l ∈ L. s (4) sof all independent sets with each independent set indexed by s, r∈T se. We represent an independent set e as a |L|-dimensional ratevector re , where the lth entry is The Network without Prespecified Paths Since no end-to-end path is given, we will use multicommod- cl if l ∈ e, rle := ity flow model for routing. Let D denote the set of destination 0 otherwise. k nodes of network layer flows. Let fi, j ≥ 0 denote the amount of capacity of link (i, j) allocated to the flows to destination k.The feasible rate region Π at the link layer is then defined as the k Then the aggregate capacity on link (i, j) is fi, j := k∈D fi, j .convex hull of these rate vectors   Let xik ≥ 0 denote the flow generated at node i towards desti-       nation k. Then the aggregate capacity for its incoming flows e Π := r : r =  ae r , ae ≥ 0, ae = 1 .  (1) and generated flow to the destination k should not exceed the   e e summation of the capacities for its outgoing flows to kThus, a link flow vector y satisfies schedulability constraint if xik ≤ k fi, j − k f j,i , i ∈ N, k ∈ D, i k. (5)y ∈ Π. j:(i, j)∈L j:( j,i)∈L The contention graph is a general construct, and can cap-ture the interdependence or contention among parallel servers Equation (5) is the rate constraint for resource allocation. Forof any queueing networks. For example, it can characterize the simplicity of presentation, we assume that there is at most onecontention relations in the network where wireless nodes are flow between any node and destination pair [i, k] ∈ S ×D. Thus,equipped with multiple radios or communicate through multi- xik = x s if i is the source node of flow s = [i, k], and xik = 0ple channels. otherwise. 3
  4. 4. 3.3. Problem Formulation is given, and solved in (11) if no path is prespecified for the We see from the last subsection that all three kinds of rate source. We see that, by dual decomposition, the flow opti-constraints are expressed as linear inequalities. If we repre- mization problem decomposes into separate “local” optimiza-sent the “routing” of the user (source) service requirement by tion problems of transport, network and link layers respectively,a linear function H(x) of the source rates x, and represent the and they interact through congestion prices.“allocation” of the service capacity by a linear function A( f ) of Defining dual function D(p) = max x, f ∈Π L(p, x, f ), by dualitythe link capacities f , since the service requirement should not we have (see, e.g., Chapter 5 in [2])exceed the allocated service capacity, we have the following in- max U s (x s ) = min D(p) = min max L(p, x, f ).equality constraint x, f s p≥0 p≥0 x, f ∈Π H(x) ≤ A( f ). (6) The dual problem min p D(p) can be solved by using the sub- gradient method [33] [2], where the Lagrangian multipliers areThe linear constraint (6) is a very general relation. The rate adjusted in the opposite direction to the subgradient of the dualconstraints (3), (4) and (5) are just its different concrete repre- functionsentations. Following [16] [22] [18], assume each source s attains a util- g(p) = A( f (p)) − H(x(p)). (12)ity U s (x s ) when it transmits at a rate x s . We assume U s (·) is con- Congestion price update: The network (links or nodes) up-tinuously differentiable, increasing, and strictly concave. Our dates the congestion price, according toobjective is to choose source rates x and allocated capacities fso as to solve the following global problem p(t + 1) = [p(t) + γt (H(x(p(t))) − A( f (p(t))))]+ , (13) where γt is a positive scalar stepsize, and “+” denotes the pro- max U s (x s ) (7) jection onto the set + of nonnegative real numbers. The algo- x, f s rithm has a nice interpretation in terms of law of supply and de- subject to H(x) ≤ A( f ), (8) mand and their regulation through pricing. Equation (13) says f ∈ Π. (9) that, if the demand H(x(p(t))) for service capacity exceeds the The system problem (7)–(9) is a convex optimization prob- supply A( f (p(t))), the price p will rise, which will in turn de-lem, but it is impractical to solve it centrally in real networks. crease the demand (see equation (10)) and increase the supplyDistributed algorithm can be derived by solving its Lagrange (see equation (11)).dual problem, as we will show in the next section. Before proceeding, we explain the notation used in this pa- per. We denote a link either by a single index l or by the directed pair (i, j) of nodes it connects. We use s or alternatively node4. Distributed Algorithm via Dual Decomposition pair [i, k] to denote a network layer flow. We overload the use of the source rate x, the link capacity f and the congestion price4.1. Distributed Algorithm p throughout this paper, depending on different kinds of rout- Consider the Lagrangian of the problem (7)–(9) with respect ing involved. For example, x refers to the source rate x s or theto the rate constraint source rate xik at node i towards destination k, depending on the specific context. Similarly, f refers to both the link capacity L(p, x, f ) = U s (x s ) − pT (H(x) − A( f )). k s { fi, j } and the capacity { fi, j } over link (i, j) that is allocated to destination k.Given p, the above Lagrangian has a nice decomposition struc-ture: it is the summation of two independent terms, of source 4.2. Convergence Analysisrates and link capacities respectively. Interpreting p as the “con- Using results on the convergence of the subgradient methodgestion price” and maximizing the Lagrangian over x and f for [33] [2], we show that, for constant stepsize, the algorithm isfixed p, we obtain the following joint congestion control and guaranteed to converge to within a neighborhood of the opti-scheduling algorithm: mal value. For diminishing stepsize, the algorithm is guaran- Congestion control: At time t, given congestion price p(t), teed to converge to the optimal value. We would like a dis-the sources adjust flow rates x according to the congestion price tributed implementation of the subgradient algorithm, and thus a constant stepsize γt = γ is more convenient. Note that the x(t) = x(p(t)) = arg max U s (x s ) − pT (t)H(x). (10) x dual cost usually will not monotonically approach the optimal s value, but wander around it under the subgradient algorithm. Scheduling: Over link l, send an amount of data for each flow The usual criterion for stability and convergence is not applica-according to the rates f such that ble. Here we define convergence in a statistical sense [3]. Let p(t) := 1 tτ=1 p(τ) be the average price by time t. f (t) = f (p(t)) ∈ arg max pT (t)A( f ). (11) t f ∈Π Definition 1. Let p∗ denote an optimal value of the dual vari-Note that there does not exist an explicit routing component able. Algorithm (10)–(13) with constant stepsize is said to con-in the dual decomposition. Instead, the routing is implicitly verge statistically to p∗ , if for any δ > 0 there exists a stepsize γsolved in (10) if the set of paths from which a source can choose such that lim supt→∞ D(p(t)) − D(p∗ ) ≤ δ. 4
  5. 5. Clearly, an optimal value p∗ exists. The following theo- Theorem 3. Let x∗ be the optimal source rates. Under the samerem guarantees the statistical convergence of the subgradient assumption of Theorem 2, the algorithm (10)–(13) convergesmethod. statistically to within a small neighborhood of the optimal val- ues P(x∗ ), i.e.,Theorem 2. Let p∗ be an optimal price. If the norm of thesubgradients is uniformly bounded, i.e., there exists G such that γG2||g(p)||2 ≤ G for all p, then P(x∗ ) ≥ lim inf P(x(t)) ≥ P(x∗ ) − . (15) t→∞ 2 γG2 Proof. The first inequality P(x∗ ) ≥ lim inf t→∞ P(x(t)) holds, D(p∗ ) ≤ lim sup D(p(t)) ≤ D(p∗ ) + , (14) t→∞ 2 since x(t) is in the feasible rate region as t goes to infinity. Nowi.e., the algorithm (10)–(13) converges statistically to p∗ . we prove the second inequality. By equation (13), we haveProof. The first inequality D(p∗ ) ≤ lim supt→∞ D(p(t)) always ||p(t + 1)||2 2holds, since D(p∗ ) is the minimum of the dual function D(p). ≤ ||p(t) − γg(p(t))||2 2Now we prove the second inequality. By equation (13), we have = ||p(t)||2 − 2γg(p(t))T p(t) + γ2 ||g(p(t))||2 2 2 ||p(t + 1) − p∗ ||2 2 = ||p(t)||2 + 2γ U s (x s (t)) − 2γ( U s (x s (t)) 2 = ||[p(t) − γg(p(t))]+ − p∗ ||2 2 s s ≤ ||p(t) − γg(p(t)) − p∗ ||2 −pT (t)H(x(t))) − 2γpT (t)A( f (t)) + γ2 ||g(p(t))||2 2 2 = ||p(t) − p∗ ||2 − 2γg(p(t))T (p(t) − p∗ ) + γ2 ||g(p(t))||2 2 2 ≤ ||p(t)||2 + 2γ 2 U s (x s (t)) − 2γ( U s (x∗ ) s s s ≤ ||p(t) − p∗ ||2 − 2γ(D(p(t)) − D(p∗ )) + γ2 ||g(p(t))||2 , 2 2 −pT (t)H(x∗ )) − 2γpT (t)A( f (t)) + γ2 ||g(p(t))||2 2where the last inequality follows from the definition of subgra- = ||p(t)||2 + 2γP(x(t)) − 2γP(x∗ ) + γ2 ||g(p(t))||2dient. Applying the inequalities recursively, we obtain 2 2 t −2γpT (t)(A( f (t)) − H(x∗ ))||p(t + 1) − p∗ ||2 2 ≤ ||p(1) − p∗ ||2 − 2γ 2 (D(p(τ)) − D(p∗ )) ≤ ||p(t)||2 + 2γP(x(t)) − 2γP(x∗ ) + γ2 ||g(p(t))||2 , 2 2 τ=1 t where the second inequality follows from the fact that x(t) is +γ2 ||g(p(τ))||2 . 2 the maximizer in the problem (10), and the third inequality fol- τ=1 lows from the fact that f (t) is the maximizer in problem (11). Applying the inequalities recursively, we obtainSince ||p(t + 1) − p∗ ||2 2 ≥ 0, we have t t t t2γ ∗ (D(p(τ)) − D(p )) ≤ ||p(1) − p∗ ||2 +γ 2 ||g(p(τ))||2 ||p(t + 1)||2 ≤ ||p(1)||2 + 2γ 2 2 (P(x(τ)) − P(x∗ )) + γ2 ||g(p(τ))||2 . 2 2 2 τ=1 τ=1 τ=1 τ=1 ≤ ||p(1) − p∗ ||2 2 2 + tγ G . 2 Since ||p(t + 1)||2 ≥ 0, we have 2From this inequality we obtain t t 1 t ||p(1) − p∗ ||2 γG2 2 2γ (P(x(τ)) − P(x∗ )) ≥ −||p(1)||2 − γ2 2 ||g(p(τ))||2 2 D(p(τ)) − D(p∗ ) ≤ + . τ=1 τ=1 t τ=1 2tγ 2 ≥ −||p(1)||2 − tγ2G2 . 2Since D is a convex function, by Jensen’s inequality, From this inequality we obtain ||p(1) − p∗ ||2 γG2 2 D(p(t)) − D(p∗ ) ≤ + . t 2tγ 2 1 −||p(1)||2 − tγ2G2 2 P(x(τ)) − P(x∗ ) ≥ . 2Thus, lim supt→∞ D(p(t)) ≤ D(p∗ )+ γG , i.e., the algorithm con- t τ=1 2tγ 2verges converges statistically to p∗ . Since P is a concave function, by Jensen’s inequality, The assumption of bounded norm for subgradient g(p) is rea-sonable, since f is finite and we always have an upper bound on −||p(1)||2 − tγ2G2 2 P(x(t)) − P(x∗ ) ≥ .x in practice. Theorem 2 implies that the congestion price p ap- 2tγproaches p∗ statistically when the stepsize γ is small enough. 2 Let the primal function be P(x) := s U s (x s ) and achieve its Thus, lim inf t→∞ P(x(t)) ≥ P(x∗ ) − γG , i.e., the algorithm (10)– 2optimum at x∗ . Define x(t) := 1 tτ=1 x(τ), the average data rate t (13) converges statistically to within a small neighborhood ofup to time t. As time goes to infinity, x(t) must be in the feasible the optimal values P(x∗ ).rate region (determined by equations (8)–(9)), otherwise p(t)will go unbounded as time goes to infinity, which contradicts Since P(x) is continuous, Theorem 3 implies that the averageTheorem 2. source rate approaches the optimal x∗ when γ is small enough. 5
  6. 6. 5. Extension to Networks with Time-Varying Channels Here “ ” denotes the function floor with respect to γ2 . We choose such a discrete congestion price to facilitate the stability In the last section, we consider wireless multihop networks analysis in the next subsection.with fixed channels or single-rate devices, i.e., the capacity cl The above algorithm cannot be derived from the dual decom-is a constant when link l is active. However, recent years have position of the problem (17)–(19). However, we will use theseen the growing popularity and demand of multi-rate wireless problem (17)–(19) as a reference system, and characterize thenetwork devices (e.g., 802.11a cards) that can adjust transmis- performance of the above algorithm with respect to it.sion rate according to the time-varying channel state and im-prove overall network utilization. Here, we consider the net- 5.2. Stochastic Stabilityworks with time-varying channels and adaptive multi-rate de- Note that congestion price p(t) takes discrete values. Thus,vices. congestion price p(t) evolves according to a discrete-time, discrete-space Markov chain. We need to show that this markov5.1. Distributed Algorithm chain is stable, i.e., the congestion price process reaches a We assume that time is slotted, and the channel is fixed within steady state and does not become unbounded. It is easy toa time slot but independently changes between different slots.1 check that the Markov chain has a countable state space, butLet h(t) denote the channel state in time slot t. Corresponding is not necessarily irreducible. In such a general case, the stateto the channel state h, the capacity of link l is cl (h) when active space is partitioned in transient set T and different recurrentand the feasible rate region at the link layer is Π(h), which is classes Ri . We define the system to be stable if all recurrentdefined in a similar way as in (1). We further assume that the states are positive recurrent and the Markov process hits the re-channel state is a finite state process with identical distribution current states with probability one [38]. This will guarantee thatq(h) in each time slot,2 and define the mean feasible rate region the Markov chain will be absorbed/reduced into some recurrentas class, and the positive recurrence ensures the ergodicity of the Π := {r : r = q(h)r(h), r(h) ∈ Π(h)}. (16) Markov chain over this class. We have the following result. h Theorem 4. The Markov chain described by equation (22) isIdeally, we would like to have a distributed algorithm that stable.solves the following utility maximization problem Proof. Denote the dual function of the problem (17)–(19) by max U s (x s ) (17) D(p) with an optimal price p∗ and subgradient g(p), i.e., g(p) = x, f s A( f (p)) − H(x(p)) with f (p) ∈ arg max f ∈Π pT A( f ). Consider subject to H(x) ≤ A( f ), (18) the the Lyapunov function V(p) = p − p∗ 2 , we have 2 f ∈ Π. (19) E[∆Vt (p)|p]However, if we solve the above problem via dual decomposi- = E[V(p(t + 1)) − V(p(t)) | p(t) = p]tion, we may get a link rate assignment which is infeasible for = E[V( [p(t) − γg(p(t))]+ ) − V(p(t)) | p(t) = p]the channel state at a given time slot. Instead we directly extend = E[V([p(t) − γg(p(t))]+ − ) − V(p(t)) | p(t) = p]the algorithm (10)–(13) to handle time-varying channel. Congestion control: At time t, given congestion price p(t), ≤ E[V(p(t) − γg(p(t))) − V(p(t)) | p(t) = p]the sources adjust flow rates x according to the congestion price +E[−2 · ([p(t) − γg(p(t))]+ − p∗ ) + || ||2 | p(t) = p] 2 ≤ E[V(p(t) − γg(p(t))) − V(p(t)) | p(t) = p] x(t) = x(p(t)) = arg max U s (x s ) − pT (t)H(x). (20) x s +2 · p∗ + || ||2 , 2 Scheduling: In the beginning of period t, each node monitors where = [p(t) − γg(p(t))]+ − [p(t) − γg(p(t))]+ . Note thatthe channel state h(t), and over link l send an amount of data for 0 ≤ < γ2 1, with 0 denoting the zero vector and 1 the vectoreach flow according to the rates f such that with every component being 1. Thus, 2 · p∗ +|| ||2 < 2γ2 ||p∗ ||1 + 2 γ4 ||1||2 . Let ∆ = 2||p∗ ||1 + γ2 ||1||2 , we have 2 2 f (t) = f (p(t)) ∈ arg max pT (t)A( f ). (21) f ∈Π(h(t)) E[∆Vt (p)|p] Congestion price update: The network (links or nodes) up- ≤ E[V(p(t) − γg(p(t))) − V(p(t)) | p(t) = p] + γ2 ∆dates the congestion price, according to = E[−γg(p(t))T (2(p(t) − p∗ ) − γg(p(t))) | p(t) = p] + γ2 ∆ p(t + 1) = [p(t) + γ(H(x(p(t))) − A( f (p(t))))]+ . (22) = 2γg(p)T (p∗ − p) + γ2 E[ g(p(t)) 2 | p(t) = p] + γ2 ∆ 2 ≤ 2γg(p)T (p∗ − p) + γ2 (G2 + ∆), 1 It is straightforward to extend our results to a network where the channel where we again use the assumption that the norm of g(p(t))state process is modulated by a hidden Markov chain. is bounded above by G. By the definition of subgradient, we 2 Even if the channel state is a continuous process, we only have finite further getchoices of modulation schemes. The corresponding capacities take discretevalues. E[∆Vt (p)|p] ≤ 2γ(D(p∗ ) − D(p)) + γ2 (G2 + ∆). 6
  7. 7. Let Note that p(t) is stationary and ergodic in some steady state by Theorem 4, and so is D(p(t)). Thus, δ= max p − p∗ 2 D(p)−D(p∗ )≤γ(G2 +∆) t−1 1 lim E[D(p(τ))] = E[D(p(∞))].and define A = {p : p − p∗ 2 ≤ δ}. We obtain t→∞ t τ=0 E[∆Vt (p)|p] ≤ −γ2 (G2 + ∆)I p∈Ac + γ2 (G2 + ∆)I p∈A , So,where I is the index function. Thus, by Theorem 3.1 in [38], γ(G2 + ∆) E[D(p(∞))] − D(p∗ ) ≤ .which is an extension of Foster’s criterion [1], the Markov chain 2p(t) is stable. Since D(p) is a convex function, by Jensen’s inequality, The above proof shows that the distance to the optimal p∗ γ(G2 + ∆) D(E[p(∞)]) − D(p∗ ) ≤ ,has negative conditional mean drift for all prices that have suf- 2ficiently large distance to p∗ , and implies that the congestion i.e., the algorithm converges statistically to within γ(G2 + ∆)/2price will stay near p∗ when γ is small enough. of the optimal value D(p∗ ).5.3. Performance Evaluation Since D(p) is a continuous function, Theorem 5 implies that the congestion price p approaches p∗ statistically when γ is We now characterize the performance of the algorithm (20)– small enough.(22) in terms of the dual and primal objective functions of thereference system problem (17)–(19). Corollary 6. x(t) is a stable Markov chain. Moreover, the av- erage arrival rates E[x(∞)] ∈ Π, where x(∞) denotes the stateTheorem 5. The algorithm (20)–(22) converges statistically to of the process x(t) in the steady state.within a small neighborhood of the optimal value D(p∗ ), i.e., Proof. x(t) is a deterministic, finite-value function of p(t). x(t) ∗ γ(G2 + ∆) ∗ is a stable Markov chain, since p(t) is. E[x(∞)] ∈ Π, other- D(p ) ≤ D(E[p(∞)]) ≤ D(p ) + , (23) 2 wise the average congetsion price E[p(∞)] will go unbounded,where p(∞) denotes the state of the Markov chain p(t) in the which contradicts Theorem 4.steady state, and ∆ = 2||p∗ ||1 + γ2 ||1||2 . 2 Theorem 7. Let P(x) be the primal function and x∗ be the opti- mal source rates of the reference system problem (17)–(19). TheProof. The first inequality D(p∗ ) ≤ D(E[p(∞)]) always holds, algorithm (20)–(22) converges statistically to within a smallsince D(p∗ ) is the minimum of the dual function D(p). Now we neighborhood of the optimal value P(x∗ ), i.e.,prove the second inequality. From the proof of Theorem 4, wehave γ(G2 + ∆) P(x∗ ) ≥ P(E[x(∞)]) ≥ P(x∗ ) − . (24) E[∆Vt (p)|p] = E[V(p(t + 1)) − V(p(t)) | p(t) = p] 2 ≤ 2γ(D(p∗ ) − D(p)) + γ2 (G2 + ∆). Proof. The proof for the theorem is a straightforward extension of the proof of Theorem 3, following similar procedure as in theTaking expectation over p, we get proof of Theorem 5. We skip the detail here. E[∆Vt (p)] = E[V(p(t + 1)) − V(p(t))] Since P(x) is a continuous function and has a unique optimal, ≤ 2γ(D(p∗ ) − E[D(p)]) + γ2 (G2 + ∆). Theorem 7 implies that the average source rate approaches the optimal of the ideal reference system (17)–(19) when stepsizeTaking summation from τ = 0 to τ = t − 1, we obtain γ is small enough. Theorems 5 and 7 show that, surprisingly, t−1 the algorithm (20)–(22) can be seen as a distributed algorithm E[V(p(t))] ≤ E[V(p(0))] − 2γ E[D(p(τ))] to approximately solve the ideal reference system problem that τ=0 is not readily solvable due to stochastic channel variations. Our proofs for stability and performance bounds are rather +2γtD(p∗ ) + tγ (G + ∆). 2 2 general. They only use general properties of convexity andSince E[V(p(t))] ≥ 0, we have Markovity and the definition of subgradients. We thus have presented a general technique and results regarding the stabil- t−1 ity and optimality of dual algorithm for convex optimization in 2γ E[D(p(τ))] − 2γtD(p∗ ) ≤ E[V(p(0))] + tγ2 (G2 + ∆). face of time-varying parameters. As the flow contention graph τ=0 is a rather general construct and can be used to capture the inter-From this inequality we obtain dependence or contention among parallel servers of any queue- t−1 ing networks, the aforementioned results are applicable to any 1 E[V(p(0))] + tγ2 (G2 + ∆) systems that can be modelled by a general model of queueing E[D(p(τ))] − D(p∗ ) ≤ . t τ=0 2tγ network that is served by a set of interdependent parallel servers 7
  8. 8. with time-varying service capabilities. In the next two sections, [47]. Consider again the example in Figure 2, and assume therewe will discuss two such applications. Other examples include are two network-layer flows A→F and G→H. Link-layer flowsfair scheduling in a generalized switch [34], and TCP [22] with BC, CD, DE and GH contend with each other, and congestiontime-varying capacity as in last-hop wireless networks. It can is located in the spatial contention region denoted by the rect-include power control as well [5], as power does not change angle. So, unlike wireline networks where link capacities pro-convexity of the feasible rate region. vide constraints for resource allocation, in multihop wireless As specific cases of the general model and algorithm pre- networks the contention relations between link-layer flows pro-sented in sections 4 and 5, we will discuss joint congestion vide fundamental constraints for resource allocation. We needcontrol and medium access control design in multihop wireless to exploit the interaction between transport and link (MAC) lay-networks with single-path routing and cross-layer congestion ers, in order to improve the performance.control, routing and scheduling design in the network without The equations (2)–(3) capture the constraints that arise fromprespecified paths in the next two sections, respectively. channel contention among wireless links. We model the re- source allocation for multihop wireless networks as a utility maximization problem with these constraints,6. Joint Congestion Control and Media Access Control De- sign max U s (x s ) (25) x, f s TCP was originally designed for wireline networks, where subject to Rx ≤ f, (26)links are assumed to have fixed capacities. However, as wire- f ∈ Π, (27)less channel is a shared medium and interference-limited, wire-less links are “elastic” and the capacities they obtain depend on which is a special case of the system problem (7)–(9). With thisthe bandwidth sharing mechanism used at the link layer. This formulation, we can explicitly exploit the interaction betweenmay result in various TCP performance problems in wireless transport and MAC layers, and systematically carry out jointnetworks. design of congestion and media access control. In the next sub- One such problem is TCP unfairness over multihop wireless section, a dual algorithm solving the system problem (25)–(27)networks. Many existing wireless MAC protocols, such as DCF is derived by applying the algorithm (10)–(13). The algorithmspecified in IEEE 802.11 standard [13], are traffic indepen- motivates a scheme for media access control in which link-layerdent and do not consider the actual requirements of the flows flows are scheduled according to congestion prices.competing for the channel. These MAC protocols suffer fromthe unfairness problem, caused by the location dependency of 6.1. Distributed Algorithmthe contentions, and exacerbated by the contention resolution Consider the Lagrangian of the problem (25)–(27) with re-mechanisms such as the binary exponential backoff algorithm spect to the rate constraintadopted in DCF. When they interact with TCP, TCP will furtherpenalize these flows with more contention. This will result in L(p, x, f ) = U s (x s ) − pT (Rx − f ) . (28) ssignificant TCP unfairness in multihop wireless networks [10][37] [45] [46] [47]. To illustrate this, consider the example in Interpret pl as the congestion price at link l, we can use theFigure 2, and assume there are four network-layer flows A→B, algorithm (10)–(13) to solve the problem (25)–(27) and its dual.C→D, E→F and G→H. The flow C→D experiences more Rate control: At time t, given congestion price p(t), source scontention and will build up a queue faster than the other three adjusts its sending rate x s according to the aggregate congestionflows. TCP will further penalize it by reducing the congestion price l Rls pl along its pathwindow more aggressively, and the resulting throughput of flowC→D will be much less than that of the other flows. x s (t) = U s−1 ( Rls pl (t)). (29) l G H Scheduling: Over link l, send an amount of data for each flow according to the rate f such that f (t) = f (p(t)) ∈ arg max pT f. A B C D E F (30) f ∈Π Figure 2: An example of a simple multihop wireless network. If the network with time-varying channel is considered, each node monitors channel state h(t) and over link l sends an amount In addition to the location dependency of contentions, corre- of data for each flow according to the rate f such thatlation among links is also the key to understand the interactionbetween transport and MAC layers. In wireline networks, link f (t) = f (p(t)) ∈ arg max pT f. (31) f ∈Π(h(t))bandwidth is well defined and links are disjoint resources. Butin wireless networks, as we mentioned above, links are corre- Congestion price update: Each link l updates its price, ac-lated due to interference, and network-layer flows that do not cording totransverse a common link may still compete with each other. pl (t + 1) = [pl (t) + γt ( Rlsx s (p(t)) − fl (p(t)))]+ . (32)Thus, congestion is located at some spatial contention region s 8
  9. 9. The above algorithm motivates a joint design scheme where As for the overall performance of our cross-layer design withthe link layer flows are scheduled according to congestion approximate scheduling, we can extend the result in [21] toprices of the links. Also, note that equations (29) and (32) are show that the performance is no worse than that achieved bycompletely distributed and can be implemented at individual an exact design with a feasible rate region 1 Π at the link layer. 2sources and links using only local information. We will discuss Moreover, in [4] we also see that this distributed schedulingthe distributed solution to scheduling problem (30) in the next algorithm only results in a very small degradation in the per-subsection. formance of the cross-layer design for the network with time- varying channel, since in this situation the exact solution of the6.2. Scheduling over multihop Networks scheduling is not as important and reasonable approximations We now come to the scheduling problem (30), which will work well.also show out in the next section. Scheduling over multihopnetworks is a difficult problem and in general NP-hard. To see 6.3. Numerical Examplesthis, note that problem (30) is equivalent to a maximum weightindependent set problem over the flow contention graph, which In this subsection, we provide numerical examples to com-is NP-hard for general graphs. It is easy to design some heuris- plement the analysis in previous subsections. We consider atic algorithm but is hard to bound its performance. simple network with secondary interference as shown in Figure With the primary interference model, the scheduling problem 2, and assume that there are three network layer flows G→H,(30) is equivalent to the maximum weighted matching prob- A→F and D→F with the same utility function U s (x s ) = log x s .lem3 over the connectivity graph {N, L} of the network. Max- We have chosen a simple topology to facilitate discussion.imum weighted matching problem can be computed in poly-nomial time (see, e.g., [29]), but this requires centralized im- The Network with Fixed Channel and Single-Rate Devicesplementation. If implemented over a multihop network, each We first consider a network with fixed link capacities. Fornode needs to notify the central node of its weight and local simplicity, we assume that all the links have one unit of ca-connectivity information such that the central node can recon- pacity when active. Figure 3 shows the evolution of sourcestruct the network topology as a weighted graph. This will lead rates and their averages with the joint algorithm (29), (30) andto an immense communication overhead which is expensive in (32) with stepsize γ = 0.2. We see that the source rates con-time and resources. There also exist simpler greedy sequential verge quickly to a neighborhood of the optimal and oscillatealgorithms to compute a weighted matching at most a factor around the optimal. This oscillating behavior mathematicallyof 2 away from the maximum; see, e.g., [30]. But they also results from the non-differentiability of the dual function andrequire centralized implementation. We seek a distributed al- physically can be interpreted as due to the scheduling process.gorithm where each node participates in the computation itself However, the average source rates are smooth and approach theusing only local information. optimum monotonically. Figure 4 shows the evolution of the A few distributed approximation algorithms exist for maxi- corresponding end-to-end congestion prices and the averagesmum weighted matching problem; see, e.g., [40] [42] [12]. In of the three flows. Similarly, the congestion prices approach[12], the author presents a simple distributed algorithm to com- the optimum quickly. We also note that the performance of thepute a weighted matching at most a factor of 2 away from the algorithm is much better than the bound of γG2 /2 specified inmaximum in linear running time O(|L|). This algorithm is a Theorems 2 and 3.distributed variant of the sequential greedy algorithm presentedin [30]. We have utilized this algorithm to solve the schedul- 1 Flow GH Normalized Source Rates Flow AF 0.8ing problem (30) distributedly, see [4] for details. The result- 0.6 Flow DFing scheduling algorithm for multihop wireless networks is one 0.4of the best distributed algorithms in terms of computational 0.2complexity and performance bound. It has a linear complexity 0 0 10 20 30 40 50 60 70 80 90 100O(|L|). Such a low complexity is important for the scalability Normalized Timeand efficiency of multihop wireless networks. It achieves a per- 1 Flow GH Flow AFformance of 1/2 of the maximum weight in the worst case, and Average Source Rates 0.8 Flow DFin practice, numerical simulations show it typically achieves a 0.6performance within about 4/5 of the maximum weight. There 0.4 0.2also exist few other distributed approximation algorithms; see, 0e.g., [21] [43] [24] [14] [32] [28]. Especially, in [14], [32] and 0 10 20 30 40 50 Normalized Time 60 70 80 90 100[28] the authors present distributed random access algorithmsthat achieve nearly 100% throughput. Figure 3: The evolution of source rates in the network with fixed link capacities. 3 A matching in a graph is a subset of links, no two of which share a common The choice of the stepsize γ is important. It characterizes thenode. The weight of a matching is the total weight of all its links. A maximum “optimality” of the algorithm, as shown in Theorems 2 and 3weighted matching in a graph is a matching whose weight is maximized over (and also in Theorems 5 and 7). It also affects the convergenceall matchings of the graph. speed. In oder to study the impact of different choices of the 9
  10. 10. 10 1 Normalized Congestion Prices Flow GH Normalized Source Rates Flow AF 8 0.8 Flow DF Flow GH Flow AF 6 Flow DF 0.6 4 0.4 2 0.2 0 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 Normalized Time Normalized Time 10 1 Flow GH Average Congestion Prices Flow AF Average Source Rates 8 0.8 Flow DF Flow GH Flow AF 6 Flow DF 0.6 4 0.4 2 0.2 0 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 Normalized Time Normalized TimeFigure 4: The evolution of congestion prices in the network with fixed link Figure 5: The evolution of source rates in the network with time-varying linkcapacities. capacities. 7 Normalized Congestion Pricesstepsize on the performance of the algorithm, we have run sim- 6 5 Flow GHulations with different stepsizes. We found that the smaller the 4 Flow AF Flow DFstepsize, the slower the convergence and the closer to the opti- 3 2mal, which is a general characteristic of any gradient based al- 1gorithm. So, there is a tradeoff between convergence speed and 0 0 10 20 30 40 50 Normalized Time 60 70 80 90 100optimality. In practice, the end user can first choose large step- 6sizes to ensure fast convergence, and subsequently, the stepsizes Average Congestion Prices 5 Flow GHcan be reduced once the source rate starts oscillating around 4 Flow AF Flow DFsome mean value. 3 2 1The Network with Time-Varying Channel and Multirate Devices 0 0 10 20 30 40 50 60 70 80 90 100 We now consider a network with time-varying link capaci- Normalized Timeties. For simplicity, we assume that the capacities of all links Figure 6: The evolution of congestion prices in the network with time-varyingare identically, uniformly distributed over 0.5, 1 and 1.5 units. link capacities.Thus, the average capacity for each link when active is the sameas that in the example with fixed link capacities. Figures 5 and 6 show the evolution of source rates, conges- There exist other ways to solve the resource allocation prob-tion prices and their averages with the same stepsize γ = 0.2. lem (25)–(27). In [3], we also derive a primal algorithm byThe source rates and congestion prices have much larger vari- solving the relaxation of the system problem (25)–(27). Basedations than those with fixed channel, due to the channel vari- on the algorithm, we propose a traffic-dependent scheme forations. But the average source rates and congestion prices are contention-based medium access control and generate conges-still smooth, and converge quickly and monotonically to opti- tion price directly from the MAC layer. As scheduling inmal values. Our simulation results have confirmed the conclu- multihop wireless networks is an intrinsically hard problem,sions from Theorems 5 and 7, which say that the average source contention-based medium access seems a must. To further in-rates and congestion prices approach the optimum of an ideal tegrate congestion control and contention-based medium ac-system with the best feasible rate region at the link layer, and cess in utility maximization framework will be a future researchthat algorithm (29), (31) and (32) can been seen as a distributed step.algorithm to solve this ideal system problem. Also note that, al-though the average link capacities when active are the same as 7. Joint Congestion Control, Routing and Scheduling De-those in fixed channel, each flow achieves larger sending rate. signThis is due to multi-user diversity: our “optimal” scheduling(31) has implicitly considered multi-user diversity. In the last section, we have discussed the resource alloca- tion in multihop wireless networks where the path for each net-6.4. Summary work layer flow is given. However, as wireless spectrum is a We have presented a model for the joint design of conges- scarce resource, it may be costly to maintain end-to-end paths,tion control and media access control for multihop wireless and congestion control based on end-to-end feedback may con-networks, where the resulting dual algorithm is to solve a util- sume too much bandwidth in signalling. Moreover, most rout-ity maximization problem with constraints that arise from con- ing schemes for multihop networks select paths that minimizetention for the wireless channel. This algorithm motivates a hop count; see, e.g., [15] [31]. This implicitly predefines a pathjoint design where link-layer flows are scheduled according to for any source-destination pair, independent of the pattern ofthe congestion prices of the links. traffic demand and interference/contention among links. This 10