International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013     Finite Horizon Adaptive ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013   Motivated by this idea, ma...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20132    Background2.1 Enhanced C...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013more uncertain elements (e.g....
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013capacity. In Case 2, proposed...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013                  PjPU 1 − Pj...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20133.2 Value function setup for ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013  Next, according to [14], th...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20133.3 Model-free online tuning ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013      FTBE           SU      ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013where tuning parameter 0 < αW...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 201312:to line 2.13: else do14: S...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013                             ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013                             ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 201312:            Update finite ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013     In this simulation, the ...
International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013    In Figures 4 through 9, t...
Finite horizon adaptive optimal distributed Power Allocation for Enhanced Cognitive Radio Network in the presence of Chann...
Finite horizon adaptive optimal distributed Power Allocation for Enhanced Cognitive Radio Network in the presence of Chann...
Finite horizon adaptive optimal distributed Power Allocation for Enhanced Cognitive Radio Network in the presence of Chann...
Upcoming SlideShare
Loading in …5
×

Finite horizon adaptive optimal distributed Power Allocation for Enhanced Cognitive Radio Network in the presence of Channel Uncertainties

306 views
211 views

Published on

Finite Horizon Adaptive Optimal Distributed
Power Allocation for Enhanced Cognitive Radio
Network in the Presence of Channel
Uncertainties

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
306
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Finite horizon adaptive optimal distributed Power Allocation for Enhanced Cognitive Radio Network in the presence of Channel Uncertainties

  1. 1. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 Finite Horizon Adaptive Optimal Distributed Power Allocation for Enhanced Cognitive Radio Network in the Presence of Channel Uncertainties Hao Xu and S. Jagannathan Department of Electrical and Computer Engineering Missouri University of Science and Technology Rolla, MO, USA hx6h7@mst.edu, sarangap@mst.eduABSTRACTIn this paper, novel enhanced Cognitive Radio Network (CRN) is considered by using power controlwhere secondary users (SUs) are allowed to use wireless resources of the primary users (PUs) when PUsare deactivated, but also allow SUs to coexist with PUs while PUs are activated by managinginterference caused from SUs to PUs. Therefore, a novel finite horizon adaptive optimal distributedpower allocation (FH-AODPA) scheme is proposed by incorporating the effect of channel uncertaintiesfor enhanced CRN in the presence of wireless channel uncertainties under two cases. In Case 1,proposed scheme can force the Signal-to-interference (SIR)of the SUs to converge to a higher targetvalue for increasing network throughput when PU’s are not communicating within finite horizon. OncePUs are activated as in the Case 2, proposed scheme cannot only force the SIR’s of PUs to converge to ahigher target SIR, but also force the SIR’s of SUs to converge to a lower value for regulating theirinterference to Pus during finite time period. In order to mitigate the attenuation of SIR’s due to channeluncertainties the proposed novel FH-AODPA allows the SIR’s of both PUs’ and SUs’ to converge to adesired target SIR while minimizing the energy consumption within finite horizon. Simulation resultsillustrate that this novel FH-AODPA scheme can converge much faster and cost less energy than othersby adapting to the channel variations optimally.KEYWORDSAdaptive optimal distributed power allocation (AODPA), Signal-to-interference ratio (SIR), Channeluncertainties, Cognitive Radio Network1 Introduction Cognitive Radio Network (CRN) [1] is a promising wireless network since CRN canimprove the wireless resource (e.g. spectrum, power etc.) usage efficiency significantly byimplementing more flexible wireless resource allocation policy [2]. In [3], a novel secondaryspectrum usage scheme (i.e. opportunistic spectrum access) is introduced. The SUs in SRN canaccess the spectrum allocated to PUs originally while the spectrum is not used by any PU. Moreover, the transmission power allocation plays a key role in cognitive radio networkprotocol designs. The efficient power allocation cannot only improves the network performance(e.g. spectrum efficient, network throughput etc.), but also guarantees the Quality-of-Service(QoS) of PUs. A traditional scheme to protect transmission of PUs is introduced in [4] byimposing a power constraint less than a prescribed threshold referred to as interferencetemperature constraint [5] in order to contain the interference caused by SUs to each PU.Research Supported by NSF ECCS#1128281 and Intelligent System Center.DOI : 10.5121/ijcnc.2013.5101 1
  2. 2. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 Motivated by this idea, many researchers have utilized transmission power allocation forenhanced CRN subject to interference constraint. Authors in [6] proposed a centralized powerallocation (CPA) to improve the enhanced cognitive radio network performance by balancingthe signal-to-interference ratios (SIR) of all PUs and SUs. However, since centralized powerallocation scheme requires the information from every PU and SU which might not be possible,distributed power allocation (DPA) is preferred for enhanced CRN since information fromother PUs and SUs is not needed. In [7-8], authors developed distributed SIR-balancingschemes to maintain Quality-of-Service requirement for each PU and SU. However, wirelesschannel uncertainties which are critical are not considered in these works [1-8]. For incorporating channel uncertainties into DPA, Jagannathan and Zawodniok [9]proposed a novel DPA algorithm to maintain a target SIR for each wireless receiver in cellularnetwork under channel uncertainties. In [9], an adaptive estimator (AE) is derived to estimateslowly time varying SIR model which can be changed with varying power and channeluncertainties, and then adaptive DPA is proposed to force actual SIR of each wireless userconverge to target SIR. Motivated from this idea, channel uncertainties have also been includedinto the developed novel finite horizon adaptive optimal DPA scheme. In this paper, a novel finite horizon adaptive optimal distributed power allocation (FH-AODPA) for PUs and SUs in enhanced CRN with channel uncertainties is proposed. Based onthe special property of enhanced CRN (i.e. introduced SUs can use PU’s wireless resourcewhen PUs are deactivated, also SUs are allowed to coexist with PUs while PUs are activated bymanaging interference caused from SUs to PUs properly), FH-AODPA can be developed undertwo cases: Case 1 PUs are deactivated while in Case 2 PUs are activated. In Case 1, since PUsare deactivated and SUs would dominant CRN, proposed FH-AODPA has to force the SIRs ofSUs to converge to a higher target value in order to increase the CRN utility within finitehorizon. However, in Case 2, since PUs are activated, proposed FH-AODPA has to not onlyforce the SIRs of the SUs to converge to a low target value to guarantee QoS for PUs, but alsoincrease network utility by allocating the transmission power properly for both PUs and SUsduring finite time period. Therefore, according to the target SIRs, the novel SIR error dynamics with channeluncertainties are derived first for PUs and SUs under two cases. Second, by using idea ofadaptive dynamic programming (ADP), the novel adaptive value function estimator and finitehorizon optimal DPA are proposed without known channel uncertainties for both PUs and SUsunder two cases. It is important to note that proposed FH-AODPA scheme cannot only forceseach PU’s and SU’s SIR converge to target SIRs respectively in two cases, but also optimizesthe power allocation during finite convergence period which is more challenging compared dueto terminal state constraint. The finite horizon optimal DPA case has not been addressed so farin the literature. Compared with infinite horizon, finite horizon optimal DPA design shouldoptimize the network utility while satisfying the terminal constraint [14]. Meanwhile, proposedFH-AODPA algorithm being highly distributive in nature does not require any inter-linkcommunication, centralized computation, and reciprocity assumption as required in a centrallycontrol wireless environment.This paper is organized as follows. Section II introduces the background included cognitiveradio network and wireless channel with uncertainties. Next, a novel adaptive optimaldistributed power allocation (AODPA) scheme is proposed along with convergence proof forboth PUs and SUs in enhanced CRN under two cases in Section III. Section IV illustrates theeffectiveness of proposed adaptive optimal distributed power allocation in enhanced CRN vianumerical simulations, and Section V provides concluding remarks. 2
  3. 3. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20132 Background2.1 Enhanced Cognitive Radio Network (CRN) As shown in Figure 1, the general enhanced Cognitive Radio Network (CRN) can beclassified into two types of sub-networks: Primary Radio Network (PRN) and Secondary RadioNetwork (SRN) which include all PUs and SUs in enhanced CRN respectively. In order toimproving network utilities (e.g. spectrum efficiency etc.), SUs are introduced in enhancedCRN to share the wireless resource (e.g. spectrum etc.) with PUs which usually exclusivenetwork resource in other existing wireless networks (e.g. WLAN, WiMAX, etc.). On the otherhand, similar to traditional wireless networks, QoS of PUs have to be guaranteed in enhancedCRN even though SUs coexist. Therefore, SUs in enhanced CRN need to learn the wirelesscommunication environment and decide their communication specifications (e.g. transmissionpower, target SIR, etc.) to not only maintain the QoS of PUs, but also increase the networkutility such as spectrum efficiency and so on. Figure 1.EnhancedCognitive Radio Network Due to special property of enhanced CRN, traditional wireless network protocol (e.g.resource allocation, scheduling etc.) might not be suitable for CRN. Therefore, novel protocolis extremely needed to be developed for enhanced CRN. Using the enhanced CRN property,novel protocol has to be separated into two cases: Case 1 PUs are deactivated; Case 2 PUs areactivated. In Case 1, since SUs dominant the enhanced CRN, enhanced CRN network protocolhas to improve the SRN performance as much as possible. However, in Case 2, enhanced CRNnetwork protocol has to not only guarantee QoS of PUs, but also increase CRN network utilityby allocating resource to both PUs and SUs properly.2.2 Channel uncertainties It is important to note that wireless channel imposes limitations on the wireless networkwhich also includes the cognitive radio network. The wireless link between the transmitter andthe receiver can be simple line-of-sight (LOS) or non LOS, or combining LOS and non LOS. Incontrast to a wired channel, wireless channels are unpredictable, and harder to analyze since 3
  4. 4. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013more uncertain elements (e.g. fading, shadowing etc.) are involved. In this paper, three mainchannel uncertainties (i.e. path loss, the shadowing, and Rayleigh fading) are considered. Sincethese main wireless channel uncertainties factors can attenuate the power of the received signaland cause variations in the SIR at the receiver significantly, it is very important to understandthese before proposing novel adaptive optimal DPA scheme for cognitive radio network. Thedetails are given as follows. In [10], assuming path loss, received power attenuation can be expressed as the followinginverse n-th power law h hij = n (1) d ijwhere h is a constant gain which usually is equal to 1 and d ij is the distance between thetransmitter of j th user to the receiver of i th user and n is the path loss exponent. Note that thevalue of path loss exponent (i.e. n ) is depended on the characteristic of wireless communicationmedium. Therefore, path loss exponent n have different values for different wirelesspropagation environments. In this paper, n is set to 4 which is normally used to model path lossin the urban environment. Moreover, the channel gain hij is a constant when mobility of multiplewireless users is not considered. Further in large urban areas, high buildings, mountains and other objects can block thewireless signals and blind area can be formed behind a high rise building or between twobuildings. For modeling the attenuation of the shadowing to the received power, the term 10 0.1ζis used to model as [11-12], where ζ is defined to be a Gaussian random variable. In nextgeneration wireless communication system included cognitive radio network, the Rayleighdistribution is commonly used to describe the statistical time varying nature of the receivedenvelope of a flat fading signal, or the envelope of an individual multipath component. In [10],the Rayleigh distribution has a probability density function (pdf), p(x) , given as:  x  x2   2 exp −   2σ 2  0≤ x≤∞ p( x ) = σ (2)    0 x<0 where x is a random variable, and σ 2 is known as the fading envelope of the Rayleighdistribution. Since all of these factors (i.e. pass loss, shadowing and Rayleigh fading) can impact thepower of received signals and SIR of multiple users, a channel gain factor is used andmultiplied with transmitted power to present the effect from these wireless channeluncertainties. The channel gain h can be derived d as [10-11]: h = f (d , n, X , ζ ) = d − n ⋅ 10 0.1ζ ⋅ X 2 (3)where d −n represents the effect from path loss, 10 0.1ζ corresponds to the effect from shadowing.For presenting Rayleigh fading, it is usually to model the power attenuation as X 2 , where X is arandom variable with Rayleigh distribution. Obviously, the channel gain h is a function of time.3 Proposed finite horizon adaptive optimal distributed powerallocation (FH-AODPA) scheme In this section, a novel finite horizon adaptive optimal distributed power allocation (FH-AODPA) scheme is proposed to optimize the power consumption by forcing the SIRs of thePUs and SUs in enhanced CRN to converge to desired target SIRs within finite time under twocases (i.e. Case 1: PUs are deactivated, Case 2: PUs are activated) even with unknown wirelesschannel uncertainties respectively. In Case 1, since PUs are deactivated, proposed FH-AODPAscheme can force SUs’ SIRs converge to a high target SIR in order to increasing the network 4
  5. 5. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013capacity. In Case 2, proposed FH-AODPA scheme cannot only guarantee the PUs’communication quality by forcing PUs’ SIRs converge to a desired target SIR, but also allowSUs to coexist with PUs in enhanced CRN by forcing their SIRs converge to a low target SIRwhich maintain the interference temperature constraints. Next, the SIR time-varying model with unknown wireless channel uncertainties is introducedfor PUs and SUs. Subsequently, we setup the value function for PUs and SUs in enhanced CRNunder two cases respectively. Then, a model-free online tuning scheme is proposed to learn thevalue function of PUs and SUs adaptively for two cases within finite time, and then based ondifferent cases we develop the finite horizon adaptive optimal distributed power allocation forPUs and SUs by minimizing the corresponded value function that is learned. Eventually, theconvergence proof is given. Meanwhile, without loss of generality, lth PU and m th SU areselected to derive finite horizon adaptive optimal distributed power allocation for conveniencerespectively.3.1 Dynamic SIR Representation for PUs and SUs with Unknown Uncertainties In previous power allocation schemes [5-8], only path loss uncertainty is considered. Inaddition, without considering the mobility of PUs and SUs, the mutual interference I (t ) is heldconstant which is actually inaccurate in practical cognitive radio network. Therefore, in thispaper, more uncertainties factors included path loss, shadowing and Rayleigh fading areconsidered together and both channel gain h and the mutual interference I (t ) are assumed to beslowly time-varying. According to [9], the SIRs, RlPU (t ) Rm (t ) , at the receiver of lth PU and SUm th SUat the time instant t can be calculated respectively as: hll (t ) Pl PU (t ) hll (t ) Pl PU (t ) RlPU (t ) = = I lPU (t ) ∑ hlj (t ) PjPU (t ) + ∑ hlj (t ) Pi SU (t ) l ≠ j∈{ PUs} i∈{SUs} SU SU (4) SU hmm (t ) P (t ) m h P (t ) mm m Rm (t ) = = ∑ hmj (t ) P (t ) + ∑ h (t ) Pi (t ) SU PUSU I m (t ) j mi j∈{ PUs} m≠i∈{SUs}where I (t ), I (t ) is the mutual interference for lth PU and m th SU, Pl PU (t ), PmSU are the l PU SU mtransmitter power of lth PU and m th SU, and {PUs},{SUs} are the sets of PUs and SUsrespectively. Differentiating (4) on both sides, (4) can be expressed as dRlPU (t ) ( hll (t ) Pl PU (t )) I lPU (t ) − (hll (t ) Pl PU (t ))[ I lPU (t )] [ RlPU (t )] = = dt [ I lPU (t )]2 (5) dRm (t ) (hmm (t ) Pm (t )) I m (t ) − ( hmm (t ) Pm (t ))[ I m (t )] SU SU SU SU SU [ Rm (t )] = SU = SU 2 dt [ I m (t )]where [ RlPU (t )] , [ Rm (t )] are the derivatives of lth PU’s and m th SU’s SIR (i.e. RlPU (t ), Rm (t ) ) and SU SU[ I lPU (t )] , [ I m (t )] are the derivative of I lPU (t ) , I m (t ) . According to Euler’s formula, differential SU SUequation can be transformed to discrete-time domain. Therefore, equation (5) can be expressedin discrete time by using Euler’s formula as (hll ,k Pl ,PU ) I lPU − (hll ,k Pl ,PU )( I lPU ) k ,k k ,k ( RlPU ) = ,k ( I lPU ) 2 ,k (6) SU (hmm,k Pm,k ) I m,k − (hmm,k Pm,k )( I m,k ) SU SU SU SU (R m ,k) = ( I m ,k ) 2 SUIn other words,RlPU+1 − RlPU Pl ,PU1 (hll ,k +1 − hll ,k ) hll ,k ( Pl ,PU1 − Pl ,PU ) hll ,k Pl ,PU   hlj ,k +1 − hlj ,k ∑  ,k ,k k+ k+ k k PU 2  = + − T I lPU ,k T I lPU ,k T ( I l ,k )  j ≠l∈{PUs}    T 5
  6. 6. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 PjPU 1 − PjPU  h −h P SU − P SU  hlj ,k  + ∑  li ,k +1 li ,k Pl ,PU + i ,k +1 i ,k hlj ,k  ,k + ,k × PjPU + (7) ,k T  i∈{SUs}  T k T     SU SURm,k +1 − Rm,k SU Pm,k +1 (hmm,k +1 − hmm,k ) SU SU SU hmm,k (Pm,k +1 − Pm,k ) hmm,k Pm,k   hmi,k +1 − hmi,k = SU + SU − 2  ∑   T I m,k T I m,k T I m,k i≠l∈{SUs}   T Pi ,SU1 − Pi ,SU   hmj ,k +1 − hmj ,k PU PjPU 1 − PjPU  hmi ,k  + ∑  hmj ,k  . k+ k ,k + ,k× Pi ,SU + Pj ,k + k T  j∈{PUs}  T T    Multiplying T on both sides, (7) can be derived as hll ,k +1 − hll ,k 1RlPU+1 = ( ,k − [ ∑ [( hlj ,k +1 − hlj ,k ) PjPU + ( PjPU+1 − PjPU ) hlj ,k ] ,k ,k ,k hll ,k I lPU j ≠l∈{ PUs } ,k (8) SU SU SU PU Pl ,PU1 k+ + ∑ [( hli ,k +1 − hli ,k ) P i ,k + (Pi ,k +1 −P i ,k ) hli ,k ]]) R l ,k + hll ,k i∈{ SUs } I lPU ,k SU hmm ,k +1 − hmm ,k 1Rm ,k +1 = ( − SU [ ∑ [( hmi ,k +1 − hmi ,k ) Pi ,SU + ( Pi ,SU1 − Pi ,SU ) hmi ,k ] k k+ k hmm ,k I m ,k i≠ m∈{SUs} SU . PU PU PU SU Pm,k +1 + ∑ [(hmj ,k +1 − hmj ,k ) P j ,k + (P j ,k +1 −P j ,k ) hmj ,k ]]) R m ,k + hmm ,k SU j∈{ PUs } I m,kNext, by defining the variables {φlPU , ρ lPU ,υlPU } and {φm,k , ρ m,k ,υ m,k } for lth PU ,k ,k ,k SU SU SU and m th SUrespectively as  PU hll ,k +1 − hll ,k 1 φl ,k = − PU [ ∑ [(hlj ,k +1 − hlj ,k ) PjPU + ( PjPU 1 − PjPU )hlj ,k ] ,k ,k + ,k  hll ,k I l ,k j ≠l∈{PUs}  + ∑ [(hli ,k +1 − hli ,k ) Pi ,SU + ( Pi ,SU1 − Pi ,SU )hli ,k ]]  i∈{SUs} k k+ k  PU (9) ρ l ,k = hll ,k  PU  PU Pl ,k +1 υl ,k = PU  I l ,kand SU hmm,k +1 − hmm,k 1φm,k = − SU [ ∑ [(hmi,k +1 − hmi,k ) Pi ,SU + ( Pi ,SU1 − Pi ,SU )hmi,k ] k k+ k hmm,k I m,k i≠m∈{SUs} + ∑ [(hmj,k +1 − hmj,k ) PjPU + ( PjPU 1 − PjPU )hmj ,k ]] j∈{PUs} ,k ,k + ,k SU (10)ρ m,k = hmm,k SU SU Pm,k +1υ m,k = I SU m, kThen, equation (8) can be represented for lth PU and m th SU respectively as RlPU+1 = φlPU RlPU + ρ lPU υ lPU ,k ,k ,k ,k ,k SU SU SU SU SU (11) Rm ,k +1 = φ m ,k Rm ,k + ρ m ,kυ m ,k Using equation (11), SIR dynamics of each PU (i.e. l ∈ {PUs} ) and SU (i.e. m ∈ {SUs} ) can beobtained without loss of generality. Moreover, it is observed that the SIR dynamics for PUs andSUs is a function of wireless channel variation from time instant k to k + 1 . However, due touncertainties, wireless channel variation cannot be known beforehand which causes the DPAscheme development for PUs and SUs more different and challenging, especially for finitehorizon optimal designing. For solving this challenging issue, a novel finite horizon adaptiveoptimal DPA method is proposed as next. 6
  7. 7. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20133.2 Value function setup for finite horizon adaptive optimal DPA in enhancedCRN As introduced above, two cases in enhanced CRN need to be considered in proposedAODPA scheme (i.e. Case 1: PUs are deactivated; Case 2: PUs are activated). The valuefunction for PUs and SUs will be setup differently for the two cases. The details are given asfollows.Case 1: PUs are deactivated In Case 1, since PUs are deactivated, wireless resource (e.g. spectrum etc.) allocated to PUswill be free. Therefore, proposed FH-AODPA scheme would force SUs’ SIRs converge to a SUhigh target SIR, γ H ,in order to improving the performance of enhanced CRN (e.g. spectrumefficiency, network capacity, etc.) within finite horizon. Therefore, the SIR error dynamics forSUs in enhanced CRN can be represented by using equation (11) as: em,k,+1 = φm,k ,1em,k,1 + (φm,k ,1 − 1)γ H + ρ m,k ,1υ m ,k ,1 SU 1 SU SU SU SU SU SU (12a)In the other words, em,k,+1  φm,k ,1 φm,k ,1 − 1 em,k,1   ρ m,k ,1  SU ,1 SU 1 SU SU SU SU  SU  =    SU  +  υ m,k (12b)  γH   0 1  γ H   0 Moreover, Em,k ,+1 = Am ,k ,1 Em ,k ,1 + Bm ,k ,1υ m ,k ,1 SU 1 SU SU SU SU (12c) SU , 1 SU , 1 SU SUwhere SIR error in Case 1 as e m ,k =R m ,k −γ H ,γ H is the high target SIR for SUs under Case 1, SU , 1 SU , 1 SU Tand augmented state E = [e γ ] . Then, according to [14] and equation (12c), the cost m,k m,k Hfunction for m th SU in Case 1 can be defined within finite time as ( E SU ,1 )T G SU ,1 E SU ,1  ∀k = 0,1,..., N − 1 J m,k,1 =  m,k,1 T m,k,1 m,k,1 SU SU SU SU (13) ( Em, N ) Pm, N Em, N where G m ,k ,1 ≥ 0 is the solution of the Riccati equation [14], and NT s is the final time constraint. SUThe optimal action dependent value function V (•) of m th SU in Case 1 is defined as:V ( Em,k,1 ,υm,k,1 ) = r ( Em,k, 1 ,υm,k, 1 ) + J m,k,+1 = [(Em,k, 1 )T (υm,k, 1 )T ]ΘSUk, 1[(Em,k, 1 )T (υm,k,1 )T ]T SU SU SU SU SU 1 SU SU m, SU SU (14) SU , 1 SU , 1 SU , 1 T SU ,1 SU , 1 SU ,1 T SU ,1 SU ,1 SU , 1with r (E ,υ ) = (E ) Q E m,k m,k + (υ ) S υ , ∀k = 0,1,..., N − 1 , Q and S SU , 1 are m,k m,k m,k m,kpositive definite matrices. Using Bellman equation and cost function definition (13), we can formulate the followingequation by substituting value-function into Bellman equation as T Em,k ,1  SU , 1  Em,k,1  SU SU SU , 1 SU , 1 SU , 1 SU ,1  Θ m,k  SU ,1  = r ( Em ,k ,υ m,k ) + J m,k +1 ∀k = 0,1,..., N − 1 (15)υ m,k   υm ,k    T  E SU ,1  Q SU ,1 + ( Am,k ,1 )T Gm,k ,+1 Am,k ,1 SU SU 1 SU ( Am,k ,1 )T Gm,k ,+1 Bm,k ,1 SU SU 1 SU   Em,k ,1  SU=  m,k,1   SU ,1 T SU ,1 SU ,1   SU ,1   SU υm,k    ( Bm,k ) Gm,k +1 Am,k  S SU ,1 + ( Bm,k ,1 )T Gm,k ,+1 Bm,k ,1  υm,k  SU SU 1 SU  Next after incorporating the terminal constraint in the value function and (15), slowly timevarying Θ m ,k,1 matrix can be expressed as SU Θm, ,SU ,1 Θmυk, SU ,1  QSU ,1 + ( Am,k,1 )T Gm,k,+1 Am,k,1 EE E , SU SU 1 SU ( Am,k,1 )T Gm,k ,+1Bm,k,1  SU SU 1 SUΘm,k,1 =  υEk,SU ,1 SU = SU 1 SU  Θm,k  Θυυk, SU ,1   (Bm,k,1 )T Gm,k ,+1 Am,k,1 m,   SU SU 1 SU S SU ,1 + (Bm,k,1 )T Gm,k,+1Bm,k,1  SU and ∀k = 0,1,..., N − 1 P SU ,1 0Θm,k,1 =  m, N SU .  0 0 7
  8. 8. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 Next, according to [14], the gain of the optimal power allocation for m th SU under Case 1can be represented in term of value function parameters, Θ SUk,1 ∀k = 0,1,..., N − 1 , as m,K m,k,1 = [S SU ,1 + ( Bm,k,1 )T Gm,k,+1 Bm,k,1 ]−1 ( Bm,k,1 )T Gm,k,+1 Am,k,1 = (Θυυ,k,SU ,1 ) −1 ΘυE,k,SU ,1 SU SU SU 1 SU SU SU 1 SU m m (16) It is important to note that even Riccati equation solution, Gm,k ,1 , is known, solving the SUoptimal design gain K m,k,1 for m th SU under Case 1 still requires its SIR error dynamics (i.e. SUAm,k ,1 , Bm,k ,1 ) which cannot be known due to channel uncertainties. However, if the parameter SU SUvector Θ m ,k,1 , k = 0,1,..., N − 1can be estimated online, then m th SU’s SIR error dynamic is not SUneeded to calculate finite horizon optimal DPA gain. Meanwhile, SIR of PUs in enhanced CRNwill not be considered since they are deactivated in Case 1.Case 2: PUs are activated In Case 2, proposed FH-AODPA scheme should not only force PUs’ SIRs converge to adesired target SIR (i.e. γ PU ) for maintaining their QoS, but also force SU’ SIRs converge to a SUlow target SIR (i.e. γ L ) in order to coexist with PUs. Therefore, the SIR error dynamics for lthPU and m th SU can be expressed as elPU+,12  φlPU , 2 φlPU , 2 − 1 elPU , 2   ρ lPU , 2  PU , 2 ,k ,k ,k ,k ,klth PU :  PU  =    PU  +  υl ,k   γ   0  1  γ   0    (17a) em,k,+21  φm,k , 2 φm,k , 2 − 1 em,k, 2   ρ m,k, 2  SU , 2 SU SU SU SU SUmth SU :  SU  =    SU  +  υ m,k  γL   0 1  γ L   0 In the other words,lth PU : ElPU+12 = AlPU , 2 ElPU , 2 + BlPU , 2υlPU , 2 ,k , ,k ,k ,k ,k (17b)mth SU : Em,k ,+2 = Am,k , 2 Em,k , 2 + Bm,k , 2υ m,k , 2 SU 1 SU SU SU SUwhere SIR error in Case 2 for PU and SU as elPU , 2 = RlPU , 2 − γ PU , em ,k, 2 = Rm ,k , 2 − γ L , γ PU , γ L is the ,k ,k SU SU SU SUdesired target SIR for PUs and high target SIR for SUs under Case 2, and augmented stateElPU , 2 = [elPU , 2 γ PU ]T , Em,k , 2 = [em,k, 2 γ L ]T . Then, according to the same theory and derivation ,k ,k SU SU SUas Case 1, slowly time varying Θ lPU , 2 , Θ SUk, 2 matrices for PU and SU in Case 2 can be expressed ,k m,respectively as: ΘlEE,PU,2 ΘlE,k , PU,2  Q PU,2 + ( AlPU ,2 )T GlPU12 AlPU ,2 υ ,k ,k + , ,k ( AlPU ,2 )T GlPU12 BlPU ,2 ,k ,k + , ,k ΘlPU,2 =  υ,E,PU ,2 ,k k υυ, PU ,2  = PU , 2 T SU , 2 SU, 2 PU , 2 PU ,2 T PU , 2 PU ,2  Θl ,k  Θl ,k     (Bl ,k ) Gl ,k +1 Al ,k S + (Bl ,k ) Gl ,k +1 Bl ,k   Θm, ,SU, 2 Θmυk, SU, 2  QSU, 2 + ( Am,k, 2 )T Gm,k,+2 Am,k, 2 EE E , SU SU 1 SU ( Am,k, 2 )T Gm,k,+2 Bm,k, 2 SU SU 1 SU Θm,k,2 =  υEk,SU, 2 SU υυ, SU , 2  = SU , 2 T SU , 2 SU , 2 SU , 1 SU ,2 T SU , 2 SU , 2  Θm,k  Θm,k     (Bm,k ) Gm,k +1 Am,k S + ( Bm,k ) Gm,k +1Bm,k  Next, the gain of and ∀k = 0,1,...,N − 1 P PU , 2 0 SU,2 P 0 SU , 2ΘlPU,2 =  ,N l ,N , Θm,N =  . m, N  0 0 0  0the optimal power allocation for lth PU and m th SU under Case 2 can be represented in term ofvalue function parameters, Θ lPU , 2 , Θ m ,k, 2 , ∀k = 0,1,..., N − 1 , respectively as ,k SUKlPU , 2 = [S PU , 2 + ( BlPU , 2 )T GlPU 12 BlPU , 2 ]−1 ( BlPU , 2 )T GlPU 12 AlPU , 2 = (Θυυ ,PU , 2 ) −1 Θυ,E,PU , 2 ,k ,k ,k + , ,k ,k ,k + , ,k l ,k lk (18)Km,k, 2 = [S SU , 2 + ( Bm,k, 2 )T Gm,k,+2 Bm,k, 2 ]−1 ( Bm,k, 2 )T Gm,k,+2 Am,k, 2 = (Θυυ,k,SU , 2 ) −1 ΘυE,k,SU , 2 SU SU SU 1 SU SU SU 1 SU m mSimilar to Case 1, once value function parameters Θ lPU , 2 , Θ m ,k, 2 for lth PU and m th SU under Case ,k SU2 have been tuned, the finite horizon optimal power allocation can be obtained for PU and SUin Case 2 by using (18). 8
  9. 9. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 20133.3 Model-free online tuning adaptive estimator for value function In this section, proposed finite horizon adaptive optimal approach derives a novel estimatorwhich is used to estimate the value function for PUs and SUs in CRN under two casesrespectively. After that, finite horizon optimal power allocation design will be derived by usinglearned value function parameters for PUs and SUs under two cases. The details are given asfollows. 3.3.1 Adaptive estimator of value function and Θ matrix under two casesCase 1: PUs are deactivatedBefore deriving the adaptive estimator (AE), the following assumption is asserted.Assumption 1 [15]: The value function, V ( Em,k,1 ,υm,k,1 ) , can be represented as the linear in the SU SUunknown parameter (LIP). Then, using adaptive control theory [19] and (14), the value-function for mth SU in enhancedCRN under Case 1can be represented in vector form asV (Em,k,1,υm,k,1 ) = (zm,k,1 )T Θm,k,1zm,k,1 = (θm,k,1 )T zm,k,1 = (Wm ,1 )T σ (N − k)zm,k,1 ∀k = 0,...,N SU SU SU SU SU SU SU SU SU (19)where θ m,k ,1 = vec(Θ SUk,1 ) = (WmSU ,1 )T σ ( N − k ), zm,k,1 = [( Em,k ,1 )T υ m ( Em,k ,1 )]T and z m, k,1 = SU m, SU SU T SU SU[( zm,k,11 ) 2 ,.., zm,kn1−1 zm,kn1 , ( zm,kn1 ) 2 ] is the Kronecker product quadratic polynomial basis vector [20] SU SU , SU , SU ,for m th SU in enhanced CRN under Case 1, vec(•) function is constructed by stacking thecolumns of matrix into one column vector with off-diagonal elements [16]. Moreover, σ (•) isthe time-dependent regression function for the value function parameter estimation θ m, k ,1 . It is SUimportant to note that θ m, N,1 = vec(Θ m, N,1 ) is considered as the known terminal constraint in finite SU SUhorizon optimal DPA problem. Therefore, it is obvious that target parameter WmSU ,1 for mth SUand regression function σ (•) should satisfy θ m,N,1 = (WmSU ,1 )T σ (0) . SU Based on relationship between value function and cost function [16], cost function of m th SUunder Case 1can also be represented in term of Θ m ,k,1 as SU J m,k,1 ( E ) = V ( Em,k ,1 ,υ m,k,1 ) = ( zm,k,1 )T ΘSUk,1 zm,k,1 = (θ m,k ,1 )T zm,k ,1 SU SU SU SU m, SU SU SU (20) = (Wm ,1 )T σ ( N − k ) zm,k ,1 ∀k = 0,..., N SU SU Next, the value function of mth SU in Case 1, V ( Em,k,1 ,υm,k,1 ) , can be approximated by using SU SU ˆ SUadaptive estimator in terms of estimated parameter Θ m,k,1 as ˆ SU SU ˆ SU SU ˆ SU V ( Em,k ,1 ,υ m,k,1 ) = (θ m,k ,1 )T zm,k ,1 = (Wm,k ,1 )T σ ( N − k ) zm,k ,1 ∀k = 0,..., N SU (21) ˆwhere WmSU ,1 is the estimated value of mth SU’s target parameter vector θ m ,k ,1 at time kTs under SU ,kCase 1.It is observed that mth SU’s Bellman equation in Case 1 can be rewritten as J m,k ,+1 ( E ) − J m,k ,1 ( E ) + r ( Em,k ,1 ,υm,k,1 ) = 0 . However, this relationship does not hold when we apply SU 1 SU SU SU ˆthe estimated matrix ΘSUk,1 . Hence, using delayed values for convenience; the residual error m, ˆ SU ˆ SU 1associate with (21) can be expressed as J m,k ,1 ( E ) − J m,k ,−1 ( E ) + r ( Em,k ,−1 , υm,k,−1 ) = em,Wk , i.e. SU 1 SU 1 FTBE FTBE ˆ SU ˆ SU 1em ,Wk = J m ,k ,1 ( E ) − J m ,k ,−1 ( E ) + r ( Em ,k ,−1 ,υ m ,k ,−1 ) SU 1 SU 1 (22a) SU 1 ˆ SU = r ( Em ,k ,−1 ,υ m ,k ,−1 ) + (Wm ,k ,1 )T ∆Z m ,k ,−1 SU 1 SU 1 k = 0,1,..., N FTBEwhere ∆Z m ,k ,−11 = σ ( N − k ) z m ,k ,1 − σ ( N − k + 1) z m ,k ,−11 , and em,Wk is Bellman equation residual error for SU SU SUthe finite horizon scenario of mth SU under Case 1. Next, the dynamics of (22a) can berepresented as 9
  10. 10. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 FTBE SU SU ˆ SU , em,Wk +1 = r ( Em,k ,1 ,υ m,k ,1 ) + (Wm,k +1 )T ∆Z m,k ,1 SU ∀k = 0,1,..., N (22b) 1 FTBE FCBesides considering em,Wk , the estimation error em,Wk due to the terminal constraint needs to beconsidered and therefore given FC SU ˆ SUem,Wk = θ m, N,1 − (Wm,k ,1 )T σ (0) (23)Next, we define the auxiliary residual error vector and terminal constraint estimation errorvector can be defined as FTBE SU , ˆ SU Ξ m ,k , SU ,1 = Γm ,k −11 + (Wm ,k ,1 )T ∆Z m ,k,−1 SU 1 (24) FC SU ˆ SU Ξ m ,k,SU ,1 = θ m , N,1Λ − (Wm ,k ,1 )T σ (0)Λwhere Λ is a known dimension matching matrix Γm,k −11 = [ r ( Em,k ,−1 ,υ m,k ,−1 ) .... r ( Em,k ,−1−i , υm,k,−1−i )] and SU , SU 1 SU 1 SU 1 SU 1∆Zm,k,−1 = [∆Zm,k,−1 .... ∆Z m ,k ,−1−i ] , 0 < i < k − 1 and ∀m∈{SUs} . Then the dynamics of auxiliary SU 1 SU 1 SU 1residual error vector (24) are generated similar as (23) and revealed to beΞ FTBE , SU ,1 =Γ SU ,1 ˆ SU , SU FC SU SU ˆ SU , + (θ m ,k +11 )T ∆Z m ,k,1 , Ξ m ,k,+1 ,1 = θ m , N,1Λ − (Wm ,k +1 )T σ (0)Λ . m ,k +1 m ,k 1 To force both the Bellman equation and terminal constraint estimation error converge to ˆ SUzero, the update law of the mth SU’s time varying matrix Θm,k,1 in Case 1 can be derived as ˆ SU,Wm,k +1 = Ψm,k ,1[(Ψm,k ,1 )T Ψm,k ,1 ]−1[αW ,1 (Ξm,k ,SU,1 )T + αW ,1 (Ξm,k,SU,1 )T − (Γm,k ,1 )T − θm,N,1Λ] SU SU SU SU FTBE SU FC SU SU (25) 1 SU ,1 SU ,1 SU ,1where Ψ m ,k = ∆Z m ,k − σ (0)Λ and 0 < α W < 1 . Substituting (25) into (24) results FTBE , SU ,1 FC , SU ,1 SU ,1 FTBE , SU ,1 FC , SU ,1 Ξ m , k +1 +Ξ m , k +1 =α W (Ξ m ,k +Ξ m ,k ) (26) ~ ˆ Then defining the parameter estimation error of mth SU under Case 1 as WmSU ,1 = WmSU ,1 − WmSU ,1 , ,k ,kdynamics of estimation errors of mth SU’s adaptive estimator parameter in Case 1 can beexpressed as ~ SU , ~ SU (Wm,k +1 )T [∆Z m,k,1 − σ (0)] = αW ,1 (Wm,k ,1 )T [∆Z SUk,−1 − σ (0)] 1 SU SU m, 1 (27) Next, the estimation of mth SU’s optimal design under Case 1 will be derived based on tuned ˆparameter ΘSUk,1 as m, ˆ SU SU ˆ ˆ υm,k,1 = − K m,k,1 zm,k,1 = −(Θυυ,k,SU ,1 ) −1 ΘυE,k,SU ,1 zm,k,1 ˆ SU SU (28a) m mThen, using (10), the mth SU’s adaptive optimal DPA design under Case 1 can be expressed as ˆ SU SU ˆ ˆ PmSU+1 = υ m ,k ,1 I m ,k ,1 = −(Θυυ,k, SU ,1 ) −1 ΘυE,k, SU ,1 z m ,k,1 I m ,k ,1 , SU SU (28b) ,k 1 m mCase 2: PUs are activated Similar to Case 1, the value-function for mth PU and mth SU in CRN under Case 2 can beestimated in vector form respectively as ˆ ˆ ˆlth PU : V ( ElPU ,2 ,υlPU , 2 ) = (θlPU , 2 )T zlPU , 2 = (Wl ,PU , 2 )T σ ( N − k ) zlPU , 2 ,k ,k ,k ,k k ,k (29) ˆ SU ˆ SU ˆ SUmth SU : V ( Em,k , 2 ,υm,k , 2 ) = (θ m,k , 2 )T zm,k , 2 = (Wm,k , 2 )T σ ( N − k ) zm,k , 2 SU SU SUwhere zlPU , 2 , z m,k , 2 are Kronecker product quadratic polynomial basis vector for lth PU and mth SU ,k SUin enhanced CRN under Case 2. Next,the update law of lth PU’s and mth SU’s time varying matrices, ΘlPU , 2 , Θm,k, 2 , in Case 2 ˆ ˆ SU ,kcan be derived respectively as ˆ lth PU : Wl ,PU1, 2 = ΨlPU , 2 [(ΨlPU ,2 )T ΨlPU , 2 ]−1[αW , 2 (ΞlFTBE, PU , 2 )T + αW , 2 (ΞlFC ,PU , 2 )T PU PU k+ ,k ,k ,k ,k ,k − (ΓlPU , 2 )T − θ m, N, 2 Λ] ,k PU (30) ˆ SU ,mth SU : Wm,k +2 = Ψm,k ,2 [(Ψm,k , 2 )T Ψm,k , 2 ]−1[αW , 2 (Ξ m,k ,SU , 2 )T + αW , 2 (Ξ m,k,SU , 2 )T SU SU SU SU FTBE SU FC 1 − (Γm,k , 2 )T − θ m, N,2 Λ] SU SU 10
  11. 11. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013where tuning parameter 0 < αW ,2 < 1 and 0 < αW , 2 < 1 . Then, dynamics of estimation errors for lth PU SUPU’s and mth SU’s adaptive estimator parameter in Case 2 can be represented respectively as ~ ~lth PU : (Wl ,PU1, 2 )T ∆ZlPU , 2 = αθPU , 2 (Wl ,PU , 2 )T ∆ZlPU ,2 k+ ,k k ,k ~ SU , 2 T SU , 2 SU , 2 ~ SU , 2 T SU , 2 (31)mth SU : (Wm,k +1 ) ∆Z m,k = αθ (Wm,k ) ∆Z m,k Then, we can derive the finite horizon adaptive optimal DPA design for lth PU and mth SU inenhanced CRN under Case 2 based on tuned parameter ΘlPU , 2 , ΘSUk, 2 , ∀k = 0,1,..., N − 1 , ˆ ˆ ,k m,respectively as:lth PU : ˆ ˆ Pl ,PU1, 2 = υlPU , 2 I lPU ,2 = −(Θυυ, PU , 2 ) −1 Θυ,E , PU ,2 zlPU , 2 I lPU , 2 ˆ ,k ,k k+ l ,k lk ,k ,k (32) mth SU : SU , 2 ˆ SU , 2 SU , 2 Pm,k +1 = υm,k I m,k = −(Θm,k ˆ υυ , SU ,2 ) −1 ΘυE, SU , 2 z SU , 2 I SU ,2 ˆ m,k m,k m,k Eventually, the stability of value function estimation, adaptive DPA estimation, and adaptiveestimation error dynamics for PUs and SUs in two cases are considered in next section. 3.3.2 Closed-loop finite horizon adaptive optimal DPA system stability for PUs and SUsin enhanced CRN Since proposed finite horizon adaptive optimal DPA is designed for enhanced CRN PUs andSUs in two cases, the closed-loop stability will be analyzed under two cases respectively.Case 1: PUs are deactivated In this case, it will be shown that mth SU’s time-varying matrix, Θ l , k , ∀k = 0,1,..., N − 1 , andrelated value function estimation errors dynamic are Uniformly Ultimately Boundednes (UUB)when PUs in enhanced CRN are deactivated. Further, the estimated finite horizon adaptiveoptimal distributed power allocation will approach the optimal power allocation within a smallultimate bound. Next the initial system states (i.e. SIR errors of SUs) are considered to reside inthe compact set which in turn is stabilized by using the initial stabilizing input υ 0SU ,,k1 . Further msufficient condition for the adaptive estimator tuning gain αW ,1 is derived to ensure the all future SUSUs’ SIR errors will converge close to zero. Then it can be shown that the actual finite horizonadaptive DPA approaches the optimal power allocation for SUs in Case 1 within ultimatebound during finite time period. Before introducing the convergence proof, the algorithm represented the proposed finitehorizon adaptive optimal distributed power allocation is given as follows.Algorithm 1:Finite Horizon Adaptive Optimal Distributed Power Allocation for mthSU in enhanced CRN under Case 1 (i.e. PUs are deactivated) ˆ1: Initialize: WmSU ,1 = 0 and implementing admissible policy υ m , 0,1 . SU ,k2: while { kTs < t < (k + 1)Ts }do3: Calculate the value function estimation errors Ξ m,k ,SU ,1 and Ξ m ,k,SU ,1 . FTBE FC4: Update the parameters of the value function estimator5: ˆ SU, Wm,k +1 = Ψm,k ,1[(Ψm,k ,1 )T Ψm,k ,1 ]−1[αW ,1 (Ξm,k ,SU,1 )T + αW ,1 (Ξm,k,SU,1 )T − (Γm,k ,1 )T − θm, N,1Λ] SU SU SU SU FTBE SU FC SU SU 16: Update finite horizon adaptive optimal DPA based on estimated Θ m,k,1 matrix. SU7: ˆ SU SU ˆ ˆ υm,k,1 = − K m,k,1 zm,k,1 = −(Θυυ,k,SU ,1 ) −1 ΘυE,k,SU ,1 zm,k,1 ˆ SU SU m m , ˆ SU SU ˆ ˆ8: PmSU+1 = υ m,k ,1 I m,k ,1 = −(Θυυ,k, SU ,1 ) −1 ΘυE,k, SU ,1 zm,k,1 I m,k ,1 SU SU ,k 1 m m9: end while10:If {t < NTs } do11:Go to next time interval [(k + 1)Ts , (k + 2)Ts ) (i.e. k = k + 1 ), and then go back 11
  12. 12. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 201312:to line 2.13: else do14: Stop the algorithm.Theorem 1.(Convergence of the Adaptive Optimal Distributed Power Allocation for SUs inenhanced CRN under Case 1). Let υ m , 0,1 be any initial admissible policy for the mth SU’s finite SUhorizon adaptive optimal DPA scheme in Case 1 with 0 < lo < 1 / 2 . Within the time horizon (i.e.t ∈ [0, NTs ] ), let the mth SU’s parameters be tuned and estimation finite horizon optimal powerallocation be provided by (25) and (28b) respectively. Then, there exists positive constant αW ,1 SUgiven as 0 < αW ,1 < 1 such that the mth SU’s SIR error em ,k ,1 and value function parameter SU SU ~estimation errors Wm,k ,1 are all uniformly ultimately bounded (UUB) in Case 1 (i.e. PUs are SUdeactivated) within the finite time horizon. Moreover, the ultimate bounds are depend on finaltime (i.e. NTs ), bounded initial value function estimation error Bm ,,0 ,1 and bounded initial SIR V SUerror state B m ,,0SU ,1 . EProof: Consider the following positive definite Lyapunov function candidate ( SU ) ~ SU ( Lk = LD Em ,k ,1 + LJ Wm ,k ,1 ) (33) SU ,1 SU ,1 2 S [1 − (α ) ]where LD (E m ,k ,1 ) is defined as LD (Em,k ,1 ) = ( Em ,k ,1 )T Π Em,k ,1 with Π = SU SU SU SU W SU ,1 2 I is positive 2( B ) m, M ~definite matrix and I is identity matrix, Bm ,k ,1 ≤ Bm,M,1 ,and LJ (WmSU ,1 ) is defined as SU SU ,k ~ SU ~ SU ~ SULJ (Wm,k ,1 ) = [(Wm,k ,1 )T σ ( N − k ) zm,k ,1 − (Wm,k ,1 )T σ ( N − k + 1) zm,k ,−1 ) 2 ] SU SU 1 ~ SU (34) = [(Wm,k ,1 )T ∆Z m,k ,−1 ]2 SU 1 ~ The first difference of (34) can be expressed as ∆L = ∆LD (Em ,k ,1 ) + ∆LJ (WmSU ,1 ) , and considering SU ,k (~ ) ,k ,~ ,k 1 ,k ~that ∆LJ WmSU ,1 = [(WmSU+1 )T ∆Z l ,k ]2 − [(WmSU ,1 )T ∆Z l ,k −1 ]2 with value function estimator, we have ~ SU ~ SU , ~ SU ∆LJ (Wm,k ,1 ) = [(Wm,k +1 )T ∆Z l ,k ]2 − [(Wm,k ,1 )T ∆Z l ,k −1 ]2 1 ~ SU = −[1 − (αW ,1 ) 2 ][(Wm ,k ,1 )T ∆Z m,k,−1 ]2 SU SU 1 (35) 2 ~ 2 ≤ −[1 − (αW ,1 ) 2 ] ∆Z m,k,−1 Wm,k ,1 SU SU 1 SUNext considering the first part of Lyapunov candidate function∆LD ( Em ,k ,1 ) = ( Em ,k ,+1 )T ΠEm ,k ,+1 − ( Em ,k ,1 )T ΠEm ,k ,1 and applying the proposed FH-AODPA scheme and SU SU 1 SU 1 SU SUCauchy-Schwartz inequality reveals 2 SU ~ SU ∆LD (Em,k,1 ) ≤ Π Am,k,1Em,k,1 + Bm,k,1υm,k,1 − Bm,k,1υm,k ,1 − (Em,k,1 )T ΠEm,k,1 SU SU SU SU ˆ SU SU SU 2 2 SU ~ SU ≤ 2 Π Am,k,1Em,k,1 + Bm,k,1υm,k,1 + 2 Π Bm,k,1υm,k ,1 − (Em,k,1 )T ΠEm,k,1 SU SU SU ˆ SU SU SU (36) 2 2 SU ~ SU ≤ −(1 − 2lo ) Π Em,k,1 + 2 Π Bm,k,1υm,k ,1 SU At final step, combining the equation (34) and (36), we have 12
  13. 13. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 2 2 2 ~ 2 SU ~ SU ∆Lk ≤ −(1 − 2k * ) Π Em,k ,1 + 2 Π Bm ,k ,1υm ,k ,1 + (αW ,1 ) 2 ∆Z m ,k ,−1 Wm,k ,1 SU SU SU 1 SU 2 ~ 2 − ∆Z m ,k ,−1 Wm ,k ,1 SU 1 SU 2 1 2 ~ 2 ≤ −(1 − 2k * ) Π Em ,k ,1 + [1 − (αW ,1 ) 2 ] ∆Z m,k ,−1 Wm ,k ,1 SU SU SU 1 SU (37) 2 2 ~ 2 2 ~ 2 + (αW ,1 ) 2 ∆Z m,k ,−1 Wm,k ,1 − ∆Z m ,k ,−1 Wm ,k ,1 SU SU 1 SU SU 1 SU 2 1 2 ~ 2 ≤ −(1 − 2k * ) Π Em ,k ,1 − [1 − (αW ,1 ) 2 ] ∆Z m,k ,−1 Wm,k ,1 SU SU SU 1 SU 2 Since 0 < k * < 1 / 2 and 0 < αW ,1 < 1 , ∆Lk is negative definite and Lk is positive definite. Using SUstandard Lyapunov theory [15], during finite horizon, all the signals can be proven UUB withultimate bounds are dependent on initial conditions and final time NTs . The details aredemonstrated as following. Assume SIR error state and value function estimation error are initiated as a bound Bm ,,0 ,1 , E SU 2 2 ~ SU 2Bm ,,0 ,1 respectively (i.e. Em , 0,1 V SU SU = Bm ,,0 ,1 , ∆Z m ,k ,−1 E SU SU 1 Wm ,k ,1 = Bm,,0 ,1 ). Using standard Lyapunov E SU 2 ~ 2Theory [15], Em, 0,1 and ∆Z m,k ,−11 Wm,k ,1 for k = 1,2,..., N can be represented as SU SU SU 2 2 ~ SU 2 2 2 ~ SU 2 Π Em,k,1 SU + ∆Z m,k ,−1 SU 1 Wm,k ,1 = Π E m,0,1 + ∆Z m, −,1 SU SU 1 Wm ,0 ,1 + Π  Em,1 ,1 − Em,0,1  +  ∆Z m, 0,1 Wm,1 ,1 − ∆Z m, −,1 Wm ,0 ,1  + L 2 2 2 ~ 2 2 ~ 2  SU SU   SU SU SU 1 SU  (38)  444 444   4444444 4444444  1 2 3 1 2 3 ~ SU ∆LD ( Em , 0,1 ) SU ∆LJ (Wm , 0 ,1 ) + Π  Em,k,1 − Em,k,−1  +  ∆Z m,k,−1 Wm,k ,1 − ∆Z m,k ,−2 Wm,k −1  2 2 2 ~ 2 2 ~ 2 SU SU 1 SU 1 SU SU 1 SU ,    1   444 444   4444444244444443 1 2 3 1  ~ SU ,1 ∆LD ( Em ,k,−1 ) SU 1 ∆LJ (Wm ,k −1 ) ~ SU ~ SU , ≤ Π ( Bm,,0 ,1 ) 2 + ( Bm,,0 ,1 ) 2 + ∆LD ( Em,0,1 ) + ∆LJ (Wm,0 ,1 ) + L + ∆LD ( Em,k,−1 ) + ∆LJ (Wm,k −1 ) E SU V SU SU SU 1 14444 4444 2 3 14444 4444 1 2 3 ∆L0 ∆Lk −1 k −1≤ Π ( Bm,,0 ,1 ) 2 + ( Bm,,0 ,1 ) 2 + E SU V SU ∑ ∆L i =0 i ∀k = 1,2,..., N Using (37) and property of geometric sequence [17], equation (38) can be represented as k −1 ∑∆L 2 2 ~ 2 Π Em,k,1 + ∆Z m,k,−1 Wm,k ,1 ≤ Π ( Bm,,0 ,1 ) 2 + ( Bm,,0 ,1 ) 2 + SU SU 1 SU E SU V SU i i=0 k −1 k −1 ∑ ∑[1 − (α 2 1 2 ~ 2≤ Π (Bm,,0 ,1 ) 2 + ( Bm,,0 ,1 ) 2 − E SU V SU (1 − 2lo ) Π Em,k,1 − SU SU ,1 2 W ) ] ∆Z m,k,−1 Wm,k ,1 SU 1 SU i=0 2 i =0 k −1≤ Π ( Bm ,,0 ,1 ) 2 + ( Bm ,,0 ,1 ) 2 − (1 − 2lo ) Π ( Bm ,,0 ,1 ) 2 E SU V SU E SU ∑ ( 2k ) i =0 * i k −1 [1 − (αW ,1 ) 2 ]( Bm ,,0 ,1 ) 2 SU V SU [1 + (αW ,1 ) 2 ]i SU− 2 ∑ 2i i =0 (39)  [1 + (αW ,1 ) 2 ]k  V ,SU ,1 2 SU≤ Π ( Bm ,,0 ,1 ) 2 + ( Bm ,,0 ,1 ) 2 − [1 − ( 2lo ) k ] Π ( Bm ,,0 ,1 ) 2 − 1 − E SU V SU E SU  ( Bm , 0 )  2  k 1 + (α W ,1 ) 2  SU≤ (2lo ) k Π ( Bm ,,0 ,1 ) 2 +  E SU V ,SU ,1 2  ( Bm , 0 ) ∀k = 1,2,..., N  2 Therefore, the ultimate bounds for SIR error value and value function estimation error can berepresented as 13
  14. 14. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 k 1 + (αW ,1 ) 2  ( Bm,,0 ,1 ) 2 SU V SU Em,k ,1 ≤ (2lo ) k ( Bm ,,0 ,1 ) 2 +  SU E SU  ≡ Bm.,k E CL (40)  2  Πand k 2 ~ 2 1 + (αW ,1 ) 2  V ,SU ,1 2 SU ∆Z m,k ,−1 Wm,k ,1 ≤ ( 2lo ) k Π ( Bm,,0 ,1 ) 2 +  SU 1 SU E SU  ( Bm,0 ) ≡ Bm,k V ,CL  2  (41) ∀k = 1,2,..., Nwhere Bm.,kCL and Bm,,k are the values of upper bounds for k = 1,2,..., N . According to the E V CLrepresentation of (40), (41) and values of lo , αW ,1 , it is important to note that since tuning SU 1 1 + (α W ,1 ) 2 SUparameter 0 < lo < and 0 < αW ,1 < 1 given in Theorem 1, then 0 < 2lo < 1 , 0 < SU < 1 and 2 2 k 1 + (αW ,1 ) 2  SUterms (2lo ) k ,   will decrease when k increase. Moreover, since initial value function  2 estimation errors Bm,,0 ,1 and initial SIR error value Bm,,0SU ,1 are bounded values, the closed-loop V SU Eultimate bounds Bm.,kCL (40) and Bm.,kCL (41) decrease while k increase. Further, when final time E VNTs increases, the SIR error values and value function estimation errors will not only be UUB,but also these ultimate bounds will decrease with time. 1 + (αW ,1 ) 2 SURemark 1: It is important to note since 0 < 2lo < 1,0 < < 1 and initial value function 2 k V , SU ,1 E , SU ,1 1 + (α W ,1 ) 2  k SUestimation error B m,0 and SIR error B m,0 are bounded values, both term (2lo ) ,    2 and the closed-loop ultimate bounds Bm.,kCL (40) and Bm.,kCL (41) will converge to zeros when time E Vgoes to infinite (i.e. Bm.,kCL → 0 , Bm.,kCL → 0 as k → ∞ ) and proposed FH-AODPA will achieve E Vinfinite optimal DPA solution.Case 2: PUs are activated Compared with Case 1, closed-loop stability analysis need to be done for SUs and for PUssince the PUs are activated. Similar to Case 1, before introducing the convergence proof, theproposed finite horizon adaptive optimal distributed power allocation algorithm for both PUsand SUs in Case 2 is given as follows.Algorithm 2:Finite Horizon Adaptive Optimal Distributed Power Allocation for lthPU and mth SU in enhanced CRN under Case 2 (i.e. PUs are activated) ˆ ˆ1: Initialize: Wl ,PU , 2 = 0,WmSU , 2 = 0 and implementing admissible policies υ m , 0 , 2 ,υ m ,0, 2 2: PU SU k ,kfor lth PU and mth SU.3: while { kTs < t < (k + 1)Ts } do4: Calculate the value function estimation errors Ξ lFTBE ,PU , 2 , Ξ m ,k ,SU , 2 and ,k FTBE5: Ξ lFC ,PU , 2 , Ξ m ,k,SU , 2 for lth PU and mth SU. ,k FC6: Update the parameters of the value function estimator for lth PUand7: mth SU as ˆ PU , 2 = Ψ PU , 2 [(Ψ PU , 2 )T Ψ PU ,2 ]−1[α PU , 2 (Ξ FTBE, PU , 2 )T8: lth PU : Wl ,k +1 l ,k l ,k l ,k W l ,k9: + α W , 2 (Ξ lFC , PU , 2 )T − (ΓlPU , 2 )T − θ m, N, 2 Λ ] PU ,k ,k PU ˆ 210: mth SU : WmSU+,1 = Ψm,k , 2 [(Ψm,k , 2 )T Ψm,k , 2 ]−1[αW , 2 (Ξ m,k ,SU , 2 )T SU SU SU SU FTBE ,k11: + αW , 2 (Ξ m,k,SU , 2 )T − (Γm,k , 2 )T − θ m,N, 2 Λ ] SU FC SU SU 14
  15. 15. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 201312: Update finite horizon adaptive optimal DPA based on estimated Θ lPU , 2 , ,k13: Θ m ,k, 2 matrices for lth PU and mth SU. SU ˆ ˆ14: lth PU : Pl ,PU1, 2 = υlPU , 2 I lPU , 2 = −(Θυυ , PU , 2 ) −1 Θυ,E , PU , 2 zlPU , 2 I lPU , 2 ˆ ,k k+ ,k l ,k lk ,k ,k SU , ˆ SU SU ˆ ˆ15: mth SU : Pm,k +2 = υm,k, 2 I m,k, 2 = −(Θυυ,k, SU ,2 ) −1 ΘυE,k, SU , 2 zm,k, 2 I m,k,2 SU SU 1 m m16: end while17: If {t < NTs } do18: Go to next time interval [(k + 1)Ts , (k + 2)Ts ) (i.e. k = k + 1 ), and then go back19: to line 2.20: else do21: Stop algorithm.Theorem 2.(Convergence of the Finite Horizon Adaptive Optimal Distributed PowerAllocation for PUs and SUs in enhanced CRN under Case 2). Let υ lPU , 2 ,υ m , 0, 2 be any initial ,0 SUadmissible policy for lth PU’s and mth SU’s finite horizon adaptive optimal DPA scheme inCase 2 with 0 < k * < 1 / 2 . Let the lth PU’s and mth SU’s parameters be tuned and estimationoptimal power allocation be provided by (30) and (32) respectively. Then, there exists positiveconstant αW , 2 , αW , 2 given as 0 < αW , 2 < 1,0 < αW , 2 < 1 such that the lth PU’s and mth SU’s SIR PU SU PU SU ~ ~error elPU , 2 , em ,k , 2 and value function parameter estimation errors Wl ,PU , 2 , WmSU , 2 are all UUB in Case ,k SU k ,k2 (i.e. PUs are activated) within the finite time horizon. Moreover, the ultimate bounds aredepend on finite time (i.e. NTs ), bounded initial value function estimation error BlV,0,PU ,2 , Bm,,0 , 2 V SUand bounded initial SIR error BlE0,PU ,2 , Bm,,0 , 2 . , E SUProof: Similar to the proofs in Theorem 1.4 Numerical Simulations 9 Primary users 8 Secondary users 7 6 5 km 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 km Figure 2.Placement of PUs and SUs in enhanced CRN "1": Primary users are activated; "0" Primary users are deactivated 1 0.5 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Time (sec) Figure 3. Activity performance of PUs in enhanced CRN 15
  16. 16. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 In this simulation, the enhanced Cognitive Radio Network (CRN) is considered to bedivided into 2 sub-networks: Primary Radio Network (PRN) and Secondary Radio Network(SRN) which included 8 PUs and 20 SUs. These PUs and SUs are placed randomly within anarea of 9km x 9km by using a Gaussian distribution which is shown as Figure 2. Moreover,power of each PU and SU in enhanced CRN is assumed to be updated asynchronously.Therefore, while lth PU or m th SU updates its power, the powers of all other PUs and SUs do notchange. Wireless channel bandwidth Wc is considered to be 100 kHz. Furthermore, it is wellknown that PUs in enhanced CRN will not always be activated. In this simulation, the activityperformance of PUs is shown in Figure 3. According to Figure 3, there exists two cases (i.e.Case 1: PUs are deactivated during [500 sec,800 sec), [1200 sec,1800 sec) ; Case 2: PUs areactivated during [0 sec,500 sec), [800 sec,1200 sec), [1800 sec,2000 sec) ) in enhanced CRN. In Case 1, SUsince PUs are deactivated, a high threshold SIR, γ H , which each SU tries to achieve is 0.1 (-10dB). In Case 2, to guarantee the QoS of activated PUs, a threshold, γ PU ,which each PU can SUachieve is selected as 0.1995 (-7dB) and another lower threshold SIR, γ L , which each SU triesto achieve is set at 0.01 (-20dB). Next, proposed finite horizon adaptive optimal distributed power allocation (FH-AODPA) isimplemented for PUs and SUs in enhanced Cognitive Radio Network with channel uncertainties.Since wireless channel attenuations of users in enhanced CRN are different, initial PUs’ andSUs’ SIRs are different values (i.e. PUs: [-18.6788dB, -7.8337dB,…, -21.1189dB], SUs: [-8.8116dB, -35.0345dB, -12.4717dB , …, -29.5028dB]).Moreover, the augment SIR error systemstate is generated as z k = [ Ek υ k ]T ∈ Թଷൈଵ and the regression function for value function isgenerated as {z12 , z1 z2 ,..., z 2 ,..., z3 } as per (19). The design parameter for the value function 2 2estimation is selected as αW = 0.0001 while initial parameters for the adaptive estimator are set tozeros at the beginning of simulation. 0 PUs average SIR SUs average SIR -5 -10 SIR (dB) -15 -20 -25 -30 0 500 1000 1500 2000 Time (sec) Figure 4.Average SIRs of PUs and SUs in enhanced CRN 250 PUs power average SUs power average 200 Average power (mW) 150 100 50 0 0 500 1000 1500 2000 Time (sec) Figure 5. Average power allocation of PUs and SUs in enhanced CRN 16
  17. 17. International Journal of Computer Networks & Communications (IJCNC) Vol.5, No.1, January 2013 In Figures 4 through 9, the performance of proposed finite horizon adaptive optimaldistributed power allocation scheme is evaluated. In Figure 4, the averages of all PUs’ SIRs andSUs’ SIRs are shown. It is important to note that proposed FH-AODPA cannot only force SUs SUconverge to low target SIR (i.e. γ H = −10dB ) when PUs are deactivated (i.e. Case 1:γ lPU = −∞ dB ∀l ∈ {PUs} during [500 sec,800 sec),[1200 sec,1800 sec) ), but also force PUs and SUsconverge to target SIRs (i.e. γ PU = −7dB, γ L = −20dB ) respectively while CRN is at Case 2 (i.e. SUPUs are activated during [0 sec,500sec),[800sec,1200sec),[1800sec,2000sec) ). Also, the powerconsumptions averages of PUs and SUs are shown in Figure 5. Obviously, in Case 1, since PUsare deactivated, SUs increase transmission powers to improve network utility (e.g. spectrumefficiency). For Case 2, due to activated PUs, SUs decrease transmission power to reduce theinference to PUs for guarantying their QoS. 20 17.5 15 12.5 FTBE 10 e 7.5 5 2.5 0 0 500 1000 1500 2000 Time (sec) Figure 6.Average of Bellman equation error. Moreover, the average of Bellman equation errors and terminal constraint errors for both PUsand SUs are considered. As shown in Figure 6 and 7, both average of Bellman equation errorsand terminal constraint errors converge close to zeros during the finite horizon (i.e. t ∈ [0, NTs ]with NTs = 2000 sec ) which indicates that proposed scheme converges close to optimal powerallocation while satisfying the terminal constraint for both PUs and SUs. It is important to notethat the convergence performances are dependent upon tuning rate based on Theorem 1 and 2.Further, according to Theorem 1and 2, when final time NTs increases, the upper bound of averageof Bellman equation errors and terminal constraint errors will decrease. 50 45 40 35 30 FC 25 e 20 15 10 5 0 0 500 1000 1500 2000 Time (sec) Figure 7. Average of terminal constraint error 17

×