This presentation discusses online opportunistic routing for cognitive radio ad-hoc networks using reinforcement learning. The objectives are to design and implement a distributed opportunistic routing algorithm, compute channel availability using Hidden Markov Model prediction, model strategic interaction among nodes to select the best forwarder, and maximize the average per packet reward. The literature survey covers previous work on reinforcement learning based routing schemes and machine learning techniques for cognitive radios. The implementation will use Java libraries for simulation and reinforcement learning to achieve the objectives.
2. Problem Statement
• To design and implement spectrum
aware online opportunistic routing
for dynamic cognitive radio
environment using Reinforcement
Learning (RL).
2
3. Objectives
To design and Implement Distributed
opportunistic Routing Algorithm.
To compute channel availability using prediction of
Hidden Markov Model.
To model strategic interaction among multiple
cognitive nodes for selecting best candidate
forwarder.
To maximize average per packet reward between
source and destination.
3
4. Literature survey
Title Author Publication Findings
A reinforcement
learning-based routing
scheme for cognitive
radio ad hoc networks
Al-Rawi, Hasan AA, et
al.
Wireless and
Mobile
Networking
Conference
(WMNC), 2014 7th
IFIP. IEEE, 2014.
Presents a simple and
pragmatic
reinforcement learning (RL)-
based routing scheme
called Cognitive Radio Q-routing
(CRQ-routing)
A Survey on Machine-
Learning Techniques in
Cognitive Radios
Mario Bkassiny,
Student Member, IEEE,
Yang Li, Student
Member, IEEE, and
Sudharman K.
Jayaweera, Senior
Member, IEEE
Communications
Surveys &
Tutorials, IEEE 15.3
(2013): 1136-
1159.
It show the impact of PU
activities on the operation of
OCR in channel sensing, relay
selection and data transmission.
Open research issues
in multihop cognitive
radio networks
Sengupta S,
Subbalakshmi KP
(2013)
Communications
Magazine,
IEEE 51.4 (2013):
168-176.
Mapping of spectrum selection
metrics and local PU
interference observation to a
packet forwarding delay over
the control channel. 4
5. Literature survey
Title Author Publication Findings
Adaptive
Opportunistic Routing
for Wireless Ad Hoc
Network
Abhijit A. Borkar,
Mohammad
Naghshvar, Tara Javidi
IEEE/ACM
Transaction on
networking,
Vol.20, No.1,
February 2012
How RL use to opportunistically
route the packet even in the
absence of Reliable knowledge
about channel statistic and
network model.
Spectrum-Aware
Opportunistic Routing
in Multi-Hop Cognitive
Radio Network
Yongkang Liu, Lin X.
Cai.
IEEE Journal on
selected areas in
communication,
Vol.30, No.10,
November 2012
It show the impact of PU
activities on the operation of
OCR in channel sensing, relay
selection and data transmission.
CRP: A Routing
Protocol for Cognitive
Radio Ad Hoc
Networks
Kaushik R. Chowdhury
and Ian F. Akyildiz
IEEE Journal on
selected areas in
communication,
Vol.29, No.4, April
2011
Mapping of spectrum selection
metrics and local PU
interference observation to a
packet forwarding delay over
the control channel.
5
6. Literature survey
Title Author Publication Findings
IPSAG: An IP spectrum
Aware Geographic
Routing Algorithm
Proposal for Multi-hop
Cognitive Radio
Networks
Cornelia-Ionela BADOI
and Ramjee PRASAD
2010 8th
International
Conference on ,
vol., no., pp.491-
496, 10-12 June
2010
The real time information
exchange inside the
neighborhood and adaptation
to the CR very dynamic
spectrum opportunities.
Gymkhana: a
Connectivity-Based
Routing Scheme for
Cognitive Radio Ad
Hoc Networks
Anna Abbagnale,
Francesca Cuomo
INFOCOM IEEE
Conference on
Computer
Communications
Workshops, 2010.
IEEE, 2010
Uses a distributed protocols to
collect some key parameters
related to paths from source to
destination
Ant-based spectrum
aware routing for CRN
Bowen LI, Dabai LI, Qi-
hui WU, Haiyuan LI
International
Conference on ,
vol., no., pp.1-5,
13-15 Nov. 2009.
An Artificial ANT colony system
can be used for discovering,
observing and learning of
routing strategies by guided
ants communication in an
indirect way.
6
7. Literature survey
Title Author Publication Findings
Channel Modeling
Based on Interference
Temperature in
Underlay Cognitive
Wireless Networks
Manuj Sharma,
Anirudhha Sahoo, K D
Nayak
IEEE International
Symposium on.
IEEE, (2008) 720-
734.
Application of trained HMM for
channel selection in Multi-
channel wireless network
Routing in Cognitive
radio networks:
challenges and
solution
Matteo Cesana,
francesca Cuomo,
Elylem Ekici
ELSEVIER Ad Hoc
Networks (2008)
vol. 24, (56-69)
Different Cognitive routing
schemes on basis of Full
spectrum knowledge and Local
spectrum knowledge.
NeXt
generation/dynamic
spectrum
access/cognitive radio
wireless networks: A
survey
Ian F. Akyildiz, Won-
Yeol Lee, Mehmet C.
Vuran
Science Direct
Computer
network 50(2006)
2127-2159
Main Function for cognitive
radios in xG networks how it
can use to achive Dynamic
spectrum access.
7
8. Platform of implementation
• JDK ( NetBeans/ Eclipse)
• JiST and Swan Simulation Libraries
– JiST is a high-performance discrete event simulation engine that runs
over a standard Java virtual machine.
– SWANS is a scalable wireless network simulator built atop the JiST
platform.
8
Hardware Requirement
Software Requirement
• Processor: dual/quad core CPU(Minimum Pentium Dual Core)
• RAM: min 1GB
• Disk : Greater than 1GB
10. Result Analysis
10
Channel 1 having frequency 2.412 GHz
0
0.5
1
1.5
2
2.5
3
3.5
1 2 3 4 5 6
MeanCAMValue
Test Sequence Number(Channel 1)
2-State HMM Prediction Model
Actual Predicted
11. Result Analysis
11
Channel 6 having frequency 2.437 GHz
0
0.5
1
1.5
2
2.5
3
1 2 3 4 5 6
MeanCAMValue
Test Sequence Number (Channel 6)
2-State HMM Prediction Model
Actual Predicted
12. Result Analysis
12
Channel 11 having frequency 2.462 GHz
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5 6
MeanCAMValue
Test Sequence Number (Channel 11)
2-State HMM Prediction Model
Actual Predicted
16. Paper Published/Submitted
16
• International Journal of Computer Applications(IJCA)
– Solao Harshal, R. M. Goudar, and Sunita Barve. "Routing
Approaches for Cognitive Radio Ad-hoc Networks and
Challenges." International Journal of Computer
Applications 108.17 (2014): 17-22. DOI:10.5120/19003-0499
• International Journal of Emerging Technologies in
Computational and Applied Sciences (IJETCAS)
– Online Opportunistic Routing in Cognitive Radio Ad-Hoc
Network and Spectrum Management(Accepted)
17. Conferences
17
• cPGCON-2015
– "Online Opportunistic Routing For Cognitive Radio Ad-Hoc
Network", Fourth Post Graduate Symposium For Computer
Engineering (cPGCON-2015),Board of studies Savitribai
Phule Pune University, MET Bhujbal Knowledge City, Adgaon
Nashik, 13th 14th March 2015.
18. Conclusion
• Use of prior time channel availability sequence
increase the opportunities to use best available
channel.
• By using online opportunistic routing we can
maximize average per packet reward that decrease
channel switching cost by selecting best relay with
the help of softmax action selection.
18
19. References
1. Abhijeet Bhorkar, Mohammad Naghshvar, “Adaptive Opportunistic Routing for
Wireless Ad Hoc Networks”, IEEE/ACM Transaction on Networking, VOL. 20,
NO. 1, 2012, DOI- 10.1109/TNET.2011.2159844
2. Richard S. Sutton, Andrew G. Barto, “Reinforcement Learning : An
introduction”, MIT Press, Cambridge, MA ,1998
3. Kok-Lim Alvin Yau, Peter Komisarczuk, Paul D.Teal, “Reinforcement learning
for context awareness and intelligence in wireless networks: Review, new
features and open issues”, Journal of Network and Computer Applications, Vol.
35, 2012, DOI-10.1016/j.jnca.2011.08.007
4. Cheng Wu, Kaushik Chowdhury, Marco Di Felice, “Spectrum Management of
Cognitive Radio Using Multi-agent Reinforcement Learning”, Int. Conf. on
Autonomous Agents and Multiagent Systems (AAMAS 2010), Vol. 35, 2010.
5. Sharma, Manuj, Anirudha Sahoo, and K. D. Nayak. "Channel modeling based on
interference temperature in underlay cognitive wireless networks." Wireless
Communication Systems. 2008. ISWCS'08. IEEE International Symposium on.
IEEE, 2008.
19
24. 0 1 0 0 1 0 1 1 0 0
0 1 1 1 0 0 0 0 1 1
24
CAM CalculationHMM and CAM
HMM prediction √
CAM calculation √
𝐶𝐴𝑀𝑐 = 𝐴𝑣𝑔1 +
1
(
𝑁𝑜1
𝑙𝑒𝑛.
)
Where,
𝐴𝑣𝑔1 is Average gap between any two 1’s in sequence
𝑁𝑜1 is number of 1 in given sequence
len. is length of total sequence
𝐶𝐴𝑀𝑐 is channel availability matrix for channel c
Module 1
25. 25
Initialization of CC & DC
Sharing Beacon Message √
Select Control Channel √
Select Data Channel √
Module 2
28. Node V(s)
2 0
3 0
4 0
5 0
Node V(s)
2 0
3 10
4 0
5 0
Before DP send by 1
After DP sent by 1
28
Routing and Value Updation
𝑽 𝒔 ← 𝑽 𝒔 + 𝜶[𝒓 + 𝜸𝑽 𝒔′ − 𝑽(𝒔)]
Module 4
29. Mathematical Modeling
• Markov Decision Process(MDP)
The task that satisfies the markov Property, i.e. all
decisions and values are function of the current state
only, is called Markov Decision Process (MDP).
MDP is represented using the tuples <S, A, f, ρ>
Set of State S: is a possible set of the states of the
environment. States are set of neighbors of every
cognitive node s represented as 𝑵 𝒔. {s1, s2, s3…sN}
Set of Action A: is a set of agent action at a specific
time, allowing it to change from one state to another
state. {a1, a2, a3…. aN} 29
30. Mathematical Modeling
State transition probability f: Is the state
transition probability function. As a result of the
action 𝒂 𝒕 A the environment changes its state
from 𝑺 𝒕 to 𝑺 𝒕+𝟏 𝑵 𝒔.
Reward function ρ : is the reinforcement function.
Used to evaluate immediate effect of action 𝒂 𝒕 i.e.
the transition from 𝑺 𝒕 to 𝑺 𝒕+𝟏.
30
34. Results
34
Given Sequence CAM Value Predicted Sequence using
HMM
CAM value for
Predicted Sequence
T time Sequence 2T time sequence
010001010001100 0.334 000101110111000 0.398
000101110111000 0.447 100110010001000 0.487
10100011010111 0.571 110100010110010 0.624
010011100000011 0.400 100101110111000 0.406
100000010001000 0.200 010001110101101 0.197
• Cam Value with and without HMM
35. Results
35
No. of Packets Average Per Packet
Reward
Avg. Per Packet
Reward with Softmax
10 78 -------
20 74 -------
50 79 -------
100 82 -------
• Average Per Packet Reward without softmax action selection
36. Methodology
• Temporal Difference : TD(0) procedural form
Initialize V(s) arbitrarily, π to the policy to be evaluated
Repeat (for each episode):
Initialize s
Repeat (for each step of episode):
a← action given by π for s
Take action a; observe reward r and next state 𝒔′
𝑉 𝑠 ← 𝑉 𝑠 + 𝛼[𝑟 + 𝛾𝑉 𝑠′ − 𝑉(𝑠)]
s ← 𝑠′
Until s is terminal
36
37. Literature survey
Title Author Publication Findings
IPSAG: An IP spectrum
Aware Geographic
Routing Algorithm
Proposal for Multi-hop
Cognitive Radio Networks
Cornelia-Ionela BADOI
and Ramjee PRASAD
2010 8th
International
Conference on , vol.,
no., pp.491-496, 10-
12 June 2010
The real time information exchange
inside the neighborhood and
adaptation to the CR very dynamic
spectrum opportunities.
Gymkhana: a
Connectivity-Based
Routing Scheme for
Cognitive Radio Ad Hoc
Networks
Anna Abbagnale,
Francesca Cuomo
INFOCOM IEEE
Conference on
Computer
Communications
Workshops, 2010.
IEEE, 2010
Uses a distributed protocols to
collect some key parameters related
to paths from source to destination
Ant-based spectrum
aware routing for CRN
Bowen LI, Dabai LI, Qi-hui
WU, Haiyuan LI
International
Conference on , vol.,
no., pp.1-5, 13-15
Nov. 2009,
An Artificial ANT colony system can
be used for discovering, observing
and learning of routing strategies by
guided ants communication in an
indirect way.
37
38. Literature survey
Title Author Publication Findings
Channel Modeling Based
on Interference
Temperature in Underlay
Cognitive Wireless
Networks
Manuj Sharma,
Anirudhha Sahoo, K D
Nayak
IEEE International
Symposium on. IEEE,
(2008) 720-734.
Application of trained HMM for
channel selection in Multi-channel
wireless network
Routing in Cognitive radio
networks: challenges and
solution
Matteo Cesana, francesca
Cuomo, Elylem Ekici
ELSEVIER Ad Hoc
Networks (2008) vol.
24, (56-69)
Different Cognitive routing schemes
on basis of Full spectrum knowledge
and Local spectrum knowledge.
NeXt generation/dynamic
spectrum
access/cognitive radio
wireless networks: A
survey
Ian F. Akyildiz, Won-Yeol
Lee, Mehmet C. Vuran
Science Direct
Computer network
50(2006) 2127-2159
Main Function for cognitive radios in
xG networks how it can use to
achive Dynamic spectrum access.
38
39. Proposed Work
• @ each node
0 1 1 0 1 0 0 0 0 1
1 0 0 0 0 0 1 0 1 1
1. Channel Availability Sequence ( For T)
2. For Next time Step 2T (Future Prediction)
Training
Data Set
Hidden
Markov
Model
Predicted
sequence for
2T
Calculate CAM
value using
formula
39