SlideShare a Scribd company logo
Machine Learning and Stochastic Geometry:
Statistical Frameworks Against Uncertainty in
Wireless LANs
Koji Yamamoto and Takayuki Nishio
Graduate School of Informatics, Kyoto University
2019-05-24 — IEEE ICC 2019 Tutorial
Contents of this tutorial
Part A — Deep Reinforcement Learning and Stochastic Geometry for
microwave WLANs
Koji Yamamoto — 90 min
I. Wireless LANs — IEEE 802.11ax and 11be
II. Deep Reinforcement Learning — Channel Allocation in WLANs
III. Stochastic Geometry — Modeling and Analysis of WLANs
Break
Part B — Deep Learning for/in mmWave WLANs
Takayuki Nishio — 90 min
2 / 88
Radio parameters
Transmission power
Frequency channel
and states
topology
fading
−−−−−−−−→ Communication quality
Throughput
Delay
x
Model f
−−−−−−−−→ y = f(x)
We would like to
▶ maximize the communication quality f(x)
▶ find the optimal parameter arg max
x
f(x)
3 / 88
Model f
is given?
Optimization/
Game theory
yes
Data (xi, yi)
is given?
yi = f(xi) + noise
no
Supervised
learning
yes
Reinforcement
learning
no
Deep learning
(CNN)
xi is image
Deep RL
xi is image
Part A-III
Part A-IIPart B
4 / 88
Model f
is given?
Optimization/
Game theory
yes
Data (xi, yi)
is given?
yi = f(xi) + noise
no
Supervised
learning
yes
Reinforcement
learning
no
Deep learning
(CNN)
xi is image
Deep RL
xi is image
Part A-III
Part A-IIPart B
4 / 88
Model f
is given?
Optimization/
Game theory
yes
Data (xi, yi)
is given?
yi = f(xi) + noise
no
Supervised
learning
yes
Reinforcement
learning
no
Deep learning
(CNN)
xi is image
Deep RL
xi is image
Part A-III
Part A-IIPart B
4 / 88
Model f
is given?
Optimization/
Game theory
yes
Data (xi, yi)
is given?
yi = f(xi) + noise
no
Supervised
learning
yes
Reinforcement
learning
no
Deep learning
(CNN)
xi is image
Deep RL
xi is image
Part A-III
Part A-IIPart B
4 / 88
Model f
is given?
Optimization/
Game theory
yes
Data (xi, yi)
is given?
yi = f(xi) + noise
no
Supervised
learning
yes
Reinforcement
learning
no
Deep learning
(CNN)
xi is image
Deep RL
xi is image
Part A-III
Part A-IIPart B
4 / 88
Goal of this tutorial
Deep understanding of important ideas
▶ Each of ML, RL, SG, and WLANs can be one tutorial
▶ A deep understanding of important techniques helps you to
understand the relevant techniques yourself
Machine learning and stochastic geometry analysis
▶ Mainly for resource management and system design
5 / 88
Part I
Wireless LANs — IEEE 802.11ax and
11be
6 / 88
Maximum data rate in IEEE 802.11
11
11a
11b
11g
11n
11ac
11ax
11be(bit/s)
1995 2000 2005 2010 2015 2020
1M
10M
100M
1G
10G
7 / 88
PHY (cited from [Khorov+2019, Table II])
Legacy 802.11ax
OFDM Constellation Order 256-QAM (11ac) 1024-QAM
OFDM Symbol Duration 3.2 µs 12.8µs
OFDM Guard Interval 0.4 or 0.8 µs 0.8, 1.6 or 3.2 µs
Maximum Data Rate ≈ 7 Gbit/s ≈ 9.6 Gbit/s
8 / 88
The 802.11ax Challenge is
Dense Networks
[Khorov+2019]
9 / 88
Channel
Received power (RSSI)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000
Probabilityofcollisions
Number of transmitters
Bianchi model (saturated traffic) [Bianchi2000]
IEEE 802.11a, Frame size: 1500 B, Data rate: 6 Mbit/s
11 / 88
Channel Access and OBSS Management
(cited from [Khorov+2019, Table II])
Legacy (802.11ac and earlier) 802.11ax
Basic Channel
Access
CSMA/CA OFDMA on top of CSMA/CA
Random Channel
Access
DCF, EDCA UL OFDMA Random Access
on top of CSMA/CA
Contention-free
Access
PCF, HCCA (not
implemented in real devices),
RAW (11ah)
Trigger-based UL OFDMA
MU Technology MU-MIMO (11ac) MU-MIMO, OFDMA
MU Transmission
Direction
DL (11ac) DL and UL
Fragmentation Static Flexible
Interference
Mitigation
NAV, RTS/CTS,
HCCA TXOP Negotiation
Two NAVs, Quiet Period
Spatial Reuse Sectorization (11ah) Adaptive Power and Sensitiv-
ity Thresholds, Color
12 / 88
Spatial Reuse Using BSS “Color”
Start
Channel
Transmit
Idle
End
Busy
(11ac)
“Color” < OBSS PD
Busy
(11ax)
Same BSS
OBSS
Yes
No
▶ “Color” header
▶ Parameter: Carrier sense threshold for OBSS (OBSS PD)
▶ As a result, channel is spatially reused (spatial reuse).
OBSS: Overlapped Basic Service Set, i.e., the other adjacent cells 13 / 88
IEEE 802.11ax IEEE 802.11be
Will be finalized in 2020 (in schedule) Starts in 2018
High Efficiency WLAN (HEW) Study
Group
Extremely High Throughput (EHT)
Study Group
Wi-Fi 6 Wi-Fi 7 (?)
http://www.ieee802.org/11/
Reports/tgax_update.htm
http://www.ieee802.org/11/
Reports/ehtsg_update.htm
[Khorov+2019] E. Khorov, A. Kiryanov, A.
Lyakhov, and G. Bianchi, “A tutorial on
IEEE 802.11ax high efficiency WLANs,”
IEEE Commun. Surveys Tuts., vol. 21,
no. 1, 2019
▶ At least 30 Gbit/s
▶ 320 MHz bandwidth and
non-contiguous spectrum
▶ Multi-band/multi-channel
aggregation
▶ 16 spatial streams MIMO
▶ Multi-Access Point Coordination
▶ Enhanced link adaptation and
retransmission protocol
14 / 88
Throughput Starvation
BA C D E
C
D
E t
Busy
First order starvation
[Mhatre+2007]
Flow-in-the middle starvation
[Garetto+2008]
Second order starvation [Mhatre+2007]
Under saturated traffic, A and D always detect signals thus they cannot
start transmissions
15 / 88
Goal of performance modeling
▶ Points represent TXs or
TX-RX pairs.
▶ Circle represents carrier sense
area (contention domain).
We would like to evaluate the following performances without Monte
Carlo simulations to ensure reproducibility
1. Individual throughput in a specific topology — Graph theory Part I
2. Average throughput over possible topologies — SG Part III
3. Distributions of interference and SIR — SG Part III
16 / 88
Contention graph
TX-RX pair Vertex
Pairs detecting signals each other Vertices are connected with edge
Implicit assumption: TX and RX forming a given link have the same carrier sense
relationship (The situation where one of them detects the carrier but the other does
not detect it is not allowed)
Carrier
detection
Carrier sensing relationship Contention graph
17 / 88
Back-of-Envelope (BoE) throughput [Liew+2010]
Contention graph
0.50.5
0
1
BoE throughput
Normalized throughput
Medium access probability
18 / 88
Model of simultaneous transmissions according to CSMA
Maximum independent sets (MISs)
▶ Induced subgraph without edges with the maximum number of
vertices
▶ Possible patterns of simultaneous TXs according to CSMA
(Vertex-)induced subgraph
▶ A subset of the vertex set of the original graph and related edge set
Induced subgraph o o o o
MIS x x o o
Wolfram—Alpha: Subgraph, Independent Set
19 / 88
n1 = 2
n1/n = 2/2 = 1
n2 = 0
n2/n = 0/2 = 0
n3 = 1
n3/n = 1/2 = 0.5
1. Find all MISs of the contention graph
(possible simultaneous transmission patterns)
2. The normalized throughput of link (vertex) i is determined as ni/n,
where
▶ n is the number of MISs and
▶ ni is the number of MISs containing i.
We assume that all possible patterns occur with equal probability
20 / 88
Example of BoE throughput
0.6
0.0
1.0
0.4
0.2
1.0
0.3
0.2
0.8
0.3
1.0
0.6
1.0
0.4
0.3
0.8
21 / 88
1 require("igraph")
2 require("spatstat")
3 require("plotrix")
4 set.seed(9)
5
6 LAMBDA <- 2e-5
7 RADIUS <- 200
8 W.LENGTH <- 500
9 W <- owin(xrange = c(-W.LENGTH, W.LENGTH), yrange = c(-W.LENGTH, W.LENGTH))
10
11 # Poisson point process
12 P <- rpoispp(lambda = LAMBDA, win = W)
13
14 # Adjacency matrix
15 D <- pairdist(P)
16 adjacency.matrix <- (D<RADIUS)
17 diag(adjacency.matrix) <- 0
18
19 # Contention graph
20 adjacency.graph <- graph.adjacency(adjacency.matrix, mode="undirected")
21
22 # Maximum independent sets
23 MISs <- largest_ivs(adjacency.graph)
24
25 # Evaluate throughput
26 ni <- 0
27 for ( i in 1:(P$n) ) {
28 N.MIS <- 0
29 for ( j in 1:length(MISs) ) {
30 N.MIS <- N.MIS + sum(MISs[[j]]==i)
31 }
32 ni[i] <- N.MIS
33 }
34
22 / 88
35 # Plotting figures
36 quartz(type="pdf", file="160408_BoE_simple_position.pdf")
37 par(pty="s",mar=c(0, 0, 0, 0)+0.1)
38 plot(P$x, P$y, xlim=c(-W.LENGTH,W.LENGTH), ylim=c(-W.LENGTH,W.LENGTH))
39 for (i in 1:(P$n) ) {
40 draw.circle(P[i]$x,P[i]$y,200,border=NA,col=adjustcolor( "blue", alpha.f = 0.1))
41 }
42 dev.off()
43 quartz(type="pdf", file="160408_BoE_simple.pdf")
44 par(pty="s",mar=c(0, 0, 0, 0)+0.1)
45 plot(P$x, P$y, xlim=c(-W.LENGTH,W.LENGTH), ylim=c(-W.LENGTH,W.LENGTH))
46 for ( i in 1:(P$n-1) ) {
47 for ( j in (i+1):(P$n) ) {
48 if (adjacency.matrix[i,j]==1) {
49 segments(P[i]$x, P[i]$y, P[j]$x, P[j]$y, col="blue", lwd = 2)
50 }
51 }
52 }
53 text(P$x, P$y, sprintf("%.1f",ni/length(MISs)), pos=1)
54 dev.off()
23 / 88
Potential game-based channel allocation [Yin+2017]
0
200
400
600
800
1000
1200
0 200 400 600 800 1000 1200
Location(m)
Location (m)
Proposed scheme
0
200
400
600
800
1000
1200
0 200 400 600 800 1000 1200
Location(m)
Location (m)
Compared scheme
▶ Cost function taking into account the number of three-node chains
which result in flow-in-the-middle starvation
▶ Cost function is designed to be a potential game [Yamamoto2015],
i.e., channel selection dynamics is proved to converge to a Nash
equilibrium
B. Yin, S. Kamiya, K. Yamamoto, T. Nishio, M. Morikura, and H. Abeysekera, “Mitigating throughput
starvation in dense WLANs through potential game-based channel selection,” IEICE Trans. Fundamentals,
vol. E100-A, no. 11, Nov. 2017 24 / 88
Summary (Part I)
▶ Spatial reuse using “color” is a promising technology in IEEE
802.11ax
▶ The next standard — IEEE 802.11be
▶ Throughput starvation problem
▶ Back-of-envelope throughput evaluation
25 / 88
Part II
Deep Reinforcement Learning —
Channel Allocation in WLANs
26 / 88
BoE throughput evaluation takes into account
▶ Adjacency (carrier sensing relationship) among stations
▶ Saturated traffic
But there are many other factors
▶ Unsaturated traffic and its fluctuations
▶ Random backoff algorithm in CSMA/CA
▶ Fading and its impact on adjacency
▶ Other factors
Optimization based on observations
↓
Machine learning
27 / 88
Machine Learn-
ing (ML)
Unsupervised
Learning
Supervised
Learning
Deep
Learning
Reinforcement
Learning
(RL)
Multi-
armed
Bandit
Q-learning
Deep RL
Multi-
agent RL
Adversarial
RL
28 / 88
Which framework can we utilize
for performance improvement in WLANs?
From a viewpoint of radio resource
management
Reinforcement
Learning (RL)
Multi-armed
Bandit
Q-learning
Deep RL
Multi-
agent RL
Adversarial
RL
“Deep RL with graph convolution”
— My opinion at this point
Multi-agent RL is also attractive because WLANs are decentralized
networks. However, at this point,
▶ Multi-agent RL is mainly developed for cooperative control systems
▶ It is hard to apply deep learning with recent progress
29 / 88
Deep RL
with graph convolution
RL
Q-learning
Deep learning
Function approximation
with deep neural networks
Graph convolution
30 / 88
Deep RL
with graph convolution
RL
Q-learning
31 / 88
Motivation — Optimal Channel Allocation
Optimal channel allocation?
Criterion: Aggregated throughput
Number of channels: 2
1
2
3
4
5
1
2
3
4
5
1
1
1
1
1
1
2
3
4
5
1
1
1
1
1← BoE throughput
Framework: Combinatorial
optimization (NP-hard)
BoE throughput is shown for the ease of explanation.
32 / 88
Motivation — Optimal Channel Allocation
Optimal channel allocation?
Criterion: Aggregated throughput
Number of channels: 2
1
2
3
4
5
1
2
3
4
5
1
1
1
1
1
1
2
3
4
5
1
1
1
1
1← BoE throughput
Framework: Combinatorial
optimization (NP-hard)
BoE throughput is shown for the ease of explanation.
32 / 88
Motivation — Optimal Sequence
Different problem setting:
▶ Finding the minimal sequence
▶ Only one AP can change its channel at a given time
▶ Throughput can be observed only after channel allocation
Initial state Optimal states
1
2
3
4
5
1
0
1
0
1
We would like to
find the minimum
sequence.
1
2
3
4
5
1
1
1
1
1
1
2
3
4
5
1
1
1
1
1
This part is simplified version of [Nakashima+2019]
33 / 88
Channel Allocation Sequence
1
2
3
4
5
1
0
1
0
1
1
2
3
4
5
1
1
1
0
1
1
2
3
4
5
1
1
1
1
1
←
We want to ac-
quire this sequence
based on observed
throughput
Criterion?
1
2
3
4
5
1
0
1
0
1
1
2
3
4
5
0.5
0.5
1
0.5
0.5
1
2
3
4
5
1
1
1
0.5
0.5
1
2
3
4
5
1
1
1
1
1
t = 0 t = 1 t = 2 t = 3 34 / 88
Automated driving system — Self-driving cars
https://www.metacar-project.com/
Markov decision process
(MDP)
Agent Environment
Action
State=Position
Reward
Metacar: “A reinforcement learning environment for self-driving cars in
the browser”
35 / 88
Problem
At time t, the agent has a sequence of observations (
state
S0, A0
action
,
reward
R1, S1),
(S1, A1, R2, S2), . . . , (St, At, Rt+1, St+1)
Based on the observations, what action At+1 should the agent take to
get higher reward?
The agent (decision maker) should have the criterion
36 / 88
Total reward and optimal action-value function
Criterion of agent: Expected total discounted reward
with discount factor 0 ≤ γ ≤ 1
E
[ ∞∑
t=0
γt
Rt+1
]
By using the optimal action-value function
Q∗
(s, a)
.
= max
π
Eπ
[ ∞∑
t=0
γt
Rt+1 S0 = s, A0 = a
]
, s ∈ S, a ∈ A
The next action should be determined as
At+1 = arg max
a∈A
Q∗
(St+1, a)
.
= denotes “equality relationship that is true by definition” [Sutton+2018]
Eπ
[·] denotes the expected value of a random variable given that the agent follows policy π. [Sutton+2018]
37 / 88
For the following initial state S0, the following sequence starting with
A0 = “AP 4 uses CH 2” is optimal because the sequence after t = 1 is optimal
t = 0 t = 1 t = 2 · · ·
State St
1
2
3
4
5
1
1
1
0
1
1
2
3
4
5
1
1
1
1
1
1
2
3
4
5
1
1
1
1
1
· · ·
Action At AP 4 uses CH 2
Reward Rt R1 = 5 R2 = 5 · · ·
Q∗
(
S0 =
1
2
3
4
5
1
1
1
0
1
, A0 = AP 4 uses CH 2
)
= R1 + γR2 + γ2
R3 + · · ·
= 5 + γ5 + γ2
5 + · · · =
5
1 − γ
= 50 (γ = 0.9)
38 / 88
Part of Optimal Q-table
For the initial state
1
2
3
4
5
1
1
1
0
1
CH
1 2
AP
1 4 + γ5 + γ25 + · · · = 49 3 + γ4 + γ25 + · · · = 47.1
2 3 + γ4 + γ25 + · · · = 47.1 4 + γ5 + γ25 + · · · = 49
3 4 + γ5 + γ25 + · · · = 49 3 + γ4 + γ25 + · · · = 47.1
4 4 + γ5 + γ25 + · · · = 49 5 + γ5 + γ25 + · · · = 50
5 4 + γ5 + γ25 + · · · = 49 4 + γ4 + γ25 + · · · = 48.1
γ = 0.9
▶ The maximum total reward is achieved starting with “AP 4 uses CH
2”.
▶ If we have the optimal Q-table, we can take the optimal action.
39 / 88
▶ The optimal action-value function Q∗ is unknown
▶ We need to estimate Q∗ from observations
New problem
Based on observations, how can we estimate Q∗?
▶ Qt(s, a): Estimate of Q∗(s, a) at time t
40 / 88
(Tabular) Q-learning [Watkins+1992]
Updated estimate
from observations
↓
Qt+1(St, At)
.
=
Estimate at t
↓
Qt(St, At) + αt δt+1(Qt)
δt+1(Qt)
.
= Rt+1 + γ max
a′∈A
Qt(St+1, a′
) − Qt(St, At)
▶ (St, At, Rt+1, St+1): Observation
▶ 0 ≤ αt ≤ 1: (Small) non-negative number
Before observation After observation
Qt(St, At)
Rt+1 + γ max
a′∈A
Qt(St+1, a′
)
αt = 0 αt = 1
δt+1(Qt)
Time-difference error
(If Qt = Q∗, δt+1 = 0 from the Bellman equation)
41 / 88
Feature of Q-learning [Szepesv´ari2010, 3.3.1]
“Under the assumption that every state-action pair is visited infinitely
often, the sequence (Qt; t ≥ 0) converges to Q∗.”
42 / 88
Markov decision process
▶ Agent: Centralized controller of all APs
▶ State: Channels and contention graph of APs
▶ Action: Channel selection of one AP
▶ Reward: Throughput
CH 1
i
Current state
CH 1 CH 2
i
Next state
Action: AP i changes its channel to 2
[Nakashima+2019]
43 / 88
State Space
To estimate Q∗(s, a), every state-action pair should be visited
When the number of states is huge, training is infeasible — State
explosion
Number of APs Number of channels Number of states
4 2 1024
10 3 2 · 1018
44 / 88
Tabular Q-learning (p. 46)
Qt+1(St, At)
.
= Qt(St, At) + αt δt+1(Qt)
δt+1(Qt)
.
= Rt+1 + γ max
a′∈A
Qt(St+1, a′
) − Qt(St, At)
Q-learning with function approximation
Extension to function approximation with a parameterized function Qθ,
θ ∈ Rd
θt+1
.
= θt + αt δt+1(Qθt ) ∇θ Qθt (St, At)
δt+1(Qθt )
.
= Rt+1 + γ max
a′∈A
Qθt (St+1, a′
) − Qθt (St, At)
Deep reinforcement learning
Q-learning with function approximation using deep neural networks
45 / 88
Regression in supervised learning
▶ Model f is not given
▶ Samples {xi, yi}i∈{1,...,n} are available,
where yi = f(xi) + noise
▶ Estimate the output for test input
x ̸∈ {xi}i∈{1,...,n}
x −−−−→ y
∈
∈
Rd f
−−−−→ R
Approach
▶ Consider a form of parameterized function fθ to approximate f
e.g., linear basis function model: fθ(x)
.
=
b∑
j=1
θj
basis
function
ϕj(x) = θ⊤
ϕ(x)
▶ θ is trained (estimated) using observed samples {xi, yi}i∈{1,...,n}
e.g., estimation by minimizing a sum-of-squares error
θ
.
= arg min
θ
n∑
i=1
(
fθ(xi) − yi
)2
.= J(θ) loss function
▶ Estimation for test input x: y
.
= fθ
(x)
46 / 88
Training set
n observations
(xi, yi)i=1,...,n
Learning algorithm
θ
.
= arg min
θ
J(θ)
New input
x
Estimated output
fθ
(x)
Model fθ(·)
Training Regression
47 / 88
Models (parameterized functions)
fθ(x)
Learning algorithms
Linear basis function model
fθ(x)
.
=
b∑
j=1
θj ϕj(x) = θ⊤
ϕ(x)
Stochastic gradient descent
Neural networks
fθ(x)
.
=
b∑
j=1
αj ϕ(x; βj)
ϕ(x; β)
.
=
1
1 + exp(−x⊤w − γ)
θ
.
= (αj, βj)j, β
.
= (w⊤, γ)⊤
Deep neural networks
(CNN, LSTM)
Deep neural networks offer better
function approximation capabilities
compared to shallow networks
Backpropagation
48 / 88
State Design and Feature Extraction [Nakashima+2019]
State: Adjacency of APs and selected channel
Physical expression
CH 2
CH 1
2
1
3
4
5
Graph expression
2
1
3
4
5
Tabular expression
5x7 matrix






0 1 1 1 0
1 0 0 0 0
1 0 0 1 0
1 0 1 0 0
0 0 0 0 0






[
1 0 0 1 1
0 1 1 0 0
]
Ch 1
Ch 2
Feature extraction (pre-processing):
Graph convolutional networks Convolutional neural network (CNN)
Feature extraction for graphs Feature extraction for images
49 / 88
State: 5x7 matrix
input_1: InputLayer
input:
output:
(None, 1, 5, 7)
(None, 1, 5, 7)
lambda_2: Lambda
input:
output:
(None, 1, 5, 7)
(None, 1, 5, 2)
lambda_1: Lambda
input:
output:
(None, 1, 5, 7)
(None, 1, 5, 5)
spectral_graph_convolution_1: SpectralGraphConvolution
input:
output:
[(None, 1, 5, 2), (None, 1, 5, 5)]
(None, 1, 5, 32)
spectral_graph_convolution_2: SpectralGraphConvolution
input:
output:
[(None, 1, 5, 32), (None, 1, 5, 5)]
(None, 1, 5, 16)
batch_normalization_1: BatchNormalization
input:
output:
(None, 1, 5, 32)
(None, 1, 5, 32)
batch_normalization_2: BatchNormalization
input:
output:
(None, 1, 5, 16)
(None, 1, 5, 16)
flatten_1: Flatten
input:
output:
(None, 1, 5, 16)
(None, 80)
dense_1: Dense
input:
output:
(None, 80)
(None, 10)
Q-values for 10 actions (5 APs x 2 CHs)
e
50 / 88
Libraries and model
1 import numpy as np
2 import tensorflow as tf
3 from keras.models import Model
4 from keras.layers import Dense, Activation, Flatten, Input, Reshape, Lambda, Concatenate,
BatchNormalization→
5 from rl.policy import EpsGreedyQPolicy # keras-rl
6
7 def get_model_functional(env, nb_actions, ap_number, ch_number):
8 inputs = Input(shape=(1,) + env.observation_space.shape)
9 adj = Lambda(lambda x: x[:, :, :, :ap_number],
10 output_shape=(1, ap_number, ap_number))(inputs) # adjacency matrix
11 ch = Lambda(lambda x: x[:, :, :, ap_number:],
12 output_shape=(1, ap_number, ch_number))(inputs) # channel matrix
13 # GraphConvolution is not provided in libraries
14 H = GraphConvolution(units=32, activation='relu',
15 kernel_initializer='he_normal')([ch, adj])
16 H = BatchNormalization()(H)
17 H = GraphConvolution(units=16, activation='relu',
18 kernel_initializer='he_normal')([H, adj])
19 H = BatchNormalization()(H)
20 H = Flatten()(H)
21 predictions = Dense(nb_actions, activation='linear',
22 kernel_initializer='he_normal')(H)
23 model = Model(inputs=inputs, outputs=predictions)
24 return model
51 / 88
0 200000 400000 600000 800000
Step
0.55
0.60
0.65
0.70
0.75
0.80
Reward
▶ Five APs are located uniformly and randomly
▶ Reward: The minimum individual BoE throughput of five APs
Simplified version of [Nakashima+2019]
52 / 88
End-to-End Learning of Proactive Handover Policy
for Camera-Assisted mmWave Networks Using
Deep Reinforcement Learning
Yusuke Koda Kota Nakashima Koji Yamamoto
Takayuki Nishio Masahiro Morikura
Graduate School of Informatics, Kyoto University
arXiv:1904.04585
Image-to-decision proactive handover
▶ Determine one BS from two candidate BSs
▶ The output of NNs is Q-value for input images
NN [16]
(Combination of
CNN and
LSTM)
Consequtive input
images
: Index of associated BS
: Remaining time step until handover process is completed : Action
Output layer
Output of NN
prior to output layer
54 / 88
12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
050150250
Time (s)
DatarateinBS1(Mbit/s)
Data rate degradation
Data rate of BS 1
12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
40506070
Time (s)
Actionvalue
Action value for selecting BS 1
Action value for selecting BS 2
Handover
to BS 2
Handover
to BS 1
Q-value
for selecting BSs 1 and
2
without camera images
▶ Because the received power degrades sharply, handover decision is
conducted after degradation in general
55 / 88
12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
40506070
Time (s)
Actionvalue
Action value for selecting BS 1
Action value for selecting BS 2
Handover
to BS 2
Handover
to BS 1
Q-value for
selecting BSs 1 and 2
without camera images
12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
1015202530
Time (s)
Action−value
Action value for selecting BS 1
Action value for selecting BS 2
Handover
to BS 2
Handover
to BS 1
Action value degradation
ahead of blockage
Q-value for
selecting BSs 1 and 2
with camera images
▶ By extending the state space using camera images, proactive
handover is enabled
56 / 88
Our related works
Deep reinforcement learning with graph convolution
▶ Channel allocation for WLANs [Nakashima+2019]
We utilize
▶ Deep Q network (DQN) [Mnih+2015]
▶ Double Deep Q-network [vHasselt+2016]
▶ Dueling network [Wang+2015]
▶ Prioritized experience replay [Schaul+2015]
Deep reinforcement learning with CNN
▶ Image-based proactive handover for mmWave WLANs
[Koda+2018], [Koda+2019]
57 / 88
References
▶ R. S. Sutton and A. G. Barto, Reinforcement Learning, 2nd ed.
MIT Pr., 2018
▶ C. Szepesv´ari, Algorithms for Reinforcement Learning. Morgan and
Claypool Pub., 2010
▶ K. Arulkumaran, M. P. Deisenroth, M. Brundage, and
A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE
Signal Process. Mag., vol. 34, no. 6, Nov. 2017
58 / 88
Summary (Part II)
▶ Reinforcement learning is used to acquire the optimal sequence to
maximize the total reward.
▶ Q-learning can estimate the optimal action-value function Q∗.
▶ Deep neural networks are used for function approximation of Q
function (deep RL).
59 / 88
Part III
Stochastic Geometry — Modeling
and Analysis of WLANs
60 / 88
Goal of performance modeling
▶ Points represent TXs or
TX-RX pairs.
▶ Circle represents carrier sense
area (contention domain).
We would like to evaluate the following performances without Monte
Carlo simulations to ensure reproducibility
1. Individual throughput in a specific topology — Graph theory Part 1
2. Average throughput over possible topologies — SG
3. Distributions of interference and SIR — SG
61 / 88
0.51
0.31
0.43
0.69
0.09
0.23
0.27
0.27
0.62
0.43
0.65
0.57
0.11
0.60
0.36
0.43
All points Potential TXs Poisson point process (PPP)
Φ
Numbers Backoff counter Uniform distribution
Circles Contention domain
Points with disks Transmitting TXs Mat´ern hardcore point pro-
cess (MHCPP) ΦM
62 / 88
Poisson point process (PPP) Φ
Φ
.
= {x} A set of locations of random points (point process)
Φ(B) A r.v. representing the number of points of Φ in region B,
i.e., Φ(B)
.
= #{Φ ∩ B}
Φ is a set of independent random points with homogeneous density λ
⇐⇒ Φ is a PPP with intensity (density) λ
⇐⇒ Φ(B) ∼ Pois(λ|B|), i.e., Poisson distribution with intensity λ|B|
⇐⇒ P
(
Φ(B) = n
)
=
e−λ|B|(λ|B|)n
n!
, n = 0, 1, . . .
λ = 1, area of square grid is 1, total area is 40
63 / 88
Probabilities derived assuming PPP
▶ Probability that Φ has no point in B
P
(
Φ(B) = 0
)
= e−λ|B|
▶ Probability that Φ has at least one point in B
P
(
Φ(B) ≥ 1
)
= 1 − e−λ|B|
Φ is a PPP with intensity λ
⇐⇒ Φ(B) ∼ Pois(λ|B|)
⇐⇒ P
(
Φ(B) = n
)
=
e−λ|B|(λ|B|)n
n!
, n = 0, 1, . . .
64 / 88
Generating Realizations of PPP
Both codes generates the figure on p. 68
B
.
= { (x, y) | 0 ≤ x ≤ 10, 0 ≤ y ≤ 4 }, i.e., |B| = 40, and λ = 1
General way in R or other languages
n <- rpois(1, lambda=40)
x <- runif(n, min=0,max=10)
y <- runif(n, min=0,max=4)
plot(x,y)
1. For region B, generating a realization n of Poisson random variable
with expectation λ|B| (= lambda in the code)
2. Generating n uniformly distributed points in region B
Package “spatstat” [Baddeley+2015] in R
P <- rpoispp(lambda=1, win=owin(xrange=c(0,10), yrange=c(0,4)))
plot(P)
For more complicated point processes, this package is useful.
65 / 88
CSMA
0.51
0.31
0.43
0.69
0.09
0.23
0.27
0.27
0.62
0.43
0.65
0.57
0.11
0.60
0.36
0.43
Points Potential TXs Poisson point process Φ
Numbers Backoff counter m(x) Uniform distributed r.v.
〇 Contention domain b(x, r) Disc with radius r
● Contended TXs Φ ∩ b(x, r)
● Simultaneous TXs Mat´ern hardcore point process ΦM
→ Interference
RXs are omitted in figures. 66 / 88
Contention domain
〇 b(x, r)
▶ Disc with radius r centered at x
▶ Model of contention domain of x ∈ Φ, i.e., area where the received
power is grater than the carrier sense threshold
67 / 88
Contended TXs
● Locations of contended TXs with TX x
Φ ∩ b(x, r)
Number of Contended TXs with TX x
Φ(b(x, r)) ∼ Pois(λ|b(x, r)|) = Pois(λπr2)
68 / 88
Backoff and simultaneous TXs
0.51
0.31
0.43
0.69
0.09
0.23
0.27
0.27
0.62
0.43
0.65
0.57
0.11
0.60
0.36
0.43
m(x)
▶ “Mark”
▶ Uniformly distributed r.v. on [0, 1] for each point x ∈ Φ
▶ Model of backoff counter
● MHCPP ΦM
.
= { x ∈ Φ : m(x) < m(y) for all y ∈ Φ ∩ b(x, r)  {x} }
▶ TXs with smallest backoff counter in their contention domains
▶ Model of a set of simultaneous TXs
69 / 88
Expected normalized throughput (medium access probability)
P(x ∈ Φ : x ∈ ΦM) = 1−e−λπr2
λπr2
▶ Probability that a point in PPP retains in MHCPP
▶ Probability that a potential TX has the smallest backoff counter, in
other words, no neighboring TXs have smaller backoff counter, in its
contention domain, and thus can transmit
P(x ∈ Φ : x ∈ ΦM) = P
(
#{ z ∈ Φ ∩ b(x, r) | m(z) < m(x) } = 0
)
Conditioning on m(x) = t and considering ˜Φ
.
= { z ∈ Φ | m(z) < t }
▶ TXs with backoff counter less than t
▶ Intensity of ˜Φ is tλ
▶ ˜Φ(b(x, r)) ∼ Pois(tλ|b(x, r)|) = Pois(tλπr2)
P
(
#{ z ∈ Φ ∩ b(x, r) | m(z) < t } = 0
)
= P
(
˜Φ(b(x, r)) = 0
)
= e−tλπr2
P
(
#{ z ∈ Φ ∩ b(x, r) | m(z) < m(x) } = 0
)
=
∫ 1
0
e−tλπr2
dt =
1 − e−λπr2
λπr2
[Baccelli+2009, 18] derived medium access probability taking into account fading
70 / 88
Intensity of MHCPP
▶ Intensity of MHCPP λM
.
=
Intensity of
potential TXs
λ ·
1 − e−λπr2
λπr2
=
1 − e−λπr2
πr2
▶ In ultra dense networks,
lim
λ→∞
λM =
1
πr2
,
The intensity of simultaneous transmissions only depends on the
area of contention domain.
71 / 88
Analysis of Inversely Proportional Carrier Sense
Threshold and Transmission Power Setting
Koji Yamamoto1 Xuedan Yang1 Takayuki Nishio1
Masahiro Morikura1 Hirantha Abeysekera2
1
Graduate School of Informatics, Kyoto University
2
NTT Access Network Service Systems Laboratories, NTT Corporation
IEEE CCNC 2017, Jan. 2017
Carrier Sense Threshold Setting for Spatial Reuse
▶ 〇 Contention domain:
RxPower > CST
▶ ● Contended TXs
▶ ● Simultaneous TXs
▶ → Interference
Increasing Carrier Sense Threshold (CST)
▶ ● reduces
▶ → increases
In densely deployed WLANs, CST adjustment is crucial
However, CST setting can cause throughput starvation
RXs are omitted in figures.
73 / 88
-82
0
-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
10
20
AP1 AP2
Receivedpower(dBm)
Position (m)
p1=13 dBm p2=13 dBm
θ1=-82 dBm θ2=-82 dBm
▶ Received signal power > threshold, θx,
i.e., two APs detect signals from each other.
▶ Two APs time-share the channel.
74 / 88
-82
0
-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
10
20
AP1 AP2
Receivedpower(dBm)
Position (m)
p1=13 dBm p2=13 dBm
θ1=-82 dBm
θ2=-82 dBm
+34 dB
▶ AP2 increases θ2 not to detect signals from AP1,
then AP2 would transmit independently of AP1
▶ AP1 should defer its transmission during transmission of AP2
→ Starvation due to asymmetric carrier sensing relationship
[Mhatre+2007]
75 / 88
-82
0
-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
10
20
AP1 AP2
Receivedpower(dBm)
Position (m)
p1=13 dBm
p2=13 dBm
-34 dB
θ1=-82 dBm
θ2=-82 dBm
+34 dB
▶ Reduction in transmission power along with increase in CST
[Fuemmeler+2006; Mhatre+2007]
Inversely proportional setting (IPS) of CST and transmission power
▶ Symmetric carrier sensing relationship is maintained and APs
spatially reuse the channel
76 / 88
Numerical example: Expected throughput
▶ Carrier sense range is a function of CST and transmission power
0
1
2
3
0 2 4 6 8 10 12 14 16
n=4
n=10
Throughput(bit/s/Hz)
ax= θx /Θ (dB)
Throughput
Asymptotic throughput
Approximation
SIR = 30 dB
α = 3.5
▶ The optimal value of CST ax assuming approximation is close to
that without approximation
▶ [Iwata+2019] extend the work under nearest AP association
M. Iwata, K. Yamamoto, B. Yin, T. Nishio, M. Morikura, and H. Abeysekera, “Analysis of inversely
proportional carrier sense threshold and transmission power setting based on received power for IEEE
802.11ax,” in Proc. IEEE Consum. Commun. Netw. Conf. (CCNC), Las Vegas, NV, USA, Jan. 2019
77 / 88
Goal of performance modeling
▶ Points represent TXs or
TX-RX pairs.
▶ Circle represents carrier sense
area (contention domain).
We would like to evaluate the following performances without Monte
Carlo simulations to ensure reproducibility
1. Individual throughput in a specific topology — Graph theory Part 1
2. Average throughput over possible topologies — SG
3. Distributions of interference and SIR — SG
78 / 88
SG Analysis of CSMA
▶ [Nguyen+2007; Baccelli+2009] MHCPP type II for CSMA modeling
▶ [Alfano+2011; Alfano+2014] Distribution of throughput
▶ [Busson+2009] Intensity underestimation of MHCPP
▶ [Kaynia+2011] Optimization of carrier sensing threshold
▶ [ElSawy+2012; ElSawy+2013] Modified hard-core point process
▶ [Venkataraman+2006] Interference distribution from PPP outside
the contention domain
Approximating MHCPP by PPP with intensity λMHCPP
[ElSawy+2012]
79 / 88
SIR distribution in Poisson cellular networks [Andrews+2011]
▶ Positions of base stations (BSs) follows a
Poisson point process (PPP)
▶ Nearest BS association (Poisson-Voronoi cells)
▶ Rayleigh fading
▶ Exponential-decaying path loss with α > 2
Spatial distribution of SIR in downlink can be derived w/o fading
P(SIR > θ) = · · ·
=
1
1 + 2θ
α−2 2F1
(
1, 1 − 2
α ; 2 − 2
α ; −θ
)
=
√
θ arctan
√
θ (α=4)
Closed form when α = 4
0
1
−20 −10 0 10 20 30
P(SIR≤θ)
θ (dB)
−20
−10
0
10
20
30
with fading
80 / 88
Expected sum — Campbell’s theorem for sums [Haenggi2012, §4.2]
Let Φ be a PPP on R2 with intensity λ, and f : R2 → R be a
measurable function. Then,
EΦ
[
∑
x∈Φ
ex. interference from x to origin
↓
f(x)
ex. sum of interference from Φ
]
= λ
∫
R2
f(x) dx
▶ Expected sum can be evaluated as an area integral
▶ Only applicable for PPPs. Thus, PPPs are often assumed
81 / 88
Stochastic Geometry Analysis of Normalized SNR
Scheduling in Downlink Cellular Networks
Takuya Ohto† Koji Yamamoto† Seong-Lyun Kim‡
Takayuki Nishio† Masahiro Morikura†
†Kyoto University ‡Yonsei University
IEEE Wireless Communications Letters, Aug. 2017
1) Round-robin scheduler
A user is selected in rotation independent of the channel state.
2) Proportional fair (PF) scheduler
A user with the largest instantaneous
rate normalized by the short-term aver-
age value,
arg max
i
ratei
E[ratei | ri]
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Normalizeddatarate
3) Normalized SNR scheduler
A user with the largest instantaneous
SNR normalized by the short-term aver-
age value,
arg max
i
SNRi
E[SNRi | ri]
= arg max
i
hi
0
1
2
3
4
5
6
NormalizedSNR
SNRi
.
= hiri
−α
/σ2
Short-term average SNR: E[SNRi | ri] = ri
−α
/σ2
83 / 88
SIR distributions (success prob. / coverage prob.)
[Andrews+2011]
SIR ccdf under Rayleigh fading channel in Poisson cellular networks
P(SIR > θ) =
1
1 + 2θ
α−2 2F1
(
1, 1 − 2
α
; 2 − 2
α
; −θ
)
√
θ arctan
√
θ (when α = 4)
α: path loss exponent
User is selected randomly independent of channel state.
This performance can be achieved under round-robin scheduling.
Normalized SNR scheduling [Ohto+2017]
P(SIR > θ) ≈
∞∑
n=0
(λu/cλb)n
c B(n + 1, c) (λu/cλb + 1)n+c+1
×
n+1∑
k=1
(n+1
k
)
(−1)k+1
1 + 2kθ
α−2 2F1
(
1, 1 − 2
α
; 2 − 2
α
; −kθ
)
√
kθ arctan
√
kθ (when α = 4)
λb: intensity of BSs, λu: intensity of users, c = 3.5
84 / 88
SIR ccdf (Interference-limited, Rayleigh, and α = 4) [Ohto+2017]
0.0
0.2
0.4
0.6
0.8
1.0
−10 −5 0 5 10 15 20
λu/λb = 4 λu/λb = 8P(SIR>θ)
θ (dB)
Normalized SNR (Analysis)
(Simulation)
Round-robin (Analysis)
Round-robin [Andrews+2011] P(SIR > θ) =
1
1 +
√
θ arctan
√
θ
Normalized SNR P(SIR⋆
> θ) ≈
∞∑
n=0
fN (n)
n+1∑
k=1
(n+1
k
)
(−1)k+1
1 +
√
kθ arctan
√
kθ
85 / 88
Related works
Stochastic geometry analysis of channel-adaptive user scheduling in
cellular networks
Downlink
▶ [Ohto+2017] T. Ohto, K. Yamamoto, S.-L. Kim, T. Nishio, and M. Morikura,
“Stochastic geometry analysis of normalized SNR-based scheduling in downlink
cellular networks,” IEEE Wireless Commun. Lett., vol. 6, no. 4, Aug. 2017
▶ [Yamamoto2018] K. Yamamoto, “SIR distribution and scheduling gain of
normalized SNR scheduling in Poisson networks,” in Proc. Int. Symp. Model.
Optim. Mobile Ad Hoc Wireless Netw. (WiOpt), Shanghai, China, May 2018
Uplink
▶ [Kamiya+2018] S. Kamiya, K. Yamamoto, S.-L. Kim, T. Nishio, and
M. Morikura, “Asymptotic analysis of normalized SNR-based scheduling in
uplink cellular networks with truncated channel inversion power control,” in
Proc. IEEE Int. Conf. Commun. (ICC), Kansas, MO, USA, May 2018
86 / 88
References
▶ M. Haenggi, Stochastic Geometry for Wireless Networks.
Cambridge, U.K.: Cambridge Univ. Press, 2012
▶ F. Baccelli and B. Blaszczyszyn, “Stochastic geometry and wireless
networks: Volume II applications,” Found. Trends Netw., vol. 4,
no. 1-2, 2009
87 / 88
Summary (Part III)
Stochastic geometry modeling
▶ Poisson point process — Potential TXs
▶ Mat´ern CSMA point process — Simultaneous TXs
▶ Distribution of interference and SIR
Applications
▶ Joint carrier sense threshold and transmission power setting
▶ User scheduling
88 / 88
References I
[Alfano+2011] G. Alfano, M. Garetto, and E. Leonardi, “New insights
into the stochastic geometry analysis of dense CSMA
networks,” in Proc. IEEE Int. Conf. Comput. Commun.
(INFOCOM), Shanghai, China, Apr. 2011.
[Alfano+2014] ——,“New directions into the stochastic geometry
analysis of dense CSMA networks,” IEEE Trans. Mobile
Comput., vol. 13, no. 2, Feb. 2014.
[Andrews+2011] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable
approach to coverage and rate in cellular networks,”
IEEE Trans. Commun., vol. 59, no. 11, Nov. 2011.
[Arulkumaran+2017] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and
A. A. Bharath, “Deep reinforcement learning: A brief
survey,” IEEE Signal Process. Mag., vol. 34, no. 6, Nov.
2017.
[Baccelli+2009] F. Baccelli and B. Blaszczyszyn, “Stochastic geometry
and wireless networks: Volume II applications,” Found.
Trends Netw., vol. 4, no. 1-2, 2009.
89 / 88
References II
[Baddeley+2015] A. Baddeley, E. Rubak, and R. Turner, Spatial Point
Patterns: Methodology and Applications with R. CRC
Pr., 2015.
[Bianchi2000] G. Bianchi, “Performance analysis of the IEEE 802.11
distributed coordination function,” IEEE J. Sel. Areas
Commun., vol. 18, no. 3, Mar. 2000.
[Busson+2009] A. Busson and G. Chelius, “Point processes for
interference modeling in CSMA/CA ad-hoc networks,” in
Proc. the 6th ACM Symp. Perform. Eval. Wireless Ad
Hoc, Sensor, Ubiquitous Netw. (PE-WASUN), Canary
Islands, Spain, Oct. 2009.
[ElSawy+2012] H. ElSawy, E. Hossain, and S. Camorlinga,
“Characterizing random CSMA wireless networks: A
stochastic geometry approach,” in Proc. IEEE Int. Conf.
Commun. (ICC), Ottawa, ON, Jun. 2012.
90 / 88
References III
[ElSawy+2013] H. ElSawy and E. Hossain, “A modified hard core point
process for analysis of random CSMA wireless networks
in general fading environments,” IEEE Trans. Commun.,
vol. 61, no. 4, Apr. 2013.
[Fuemmeler+2006] J. A. Fuemmeler, N. H. Vaidya, and V. V. Veeravalli,
“Selecting transmit powers and carrier sense thresholds
in csma protocols for wireless ad hoc networks,” in
Proceedings of the 2nd annual international workshop on
Wireless internet - WICON ’06, New York, New York,
USA, Aug. 2006.
[Garetto+2008] M. Garetto, T. Salonidis, and E. W. Knightly,
“Modeling per-flow throughput and capturing starvation
in CSMA multi-hop wireless networks,” IEEE/ACM
Trans. Netw., vol. 16, no. 4, Aug. 2008.
[Haenggi2012] M. Haenggi, Stochastic Geometry for Wireless Networks.
Cambridge, U.K.: Cambridge Univ. Press, 2012.
91 / 88
References IV
[Iwata+2019] M. Iwata, K. Yamamoto, B. Yin, T. Nishio,
M. Morikura, and H. Abeysekera, “Analysis of inversely
proportional carrier sense threshold and transmission
power setting based on received power for IEEE
802.11ax,” in Proc. IEEE Consum. Commun. Netw.
Conf. (CCNC), Las Vegas, NV, USA, Jan. 2019.
[Kamiya+2018] S. Kamiya, K. Yamamoto, S.-L. Kim, T. Nishio, and
M. Morikura, “Asymptotic analysis of normalized
SNR-based scheduling in uplink cellular networks with
truncated channel inversion power control,” in Proc.
IEEE Int. Conf. Commun. (ICC), Kansas, MO, USA,
May 2018.
[Kaynia+2011] M. Kaynia, N. Jindal, and G. E. Oien, “Improving the
performance of wireless ad hoc networks through MAC
layer design,” IEEE Trans. Wireless Commun., vol. 10,
no. 1, Jan. 2011.
[Khorov+2019] E. Khorov, A. Kiryanov, A. Lyakhov, and G. Bianchi, “A
tutorial on IEEE 802.11ax high efficiency WLANs,” IEEE
Commun. Surveys Tuts., vol. 21, no. 1, 2019.
92 / 88
References V
[Koda+2018] Y. Koda, K. Yamamoto, T. Nishio, and M. Morikura,
“Reinforcement learning based predictive handover for
pedestrian-aware mmWave networks,” in Proc. IEEE Int.
Conf. Comput. Commun. Workshops (INFOCOM
Workshops), Honolulu, HI, USA, Apr. 2018.
[Koda+2019] Y. Koda, K. Nakashima, K. Yamamoto, T. Nishio, and
M. Morikura, “End-to-end learning of proactive handover
policy for camera-assisted mmWave networks using deep
reinforcement learning,” arXiv preprint arXiv:1904.04585,
Apr. 2019.
[Liew+2010] S. C. Liew, C. H. Kai, H. C. Leung, and P. Wong,
“Back-of-the-envelope computation of throughput
distributions in CSMA wireless networks,” IEEE Trans.
Mobile Comput., vol. 9, no. 9, Sep. 2010.
[Mhatre+2007] V. P. Mhatre, K. Papagiannaki, and F. Baccelli,
“Interference mitigation through power control in high
density 802.11 WLANs,” in Proc. IEEE Int. Conf.
Comput. Commun. (INFOCOM), Anchorage, AK, May
2007.
93 / 88
References VI
[Mnih+2015] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu,
J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller,
A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie,
A. Sadik, I. Antonoglou, H. King, D. Kumaran,
D. Wierstra, S. Legg, and D. Hassabis, “Human-level
control through deep reinforcement learning,” Nature,
vol. 518, no. 7540, Feb. 2015.
[Nakashima+2019] K. Nakashima, S. Kamiya, K. Ohtsu, K. Yamamoto,
T. Nishio, and Masahiro Morikura, “Deep reinforcement
learning-based channel allocation for wireless LANs with
graph convolutional networks,” arXiv preprint
arXiv:1905.07144, May 2019.
[Nguyen+2007] H. Q. Nguyen, F. Baccelli, and D. Kofman, “A
stochastic geometry analysis of dense IEEE 802.11
networks,” in Proc. IEEE Int. Conf. Comput. Commun.
(INFOCOM), Anchorage, AK, May 2007.
94 / 88
References VII
[Ohto+2017] T. Ohto, K. Yamamoto, S.-L. Kim, T. Nishio, and
M. Morikura, “Stochastic geometry analysis of
normalized SNR-based scheduling in downlink cellular
networks,” IEEE Wireless Commun. Lett., vol. 6, no. 4,
Aug. 2017.
[Schaul+2015] T. Schaul, J. Quan, I. Antonoglou, and D. Silver,
“Prioritized experience replay,” in ICLR, Sna Juan,
Puerto Rico, May 2015.
[Sutton+2018] R. S. Sutton and A. G. Barto, Reinforcement Learning,
2nd ed. MIT Pr., 2018.
[Szepesv´ari2010] C. Szepesv´ari, Algorithms for Reinforcement Learning.
Morgan and Claypool Pub., 2010.
[Venkataraman+2006] J. Venkataraman, M. Haenggi, and O. Collins, “Shot
noise models for outage and throughput analyses in
wireless ad hoc networks,” in Proc. IEEE Military
Commun. Conf. (MILCOM), Oct. 2006.
95 / 88
References VIII
[vHasselt+2016] H. van Hasselt, A. Guez, and D. Silver, “Deep
reinforcement learning with double q-learning,” in 30th
AAAI Conf. Artificial Intelligence, Mar. 2016.
[Wang+2015] Z. Wang, T. Schaul, M. Hessel, H. van Hasselt,
M. Lanctot, and N. de Freitas, “Dueling network
architectures for deep reinforcement learning,” arXiv
preprint arXiv:1511.06581, Nov. 2015.
[Watkins+1992] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach.
Learn., vol. 8, no. 3-4, May 1992.
[Yamamoto2015] K. Yamamoto, “A comprehensive survey of potential
game approaches to wireless networks,” IEICE Trans.
Commun., vol. E98-B, no. 9, Sep. 2015.
[Yamamoto2018] ——,“SIR distribution and scheduling gain of normalized
SNR scheduling in Poisson networks,” in Proc. Int.
Symp. Model. Optim. Mobile Ad Hoc Wireless Netw.
(WiOpt), Shanghai, China, May 2018.
96 / 88
References IX
[Yin+2017] B. Yin, S. Kamiya, K. Yamamoto, T. Nishio,
M. Morikura, and H. Abeysekera, “Mitigating
throughput starvation in dense WLANs through potential
game-based channel selection,” IEICE Trans.
Fundamentals, vol. E100-A, no. 11, Nov. 2017.
Some figures in this slide are licensed to speakers only.
97 / 88

More Related Content

What's hot

Designing 5G NR (New Radio)
Designing 5G NR (New Radio)Designing 5G NR (New Radio)
Designing 5G NR (New Radio)
Qualcomm Research
 
Info vista planet-5g nr
Info vista planet-5g nrInfo vista planet-5g nr
Info vista planet-5g nr
Denmark Wilson
 
FT of Gaussian Pulse etc.ppt
FT of Gaussian Pulse etc.pptFT of Gaussian Pulse etc.ppt
FT of Gaussian Pulse etc.ppt
SandyPurba4
 
LTE introduction part1
LTE introduction part1LTE introduction part1
LTE introduction part1
Pei-Che Chang
 
Gautier techniques mimo
Gautier techniques mimoGautier techniques mimo
Gautier techniques mimo
mhamedhoumani
 
Digital Mobile Network Evolution - from GSM to 5G
Digital Mobile Network Evolution - from GSM to 5GDigital Mobile Network Evolution - from GSM to 5G
Digital Mobile Network Evolution - from GSM to 5G
3G4G
 
Lte power control
Lte power controlLte power control
Lte power control
Pranay Akul
 
LTE-Advanced Physical Layer
LTE-Advanced Physical LayerLTE-Advanced Physical Layer
LTE-Advanced Physical Layer
Praveen Kumar
 
signal space analysis.ppt
signal space analysis.pptsignal space analysis.ppt
signal space analysis.ppt
PatrickMumba7
 
Aide à la Planification Cellulaire dans un Réseau LTE (4G)
Aide à la Planification Cellulaire dans un Réseau LTE (4G)Aide à la Planification Cellulaire dans un Réseau LTE (4G)
Aide à la Planification Cellulaire dans un Réseau LTE (4G)
Fatiha Merazka
 
Parallel Wireless Urban Solution - Media
Parallel Wireless Urban Solution - MediaParallel Wireless Urban Solution - Media
Parallel Wireless Urban Solution - Media
Parallel Wireless, Inc.
 
RF Module Design - [Chapter 2] Noises
RF Module Design - [Chapter 2] NoisesRF Module Design - [Chapter 2] Noises
RF Module Design - [Chapter 2] Noises
Simen Li
 
bask, bfsk, bpsk
bask, bfsk, bpskbask, bfsk, bpsk
bask, bfsk, bpsk
blzz2net
 
NEI_LTE1235_ready_02.pdf
NEI_LTE1235_ready_02.pdfNEI_LTE1235_ready_02.pdf
NEI_LTE1235_ready_02.pdf
PLAGBEKomlanDaniel
 
LTE air interface overview
LTE air interface overviewLTE air interface overview
LTE air interface overview
Nasir Ahmad
 
Equalization in digital communication
Equalization in digital communicationEqualization in digital communication
Equalization in digital communication
Pei-Che Chang
 
4g lte matlab
4g lte matlab4g lte matlab
4g lte matlab
Hakim Zentani
 
friis formula
friis formulafriis formula
friis formula
Harshavardhan K
 

What's hot (20)

Designing 5G NR (New Radio)
Designing 5G NR (New Radio)Designing 5G NR (New Radio)
Designing 5G NR (New Radio)
 
Info vista planet-5g nr
Info vista planet-5g nrInfo vista planet-5g nr
Info vista planet-5g nr
 
FT of Gaussian Pulse etc.ppt
FT of Gaussian Pulse etc.pptFT of Gaussian Pulse etc.ppt
FT of Gaussian Pulse etc.ppt
 
LTE introduction part1
LTE introduction part1LTE introduction part1
LTE introduction part1
 
Gautier techniques mimo
Gautier techniques mimoGautier techniques mimo
Gautier techniques mimo
 
Matched filter
Matched filterMatched filter
Matched filter
 
Digital Mobile Network Evolution - from GSM to 5G
Digital Mobile Network Evolution - from GSM to 5GDigital Mobile Network Evolution - from GSM to 5G
Digital Mobile Network Evolution - from GSM to 5G
 
Lte power control
Lte power controlLte power control
Lte power control
 
LTE-Advanced Physical Layer
LTE-Advanced Physical LayerLTE-Advanced Physical Layer
LTE-Advanced Physical Layer
 
signal space analysis.ppt
signal space analysis.pptsignal space analysis.ppt
signal space analysis.ppt
 
Aide à la Planification Cellulaire dans un Réseau LTE (4G)
Aide à la Planification Cellulaire dans un Réseau LTE (4G)Aide à la Planification Cellulaire dans un Réseau LTE (4G)
Aide à la Planification Cellulaire dans un Réseau LTE (4G)
 
Parallel Wireless Urban Solution - Media
Parallel Wireless Urban Solution - MediaParallel Wireless Urban Solution - Media
Parallel Wireless Urban Solution - Media
 
RF Module Design - [Chapter 2] Noises
RF Module Design - [Chapter 2] NoisesRF Module Design - [Chapter 2] Noises
RF Module Design - [Chapter 2] Noises
 
bask, bfsk, bpsk
bask, bfsk, bpskbask, bfsk, bpsk
bask, bfsk, bpsk
 
NEI_LTE1235_ready_02.pdf
NEI_LTE1235_ready_02.pdfNEI_LTE1235_ready_02.pdf
NEI_LTE1235_ready_02.pdf
 
LTE air interface overview
LTE air interface overviewLTE air interface overview
LTE air interface overview
 
Equalization in digital communication
Equalization in digital communicationEqualization in digital communication
Equalization in digital communication
 
Radio Link budget
Radio Link budgetRadio Link budget
Radio Link budget
 
4g lte matlab
4g lte matlab4g lte matlab
4g lte matlab
 
friis formula
friis formulafriis formula
friis formula
 

Similar to Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs --- Part A

Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
IRJET Journal
 
MIMO Communications
MIMO CommunicationsMIMO Communications
MIMO Communications
IRJET Journal
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features
IJECEIAES
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
Usatyuk Vasiliy
 
IEEE 802.11a Physical Layer Simulation
IEEE 802.11a Physical Layer SimulationIEEE 802.11a Physical Layer Simulation
IEEE 802.11a Physical Layer Simulation
Michail Grigoropoulos
 
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
IJECEIAES
 
The impact of noise on detecting the arrival angle using the root-WSF algorithm
The impact of noise on detecting the arrival angle using the root-WSF algorithmThe impact of noise on detecting the arrival angle using the root-WSF algorithm
The impact of noise on detecting the arrival angle using the root-WSF algorithm
TELKOMNIKA JOURNAL
 
Channel Models for Massive MIMO
Channel Models for Massive MIMOChannel Models for Massive MIMO
Channel Models for Massive MIMO
CPqD
 
D010512126
D010512126D010512126
D010512126
IOSR Journals
 
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
Naoki Shibata
 
BER Performance of MU-MIMO System using Dirty Paper Coding
BER Performance of MU-MIMO System using Dirty Paper CodingBER Performance of MU-MIMO System using Dirty Paper Coding
BER Performance of MU-MIMO System using Dirty Paper Coding
IJEEE
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...butest
 
An ultra wideband antenna for Ku band applications
An ultra wideband antenna for Ku band applicationsAn ultra wideband antenna for Ku band applications
An ultra wideband antenna for Ku band applications
IJECEIAES
 
EE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notesEE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notes
Haris Hassan
 
2021 itu challenge_reinforcement_learning
2021 itu challenge_reinforcement_learning2021 itu challenge_reinforcement_learning
2021 itu challenge_reinforcement_learning
LASSEMedia
 
MISSILE TELEMETRY DATALINK CALCULATION (A MATLAB PROGRAM)
MISSILE TELEMETRY DATALINK CALCULATION  (A MATLAB PROGRAM)MISSILE TELEMETRY DATALINK CALCULATION  (A MATLAB PROGRAM)
MISSILE TELEMETRY DATALINK CALCULATION (A MATLAB PROGRAM)
AM Publications
 
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applications
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applicationsModified e-slotted patch antenna for WLAN/Wi-Max satellite applications
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applications
TELKOMNIKA JOURNAL
 
tutorial 3.pptx
tutorial 3.pptxtutorial 3.pptx
tutorial 3.pptx
ntabdelnaiem
 
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT ForwardingAn Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
CSCJournals
 

Similar to Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs --- Part A (20)

Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
Design of Rectangular Shaped Slotted Micro Strip Antenna for Triple Frequency...
 
MIMO Communications
MIMO CommunicationsMIMO Communications
MIMO Communications
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
IEEE 802.11a Physical Layer Simulation
IEEE 802.11a Physical Layer SimulationIEEE 802.11a Physical Layer Simulation
IEEE 802.11a Physical Layer Simulation
 
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
Performance Enhancement in SU and MU MIMO-OFDM Technique for Wireless Communi...
 
The impact of noise on detecting the arrival angle using the root-WSF algorithm
The impact of noise on detecting the arrival angle using the root-WSF algorithmThe impact of noise on detecting the arrival angle using the root-WSF algorithm
The impact of noise on detecting the arrival angle using the root-WSF algorithm
 
Channel Models for Massive MIMO
Channel Models for Massive MIMOChannel Models for Massive MIMO
Channel Models for Massive MIMO
 
D010512126
D010512126D010512126
D010512126
 
Cm co
Cm coCm co
Cm co
 
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
 
BER Performance of MU-MIMO System using Dirty Paper Coding
BER Performance of MU-MIMO System using Dirty Paper CodingBER Performance of MU-MIMO System using Dirty Paper Coding
BER Performance of MU-MIMO System using Dirty Paper Coding
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...
 
An ultra wideband antenna for Ku band applications
An ultra wideband antenna for Ku band applicationsAn ultra wideband antenna for Ku band applications
An ultra wideband antenna for Ku band applications
 
EE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notesEE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notes
 
2021 itu challenge_reinforcement_learning
2021 itu challenge_reinforcement_learning2021 itu challenge_reinforcement_learning
2021 itu challenge_reinforcement_learning
 
MISSILE TELEMETRY DATALINK CALCULATION (A MATLAB PROGRAM)
MISSILE TELEMETRY DATALINK CALCULATION  (A MATLAB PROGRAM)MISSILE TELEMETRY DATALINK CALCULATION  (A MATLAB PROGRAM)
MISSILE TELEMETRY DATALINK CALCULATION (A MATLAB PROGRAM)
 
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applications
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applicationsModified e-slotted patch antenna for WLAN/Wi-Max satellite applications
Modified e-slotted patch antenna for WLAN/Wi-Max satellite applications
 
tutorial 3.pptx
tutorial 3.pptxtutorial 3.pptx
tutorial 3.pptx
 
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT ForwardingAn Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
An Efficient Wireless Backhaul Utilizing MIMO Transmission and IPT Forwarding
 

Recently uploaded

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 

Recently uploaded (20)

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 

Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs --- Part A

  • 1. Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs Koji Yamamoto and Takayuki Nishio Graduate School of Informatics, Kyoto University 2019-05-24 — IEEE ICC 2019 Tutorial
  • 2. Contents of this tutorial Part A — Deep Reinforcement Learning and Stochastic Geometry for microwave WLANs Koji Yamamoto — 90 min I. Wireless LANs — IEEE 802.11ax and 11be II. Deep Reinforcement Learning — Channel Allocation in WLANs III. Stochastic Geometry — Modeling and Analysis of WLANs Break Part B — Deep Learning for/in mmWave WLANs Takayuki Nishio — 90 min 2 / 88
  • 3. Radio parameters Transmission power Frequency channel and states topology fading −−−−−−−−→ Communication quality Throughput Delay x Model f −−−−−−−−→ y = f(x) We would like to ▶ maximize the communication quality f(x) ▶ find the optimal parameter arg max x f(x) 3 / 88
  • 4. Model f is given? Optimization/ Game theory yes Data (xi, yi) is given? yi = f(xi) + noise no Supervised learning yes Reinforcement learning no Deep learning (CNN) xi is image Deep RL xi is image Part A-III Part A-IIPart B 4 / 88
  • 5. Model f is given? Optimization/ Game theory yes Data (xi, yi) is given? yi = f(xi) + noise no Supervised learning yes Reinforcement learning no Deep learning (CNN) xi is image Deep RL xi is image Part A-III Part A-IIPart B 4 / 88
  • 6. Model f is given? Optimization/ Game theory yes Data (xi, yi) is given? yi = f(xi) + noise no Supervised learning yes Reinforcement learning no Deep learning (CNN) xi is image Deep RL xi is image Part A-III Part A-IIPart B 4 / 88
  • 7. Model f is given? Optimization/ Game theory yes Data (xi, yi) is given? yi = f(xi) + noise no Supervised learning yes Reinforcement learning no Deep learning (CNN) xi is image Deep RL xi is image Part A-III Part A-IIPart B 4 / 88
  • 8. Model f is given? Optimization/ Game theory yes Data (xi, yi) is given? yi = f(xi) + noise no Supervised learning yes Reinforcement learning no Deep learning (CNN) xi is image Deep RL xi is image Part A-III Part A-IIPart B 4 / 88
  • 9. Goal of this tutorial Deep understanding of important ideas ▶ Each of ML, RL, SG, and WLANs can be one tutorial ▶ A deep understanding of important techniques helps you to understand the relevant techniques yourself Machine learning and stochastic geometry analysis ▶ Mainly for resource management and system design 5 / 88
  • 10. Part I Wireless LANs — IEEE 802.11ax and 11be 6 / 88
  • 11. Maximum data rate in IEEE 802.11 11 11a 11b 11g 11n 11ac 11ax 11be(bit/s) 1995 2000 2005 2010 2015 2020 1M 10M 100M 1G 10G 7 / 88
  • 12. PHY (cited from [Khorov+2019, Table II]) Legacy 802.11ax OFDM Constellation Order 256-QAM (11ac) 1024-QAM OFDM Symbol Duration 3.2 µs 12.8µs OFDM Guard Interval 0.4 or 0.8 µs 0.8, 1.6 or 3.2 µs Maximum Data Rate ≈ 7 Gbit/s ≈ 9.6 Gbit/s 8 / 88
  • 13. The 802.11ax Challenge is Dense Networks [Khorov+2019] 9 / 88
  • 15. 0 0.2 0.4 0.6 0.8 1 1 10 100 1000 10000 Probabilityofcollisions Number of transmitters Bianchi model (saturated traffic) [Bianchi2000] IEEE 802.11a, Frame size: 1500 B, Data rate: 6 Mbit/s 11 / 88
  • 16. Channel Access and OBSS Management (cited from [Khorov+2019, Table II]) Legacy (802.11ac and earlier) 802.11ax Basic Channel Access CSMA/CA OFDMA on top of CSMA/CA Random Channel Access DCF, EDCA UL OFDMA Random Access on top of CSMA/CA Contention-free Access PCF, HCCA (not implemented in real devices), RAW (11ah) Trigger-based UL OFDMA MU Technology MU-MIMO (11ac) MU-MIMO, OFDMA MU Transmission Direction DL (11ac) DL and UL Fragmentation Static Flexible Interference Mitigation NAV, RTS/CTS, HCCA TXOP Negotiation Two NAVs, Quiet Period Spatial Reuse Sectorization (11ah) Adaptive Power and Sensitiv- ity Thresholds, Color 12 / 88
  • 17. Spatial Reuse Using BSS “Color” Start Channel Transmit Idle End Busy (11ac) “Color” < OBSS PD Busy (11ax) Same BSS OBSS Yes No ▶ “Color” header ▶ Parameter: Carrier sense threshold for OBSS (OBSS PD) ▶ As a result, channel is spatially reused (spatial reuse). OBSS: Overlapped Basic Service Set, i.e., the other adjacent cells 13 / 88
  • 18. IEEE 802.11ax IEEE 802.11be Will be finalized in 2020 (in schedule) Starts in 2018 High Efficiency WLAN (HEW) Study Group Extremely High Throughput (EHT) Study Group Wi-Fi 6 Wi-Fi 7 (?) http://www.ieee802.org/11/ Reports/tgax_update.htm http://www.ieee802.org/11/ Reports/ehtsg_update.htm [Khorov+2019] E. Khorov, A. Kiryanov, A. Lyakhov, and G. Bianchi, “A tutorial on IEEE 802.11ax high efficiency WLANs,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, 2019 ▶ At least 30 Gbit/s ▶ 320 MHz bandwidth and non-contiguous spectrum ▶ Multi-band/multi-channel aggregation ▶ 16 spatial streams MIMO ▶ Multi-Access Point Coordination ▶ Enhanced link adaptation and retransmission protocol 14 / 88
  • 19. Throughput Starvation BA C D E C D E t Busy First order starvation [Mhatre+2007] Flow-in-the middle starvation [Garetto+2008] Second order starvation [Mhatre+2007] Under saturated traffic, A and D always detect signals thus they cannot start transmissions 15 / 88
  • 20. Goal of performance modeling ▶ Points represent TXs or TX-RX pairs. ▶ Circle represents carrier sense area (contention domain). We would like to evaluate the following performances without Monte Carlo simulations to ensure reproducibility 1. Individual throughput in a specific topology — Graph theory Part I 2. Average throughput over possible topologies — SG Part III 3. Distributions of interference and SIR — SG Part III 16 / 88
  • 21. Contention graph TX-RX pair Vertex Pairs detecting signals each other Vertices are connected with edge Implicit assumption: TX and RX forming a given link have the same carrier sense relationship (The situation where one of them detects the carrier but the other does not detect it is not allowed) Carrier detection Carrier sensing relationship Contention graph 17 / 88
  • 22. Back-of-Envelope (BoE) throughput [Liew+2010] Contention graph 0.50.5 0 1 BoE throughput Normalized throughput Medium access probability 18 / 88
  • 23. Model of simultaneous transmissions according to CSMA Maximum independent sets (MISs) ▶ Induced subgraph without edges with the maximum number of vertices ▶ Possible patterns of simultaneous TXs according to CSMA (Vertex-)induced subgraph ▶ A subset of the vertex set of the original graph and related edge set Induced subgraph o o o o MIS x x o o Wolfram—Alpha: Subgraph, Independent Set 19 / 88
  • 24. n1 = 2 n1/n = 2/2 = 1 n2 = 0 n2/n = 0/2 = 0 n3 = 1 n3/n = 1/2 = 0.5 1. Find all MISs of the contention graph (possible simultaneous transmission patterns) 2. The normalized throughput of link (vertex) i is determined as ni/n, where ▶ n is the number of MISs and ▶ ni is the number of MISs containing i. We assume that all possible patterns occur with equal probability 20 / 88
  • 25. Example of BoE throughput 0.6 0.0 1.0 0.4 0.2 1.0 0.3 0.2 0.8 0.3 1.0 0.6 1.0 0.4 0.3 0.8 21 / 88
  • 26. 1 require("igraph") 2 require("spatstat") 3 require("plotrix") 4 set.seed(9) 5 6 LAMBDA <- 2e-5 7 RADIUS <- 200 8 W.LENGTH <- 500 9 W <- owin(xrange = c(-W.LENGTH, W.LENGTH), yrange = c(-W.LENGTH, W.LENGTH)) 10 11 # Poisson point process 12 P <- rpoispp(lambda = LAMBDA, win = W) 13 14 # Adjacency matrix 15 D <- pairdist(P) 16 adjacency.matrix <- (D<RADIUS) 17 diag(adjacency.matrix) <- 0 18 19 # Contention graph 20 adjacency.graph <- graph.adjacency(adjacency.matrix, mode="undirected") 21 22 # Maximum independent sets 23 MISs <- largest_ivs(adjacency.graph) 24 25 # Evaluate throughput 26 ni <- 0 27 for ( i in 1:(P$n) ) { 28 N.MIS <- 0 29 for ( j in 1:length(MISs) ) { 30 N.MIS <- N.MIS + sum(MISs[[j]]==i) 31 } 32 ni[i] <- N.MIS 33 } 34 22 / 88
  • 27. 35 # Plotting figures 36 quartz(type="pdf", file="160408_BoE_simple_position.pdf") 37 par(pty="s",mar=c(0, 0, 0, 0)+0.1) 38 plot(P$x, P$y, xlim=c(-W.LENGTH,W.LENGTH), ylim=c(-W.LENGTH,W.LENGTH)) 39 for (i in 1:(P$n) ) { 40 draw.circle(P[i]$x,P[i]$y,200,border=NA,col=adjustcolor( "blue", alpha.f = 0.1)) 41 } 42 dev.off() 43 quartz(type="pdf", file="160408_BoE_simple.pdf") 44 par(pty="s",mar=c(0, 0, 0, 0)+0.1) 45 plot(P$x, P$y, xlim=c(-W.LENGTH,W.LENGTH), ylim=c(-W.LENGTH,W.LENGTH)) 46 for ( i in 1:(P$n-1) ) { 47 for ( j in (i+1):(P$n) ) { 48 if (adjacency.matrix[i,j]==1) { 49 segments(P[i]$x, P[i]$y, P[j]$x, P[j]$y, col="blue", lwd = 2) 50 } 51 } 52 } 53 text(P$x, P$y, sprintf("%.1f",ni/length(MISs)), pos=1) 54 dev.off() 23 / 88
  • 28. Potential game-based channel allocation [Yin+2017] 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 Location(m) Location (m) Proposed scheme 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 Location(m) Location (m) Compared scheme ▶ Cost function taking into account the number of three-node chains which result in flow-in-the-middle starvation ▶ Cost function is designed to be a potential game [Yamamoto2015], i.e., channel selection dynamics is proved to converge to a Nash equilibrium B. Yin, S. Kamiya, K. Yamamoto, T. Nishio, M. Morikura, and H. Abeysekera, “Mitigating throughput starvation in dense WLANs through potential game-based channel selection,” IEICE Trans. Fundamentals, vol. E100-A, no. 11, Nov. 2017 24 / 88
  • 29. Summary (Part I) ▶ Spatial reuse using “color” is a promising technology in IEEE 802.11ax ▶ The next standard — IEEE 802.11be ▶ Throughput starvation problem ▶ Back-of-envelope throughput evaluation 25 / 88
  • 30. Part II Deep Reinforcement Learning — Channel Allocation in WLANs 26 / 88
  • 31. BoE throughput evaluation takes into account ▶ Adjacency (carrier sensing relationship) among stations ▶ Saturated traffic But there are many other factors ▶ Unsaturated traffic and its fluctuations ▶ Random backoff algorithm in CSMA/CA ▶ Fading and its impact on adjacency ▶ Other factors Optimization based on observations ↓ Machine learning 27 / 88
  • 32. Machine Learn- ing (ML) Unsupervised Learning Supervised Learning Deep Learning Reinforcement Learning (RL) Multi- armed Bandit Q-learning Deep RL Multi- agent RL Adversarial RL 28 / 88 Which framework can we utilize for performance improvement in WLANs? From a viewpoint of radio resource management
  • 33. Reinforcement Learning (RL) Multi-armed Bandit Q-learning Deep RL Multi- agent RL Adversarial RL “Deep RL with graph convolution” — My opinion at this point Multi-agent RL is also attractive because WLANs are decentralized networks. However, at this point, ▶ Multi-agent RL is mainly developed for cooperative control systems ▶ It is hard to apply deep learning with recent progress 29 / 88
  • 34. Deep RL with graph convolution RL Q-learning Deep learning Function approximation with deep neural networks Graph convolution 30 / 88
  • 35. Deep RL with graph convolution RL Q-learning 31 / 88
  • 36. Motivation — Optimal Channel Allocation Optimal channel allocation? Criterion: Aggregated throughput Number of channels: 2 1 2 3 4 5 1 2 3 4 5 1 1 1 1 1 1 2 3 4 5 1 1 1 1 1← BoE throughput Framework: Combinatorial optimization (NP-hard) BoE throughput is shown for the ease of explanation. 32 / 88
  • 37. Motivation — Optimal Channel Allocation Optimal channel allocation? Criterion: Aggregated throughput Number of channels: 2 1 2 3 4 5 1 2 3 4 5 1 1 1 1 1 1 2 3 4 5 1 1 1 1 1← BoE throughput Framework: Combinatorial optimization (NP-hard) BoE throughput is shown for the ease of explanation. 32 / 88
  • 38. Motivation — Optimal Sequence Different problem setting: ▶ Finding the minimal sequence ▶ Only one AP can change its channel at a given time ▶ Throughput can be observed only after channel allocation Initial state Optimal states 1 2 3 4 5 1 0 1 0 1 We would like to find the minimum sequence. 1 2 3 4 5 1 1 1 1 1 1 2 3 4 5 1 1 1 1 1 This part is simplified version of [Nakashima+2019] 33 / 88
  • 39. Channel Allocation Sequence 1 2 3 4 5 1 0 1 0 1 1 2 3 4 5 1 1 1 0 1 1 2 3 4 5 1 1 1 1 1 ← We want to ac- quire this sequence based on observed throughput Criterion? 1 2 3 4 5 1 0 1 0 1 1 2 3 4 5 0.5 0.5 1 0.5 0.5 1 2 3 4 5 1 1 1 0.5 0.5 1 2 3 4 5 1 1 1 1 1 t = 0 t = 1 t = 2 t = 3 34 / 88
  • 40. Automated driving system — Self-driving cars https://www.metacar-project.com/ Markov decision process (MDP) Agent Environment Action State=Position Reward Metacar: “A reinforcement learning environment for self-driving cars in the browser” 35 / 88
  • 41. Problem At time t, the agent has a sequence of observations ( state S0, A0 action , reward R1, S1), (S1, A1, R2, S2), . . . , (St, At, Rt+1, St+1) Based on the observations, what action At+1 should the agent take to get higher reward? The agent (decision maker) should have the criterion 36 / 88
  • 42. Total reward and optimal action-value function Criterion of agent: Expected total discounted reward with discount factor 0 ≤ γ ≤ 1 E [ ∞∑ t=0 γt Rt+1 ] By using the optimal action-value function Q∗ (s, a) . = max π Eπ [ ∞∑ t=0 γt Rt+1 S0 = s, A0 = a ] , s ∈ S, a ∈ A The next action should be determined as At+1 = arg max a∈A Q∗ (St+1, a) . = denotes “equality relationship that is true by definition” [Sutton+2018] Eπ [·] denotes the expected value of a random variable given that the agent follows policy π. [Sutton+2018] 37 / 88
  • 43. For the following initial state S0, the following sequence starting with A0 = “AP 4 uses CH 2” is optimal because the sequence after t = 1 is optimal t = 0 t = 1 t = 2 · · · State St 1 2 3 4 5 1 1 1 0 1 1 2 3 4 5 1 1 1 1 1 1 2 3 4 5 1 1 1 1 1 · · · Action At AP 4 uses CH 2 Reward Rt R1 = 5 R2 = 5 · · · Q∗ ( S0 = 1 2 3 4 5 1 1 1 0 1 , A0 = AP 4 uses CH 2 ) = R1 + γR2 + γ2 R3 + · · · = 5 + γ5 + γ2 5 + · · · = 5 1 − γ = 50 (γ = 0.9) 38 / 88
  • 44. Part of Optimal Q-table For the initial state 1 2 3 4 5 1 1 1 0 1 CH 1 2 AP 1 4 + γ5 + γ25 + · · · = 49 3 + γ4 + γ25 + · · · = 47.1 2 3 + γ4 + γ25 + · · · = 47.1 4 + γ5 + γ25 + · · · = 49 3 4 + γ5 + γ25 + · · · = 49 3 + γ4 + γ25 + · · · = 47.1 4 4 + γ5 + γ25 + · · · = 49 5 + γ5 + γ25 + · · · = 50 5 4 + γ5 + γ25 + · · · = 49 4 + γ4 + γ25 + · · · = 48.1 γ = 0.9 ▶ The maximum total reward is achieved starting with “AP 4 uses CH 2”. ▶ If we have the optimal Q-table, we can take the optimal action. 39 / 88
  • 45. ▶ The optimal action-value function Q∗ is unknown ▶ We need to estimate Q∗ from observations New problem Based on observations, how can we estimate Q∗? ▶ Qt(s, a): Estimate of Q∗(s, a) at time t 40 / 88
  • 46. (Tabular) Q-learning [Watkins+1992] Updated estimate from observations ↓ Qt+1(St, At) . = Estimate at t ↓ Qt(St, At) + αt δt+1(Qt) δt+1(Qt) . = Rt+1 + γ max a′∈A Qt(St+1, a′ ) − Qt(St, At) ▶ (St, At, Rt+1, St+1): Observation ▶ 0 ≤ αt ≤ 1: (Small) non-negative number Before observation After observation Qt(St, At) Rt+1 + γ max a′∈A Qt(St+1, a′ ) αt = 0 αt = 1 δt+1(Qt) Time-difference error (If Qt = Q∗, δt+1 = 0 from the Bellman equation) 41 / 88
  • 47. Feature of Q-learning [Szepesv´ari2010, 3.3.1] “Under the assumption that every state-action pair is visited infinitely often, the sequence (Qt; t ≥ 0) converges to Q∗.” 42 / 88
  • 48. Markov decision process ▶ Agent: Centralized controller of all APs ▶ State: Channels and contention graph of APs ▶ Action: Channel selection of one AP ▶ Reward: Throughput CH 1 i Current state CH 1 CH 2 i Next state Action: AP i changes its channel to 2 [Nakashima+2019] 43 / 88
  • 49. State Space To estimate Q∗(s, a), every state-action pair should be visited When the number of states is huge, training is infeasible — State explosion Number of APs Number of channels Number of states 4 2 1024 10 3 2 · 1018 44 / 88
  • 50. Tabular Q-learning (p. 46) Qt+1(St, At) . = Qt(St, At) + αt δt+1(Qt) δt+1(Qt) . = Rt+1 + γ max a′∈A Qt(St+1, a′ ) − Qt(St, At) Q-learning with function approximation Extension to function approximation with a parameterized function Qθ, θ ∈ Rd θt+1 . = θt + αt δt+1(Qθt ) ∇θ Qθt (St, At) δt+1(Qθt ) . = Rt+1 + γ max a′∈A Qθt (St+1, a′ ) − Qθt (St, At) Deep reinforcement learning Q-learning with function approximation using deep neural networks 45 / 88
  • 51. Regression in supervised learning ▶ Model f is not given ▶ Samples {xi, yi}i∈{1,...,n} are available, where yi = f(xi) + noise ▶ Estimate the output for test input x ̸∈ {xi}i∈{1,...,n} x −−−−→ y ∈ ∈ Rd f −−−−→ R Approach ▶ Consider a form of parameterized function fθ to approximate f e.g., linear basis function model: fθ(x) . = b∑ j=1 θj basis function ϕj(x) = θ⊤ ϕ(x) ▶ θ is trained (estimated) using observed samples {xi, yi}i∈{1,...,n} e.g., estimation by minimizing a sum-of-squares error θ . = arg min θ n∑ i=1 ( fθ(xi) − yi )2 .= J(θ) loss function ▶ Estimation for test input x: y . = fθ (x) 46 / 88
  • 52. Training set n observations (xi, yi)i=1,...,n Learning algorithm θ . = arg min θ J(θ) New input x Estimated output fθ (x) Model fθ(·) Training Regression 47 / 88
  • 53. Models (parameterized functions) fθ(x) Learning algorithms Linear basis function model fθ(x) . = b∑ j=1 θj ϕj(x) = θ⊤ ϕ(x) Stochastic gradient descent Neural networks fθ(x) . = b∑ j=1 αj ϕ(x; βj) ϕ(x; β) . = 1 1 + exp(−x⊤w − γ) θ . = (αj, βj)j, β . = (w⊤, γ)⊤ Deep neural networks (CNN, LSTM) Deep neural networks offer better function approximation capabilities compared to shallow networks Backpropagation 48 / 88
  • 54. State Design and Feature Extraction [Nakashima+2019] State: Adjacency of APs and selected channel Physical expression CH 2 CH 1 2 1 3 4 5 Graph expression 2 1 3 4 5 Tabular expression 5x7 matrix       0 1 1 1 0 1 0 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0       [ 1 0 0 1 1 0 1 1 0 0 ] Ch 1 Ch 2 Feature extraction (pre-processing): Graph convolutional networks Convolutional neural network (CNN) Feature extraction for graphs Feature extraction for images 49 / 88
  • 55. State: 5x7 matrix input_1: InputLayer input: output: (None, 1, 5, 7) (None, 1, 5, 7) lambda_2: Lambda input: output: (None, 1, 5, 7) (None, 1, 5, 2) lambda_1: Lambda input: output: (None, 1, 5, 7) (None, 1, 5, 5) spectral_graph_convolution_1: SpectralGraphConvolution input: output: [(None, 1, 5, 2), (None, 1, 5, 5)] (None, 1, 5, 32) spectral_graph_convolution_2: SpectralGraphConvolution input: output: [(None, 1, 5, 32), (None, 1, 5, 5)] (None, 1, 5, 16) batch_normalization_1: BatchNormalization input: output: (None, 1, 5, 32) (None, 1, 5, 32) batch_normalization_2: BatchNormalization input: output: (None, 1, 5, 16) (None, 1, 5, 16) flatten_1: Flatten input: output: (None, 1, 5, 16) (None, 80) dense_1: Dense input: output: (None, 80) (None, 10) Q-values for 10 actions (5 APs x 2 CHs) e 50 / 88
  • 56. Libraries and model 1 import numpy as np 2 import tensorflow as tf 3 from keras.models import Model 4 from keras.layers import Dense, Activation, Flatten, Input, Reshape, Lambda, Concatenate, BatchNormalization→ 5 from rl.policy import EpsGreedyQPolicy # keras-rl 6 7 def get_model_functional(env, nb_actions, ap_number, ch_number): 8 inputs = Input(shape=(1,) + env.observation_space.shape) 9 adj = Lambda(lambda x: x[:, :, :, :ap_number], 10 output_shape=(1, ap_number, ap_number))(inputs) # adjacency matrix 11 ch = Lambda(lambda x: x[:, :, :, ap_number:], 12 output_shape=(1, ap_number, ch_number))(inputs) # channel matrix 13 # GraphConvolution is not provided in libraries 14 H = GraphConvolution(units=32, activation='relu', 15 kernel_initializer='he_normal')([ch, adj]) 16 H = BatchNormalization()(H) 17 H = GraphConvolution(units=16, activation='relu', 18 kernel_initializer='he_normal')([H, adj]) 19 H = BatchNormalization()(H) 20 H = Flatten()(H) 21 predictions = Dense(nb_actions, activation='linear', 22 kernel_initializer='he_normal')(H) 23 model = Model(inputs=inputs, outputs=predictions) 24 return model 51 / 88
  • 57. 0 200000 400000 600000 800000 Step 0.55 0.60 0.65 0.70 0.75 0.80 Reward ▶ Five APs are located uniformly and randomly ▶ Reward: The minimum individual BoE throughput of five APs Simplified version of [Nakashima+2019] 52 / 88
  • 58. End-to-End Learning of Proactive Handover Policy for Camera-Assisted mmWave Networks Using Deep Reinforcement Learning Yusuke Koda Kota Nakashima Koji Yamamoto Takayuki Nishio Masahiro Morikura Graduate School of Informatics, Kyoto University arXiv:1904.04585
  • 59. Image-to-decision proactive handover ▶ Determine one BS from two candidate BSs ▶ The output of NNs is Q-value for input images NN [16] (Combination of CNN and LSTM) Consequtive input images : Index of associated BS : Remaining time step until handover process is completed : Action Output layer Output of NN prior to output layer 54 / 88
  • 60. 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 050150250 Time (s) DatarateinBS1(Mbit/s) Data rate degradation Data rate of BS 1 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 40506070 Time (s) Actionvalue Action value for selecting BS 1 Action value for selecting BS 2 Handover to BS 2 Handover to BS 1 Q-value for selecting BSs 1 and 2 without camera images ▶ Because the received power degrades sharply, handover decision is conducted after degradation in general 55 / 88
  • 61. 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 40506070 Time (s) Actionvalue Action value for selecting BS 1 Action value for selecting BS 2 Handover to BS 2 Handover to BS 1 Q-value for selecting BSs 1 and 2 without camera images 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0 1015202530 Time (s) Action−value Action value for selecting BS 1 Action value for selecting BS 2 Handover to BS 2 Handover to BS 1 Action value degradation ahead of blockage Q-value for selecting BSs 1 and 2 with camera images ▶ By extending the state space using camera images, proactive handover is enabled 56 / 88
  • 62. Our related works Deep reinforcement learning with graph convolution ▶ Channel allocation for WLANs [Nakashima+2019] We utilize ▶ Deep Q network (DQN) [Mnih+2015] ▶ Double Deep Q-network [vHasselt+2016] ▶ Dueling network [Wang+2015] ▶ Prioritized experience replay [Schaul+2015] Deep reinforcement learning with CNN ▶ Image-based proactive handover for mmWave WLANs [Koda+2018], [Koda+2019] 57 / 88
  • 63. References ▶ R. S. Sutton and A. G. Barto, Reinforcement Learning, 2nd ed. MIT Pr., 2018 ▶ C. Szepesv´ari, Algorithms for Reinforcement Learning. Morgan and Claypool Pub., 2010 ▶ K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, Nov. 2017 58 / 88
  • 64. Summary (Part II) ▶ Reinforcement learning is used to acquire the optimal sequence to maximize the total reward. ▶ Q-learning can estimate the optimal action-value function Q∗. ▶ Deep neural networks are used for function approximation of Q function (deep RL). 59 / 88
  • 65. Part III Stochastic Geometry — Modeling and Analysis of WLANs 60 / 88
  • 66. Goal of performance modeling ▶ Points represent TXs or TX-RX pairs. ▶ Circle represents carrier sense area (contention domain). We would like to evaluate the following performances without Monte Carlo simulations to ensure reproducibility 1. Individual throughput in a specific topology — Graph theory Part 1 2. Average throughput over possible topologies — SG 3. Distributions of interference and SIR — SG 61 / 88
  • 67. 0.51 0.31 0.43 0.69 0.09 0.23 0.27 0.27 0.62 0.43 0.65 0.57 0.11 0.60 0.36 0.43 All points Potential TXs Poisson point process (PPP) Φ Numbers Backoff counter Uniform distribution Circles Contention domain Points with disks Transmitting TXs Mat´ern hardcore point pro- cess (MHCPP) ΦM 62 / 88
  • 68. Poisson point process (PPP) Φ Φ . = {x} A set of locations of random points (point process) Φ(B) A r.v. representing the number of points of Φ in region B, i.e., Φ(B) . = #{Φ ∩ B} Φ is a set of independent random points with homogeneous density λ ⇐⇒ Φ is a PPP with intensity (density) λ ⇐⇒ Φ(B) ∼ Pois(λ|B|), i.e., Poisson distribution with intensity λ|B| ⇐⇒ P ( Φ(B) = n ) = e−λ|B|(λ|B|)n n! , n = 0, 1, . . . λ = 1, area of square grid is 1, total area is 40 63 / 88
  • 69. Probabilities derived assuming PPP ▶ Probability that Φ has no point in B P ( Φ(B) = 0 ) = e−λ|B| ▶ Probability that Φ has at least one point in B P ( Φ(B) ≥ 1 ) = 1 − e−λ|B| Φ is a PPP with intensity λ ⇐⇒ Φ(B) ∼ Pois(λ|B|) ⇐⇒ P ( Φ(B) = n ) = e−λ|B|(λ|B|)n n! , n = 0, 1, . . . 64 / 88
  • 70. Generating Realizations of PPP Both codes generates the figure on p. 68 B . = { (x, y) | 0 ≤ x ≤ 10, 0 ≤ y ≤ 4 }, i.e., |B| = 40, and λ = 1 General way in R or other languages n <- rpois(1, lambda=40) x <- runif(n, min=0,max=10) y <- runif(n, min=0,max=4) plot(x,y) 1. For region B, generating a realization n of Poisson random variable with expectation λ|B| (= lambda in the code) 2. Generating n uniformly distributed points in region B Package “spatstat” [Baddeley+2015] in R P <- rpoispp(lambda=1, win=owin(xrange=c(0,10), yrange=c(0,4))) plot(P) For more complicated point processes, this package is useful. 65 / 88
  • 71. CSMA 0.51 0.31 0.43 0.69 0.09 0.23 0.27 0.27 0.62 0.43 0.65 0.57 0.11 0.60 0.36 0.43 Points Potential TXs Poisson point process Φ Numbers Backoff counter m(x) Uniform distributed r.v. 〇 Contention domain b(x, r) Disc with radius r ● Contended TXs Φ ∩ b(x, r) ● Simultaneous TXs Mat´ern hardcore point process ΦM → Interference RXs are omitted in figures. 66 / 88
  • 72. Contention domain 〇 b(x, r) ▶ Disc with radius r centered at x ▶ Model of contention domain of x ∈ Φ, i.e., area where the received power is grater than the carrier sense threshold 67 / 88
  • 73. Contended TXs ● Locations of contended TXs with TX x Φ ∩ b(x, r) Number of Contended TXs with TX x Φ(b(x, r)) ∼ Pois(λ|b(x, r)|) = Pois(λπr2) 68 / 88
  • 74. Backoff and simultaneous TXs 0.51 0.31 0.43 0.69 0.09 0.23 0.27 0.27 0.62 0.43 0.65 0.57 0.11 0.60 0.36 0.43 m(x) ▶ “Mark” ▶ Uniformly distributed r.v. on [0, 1] for each point x ∈ Φ ▶ Model of backoff counter ● MHCPP ΦM . = { x ∈ Φ : m(x) < m(y) for all y ∈ Φ ∩ b(x, r) {x} } ▶ TXs with smallest backoff counter in their contention domains ▶ Model of a set of simultaneous TXs 69 / 88
  • 75. Expected normalized throughput (medium access probability) P(x ∈ Φ : x ∈ ΦM) = 1−e−λπr2 λπr2 ▶ Probability that a point in PPP retains in MHCPP ▶ Probability that a potential TX has the smallest backoff counter, in other words, no neighboring TXs have smaller backoff counter, in its contention domain, and thus can transmit P(x ∈ Φ : x ∈ ΦM) = P ( #{ z ∈ Φ ∩ b(x, r) | m(z) < m(x) } = 0 ) Conditioning on m(x) = t and considering ˜Φ . = { z ∈ Φ | m(z) < t } ▶ TXs with backoff counter less than t ▶ Intensity of ˜Φ is tλ ▶ ˜Φ(b(x, r)) ∼ Pois(tλ|b(x, r)|) = Pois(tλπr2) P ( #{ z ∈ Φ ∩ b(x, r) | m(z) < t } = 0 ) = P ( ˜Φ(b(x, r)) = 0 ) = e−tλπr2 P ( #{ z ∈ Φ ∩ b(x, r) | m(z) < m(x) } = 0 ) = ∫ 1 0 e−tλπr2 dt = 1 − e−λπr2 λπr2 [Baccelli+2009, 18] derived medium access probability taking into account fading 70 / 88
  • 76. Intensity of MHCPP ▶ Intensity of MHCPP λM . = Intensity of potential TXs λ · 1 − e−λπr2 λπr2 = 1 − e−λπr2 πr2 ▶ In ultra dense networks, lim λ→∞ λM = 1 πr2 , The intensity of simultaneous transmissions only depends on the area of contention domain. 71 / 88
  • 77. Analysis of Inversely Proportional Carrier Sense Threshold and Transmission Power Setting Koji Yamamoto1 Xuedan Yang1 Takayuki Nishio1 Masahiro Morikura1 Hirantha Abeysekera2 1 Graduate School of Informatics, Kyoto University 2 NTT Access Network Service Systems Laboratories, NTT Corporation IEEE CCNC 2017, Jan. 2017
  • 78. Carrier Sense Threshold Setting for Spatial Reuse ▶ 〇 Contention domain: RxPower > CST ▶ ● Contended TXs ▶ ● Simultaneous TXs ▶ → Interference Increasing Carrier Sense Threshold (CST) ▶ ● reduces ▶ → increases In densely deployed WLANs, CST adjustment is crucial However, CST setting can cause throughput starvation RXs are omitted in figures. 73 / 88
  • 79. -82 0 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 AP1 AP2 Receivedpower(dBm) Position (m) p1=13 dBm p2=13 dBm θ1=-82 dBm θ2=-82 dBm ▶ Received signal power > threshold, θx, i.e., two APs detect signals from each other. ▶ Two APs time-share the channel. 74 / 88
  • 80. -82 0 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 AP1 AP2 Receivedpower(dBm) Position (m) p1=13 dBm p2=13 dBm θ1=-82 dBm θ2=-82 dBm +34 dB ▶ AP2 increases θ2 not to detect signals from AP1, then AP2 would transmit independently of AP1 ▶ AP1 should defer its transmission during transmission of AP2 → Starvation due to asymmetric carrier sensing relationship [Mhatre+2007] 75 / 88
  • 81. -82 0 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 AP1 AP2 Receivedpower(dBm) Position (m) p1=13 dBm p2=13 dBm -34 dB θ1=-82 dBm θ2=-82 dBm +34 dB ▶ Reduction in transmission power along with increase in CST [Fuemmeler+2006; Mhatre+2007] Inversely proportional setting (IPS) of CST and transmission power ▶ Symmetric carrier sensing relationship is maintained and APs spatially reuse the channel 76 / 88
  • 82. Numerical example: Expected throughput ▶ Carrier sense range is a function of CST and transmission power 0 1 2 3 0 2 4 6 8 10 12 14 16 n=4 n=10 Throughput(bit/s/Hz) ax= θx /Θ (dB) Throughput Asymptotic throughput Approximation SIR = 30 dB α = 3.5 ▶ The optimal value of CST ax assuming approximation is close to that without approximation ▶ [Iwata+2019] extend the work under nearest AP association M. Iwata, K. Yamamoto, B. Yin, T. Nishio, M. Morikura, and H. Abeysekera, “Analysis of inversely proportional carrier sense threshold and transmission power setting based on received power for IEEE 802.11ax,” in Proc. IEEE Consum. Commun. Netw. Conf. (CCNC), Las Vegas, NV, USA, Jan. 2019 77 / 88
  • 83. Goal of performance modeling ▶ Points represent TXs or TX-RX pairs. ▶ Circle represents carrier sense area (contention domain). We would like to evaluate the following performances without Monte Carlo simulations to ensure reproducibility 1. Individual throughput in a specific topology — Graph theory Part 1 2. Average throughput over possible topologies — SG 3. Distributions of interference and SIR — SG 78 / 88
  • 84. SG Analysis of CSMA ▶ [Nguyen+2007; Baccelli+2009] MHCPP type II for CSMA modeling ▶ [Alfano+2011; Alfano+2014] Distribution of throughput ▶ [Busson+2009] Intensity underestimation of MHCPP ▶ [Kaynia+2011] Optimization of carrier sensing threshold ▶ [ElSawy+2012; ElSawy+2013] Modified hard-core point process ▶ [Venkataraman+2006] Interference distribution from PPP outside the contention domain Approximating MHCPP by PPP with intensity λMHCPP [ElSawy+2012] 79 / 88
  • 85. SIR distribution in Poisson cellular networks [Andrews+2011] ▶ Positions of base stations (BSs) follows a Poisson point process (PPP) ▶ Nearest BS association (Poisson-Voronoi cells) ▶ Rayleigh fading ▶ Exponential-decaying path loss with α > 2 Spatial distribution of SIR in downlink can be derived w/o fading P(SIR > θ) = · · · = 1 1 + 2θ α−2 2F1 ( 1, 1 − 2 α ; 2 − 2 α ; −θ ) = √ θ arctan √ θ (α=4) Closed form when α = 4 0 1 −20 −10 0 10 20 30 P(SIR≤θ) θ (dB) −20 −10 0 10 20 30 with fading 80 / 88
  • 86. Expected sum — Campbell’s theorem for sums [Haenggi2012, §4.2] Let Φ be a PPP on R2 with intensity λ, and f : R2 → R be a measurable function. Then, EΦ [ ∑ x∈Φ ex. interference from x to origin ↓ f(x) ex. sum of interference from Φ ] = λ ∫ R2 f(x) dx ▶ Expected sum can be evaluated as an area integral ▶ Only applicable for PPPs. Thus, PPPs are often assumed 81 / 88
  • 87. Stochastic Geometry Analysis of Normalized SNR Scheduling in Downlink Cellular Networks Takuya Ohto† Koji Yamamoto† Seong-Lyun Kim‡ Takayuki Nishio† Masahiro Morikura† †Kyoto University ‡Yonsei University IEEE Wireless Communications Letters, Aug. 2017
  • 88. 1) Round-robin scheduler A user is selected in rotation independent of the channel state. 2) Proportional fair (PF) scheduler A user with the largest instantaneous rate normalized by the short-term aver- age value, arg max i ratei E[ratei | ri] 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Normalizeddatarate 3) Normalized SNR scheduler A user with the largest instantaneous SNR normalized by the short-term aver- age value, arg max i SNRi E[SNRi | ri] = arg max i hi 0 1 2 3 4 5 6 NormalizedSNR SNRi . = hiri −α /σ2 Short-term average SNR: E[SNRi | ri] = ri −α /σ2 83 / 88
  • 89. SIR distributions (success prob. / coverage prob.) [Andrews+2011] SIR ccdf under Rayleigh fading channel in Poisson cellular networks P(SIR > θ) = 1 1 + 2θ α−2 2F1 ( 1, 1 − 2 α ; 2 − 2 α ; −θ ) √ θ arctan √ θ (when α = 4) α: path loss exponent User is selected randomly independent of channel state. This performance can be achieved under round-robin scheduling. Normalized SNR scheduling [Ohto+2017] P(SIR > θ) ≈ ∞∑ n=0 (λu/cλb)n c B(n + 1, c) (λu/cλb + 1)n+c+1 × n+1∑ k=1 (n+1 k ) (−1)k+1 1 + 2kθ α−2 2F1 ( 1, 1 − 2 α ; 2 − 2 α ; −kθ ) √ kθ arctan √ kθ (when α = 4) λb: intensity of BSs, λu: intensity of users, c = 3.5 84 / 88
  • 90. SIR ccdf (Interference-limited, Rayleigh, and α = 4) [Ohto+2017] 0.0 0.2 0.4 0.6 0.8 1.0 −10 −5 0 5 10 15 20 λu/λb = 4 λu/λb = 8P(SIR>θ) θ (dB) Normalized SNR (Analysis) (Simulation) Round-robin (Analysis) Round-robin [Andrews+2011] P(SIR > θ) = 1 1 + √ θ arctan √ θ Normalized SNR P(SIR⋆ > θ) ≈ ∞∑ n=0 fN (n) n+1∑ k=1 (n+1 k ) (−1)k+1 1 + √ kθ arctan √ kθ 85 / 88
  • 91. Related works Stochastic geometry analysis of channel-adaptive user scheduling in cellular networks Downlink ▶ [Ohto+2017] T. Ohto, K. Yamamoto, S.-L. Kim, T. Nishio, and M. Morikura, “Stochastic geometry analysis of normalized SNR-based scheduling in downlink cellular networks,” IEEE Wireless Commun. Lett., vol. 6, no. 4, Aug. 2017 ▶ [Yamamoto2018] K. Yamamoto, “SIR distribution and scheduling gain of normalized SNR scheduling in Poisson networks,” in Proc. Int. Symp. Model. Optim. Mobile Ad Hoc Wireless Netw. (WiOpt), Shanghai, China, May 2018 Uplink ▶ [Kamiya+2018] S. Kamiya, K. Yamamoto, S.-L. Kim, T. Nishio, and M. Morikura, “Asymptotic analysis of normalized SNR-based scheduling in uplink cellular networks with truncated channel inversion power control,” in Proc. IEEE Int. Conf. Commun. (ICC), Kansas, MO, USA, May 2018 86 / 88
  • 92. References ▶ M. Haenggi, Stochastic Geometry for Wireless Networks. Cambridge, U.K.: Cambridge Univ. Press, 2012 ▶ F. Baccelli and B. Blaszczyszyn, “Stochastic geometry and wireless networks: Volume II applications,” Found. Trends Netw., vol. 4, no. 1-2, 2009 87 / 88
  • 93. Summary (Part III) Stochastic geometry modeling ▶ Poisson point process — Potential TXs ▶ Mat´ern CSMA point process — Simultaneous TXs ▶ Distribution of interference and SIR Applications ▶ Joint carrier sense threshold and transmission power setting ▶ User scheduling 88 / 88
  • 94. References I [Alfano+2011] G. Alfano, M. Garetto, and E. Leonardi, “New insights into the stochastic geometry analysis of dense CSMA networks,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Shanghai, China, Apr. 2011. [Alfano+2014] ——,“New directions into the stochastic geometry analysis of dense CSMA networks,” IEEE Trans. Mobile Comput., vol. 13, no. 2, Feb. 2014. [Andrews+2011] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable approach to coverage and rate in cellular networks,” IEEE Trans. Commun., vol. 59, no. 11, Nov. 2011. [Arulkumaran+2017] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Process. Mag., vol. 34, no. 6, Nov. 2017. [Baccelli+2009] F. Baccelli and B. Blaszczyszyn, “Stochastic geometry and wireless networks: Volume II applications,” Found. Trends Netw., vol. 4, no. 1-2, 2009. 89 / 88
  • 95. References II [Baddeley+2015] A. Baddeley, E. Rubak, and R. Turner, Spatial Point Patterns: Methodology and Applications with R. CRC Pr., 2015. [Bianchi2000] G. Bianchi, “Performance analysis of the IEEE 802.11 distributed coordination function,” IEEE J. Sel. Areas Commun., vol. 18, no. 3, Mar. 2000. [Busson+2009] A. Busson and G. Chelius, “Point processes for interference modeling in CSMA/CA ad-hoc networks,” in Proc. the 6th ACM Symp. Perform. Eval. Wireless Ad Hoc, Sensor, Ubiquitous Netw. (PE-WASUN), Canary Islands, Spain, Oct. 2009. [ElSawy+2012] H. ElSawy, E. Hossain, and S. Camorlinga, “Characterizing random CSMA wireless networks: A stochastic geometry approach,” in Proc. IEEE Int. Conf. Commun. (ICC), Ottawa, ON, Jun. 2012. 90 / 88
  • 96. References III [ElSawy+2013] H. ElSawy and E. Hossain, “A modified hard core point process for analysis of random CSMA wireless networks in general fading environments,” IEEE Trans. Commun., vol. 61, no. 4, Apr. 2013. [Fuemmeler+2006] J. A. Fuemmeler, N. H. Vaidya, and V. V. Veeravalli, “Selecting transmit powers and carrier sense thresholds in csma protocols for wireless ad hoc networks,” in Proceedings of the 2nd annual international workshop on Wireless internet - WICON ’06, New York, New York, USA, Aug. 2006. [Garetto+2008] M. Garetto, T. Salonidis, and E. W. Knightly, “Modeling per-flow throughput and capturing starvation in CSMA multi-hop wireless networks,” IEEE/ACM Trans. Netw., vol. 16, no. 4, Aug. 2008. [Haenggi2012] M. Haenggi, Stochastic Geometry for Wireless Networks. Cambridge, U.K.: Cambridge Univ. Press, 2012. 91 / 88
  • 97. References IV [Iwata+2019] M. Iwata, K. Yamamoto, B. Yin, T. Nishio, M. Morikura, and H. Abeysekera, “Analysis of inversely proportional carrier sense threshold and transmission power setting based on received power for IEEE 802.11ax,” in Proc. IEEE Consum. Commun. Netw. Conf. (CCNC), Las Vegas, NV, USA, Jan. 2019. [Kamiya+2018] S. Kamiya, K. Yamamoto, S.-L. Kim, T. Nishio, and M. Morikura, “Asymptotic analysis of normalized SNR-based scheduling in uplink cellular networks with truncated channel inversion power control,” in Proc. IEEE Int. Conf. Commun. (ICC), Kansas, MO, USA, May 2018. [Kaynia+2011] M. Kaynia, N. Jindal, and G. E. Oien, “Improving the performance of wireless ad hoc networks through MAC layer design,” IEEE Trans. Wireless Commun., vol. 10, no. 1, Jan. 2011. [Khorov+2019] E. Khorov, A. Kiryanov, A. Lyakhov, and G. Bianchi, “A tutorial on IEEE 802.11ax high efficiency WLANs,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, 2019. 92 / 88
  • 98. References V [Koda+2018] Y. Koda, K. Yamamoto, T. Nishio, and M. Morikura, “Reinforcement learning based predictive handover for pedestrian-aware mmWave networks,” in Proc. IEEE Int. Conf. Comput. Commun. Workshops (INFOCOM Workshops), Honolulu, HI, USA, Apr. 2018. [Koda+2019] Y. Koda, K. Nakashima, K. Yamamoto, T. Nishio, and M. Morikura, “End-to-end learning of proactive handover policy for camera-assisted mmWave networks using deep reinforcement learning,” arXiv preprint arXiv:1904.04585, Apr. 2019. [Liew+2010] S. C. Liew, C. H. Kai, H. C. Leung, and P. Wong, “Back-of-the-envelope computation of throughput distributions in CSMA wireless networks,” IEEE Trans. Mobile Comput., vol. 9, no. 9, Sep. 2010. [Mhatre+2007] V. P. Mhatre, K. Papagiannaki, and F. Baccelli, “Interference mitigation through power control in high density 802.11 WLANs,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Anchorage, AK, May 2007. 93 / 88
  • 99. References VI [Mnih+2015] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, Feb. 2015. [Nakashima+2019] K. Nakashima, S. Kamiya, K. Ohtsu, K. Yamamoto, T. Nishio, and Masahiro Morikura, “Deep reinforcement learning-based channel allocation for wireless LANs with graph convolutional networks,” arXiv preprint arXiv:1905.07144, May 2019. [Nguyen+2007] H. Q. Nguyen, F. Baccelli, and D. Kofman, “A stochastic geometry analysis of dense IEEE 802.11 networks,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Anchorage, AK, May 2007. 94 / 88
  • 100. References VII [Ohto+2017] T. Ohto, K. Yamamoto, S.-L. Kim, T. Nishio, and M. Morikura, “Stochastic geometry analysis of normalized SNR-based scheduling in downlink cellular networks,” IEEE Wireless Commun. Lett., vol. 6, no. 4, Aug. 2017. [Schaul+2015] T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” in ICLR, Sna Juan, Puerto Rico, May 2015. [Sutton+2018] R. S. Sutton and A. G. Barto, Reinforcement Learning, 2nd ed. MIT Pr., 2018. [Szepesv´ari2010] C. Szepesv´ari, Algorithms for Reinforcement Learning. Morgan and Claypool Pub., 2010. [Venkataraman+2006] J. Venkataraman, M. Haenggi, and O. Collins, “Shot noise models for outage and throughput analyses in wireless ad hoc networks,” in Proc. IEEE Military Commun. Conf. (MILCOM), Oct. 2006. 95 / 88
  • 101. References VIII [vHasselt+2016] H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” in 30th AAAI Conf. Artificial Intelligence, Mar. 2016. [Wang+2015] Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, and N. de Freitas, “Dueling network architectures for deep reinforcement learning,” arXiv preprint arXiv:1511.06581, Nov. 2015. [Watkins+1992] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn., vol. 8, no. 3-4, May 1992. [Yamamoto2015] K. Yamamoto, “A comprehensive survey of potential game approaches to wireless networks,” IEICE Trans. Commun., vol. E98-B, no. 9, Sep. 2015. [Yamamoto2018] ——,“SIR distribution and scheduling gain of normalized SNR scheduling in Poisson networks,” in Proc. Int. Symp. Model. Optim. Mobile Ad Hoc Wireless Netw. (WiOpt), Shanghai, China, May 2018. 96 / 88
  • 102. References IX [Yin+2017] B. Yin, S. Kamiya, K. Yamamoto, T. Nishio, M. Morikura, and H. Abeysekera, “Mitigating throughput starvation in dense WLANs through potential game-based channel selection,” IEICE Trans. Fundamentals, vol. E100-A, no. 11, Nov. 2017. Some figures in this slide are licensed to speakers only. 97 / 88