3. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Sources of Latency, Jitter, and (Frame) Loss
Send
Input
Present
Latency
Transit/
Peering
Game
Simulation
Game
Render
Capture
Encode
Latency
Decode
Latency
Post
Process
Client Server
Network
First Mile
(Server)
Last Mile
(Client)
NVIDIA
Reflex!
Streaming up
to 240FPS!
Focus of this talk!
Adaptive
VSync &
De-Jitter
4. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
“(Input) Lag is a common reason some
gamers reject cloud services”
“There are many factors that go into your
cloud gaming experience, but by far input
latency is one of the biggest“
“After the issues around game ownership,
the biggest concern with game streaming
technology is input latency.”
“Lag is one of the banes of many
gamers’ existence”
Network Latency Impacts CSAT
N.B.: Jitter, Packet Loss, Throughput obviously matter too …
0%
-20%
5. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Network Metrics for QoE Adaptation
● Loss, OWD, RTD, BWE, CE marks and dozens other NW
metrics collected at per-frame freq for QoS adaptation
Streaming
Servers
Bursty + Const Losses
NACK FEC
Bottlenecks /
Bursty Losses
Route
Adaptation
Bandwidth
Throttling
Rate
Limiting
Packet
Pacing
Packet
Pacing
IDR request or
Error Concealment
Frame recovery fails:
Queue Build-Up
Loss- + OWD-based
congestion control
<Src: YouTube>
6. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Network Metrics for QoE Adaptation
● Loss, OWD, RTD, BWE, CE marks and dozens other NW
metrics collected at per-frame freq for QoS adaptation
Streaming
Servers
Predictable
Losses
Bursty Losses
NACK FEC
FEC
Bottlenecks /
Bursty Losses
Route
Adaptation
Bandwidth
Throttling
Rate
Limiting
Packet
Pacing
Packet
Pacing
Packet
Pacing
IDR request or
Error Concealment
Frame recovery fails:
Queue Build-Up
OWD- + BWE-based
congestion control
<Src: YouTube>
7. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Connectivity Issues
30+
Data Centers
100+
Countries
● Increasing coverage decreases DC-client path
lengths and cut-down latency
● Careful selection of peers with high transit
degree helps too
● Bottlenecks between transit ISPs out of our
control
8. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Middle-Mile: QoE-Aware Routing
Streaming
Rate
[Kbps]
Average RTD [ms]
→ Observations:
● Min latency matters, but user ratings skewed towards
consistent latency & (high) video quality
● Preferring shortest AS_PATH useful, but QoE-aware
route selection can do better
● Problem 1: Bottlenecks between transit / transit+eyeball ISPs
Great session
ratings! :D
Good session
ratings! :)
Poor session ratings :(
→ Optimization objective:
1. Model the network score across networking (latency,
loss, rate targets …) & goodput indicators (stream rate)
2. Stream over peer links that maximize the score
3. Update the state dynamically
9. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
QoE-Aware Routing: Results
~28% Latency
Decrease
Guided transit
switch for a Top-3
Eyeball ISP in EU
TRANSIT 1
TRANSIT 2
10. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
QoE-Aware Routing: Results
+ Reacts to
transit
anomalies:
~28% Latency
Decrease
~+19% Streaming
Bitrate Increase
TRANSIT 1
TRANSIT 2
Guided transit
switch for a Top-3
Eyeball ISP in EU
TRANSIT 1
TRANSIT 2
Now auto-steering traffic to 30k+ unique
eyeball ISPs (monthly) from 30+ GFN DCs
Capacity between two non-NVIDIA
peers exhausted
Re-Route!
11. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Last-Mile: L4S for Cloud Streaming
● L4S [RFC9332] addresses bufferbloats by allowing
sender to react faster to queue build-up vs black-box
E2E queue build-up estimation
● Use of L4S requires a compliant TX/RX and network
(marking, CE feedback, on-path AQM, and new CC)
Queue
Assignment
Other
LL
Other
LL
Queue
Assignment
Feedback
confirming % CE
Feedback
confirming % CE
Streamer Client
Congested
Network
Packets
Marked
ECT(1)
Packets
Remarked CE
→ Problem 2: Handling impairments in user’s network (bufferbloats, packet loss …)
● CloudXR 4.0 SDK has initial L4S CC support
● PoC L4S support in GeForce NOW in evaluation
12. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Results: L4S for Cloud Streaming
DOCSIS Operator 1 (US L4S):
-24% Avg RTD
-85.2% P99 RTD
(vs Single-Queue AQM on US)
● On a shared link w/ unresponsive TCP
sender and extreme congestion, L4S
significantly improves latency and loss
● Added CE feedback is invaluable for a
faster sender turnaround
● Alternatives: sch_fq_codel / sch_cake
would also significantly improve
experience during congestions
L4S
DOCSIS Operator 2 (DS+US L4S):
-72.4% RTD
-70% P99.9 RTD
(vs Single-Queue AQM on DS+US)
13. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Results: L4S for Cloud Streaming
DOCSIS Operator 1 (US L4S):
-24% Avg RTD
-85.2% P99 RTD
(vs Single-Queue AQM on US)
● On a shared link w/ unresponsive TCP
sender and extreme congestion, L4S
significantly improves latency and loss
● Added CE feedback is invaluable for a
faster sender turnaround
● Alternatives: sch_fq_codel / sch_cake
would also significantly improve
experience during congestions
L4S
DOCSIS Operator 2 (DS+US L4S):
-72.4% RTD
-70% P99.9 RTD
(vs Single-Queue AQM on DS+US)
14. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Results: L4S for Cloud Streaming
● Consistent* ECT(1) delivery for ~94%
streaming sessions under test (12.2023)
● Bleaching issue more pronounced on few T1
and client ISPs
● Root cause configuration or vendor
equipment issues
T1 ISP #1 T1 ISP #2 T1 ISP #3 T1 ISP #4
CableISP #1
FibOpt ISP #1
FibOpt ISP #3
FibOpt ISP #2
FibOpt ISP #4
CableISP #2
CableISP #3
WirelessISP #1
CableISP #5
CableISP #4
T1 ISP #3
15. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Cloud Streaming Latency:
Selected Topics
Ermin Sakic, NVIDIA
Monday 11th
December 2023
Try it out and give us feedback!
nvidia.com/geforce-now
developer.nvidia.com/cloudxr-sdk