CURIOUS : HOW JIOCINEMA
HANDLED 32M CONCURRENT
USERS
A technical deep dive into streaming video content at scale
Present By : Ansal KA
JIOCINEMA’S
MULTI-CAM IPL
STREAMING
 JioCinema is an Indian OTT media
streaming service offering video
streaming services. Since 2023,
JioCinema has held the exclusive
digital rights to broadcast the
Indian Premier League (IPL) in
India.
 With multiple cameras available on
the ground, it is now possible for
viewers to switch to their
preference camera angle while
watching IPL.
 This includes spider camera view,
batsman view, bird's eye view,
wicketkeeper view and recently
added Hero Cam.
 Production Control Rooms (PCRs).
JIO CINEMA
STATISTICS
The platform reached 26 billion
views during the IPL 2024
season, marking a 53 per cent
increase from the previous year
During the IPL 2023 final, when
Dhoni came out to bat, the
digital streaming platform
JioCinema hit 22 million
concurrent viewers
INTRODUCTION
Challenge:
Managing the demands of
over 32 million concurrent
viewers during key IPL
moments (e.g., finals).
Goal:
Zero downtime, uninterrupted
streams, and seamless user
experiences.
Solution Focus:
Comprehensive audits,
scalable infrastructure, CDN
optimization, and
personalized content
management.
Key Points Covered:
CDN management
Backend scaling
Cache strategies
Panic Mode readiness
BASIC FLOW
FRONTEND OPTIMIZATION
 Feature Flags:
• Implemented for controlled feature rollouts to identify issues
early.
• Features can be rolled back instantly if causing app crashes
or performance drops.
• Enhances agility in handling real-time issues without
impacting the entire user base.
FRONTEND OPTIMIZATION
 Scenario Simulations:
• Tools like Charles used to simulate different network
conditions (low bandwidth, high latency).
• Test real-world scenarios like DNS failures, API call timeouts,
and backend errors to ensure preparedness.
FRONTEND OPTIMIZATION
 Graceful Degradation:
• When certain features fail, default content is shown (e.g.,
home screen UI fallback).
• Core , P1 and P2 Features classification
• Errors are hidden from the user to ensure continuous
satisfaction.
• Exponential Backoff technique for retrying failed API calls to
avoid server overload.
EXPONENTIAL
BACKOFF
A technique for handling failed HTTP
requests by gradually increasing the wait
time between retries to prevent server
overload.
For example, the first retry waits for 1
second, the second retry waits for 2
seconds, and so on. This helps prevent
server overloading.
BACKEND
INFRASTRUCTURE
 A content delivery network (CDN) is a
network of interconnected servers that
speeds up webpage loading for data-
heavy applications.
 By distributing content across multiple
servers located in different geographic
locations, CDNs can deliver content
faster and more efficiently to users.
 That way, it reduces latency, improves
page load times, and provides a better
user experience.
BACKEND
INFRASTRUCTURE
• In-house CDN optimizer to dynamically balance
traffic across multiple CDNs.
• Monitors real-time load on each CDN and routes
traffic accordingly to avoid overburdening any
single CDN.
• Reduces the risk of regional failures and improves
global load distribution.
Multi-
CDN
Strategy:
• Preemptive scaling of databases before big
matches.
• Simulations of expected load, including API calls,
database queries, and CPU utilization.
• Communicated scaling needs with cloud providers
months in advance to ensure readiness.
• Resource thresholds set at ~60% CPU utilization to
maintain high performance during peak times.
Databas
e
Scaling:
MULTI-CDN OPTIMIZATION
Sticker overload issue: Resolved by caching stickers and
Base64 encoding new stickers, reducing the burden on the
origin server.
Image
CDN:
Optimized for seamless video delivery during peak traffic.
Special strategies for handling video spikes during critical
match moments.
Video
CDN:
Cached frequent API calls (e.g., login, match stats) for
quicker responses.
Dynamic routing across multiple CDNs to ensure minimal
latency and improve user experience.
API CDN:
CACHE MANAGEMENT & OFFLOADING
Cache TTL Extension:
Longer Time-to-Live (TTL) on cached
content to reduce backend server load.
Popular content (e.g., high-traffic videos,
repeated API calls) is cached longer to
offload requests from the origin server.
Cache Offloading:
Optimized CDN caching policy with a
goal of >90% cache offloading.
Efficient eviction and refresh strategies to
ensure content is timely while minimizing
server strain.
Use of Base64 for efficient encoding of
new image-based assets (e.g., stickers).
PANIC MODE & SNAPSHOT-BASED FAILOVER
Panic Mode Activation:
Triggered when unexpected traffic surges
occur (e.g., Dhoni entering the field).
Pre-match snapshots of expected API
responses are stored in static storage.
If the system starts failing due to load, CDN
serves static content from these snapshots,
ensuring zero complaints from users.
Static Failover:
Customization systems like personalized
home screens may fail under load;
snapshots ensure standard content is
displayed.
Ensures continuous operation even in the
face of backend failures, reducing
pressure on real-time processing.
ASYNCHRONOUS PROCESSING WITH KAFKA
Kafka Utilization:
Heavy reliance on Kafka for handling
asynchronous tasks during live events.
Tasks such as viewer count updates are
processed asynchronously to reduce real-
time load on the backend.
Throughput Optimization:
Kafka ensures high throughput with
partitioning for better scalability.
Resilience: If Kafka is temporarily down,
data can be processed from local
storage to ensure service continuity.
LADDER-BASED SCALING DOWN
Smooth Downscaling:
Traffic doesn’t drop off immediately after
a match; users often linger.
Gradual downscaling strategy: from 32M
to 25M to 20M, ensuring smooth
transitions and no service interruptions.
Health Checks:
Liveness checks and readiness probes to
ensure that each component is
functioning as expected before scaling
down.
Optimizes resource use without abruptly
reducing capacity, avoiding
unnecessary service degradation.
AD INSERTION MECHANISM
• Ad directors monitor live matches and trigger ads based on
commentary cues.
• Different languages have dedicated ad directors for each
live stream.
Live Stream Ad
Insertion:
• Avoid over-delivery of ads by targeting segments of the 32M
users.
• Client-side ads triggered using SCTE-35 markers for precise
ad placement without disrupting user experience.
Static Ads &
Client-Side
Ads:
CONCLUSION
& FUTURE
READINESS
• Key Takeaways:
• Scalability: Proactive planning and simulations ensured
readiness for up to 32M users.
• Resilience: Panic Mode and static content delivery helped
prevent system crashes during high load.
• Efficiency: Efficient CDN and cache management played a
crucial role in handling traffic.
• Ad Scalability: Smooth ad insertion techniques ensured
monetization without affecting the user experience.
• Future Improvements:
• Further refinement of caching policies.
• Enhanced database simulations for faster scaling.
• Even greater automation in scaling and Panic Mode
activation.
QUESTIONS?
 References:
Medium: Behind the Scenes: How JioCinema
Seamlessly Streams IPL Matches to 20 Million Fans
Youtube : How JioCinema
live streams IPL to 20 million concurrent devices w/ Prachi S
harma | Ep 7

Jio Cinema user handling system design: A technical deep dive into streaming video content at scale

  • 1.
    CURIOUS : HOWJIOCINEMA HANDLED 32M CONCURRENT USERS A technical deep dive into streaming video content at scale Present By : Ansal KA
  • 2.
    JIOCINEMA’S MULTI-CAM IPL STREAMING  JioCinemais an Indian OTT media streaming service offering video streaming services. Since 2023, JioCinema has held the exclusive digital rights to broadcast the Indian Premier League (IPL) in India.  With multiple cameras available on the ground, it is now possible for viewers to switch to their preference camera angle while watching IPL.  This includes spider camera view, batsman view, bird's eye view, wicketkeeper view and recently added Hero Cam.  Production Control Rooms (PCRs).
  • 3.
    JIO CINEMA STATISTICS The platformreached 26 billion views during the IPL 2024 season, marking a 53 per cent increase from the previous year During the IPL 2023 final, when Dhoni came out to bat, the digital streaming platform JioCinema hit 22 million concurrent viewers
  • 4.
    INTRODUCTION Challenge: Managing the demandsof over 32 million concurrent viewers during key IPL moments (e.g., finals). Goal: Zero downtime, uninterrupted streams, and seamless user experiences. Solution Focus: Comprehensive audits, scalable infrastructure, CDN optimization, and personalized content management. Key Points Covered: CDN management Backend scaling Cache strategies Panic Mode readiness
  • 5.
  • 6.
    FRONTEND OPTIMIZATION  FeatureFlags: • Implemented for controlled feature rollouts to identify issues early. • Features can be rolled back instantly if causing app crashes or performance drops. • Enhances agility in handling real-time issues without impacting the entire user base.
  • 7.
    FRONTEND OPTIMIZATION  ScenarioSimulations: • Tools like Charles used to simulate different network conditions (low bandwidth, high latency). • Test real-world scenarios like DNS failures, API call timeouts, and backend errors to ensure preparedness.
  • 8.
    FRONTEND OPTIMIZATION  GracefulDegradation: • When certain features fail, default content is shown (e.g., home screen UI fallback). • Core , P1 and P2 Features classification • Errors are hidden from the user to ensure continuous satisfaction. • Exponential Backoff technique for retrying failed API calls to avoid server overload.
  • 9.
    EXPONENTIAL BACKOFF A technique forhandling failed HTTP requests by gradually increasing the wait time between retries to prevent server overload. For example, the first retry waits for 1 second, the second retry waits for 2 seconds, and so on. This helps prevent server overloading.
  • 10.
    BACKEND INFRASTRUCTURE  A contentdelivery network (CDN) is a network of interconnected servers that speeds up webpage loading for data- heavy applications.  By distributing content across multiple servers located in different geographic locations, CDNs can deliver content faster and more efficiently to users.  That way, it reduces latency, improves page load times, and provides a better user experience.
  • 11.
    BACKEND INFRASTRUCTURE • In-house CDNoptimizer to dynamically balance traffic across multiple CDNs. • Monitors real-time load on each CDN and routes traffic accordingly to avoid overburdening any single CDN. • Reduces the risk of regional failures and improves global load distribution. Multi- CDN Strategy: • Preemptive scaling of databases before big matches. • Simulations of expected load, including API calls, database queries, and CPU utilization. • Communicated scaling needs with cloud providers months in advance to ensure readiness. • Resource thresholds set at ~60% CPU utilization to maintain high performance during peak times. Databas e Scaling:
  • 12.
    MULTI-CDN OPTIMIZATION Sticker overloadissue: Resolved by caching stickers and Base64 encoding new stickers, reducing the burden on the origin server. Image CDN: Optimized for seamless video delivery during peak traffic. Special strategies for handling video spikes during critical match moments. Video CDN: Cached frequent API calls (e.g., login, match stats) for quicker responses. Dynamic routing across multiple CDNs to ensure minimal latency and improve user experience. API CDN:
  • 13.
    CACHE MANAGEMENT &OFFLOADING Cache TTL Extension: Longer Time-to-Live (TTL) on cached content to reduce backend server load. Popular content (e.g., high-traffic videos, repeated API calls) is cached longer to offload requests from the origin server. Cache Offloading: Optimized CDN caching policy with a goal of >90% cache offloading. Efficient eviction and refresh strategies to ensure content is timely while minimizing server strain. Use of Base64 for efficient encoding of new image-based assets (e.g., stickers).
  • 14.
    PANIC MODE &SNAPSHOT-BASED FAILOVER Panic Mode Activation: Triggered when unexpected traffic surges occur (e.g., Dhoni entering the field). Pre-match snapshots of expected API responses are stored in static storage. If the system starts failing due to load, CDN serves static content from these snapshots, ensuring zero complaints from users. Static Failover: Customization systems like personalized home screens may fail under load; snapshots ensure standard content is displayed. Ensures continuous operation even in the face of backend failures, reducing pressure on real-time processing.
  • 15.
    ASYNCHRONOUS PROCESSING WITHKAFKA Kafka Utilization: Heavy reliance on Kafka for handling asynchronous tasks during live events. Tasks such as viewer count updates are processed asynchronously to reduce real- time load on the backend. Throughput Optimization: Kafka ensures high throughput with partitioning for better scalability. Resilience: If Kafka is temporarily down, data can be processed from local storage to ensure service continuity.
  • 16.
    LADDER-BASED SCALING DOWN SmoothDownscaling: Traffic doesn’t drop off immediately after a match; users often linger. Gradual downscaling strategy: from 32M to 25M to 20M, ensuring smooth transitions and no service interruptions. Health Checks: Liveness checks and readiness probes to ensure that each component is functioning as expected before scaling down. Optimizes resource use without abruptly reducing capacity, avoiding unnecessary service degradation.
  • 17.
    AD INSERTION MECHANISM •Ad directors monitor live matches and trigger ads based on commentary cues. • Different languages have dedicated ad directors for each live stream. Live Stream Ad Insertion: • Avoid over-delivery of ads by targeting segments of the 32M users. • Client-side ads triggered using SCTE-35 markers for precise ad placement without disrupting user experience. Static Ads & Client-Side Ads:
  • 18.
    CONCLUSION & FUTURE READINESS • KeyTakeaways: • Scalability: Proactive planning and simulations ensured readiness for up to 32M users. • Resilience: Panic Mode and static content delivery helped prevent system crashes during high load. • Efficiency: Efficient CDN and cache management played a crucial role in handling traffic. • Ad Scalability: Smooth ad insertion techniques ensured monetization without affecting the user experience. • Future Improvements: • Further refinement of caching policies. • Enhanced database simulations for faster scaling. • Even greater automation in scaling and Panic Mode activation.
  • 19.
    QUESTIONS?  References: Medium: Behindthe Scenes: How JioCinema Seamlessly Streams IPL Matches to 20 Million Fans Youtube : How JioCinema live streams IPL to 20 million concurrent devices w/ Prachi S harma | Ep 7