SlideShare a Scribd company logo
1 of 86
Twitch Plays Pokémon:
Twitch’s Chat Architecture
John Rizzo
Sr Software Engineer
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
twitch-pokemon
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
About Me
www.twitch.tv
Twitch Introduction
www.twitch.tv
Twitch Introduction
www.twitch.tv
• Over 800k concurrent users
• Tens of BILLIONS of daily messages
• ~10 Servers
• 2 Engineers
• 19 Amp Energy drinks per day
Twitch Introduction
www.twitch.tv
Architecture Overview
www.twitch.tv
Video Edge API Edge Chat Edge
fChannel
Page
Video Backend Services Chat
Twitch Plays Pokémon Strikes!
www.twitch.tv
Public Reaction
Media outlets have described the proceedings of the game as
being "mesmerizing", "miraculous" and "beautiful chaos", with
one viewer comparing it to "watching a car crash in slow
motion” - Wikipedia
“Some users are hailing the development as part of internet
history…” – BBC
–TwitchChatTeam
www.twitch.tv
10:33pm, February 14: TPP hits 5k concurrent viewers
9:42am, February 15: TPP hits 20k
8:21pm, February 16: Time to take preemptive action
Twitch Plays Pokémon Strikes!
www.twitch.tv
We separate servers into several clusters
• Robust in the face of failure
• Can tune each cluster for its specific purpose
Chat Server Organization
www.twitch.tv
Main
Cluster
Event
Cluster
Group
Cluster
Test
Cluster
8:21pm, February 16: Move TPP onto the event cluster
Twitch Plays Pokémon Strikes!
www.twitch.tv
5:43pm, February 16: TPP hits 50k users, chat system
starts to show signs of stress (but…it’s on the event cluster!)
8:01am, February 17: Twitch chat engineers panic, rush to
the office
9:35am, February 17: Investigation begins
Twitch Plays Pokémon Strikes!
www.twitch.tv
Debugging principle: Start investigating upstream, move
downstream as necessary
Solving Twitch Plays Pokémon
www.twitch.tv
Edge Server: Sits at the “edge” of the public internet and Twitch’s
internal network
Chat Architecture Overview
www.twitch.tv
User Chat Edge
Public Internet Internal Twitch Network
Flash socket
Ingestion
Distribution
Any user:room:edge tuple is valid
Solving Twitch Plays Pokémon
www.twitch.tv
user9
user1:lirik
user2:degentp
user3:witwix
…
user4:witwix
user5:armorra
user6:degentp
…
user2:witwix
user7:cep21
user8:lirik
…
A Note on Instrumentation
www.twitch.tv
Chat Service
Remote statsd
collector
Graphite
cluster
UDP UDP
HTTP
Solving Twitch Plays Pokémon
So let’s take a look at our edge server logs…
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Message successfully sent to #degentp
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:05 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
Feb 18 06:54:04 tmi_edge: [clue] Message successfully sent to #paragusrants
www.twitch.tv
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Timestamp - when this action was recorded on the server
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Server - the name of the server that is generating this log file
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Remote service - the name of the service that edge is connecting
to
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Event detail
Why did the clue service take so long to respond? Also, what is
the clue service?
Ingestion
Message Ingestion
www.twitch.tv
User Chat Edge
Clue
Public Internet Internal Twitch Network
Distribution
HTTP
Solving Twitch Plays Pokémon
Let’s dissect one of these log lines
Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for:
privmsg
www.twitch.tv
Clue server took longer than 5 seconds to process this message.
Why?
Solving Twitch Plays Pokémon
Clue logs…
Feb 18 06:54:04 chat_clue: WARNING:tornado.general:Connect error on fd 10:
ECONNREFUSED
Feb 18 06:54:04 chat_clue: WARNING:tornado.general:Connect error on fd 15:
ECONNREFUSED
Feb 18 06:54:05 chat_clue: WARNING:tornado.general:Connect error on fd 9:
ECONNREFUSED
Feb 18 06:54:05 chat_clue: WARNING:tornado.general:Connect error on fd 18:
ECONNREFUSED
www.twitch.tv
Not very useful…but we get some info. Clue’s connections are
being refused.
Which machine is clue failing to connect to? Why?
Let’s take a step back…these errors are happening on both main
AND the event clusters. Why?
Are there common services or dependencies?
• Databases (store chat color, badges, bans, etc)
• Cache (to speed up database access)
• Services (authentication, user data, etc)
Investigating Clue
www.twitch.tv
Investigating Clue
www.twitch.tv
Chat:Clue Rails Video
PGBouncer
Postgre
s
Redis
Can rule out databases and services – rest of site is functional
Let’s look closer at our cache – this one is specific to chat servers
Investigating Clue
www.twitch.tv
Redis: where do we start investigating?
Strategy: start high-level then drill down
Investigating Redis
www.twitch.tv
Redis server isn’t being stressed very hard
Investigating Redis
www.twitch.tv
Let’s look at how Clue is using Redis…
Clue Configuration
www.twitch.tv
DB_SERVER=db.internal.twitch.tv
DB_NAME=twitch_db
DB_TIMEOUT=1s
CACHE_SERVER=localhost
CACHE_PORT=2000
CACHE_MAX_CONNS=10
CACHE_TIMEOUT=100ms
…
…
Clue Configuration
www.twitch.tv
DB_SERVER=db.internal.twitch.tv
DB_NAME=twitch_db
DB_TIMEOUT=1s
CACHE_SERVER=localhost
CACHE_PORT=2000
CACHE_MAX_CONNS=10
CACHE_TIMEOUT=100ms
…
…
Looks like our whole cache contains only one local instance?
Redis is single-process and single-threaded!
Redis Configuration
www.twitch.tv
$ ps aux | grep redis
13909 0.0 0.0 2434840 796 s000 S+ grep redis
Redis doesn’t seem to be running locally - what listens on port
2000?
Redis Configuration
www.twitch.tv
$ netstat -lp | grep 2000
tcp 0 0 localhost:2000 *:* LISTEN 2109/haproxy
HAProxy!
HAProxy
www.twitch.tv
• Limits for #connections, requests, etc
• Robust instrumentation
Attribution:
Shareholic.com
Redis Configuration
www.twitch.tv
Are we load balancing across many Redis instances?
DB_SERVER=db.internal.twitch.tv
DB_NAME=twitch_db
DB_TIMEOUT=1s
CACHE_SERVER=localhost
CACHE_PORT=2000
CACHE_MAX_CONNS=10
CACHE_TIMEOUT=100ms
…
…
Redis Configurtaion
www.twitch.tv
Are we load balancing across many Redis instances?
class twitch::haproxy::listeners::chat_redis (
$haproxy_instance = ‘chat-backend',
$proxy_name = 'chat-redis',
$servers = [
'redis2.internal.twitch.tv:6379',
],
...
...
...
Redis Configurtaion
www.twitch.tv
Are we load balancing across many Redis instances?
class twitch::haproxy::listeners::chat_redis (
$haproxy_instance = ’chat-backend',
$proxy_name = 'chat-redis',
$servers = [
'redis2.internal.twitch.tv:6379',
],
...
...
...
We are not load balancing across several instances
Investigating Redis
$ top
Tasks: 281 total, 1 running, 311 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.3%us, 10.5%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi,
Mem: 24682328k total,6962336k used, 17719992k free, 13644k buffers
Swap: 7999484k total, 0k used, 7999484k free, 4151420k cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26109 20 0 76048 128m 1340 S 99 0.2 6133:28 redis-server
3342 20 0 9040 1320 844 R 2 0.0 0:00.01 top
1 20 0 90412 3920 2576 S 0 0.0 103:45.82 init
2 20 0 0 0 0 S 0 0.0 0:05.17 kthreadd
Let’s take a look at the Redis box…
www.twitch.tv
Investigating Redis
Redis is unsprisingly maxing out the CPU
$ top
Tasks: 281 total, 1 running, 311 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.3%us, 10.5%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi
Mem: 24682328k total,6962336k used, 17719992k free, 13644k buffers
Swap: 7999484k total, 0k used, 7999484k free, 4151420k cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26109 20 0 76048 128m 1340 S 99 0.2 6133:28 redis-server
3342 20 0 9040 1320 844 R 2 0.0 0:00.01 top
1 20 0 90412 3920 2576 S 0 0.0 103:45.82 init
2 20 0 0 0 0 S 0 0.0 0:05.17 kthreadd
www.twitch.tv
Redis Optimization Options?
• Optimize Redis at the application-level
• Distribute Redis
www.twitch.tv
Clue Logic
2:00pm: Smarter caching?
Redis Optimization Options?
www.twitch.tv
Redis
Clue Logic
2:00pm: Smarter caching?
Redis Optimization Options?
www.twitch.tv
Redis
Clue
local cache
• 2:23pm: There are some challenges (cache coherence,
implementation difficulty)
• Is there any low-hanging fruit?
• Yes! Rate limiting code!
• 2:33pm: Change has little effect...
Redis Optimization Options?
www.twitch.tv
2:41pm: Distribute Redis?
Redis Optimization Options?
www.twitch.tv
Clue Logic
Redis
Clue
Local Cache
Redis
Redis
Redis
2:56pm: Yes! Sharding by key!
Redis Optimization Options?
www.twitch.tv
set “effinga.chat_color”
get “degentp.chat_color”
get “effinga.chat_color”
set “degentp.chat_color”
Redis
Redis
Redis
Redis
Distribution
Function
3:03pm: How do we implement this?
• Puppet configuration changes
• HAProxy changes
• Redis deploy changes (copy/paste!)
• Do we need to modify any code?
• Can we let HAProxy load balance for us?
• No – LB needs to be aware of Redis protocol
• Changes required at the application level
Distributing Redis
www.twitch.tv
3:52pm: Code surveyed – Two problematic patterns
Distributing Redis
www.twitch.tv
Problematic pattern #1:
Distributing Redis
www.twitch.tv
Distributing Redis
www.twitch.tv
Problematic pattern #1 solution:
Problematic pattern #2:
Distributing Redis
www.twitch.tv
What if we need these keys in different contexts?
Distributing Redis
www.twitch.tv
Problematic pattern #2 solution:
7:21pm: Test (fingers crossed)
8:11pm: Deploy cache changes
9:29pm: Deploy chat code changes
Distributing Redis
www.twitch.tv
10:10pm: Better, but still bad…
Solving Twitch Plays Pokémon
www.twitch.tv
Solving Twitch Plays Pokémon
Let’s use some tools to dig deeper…
$ redis-cli -h redis2.internal.twitch.tv -p 6379 INFO
# Clients
connected_clients:3021
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
$ redis-cli -h redis2.internal.twitch.tv -p 6379 CLIENT LIST |
grep idle | wc -l
2311
www.twitch.tv
Solving Twitch Plays Pokémon
Lots of bad connections
$ redis-cli -h redis2.internal.twitch.tv -p 6379 INFO
# Clients
connected_clients:3021
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
$ redis-cli -h redis2.internal.twitch.tv -p 6379 CLIENT LIST |
grep idle | wc -l
2311
www.twitch.tv
Solving Twitch Plays Pokémon
Let’s grab the pid of one Redis instance
$ sudo svstat /etc/service/redis_*
/etc/service/redis_6379: up (pid 26109) 3543 seconds
/etc/service/redis_6380: up (pid 26111) 3543 seconds
/etc/service/redis_6381: up (pid 26113) 3543 seconds
/etc/service/redis_6382: up (pid 26114) 3544 seconds
www.twitch.tv
Solving Twitch Plays Pokémon
$ sudo svstat /etc/service/redis_*
/etc/service/redis_6379: up (pid 26109) 3543 seconds
/etc/service/redis_6380: up (pid 26111) 3543 seconds
/etc/service/redis_6381: up (pid 26113) 3543 seconds
/etc/service/redis_6382: up (pid 26114) 3544 seconds
www.twitch.tv
Let’s grab the pid of one Redis instance
Solving Twitch Plays Pokémon
$ sudo lsof –p 26109 | grep chat | cut -d ' ' -f 32 | cut -d ':' -f
2 | sort | uniq –c
2012 6421->chat-testing.internal.twitch.tv
121 6421->chat1.internal.twitch.tv
101 6421->chat3.internal.twitch.tv
...
www.twitch.tv
Solving Twitch Plays Pokémon
$ sudo lsof –p 26109 | grep tmi | cut -d ' ' -f 32 | cut -d ':' -f
2 | sort | uniq –c
2012 6421->chat-testing.internal.twitch.tv
121 6421->chat1.internal.twitch.tv
101 6421->chat3.internal.twitch.tv
...
www.twitch.tv
Solving Twitch Plays Pokémon
Lesson learned: Testing is bad
www.twitch.tv
Solving Twitch Plays Pokémon
11:12pm: Shut off testing cluster completely
www.twitch.tv
Users can connect to chat again!
Users can send messages again!
Chat team leaves the office at 11:31pm
Twitch Plays Pokémon is
Solved(?)
www.twitch.tv
Complaints that users don’t see the same messages
A New Bug Arises
www.twitch.tv
Message Distribution
www.twitch.tv
User Chat Edge
Public Internet Internal Twitch Network
Ingestion
Distribution
Message Distribution
www.twitch.tv
User Chat Edge
Public Internet Internal Twitch Network
Clue
Exchange
Message Distribution
www.twitch.tv
Clue Exchange
Exchange
Exchange
HAProxy
Edge
Edge
Edge
Edge
Edge
Edge
• Edge/Clue instrumentation show no errors
• Exchange isn’t even instrumented!
• Let’s fix that…
Solving the Distribution Problem
www.twitch.tv
Solving the Distribution Problem
Let’s look at our new exchange logs
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] Exchange success
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] i/o timeout
Feb 19 14:11:06 exchange: [exchange] Exchange success
www.twitch.tv
Ideas?
Solving the Distribution Problem
• These are extremely short and simple requests, but there are
many of them
• We aren’t using HTTP keepalives
www.twitch.tv
Solving the Distribution Problem
Go makes this extremely simple
www.twitch.tv
Lessons Learned
• Better logs/instrumentation to make debugging easier
• Generate fake traffic to force us to handle more load than we
need
• Make sure we use and configure our infrastructure wisely!
www.twitch.tv
What have we done since?
• We now use AWS servers exclusively
• Better Redis infrastructure
• Python -> Go
• Lots of other big and small changes to support new products
and better QoS
www.twitch.tv
Thank you.
Questions?
www.twitch.tv
Video Architecture
www.twitch.tv
PoP Origin
PoP
PoP
Ingest Distribution
Broadcaster
Viewers
Replication
Hierarchy
RTMP Protocol
www.twitch.tv
Attribution:
MMick66 - Wikipedia
HLS Protocol
www.twitch.tv
Video Ingest
www.twitch.tv
Streaming
Encoder
Internet
PoP
Ingest Proxy
Ingest
Video Ingest
www.twitch.tv
Ingest
Auth Service
DB Transcode
Queue
Video Ingest
www.twitch.tv
Transcoder
Transcode/Transmux
Worker
gotranscoder
HLS Origin
Queue
RTMP Data
Video API
Replication
Hierarchy
VOD
Video Distribution
www.twitch.tv
PoP
Video Edge
Protected Replication
HLS Proxy
HLS Cache HLS Cache
Find API
Ingest Proxy
Upstream PoP
Protected Replication
Video Distribution
www.twitch.tv
PoP Replication Hierarchy
PoP
Edge
PR
PoP
Edge
PR
PoP
Edge
PR
PoP
Edge
PR
PoP
Edge
PR
Transcoding
Tier 1 Cache
Thank you.
Questions?
www.twitch.tv
Watch the video with slide synchronization on
InfoQ.com!
https://www.infoq.com/presentations/twitch-
pokemon

More Related Content

What's hot

ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来Hiromasa Oka
 
MuleSoft Composer for Salesforce.pptx
MuleSoft Composer for Salesforce.pptxMuleSoft Composer for Salesforce.pptx
MuleSoft Composer for Salesforce.pptxAmitSingh192710
 
Bon voyage Docker_Kubernetes
Bon voyage Docker_KubernetesBon voyage Docker_Kubernetes
Bon voyage Docker_Kubernetesssuseraada82
 
Adapt or Die: A Microservices Story at Google
Adapt or Die: A Microservices Story at GoogleAdapt or Die: A Microservices Story at Google
Adapt or Die: A Microservices Story at GoogleApigee | Google Cloud
 
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native ApplicationVMUG IT
 
Fast parallel data loading with the bulk API
Fast parallel data loading with the bulk APIFast parallel data loading with the bulk API
Fast parallel data loading with the bulk APISalesforce Developers
 
WAS vs JBoss, WebLogic, Tomcat (year 2015)
WAS vs JBoss, WebLogic, Tomcat (year 2015)WAS vs JBoss, WebLogic, Tomcat (year 2015)
WAS vs JBoss, WebLogic, Tomcat (year 2015)Roman Kharkovski
 
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)Adin Ermie
 
AWS WAF を活用しよう
AWS WAF を活用しようAWS WAF を活用しよう
AWS WAF を活用しようYuto Ichikawa
 
Snowflake on Googleのターゲットエンドポイントとしての利用
Snowflake on Googleのターゲットエンドポイントとしての利用Snowflake on Googleのターゲットエンドポイントとしての利用
Snowflake on Googleのターゲットエンドポイントとしての利用QlikPresalesJapan
 
Patna MuleSoft Meetup Anypoint Cloudhub 2.0
Patna MuleSoft Meetup Anypoint Cloudhub 2.0Patna MuleSoft Meetup Anypoint Cloudhub 2.0
Patna MuleSoft Meetup Anypoint Cloudhub 2.0shyamraj55
 
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨Amazon Web Services Korea
 
デモとディスカッションで体験するOracle DBトラブル対応
デモとディスカッションで体験するOracle DBトラブル対応デモとディスカッションで体験するOracle DBトラブル対応
デモとディスカッションで体験するOracle DBトラブル対応歩 柴田
 
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1Amazon Web Services Korea
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformDevOps.com
 

What's hot (20)

ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
 
MuleSoft Composer for Salesforce.pptx
MuleSoft Composer for Salesforce.pptxMuleSoft Composer for Salesforce.pptx
MuleSoft Composer for Salesforce.pptx
 
Bon voyage Docker_Kubernetes
Bon voyage Docker_KubernetesBon voyage Docker_Kubernetes
Bon voyage Docker_Kubernetes
 
Adapt or Die: A Microservices Story at Google
Adapt or Die: A Microservices Story at GoogleAdapt or Die: A Microservices Story at Google
Adapt or Die: A Microservices Story at Google
 
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native Application
 
Terraform
TerraformTerraform
Terraform
 
Fast parallel data loading with the bulk API
Fast parallel data loading with the bulk APIFast parallel data loading with the bulk API
Fast parallel data loading with the bulk API
 
Introduce to Terraform
Introduce to TerraformIntroduce to Terraform
Introduce to Terraform
 
WAS vs JBoss, WebLogic, Tomcat (year 2015)
WAS vs JBoss, WebLogic, Tomcat (year 2015)WAS vs JBoss, WebLogic, Tomcat (year 2015)
WAS vs JBoss, WebLogic, Tomcat (year 2015)
 
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
 
AWS WAF を活用しよう
AWS WAF を活用しようAWS WAF を活用しよう
AWS WAF を活用しよう
 
Docker swarm
Docker swarmDocker swarm
Docker swarm
 
Terraform
TerraformTerraform
Terraform
 
Snowflake on Googleのターゲットエンドポイントとしての利用
Snowflake on Googleのターゲットエンドポイントとしての利用Snowflake on Googleのターゲットエンドポイントとしての利用
Snowflake on Googleのターゲットエンドポイントとしての利用
 
Patna MuleSoft Meetup Anypoint Cloudhub 2.0
Patna MuleSoft Meetup Anypoint Cloudhub 2.0Patna MuleSoft Meetup Anypoint Cloudhub 2.0
Patna MuleSoft Meetup Anypoint Cloudhub 2.0
 
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨
[Gaming on AWS] AWS에서 실시간 멀티플레이 게임 구현하기 - 넥슨
 
デモとディスカッションで体験するOracle DBトラブル対応
デモとディスカッションで体験するOracle DBトラブル対応デモとディスカッションで体験するOracle DBトラブル対応
デモとディスカッションで体験するOracle DBトラブル対応
 
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1
간단한 게임을 쉽고 저렴하게 서비스해보자! ::: AWS Game Master 온라인 시리즈 #1
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with Terraform
 
WAF deployment
WAF deploymentWAF deployment
WAF deployment
 

Viewers also liked

twitch tv presentation
twitch tv presentationtwitch tv presentation
twitch tv presentationGalen Gong
 
State of the Video Game Industry
State of the Video Game IndustryState of the Video Game Industry
State of the Video Game IndustryMargaret Wallace
 
Twitch - Video Walkthrough Feature
Twitch - Video Walkthrough FeatureTwitch - Video Walkthrough Feature
Twitch - Video Walkthrough FeatureCarolyn Jao
 
大规模、高并发实时通信系统的挑战和思路
大规模、高并发实时通信系统的挑战和思路大规模、高并发实时通信系统的挑战和思路
大规模、高并发实时通信系统的挑战和思路Junwen Feng
 
The Livestream Way - Building a Culture of Sales Excellence & Accountability
The Livestream Way - Building a Culture of Sales Excellence & AccountabilityThe Livestream Way - Building a Culture of Sales Excellence & Accountability
The Livestream Way - Building a Culture of Sales Excellence & AccountabilitySam Jacobs
 
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Amazon Web Services
 
实时全互动直播 毫秒延迟&多人连麦
实时全互动直播 毫秒延迟&多人连麦实时全互动直播 毫秒延迟&多人连麦
实时全互动直播 毫秒延迟&多人连麦Hardway Hou
 
Twitch Plays Pokemon: Collaborative Learning Opportunities with Video Game S...
Twitch Plays Pokemon:  Collaborative Learning Opportunities with Video Game S...Twitch Plays Pokemon:  Collaborative Learning Opportunities with Video Game S...
Twitch Plays Pokemon: Collaborative Learning Opportunities with Video Game S...Eric Sembrat
 
秒秒夺金:直播互动的最佳实践
秒秒夺金:直播互动的最佳实践秒秒夺金:直播互动的最佳实践
秒秒夺金:直播互动的最佳实践Junwen Feng
 
Twitch - Project Presentation
Twitch - Project PresentationTwitch - Project Presentation
Twitch - Project PresentationDaise Rocha
 
eSports: Quite Possibly the Next Big Thing in Media/Entertainment
eSports: Quite Possibly the Next Big Thing in Media/EntertainmenteSports: Quite Possibly the Next Big Thing in Media/Entertainment
eSports: Quite Possibly the Next Big Thing in Media/EntertainmentColin Sebastian
 
Netflix - Freedom & Responsibility
Netflix - Freedom & ResponsibilityNetflix - Freedom & Responsibility
Netflix - Freedom & ResponsibilityChris Ellis
 
Twitch.tv presentation 2015 BS
Twitch.tv presentation 2015 BSTwitch.tv presentation 2015 BS
Twitch.tv presentation 2015 BSfastjack25
 
Livehouse.in 直播影音趨勢
Livehouse.in 直播影音趨勢Livehouse.in 直播影音趨勢
Livehouse.in 直播影音趨勢Kimmy Chen
 
Over the Top Content Delivery: State of the Art and Challenges Ahead
Over the Top Content Delivery: State of the Art and Challenges AheadOver the Top Content Delivery: State of the Art and Challenges Ahead
Over the Top Content Delivery: State of the Art and Challenges AheadAlpen-Adria-Universität
 
網路直播工作流程分享
網路直播工作流程分享網路直播工作流程分享
網路直播工作流程分享Gaspar Hsieh
 
直播效益與趨勢
直播效益與趨勢直播效益與趨勢
直播效益與趨勢Kimmy Chen
 
Facebook chat architecture
Facebook chat architectureFacebook chat architecture
Facebook chat architectureUdaya Kiran
 
중국웨이보-2016년라이브방송통찰보고서
중국웨이보-2016년라이브방송통찰보고서중국웨이보-2016년라이브방송통찰보고서
중국웨이보-2016년라이브방송통찰보고서Young Jong Sihn
 

Viewers also liked (20)

Twitch - an intro
Twitch - an introTwitch - an intro
Twitch - an intro
 
twitch tv presentation
twitch tv presentationtwitch tv presentation
twitch tv presentation
 
State of the Video Game Industry
State of the Video Game IndustryState of the Video Game Industry
State of the Video Game Industry
 
Twitch - Video Walkthrough Feature
Twitch - Video Walkthrough FeatureTwitch - Video Walkthrough Feature
Twitch - Video Walkthrough Feature
 
大规模、高并发实时通信系统的挑战和思路
大规模、高并发实时通信系统的挑战和思路大规模、高并发实时通信系统的挑战和思路
大规模、高并发实时通信系统的挑战和思路
 
The Livestream Way - Building a Culture of Sales Excellence & Accountability
The Livestream Way - Building a Culture of Sales Excellence & AccountabilityThe Livestream Way - Building a Culture of Sales Excellence & Accountability
The Livestream Way - Building a Culture of Sales Excellence & Accountability
 
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
 
实时全互动直播 毫秒延迟&多人连麦
实时全互动直播 毫秒延迟&多人连麦实时全互动直播 毫秒延迟&多人连麦
实时全互动直播 毫秒延迟&多人连麦
 
Twitch Plays Pokemon: Collaborative Learning Opportunities with Video Game S...
Twitch Plays Pokemon:  Collaborative Learning Opportunities with Video Game S...Twitch Plays Pokemon:  Collaborative Learning Opportunities with Video Game S...
Twitch Plays Pokemon: Collaborative Learning Opportunities with Video Game S...
 
秒秒夺金:直播互动的最佳实践
秒秒夺金:直播互动的最佳实践秒秒夺金:直播互动的最佳实践
秒秒夺金:直播互动的最佳实践
 
Twitch - Project Presentation
Twitch - Project PresentationTwitch - Project Presentation
Twitch - Project Presentation
 
eSports: Quite Possibly the Next Big Thing in Media/Entertainment
eSports: Quite Possibly the Next Big Thing in Media/EntertainmenteSports: Quite Possibly the Next Big Thing in Media/Entertainment
eSports: Quite Possibly the Next Big Thing in Media/Entertainment
 
Netflix - Freedom & Responsibility
Netflix - Freedom & ResponsibilityNetflix - Freedom & Responsibility
Netflix - Freedom & Responsibility
 
Twitch.tv presentation 2015 BS
Twitch.tv presentation 2015 BSTwitch.tv presentation 2015 BS
Twitch.tv presentation 2015 BS
 
Livehouse.in 直播影音趨勢
Livehouse.in 直播影音趨勢Livehouse.in 直播影音趨勢
Livehouse.in 直播影音趨勢
 
Over the Top Content Delivery: State of the Art and Challenges Ahead
Over the Top Content Delivery: State of the Art and Challenges AheadOver the Top Content Delivery: State of the Art and Challenges Ahead
Over the Top Content Delivery: State of the Art and Challenges Ahead
 
網路直播工作流程分享
網路直播工作流程分享網路直播工作流程分享
網路直播工作流程分享
 
直播效益與趨勢
直播效益與趨勢直播效益與趨勢
直播效益與趨勢
 
Facebook chat architecture
Facebook chat architectureFacebook chat architecture
Facebook chat architecture
 
중국웨이보-2016년라이브방송통찰보고서
중국웨이보-2016년라이브방송통찰보고서중국웨이보-2016년라이브방송통찰보고서
중국웨이보-2016년라이브방송통찰보고서
 

Similar to Twitch Plays Pokémon: Twitch's Chat Architecture

Openstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkOpenstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkAndrew Shafer
 
Breaking Smart Speakers: We are Listening to You.
Breaking Smart Speakers: We are Listening to You.Breaking Smart Speakers: We are Listening to You.
Breaking Smart Speakers: We are Listening to You.Priyanka Aash
 
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soul
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soulLet's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soul
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soulSwanand Pagnis
 
Microservices Antipatterns
Microservices AntipatternsMicroservices Antipatterns
Microservices AntipatternsC4Media
 
WebRTC: A front-end perspective
WebRTC: A front-end perspectiveWebRTC: A front-end perspective
WebRTC: A front-end perspectiveshwetank
 
WebRTC Reborn - Full Stack Toronto
WebRTC Reborn -  Full Stack TorontoWebRTC Reborn -  Full Stack Toronto
WebRTC Reborn - Full Stack TorontoDan Jenkins
 
Tips from Support: Always Carry a Towel and Don’t Panic!
Tips from Support: Always Carry a Towel and Don’t Panic!Tips from Support: Always Carry a Towel and Don’t Panic!
Tips from Support: Always Carry a Towel and Don’t Panic!Perforce
 
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency app
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency appDylan Butler & Oliver Hager - Building a cross platform cryptocurrency app
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency appDevCamp Campinas
 
Advanced WCF Workshop
Advanced WCF WorkshopAdvanced WCF Workshop
Advanced WCF WorkshopIdo Flatow
 
Rollup-as-a-service and why it matters to the next-gen of dApps
Rollup-as-a-service and why it matters to the next-gen of dAppsRollup-as-a-service and why it matters to the next-gen of dApps
Rollup-as-a-service and why it matters to the next-gen of dAppsTinaBregovi
 
Loophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in ChromeLoophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in Chromecgvwzq
 
Bit_Bucket_x31_Final
Bit_Bucket_x31_FinalBit_Bucket_x31_Final
Bit_Bucket_x31_FinalSam Knutson
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...C4Media
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like systemAree Oh
 
Consul administration at scale
Consul administration at scaleConsul administration at scale
Consul administration at scalePierre Souchay
 
WebRTC Standards & Implementation Q&A - Legacy API Support Changes
WebRTC Standards & Implementation Q&A - Legacy API Support ChangesWebRTC Standards & Implementation Q&A - Legacy API Support Changes
WebRTC Standards & Implementation Q&A - Legacy API Support ChangesAmir Zmora
 
How you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezHow you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezCodemotion
 
FFMEET: running a non-profit conference system
FFMEET: running a non-profit conference systemFFMEET: running a non-profit conference system
FFMEET: running a non-profit conference systemAnnika Wickert
 

Similar to Twitch Plays Pokémon: Twitch's Chat Architecture (20)

Openstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkOpenstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud Framework
 
Breaking Smart Speakers: We are Listening to You.
Breaking Smart Speakers: We are Listening to You.Breaking Smart Speakers: We are Listening to You.
Breaking Smart Speakers: We are Listening to You.
 
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soul
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soulLet's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soul
Let's Get Real (time): Server-Sent Events, WebSockets and WebRTC for the soul
 
Microservices Antipatterns
Microservices AntipatternsMicroservices Antipatterns
Microservices Antipatterns
 
Distributed "Web Scale" Systems
Distributed "Web Scale" SystemsDistributed "Web Scale" Systems
Distributed "Web Scale" Systems
 
WebRTC: A front-end perspective
WebRTC: A front-end perspectiveWebRTC: A front-end perspective
WebRTC: A front-end perspective
 
WebRTC Reborn - Full Stack Toronto
WebRTC Reborn -  Full Stack TorontoWebRTC Reborn -  Full Stack Toronto
WebRTC Reborn - Full Stack Toronto
 
Tips from Support: Always Carry a Towel and Don’t Panic!
Tips from Support: Always Carry a Towel and Don’t Panic!Tips from Support: Always Carry a Towel and Don’t Panic!
Tips from Support: Always Carry a Towel and Don’t Panic!
 
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency app
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency appDylan Butler & Oliver Hager - Building a cross platform cryptocurrency app
Dylan Butler & Oliver Hager - Building a cross platform cryptocurrency app
 
Advanced WCF Workshop
Advanced WCF WorkshopAdvanced WCF Workshop
Advanced WCF Workshop
 
Rollup-as-a-service and why it matters to the next-gen of dApps
Rollup-as-a-service and why it matters to the next-gen of dAppsRollup-as-a-service and why it matters to the next-gen of dApps
Rollup-as-a-service and why it matters to the next-gen of dApps
 
Loophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in ChromeLoophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in Chrome
 
Bit_Bucket_x31_Final
Bit_Bucket_x31_FinalBit_Bucket_x31_Final
Bit_Bucket_x31_Final
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
 
Preso fcul
Preso fculPreso fcul
Preso fcul
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like system
 
Consul administration at scale
Consul administration at scaleConsul administration at scale
Consul administration at scale
 
WebRTC Standards & Implementation Q&A - Legacy API Support Changes
WebRTC Standards & Implementation Q&A - Legacy API Support ChangesWebRTC Standards & Implementation Q&A - Legacy API Support Changes
WebRTC Standards & Implementation Q&A - Legacy API Support Changes
 
How you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezHow you can benefit from using Redis - Ramirez
How you can benefit from using Redis - Ramirez
 
FFMEET: running a non-profit conference system
FFMEET: running a non-profit conference systemFFMEET: running a non-profit conference system
FFMEET: running a non-profit conference system
 

More from C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

More from C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Twitch Plays Pokémon: Twitch's Chat Architecture

  • 1. Twitch Plays Pokémon: Twitch’s Chat Architecture John Rizzo Sr Software Engineer
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ twitch-pokemon
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 7. • Over 800k concurrent users • Tens of BILLIONS of daily messages • ~10 Servers • 2 Engineers • 19 Amp Energy drinks per day Twitch Introduction www.twitch.tv
  • 8. Architecture Overview www.twitch.tv Video Edge API Edge Chat Edge fChannel Page Video Backend Services Chat
  • 9. Twitch Plays Pokémon Strikes! www.twitch.tv
  • 10. Public Reaction Media outlets have described the proceedings of the game as being "mesmerizing", "miraculous" and "beautiful chaos", with one viewer comparing it to "watching a car crash in slow motion” - Wikipedia “Some users are hailing the development as part of internet history…” – BBC –TwitchChatTeam www.twitch.tv
  • 11. 10:33pm, February 14: TPP hits 5k concurrent viewers 9:42am, February 15: TPP hits 20k 8:21pm, February 16: Time to take preemptive action Twitch Plays Pokémon Strikes! www.twitch.tv
  • 12. We separate servers into several clusters • Robust in the face of failure • Can tune each cluster for its specific purpose Chat Server Organization www.twitch.tv Main Cluster Event Cluster Group Cluster Test Cluster
  • 13. 8:21pm, February 16: Move TPP onto the event cluster Twitch Plays Pokémon Strikes! www.twitch.tv
  • 14. 5:43pm, February 16: TPP hits 50k users, chat system starts to show signs of stress (but…it’s on the event cluster!) 8:01am, February 17: Twitch chat engineers panic, rush to the office 9:35am, February 17: Investigation begins Twitch Plays Pokémon Strikes! www.twitch.tv
  • 15. Debugging principle: Start investigating upstream, move downstream as necessary Solving Twitch Plays Pokémon www.twitch.tv
  • 16. Edge Server: Sits at the “edge” of the public internet and Twitch’s internal network Chat Architecture Overview www.twitch.tv User Chat Edge Public Internet Internal Twitch Network Flash socket Ingestion Distribution
  • 17. Any user:room:edge tuple is valid Solving Twitch Plays Pokémon www.twitch.tv user9 user1:lirik user2:degentp user3:witwix … user4:witwix user5:armorra user6:degentp … user2:witwix user7:cep21 user8:lirik …
  • 18. A Note on Instrumentation www.twitch.tv Chat Service Remote statsd collector Graphite cluster UDP UDP HTTP
  • 19. Solving Twitch Plays Pokémon So let’s take a look at our edge server logs… Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Message successfully sent to #degentp Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:05 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg Feb 18 06:54:04 tmi_edge: [clue] Message successfully sent to #paragusrants www.twitch.tv
  • 20. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv
  • 21. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv Timestamp - when this action was recorded on the server
  • 22. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv Server - the name of the server that is generating this log file
  • 23. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv Remote service - the name of the service that edge is connecting to
  • 24. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 chat_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv Event detail Why did the clue service take so long to respond? Also, what is the clue service?
  • 25. Ingestion Message Ingestion www.twitch.tv User Chat Edge Clue Public Internet Internal Twitch Network Distribution HTTP
  • 26. Solving Twitch Plays Pokémon Let’s dissect one of these log lines Feb 18 06:54:04 tmi_edge: [clue] Timed out (after 5s) while writing request for: privmsg www.twitch.tv Clue server took longer than 5 seconds to process this message. Why?
  • 27. Solving Twitch Plays Pokémon Clue logs… Feb 18 06:54:04 chat_clue: WARNING:tornado.general:Connect error on fd 10: ECONNREFUSED Feb 18 06:54:04 chat_clue: WARNING:tornado.general:Connect error on fd 15: ECONNREFUSED Feb 18 06:54:05 chat_clue: WARNING:tornado.general:Connect error on fd 9: ECONNREFUSED Feb 18 06:54:05 chat_clue: WARNING:tornado.general:Connect error on fd 18: ECONNREFUSED www.twitch.tv Not very useful…but we get some info. Clue’s connections are being refused. Which machine is clue failing to connect to? Why?
  • 28. Let’s take a step back…these errors are happening on both main AND the event clusters. Why? Are there common services or dependencies? • Databases (store chat color, badges, bans, etc) • Cache (to speed up database access) • Services (authentication, user data, etc) Investigating Clue www.twitch.tv
  • 29. Investigating Clue www.twitch.tv Chat:Clue Rails Video PGBouncer Postgre s Redis
  • 30. Can rule out databases and services – rest of site is functional Let’s look closer at our cache – this one is specific to chat servers Investigating Clue www.twitch.tv
  • 31. Redis: where do we start investigating? Strategy: start high-level then drill down Investigating Redis www.twitch.tv Redis server isn’t being stressed very hard
  • 32. Investigating Redis www.twitch.tv Let’s look at how Clue is using Redis…
  • 35. Redis Configuration www.twitch.tv $ ps aux | grep redis 13909 0.0 0.0 2434840 796 s000 S+ grep redis Redis doesn’t seem to be running locally - what listens on port 2000?
  • 36. Redis Configuration www.twitch.tv $ netstat -lp | grep 2000 tcp 0 0 localhost:2000 *:* LISTEN 2109/haproxy HAProxy!
  • 37. HAProxy www.twitch.tv • Limits for #connections, requests, etc • Robust instrumentation Attribution: Shareholic.com
  • 38. Redis Configuration www.twitch.tv Are we load balancing across many Redis instances? DB_SERVER=db.internal.twitch.tv DB_NAME=twitch_db DB_TIMEOUT=1s CACHE_SERVER=localhost CACHE_PORT=2000 CACHE_MAX_CONNS=10 CACHE_TIMEOUT=100ms … …
  • 39. Redis Configurtaion www.twitch.tv Are we load balancing across many Redis instances? class twitch::haproxy::listeners::chat_redis ( $haproxy_instance = ‘chat-backend', $proxy_name = 'chat-redis', $servers = [ 'redis2.internal.twitch.tv:6379', ], ... ... ...
  • 40. Redis Configurtaion www.twitch.tv Are we load balancing across many Redis instances? class twitch::haproxy::listeners::chat_redis ( $haproxy_instance = ’chat-backend', $proxy_name = 'chat-redis', $servers = [ 'redis2.internal.twitch.tv:6379', ], ... ... ... We are not load balancing across several instances
  • 41. Investigating Redis $ top Tasks: 281 total, 1 running, 311 sleeping, 0 stopped, 0 zombie Cpu(s): 10.3%us, 10.5%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi, Mem: 24682328k total,6962336k used, 17719992k free, 13644k buffers Swap: 7999484k total, 0k used, 7999484k free, 4151420k cached PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26109 20 0 76048 128m 1340 S 99 0.2 6133:28 redis-server 3342 20 0 9040 1320 844 R 2 0.0 0:00.01 top 1 20 0 90412 3920 2576 S 0 0.0 103:45.82 init 2 20 0 0 0 0 S 0 0.0 0:05.17 kthreadd Let’s take a look at the Redis box… www.twitch.tv
  • 42. Investigating Redis Redis is unsprisingly maxing out the CPU $ top Tasks: 281 total, 1 running, 311 sleeping, 0 stopped, 0 zombie Cpu(s): 10.3%us, 10.5%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi Mem: 24682328k total,6962336k used, 17719992k free, 13644k buffers Swap: 7999484k total, 0k used, 7999484k free, 4151420k cached PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26109 20 0 76048 128m 1340 S 99 0.2 6133:28 redis-server 3342 20 0 9040 1320 844 R 2 0.0 0:00.01 top 1 20 0 90412 3920 2576 S 0 0.0 103:45.82 init 2 20 0 0 0 0 S 0 0.0 0:05.17 kthreadd www.twitch.tv
  • 43. Redis Optimization Options? • Optimize Redis at the application-level • Distribute Redis www.twitch.tv
  • 44. Clue Logic 2:00pm: Smarter caching? Redis Optimization Options? www.twitch.tv Redis
  • 45. Clue Logic 2:00pm: Smarter caching? Redis Optimization Options? www.twitch.tv Redis Clue local cache
  • 46. • 2:23pm: There are some challenges (cache coherence, implementation difficulty) • Is there any low-hanging fruit? • Yes! Rate limiting code! • 2:33pm: Change has little effect... Redis Optimization Options? www.twitch.tv
  • 47. 2:41pm: Distribute Redis? Redis Optimization Options? www.twitch.tv Clue Logic Redis Clue Local Cache Redis Redis Redis
  • 48. 2:56pm: Yes! Sharding by key! Redis Optimization Options? www.twitch.tv set “effinga.chat_color” get “degentp.chat_color” get “effinga.chat_color” set “degentp.chat_color” Redis Redis Redis Redis Distribution Function
  • 49. 3:03pm: How do we implement this? • Puppet configuration changes • HAProxy changes • Redis deploy changes (copy/paste!) • Do we need to modify any code? • Can we let HAProxy load balance for us? • No – LB needs to be aware of Redis protocol • Changes required at the application level Distributing Redis www.twitch.tv
  • 50. 3:52pm: Code surveyed – Two problematic patterns Distributing Redis www.twitch.tv
  • 53. Problematic pattern #2: Distributing Redis www.twitch.tv What if we need these keys in different contexts?
  • 55. 7:21pm: Test (fingers crossed) 8:11pm: Deploy cache changes 9:29pm: Deploy chat code changes Distributing Redis www.twitch.tv
  • 56. 10:10pm: Better, but still bad… Solving Twitch Plays Pokémon www.twitch.tv
  • 57. Solving Twitch Plays Pokémon Let’s use some tools to dig deeper… $ redis-cli -h redis2.internal.twitch.tv -p 6379 INFO # Clients connected_clients:3021 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 $ redis-cli -h redis2.internal.twitch.tv -p 6379 CLIENT LIST | grep idle | wc -l 2311 www.twitch.tv
  • 58. Solving Twitch Plays Pokémon Lots of bad connections $ redis-cli -h redis2.internal.twitch.tv -p 6379 INFO # Clients connected_clients:3021 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 $ redis-cli -h redis2.internal.twitch.tv -p 6379 CLIENT LIST | grep idle | wc -l 2311 www.twitch.tv
  • 59. Solving Twitch Plays Pokémon Let’s grab the pid of one Redis instance $ sudo svstat /etc/service/redis_* /etc/service/redis_6379: up (pid 26109) 3543 seconds /etc/service/redis_6380: up (pid 26111) 3543 seconds /etc/service/redis_6381: up (pid 26113) 3543 seconds /etc/service/redis_6382: up (pid 26114) 3544 seconds www.twitch.tv
  • 60. Solving Twitch Plays Pokémon $ sudo svstat /etc/service/redis_* /etc/service/redis_6379: up (pid 26109) 3543 seconds /etc/service/redis_6380: up (pid 26111) 3543 seconds /etc/service/redis_6381: up (pid 26113) 3543 seconds /etc/service/redis_6382: up (pid 26114) 3544 seconds www.twitch.tv Let’s grab the pid of one Redis instance
  • 61. Solving Twitch Plays Pokémon $ sudo lsof –p 26109 | grep chat | cut -d ' ' -f 32 | cut -d ':' -f 2 | sort | uniq –c 2012 6421->chat-testing.internal.twitch.tv 121 6421->chat1.internal.twitch.tv 101 6421->chat3.internal.twitch.tv ... www.twitch.tv
  • 62. Solving Twitch Plays Pokémon $ sudo lsof –p 26109 | grep tmi | cut -d ' ' -f 32 | cut -d ':' -f 2 | sort | uniq –c 2012 6421->chat-testing.internal.twitch.tv 121 6421->chat1.internal.twitch.tv 101 6421->chat3.internal.twitch.tv ... www.twitch.tv
  • 63. Solving Twitch Plays Pokémon Lesson learned: Testing is bad www.twitch.tv
  • 64. Solving Twitch Plays Pokémon 11:12pm: Shut off testing cluster completely www.twitch.tv
  • 65. Users can connect to chat again! Users can send messages again! Chat team leaves the office at 11:31pm Twitch Plays Pokémon is Solved(?) www.twitch.tv
  • 66. Complaints that users don’t see the same messages A New Bug Arises www.twitch.tv
  • 67. Message Distribution www.twitch.tv User Chat Edge Public Internet Internal Twitch Network Ingestion Distribution
  • 68. Message Distribution www.twitch.tv User Chat Edge Public Internet Internal Twitch Network Clue Exchange
  • 70. • Edge/Clue instrumentation show no errors • Exchange isn’t even instrumented! • Let’s fix that… Solving the Distribution Problem www.twitch.tv
  • 71. Solving the Distribution Problem Let’s look at our new exchange logs Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] Exchange success Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] i/o timeout Feb 19 14:11:06 exchange: [exchange] Exchange success www.twitch.tv Ideas?
  • 72. Solving the Distribution Problem • These are extremely short and simple requests, but there are many of them • We aren’t using HTTP keepalives www.twitch.tv
  • 73. Solving the Distribution Problem Go makes this extremely simple www.twitch.tv
  • 74. Lessons Learned • Better logs/instrumentation to make debugging easier • Generate fake traffic to force us to handle more load than we need • Make sure we use and configure our infrastructure wisely! www.twitch.tv
  • 75. What have we done since? • We now use AWS servers exclusively • Better Redis infrastructure • Python -> Go • Lots of other big and small changes to support new products and better QoS www.twitch.tv
  • 77. Video Architecture www.twitch.tv PoP Origin PoP PoP Ingest Distribution Broadcaster Viewers Replication Hierarchy
  • 83. Video Distribution www.twitch.tv PoP Video Edge Protected Replication HLS Proxy HLS Cache HLS Cache Find API Ingest Proxy Upstream PoP Protected Replication
  • 84. Video Distribution www.twitch.tv PoP Replication Hierarchy PoP Edge PR PoP Edge PR PoP Edge PR PoP Edge PR PoP Edge PR Transcoding Tier 1 Cache
  • 86. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/twitch- pokemon