3. A brief Overview
• 560 Million Page Views a Month
• 34TB of Data transfered a Month
• 1665 rps (2250 peak) Across web Farm
• WISC(HER)
4. Our First Priority is
Performance
Nobody likes a slow site, least of all us.
When your site is slow people leave.
Make your site fast, and the people will stay
Good write up on moz.com:
http://moz.com/blog/site-speed-are-you-fast-does-
it-matter
5. The Performance
toolkit
• Mini Profiler
• OpServer
(https://github.com/opserver/Opserver)
• Client Timings
(http://teststackoverflow.com/)
20. Tag Engine
• Our Special index of SO
• Tagging is hard
• Written by Marc Gravell
• http://blog.marcgravell.com/2014/04/technica
l-debt-case-study-tags.html
23. So what does this get
you
• 52 ms homepage render time
• 33 ms questions page render time
24. Always See our
Performance
• http://stackexchange.com/performance
25. Thank YOU!
Contact:
@GABeech
george@stackoverflow.com
Office Hours:
Wednesday, November 12th
(today…)
2:00pm - 3:30pm
LISA Lab
Editor's Notes
Windows
IIS
SQL Server
C#
HAProxy
Elastic Search
Redis
Why do I bring up performance in an infra talk? simple. It drives our design decisions.
Shown to every Dev/SRE on every page
Oneboxed in our chat system
Bubbles up problems
How well are we actually doing when _you_ load the page
The actual design starts now.
4 Different providers
Selected for different characteristics
Router Redundancy Hot/Standby HSRP/BGP on “T2”
Full BGP tables and HSRP on T1
4B requests/month
3000 req/sec peak
10% CPU 18% peak
Between 600k and 700k concurrent connections (EST, TIME_WAIT, ETC)
Multiple Processes Allow for granular restarts and segregation of faults
SSL Termination done on the LB
Websockets: The weird connection
Long lived
TCP not HTTP
Request flow
In, is http? yes, servers: no term https, is http
Source Port Exhaustion
use 127.0.0.0/8 to resolve
Server only running at ~12% cpu
We don’t run full SSL everywhere yet
185 req/s 250 peak
15% CPU usage 20% peak
(SO) 343 M Queries per day
(SO) Peak of 7500 queries / second
(SE) 216M Queries per day
(SE) Peak 3200 queries / second
CPU Use: SO 8% Peak 15% — SE 10% Peak 20%
3.65 B operations a day
Peak 60,000/s
3% cpu usage
3 Servers, 32 GB RAM
3644 req/s
3% CPU 10% peak
Replaced Full Text search in SQL Server
Spins up a full copy of SO/SE
Cool thing can be upgraded with 0 downtime
2 others/ not prod
Machine learning
Log stash (300TB)
Team City monitors our Development Git repository
Dev Auto builds (Deploy to Meta)
When the build is verified Dev triggers Prod Build
Copy Artifacts from Dev Build