Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Software architecture for high traffic website

3,625 views

Published on

Kiến trúc phần mềm cho các site chịu tải lớn – Software architecture for high traffic Website

Case study giới thiệu về kiến trúc của một site traffic lớn đó là stackoverflow.com - trang hỏi đáp về lập trình rất nổi tiếng

Bài trình bày của bạn Ngô Xuân Hòa tại Meetup 4 của Ha Noi .NET Group.

Chi tiết vui lòng xem tại: http://tungnt.net

Published in: Technology

Software architecture for high traffic website

  1. 1. Software architecture for high traffic website Case study - Stack Overflow Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet) Hanoi .Net Meetup
  2. 2. Contents About Stack Overflow ● Beginning ● Restructure #1 ● Restructure # 2 ● Founders ● Principles SO architecture ● StackExchange.Redis ● Dapper ● Jil Open-source Libs
  3. 3. About Stack Overflow
  4. 4. Founders Jeff Atwood Joel Spolsky
  5. 5. 2008 Stack Overflow 2009 2010 2011 Server Fault Stack Exchange 1.0 Stack Exchange 2.0 Stack Overflow Carees Rome wasn’t build in a day!
  6. 6. ● 100+ Q&A Sites ● 600+ million pageviews a month ● 3000+ requests per second ● 16+ million users ● 8+ million question ● 40+ million answers
  7. 7. Principles Perfomance Is a Feature Cache All The Thing! Reinvention is OK
  8. 8. Stack Overflow Architecture
  9. 9. 2 times restructuring Stack Exchange 1.0 ● ASP.NET MVC ● SQL Server ● LINQ to SQL ● Wikipedia DB Design Stack Exchange Network LINQ to SQL HAProxy Redis Lucene.NET Scale Up ● Cache every things ● Elastic Search ● Reinvention
  10. 10. Stack Exchange 1.0 Structure Windows NLBLoad balancing IIS Server IIS ServerWeb server SQL ServerDatabase
  11. 11. Window NLB ● Cons: ○ Limit to 8 Nodes ○ Cannot detect service failed Web-tier ASP.NET MVC LINQ to SQL SQL Server ● All-in-memory ● Full text search
  12. 12. ● 16 million pageviews a month ● 3 million unique visitors a month ● 6 million visits a month
  13. 13. Follow none but learn from everyone!
  14. 14. Pros ● Bottleneck: Database SQL Server ● High cost to scale up ● Simple Cons
  15. 15. Restructure #1 - Stack Exchange Network HAProxy Redis Cache Lucene.NET Tag Engine
  16. 16. Stack Exchange Network Structure HAProxy Redis IIS Servers Database protobuf sqlhttp http
  17. 17. Load Balancing ● HAProxy: ○ Run in Linux ○ Free Web-tier ASP.NET MVC 3 LINQ to SQL jQuery 1.4.5 Lucene.Net Redis ● In-memory cache ● Master-slave ● Messaging notification
  18. 18. 3 Type Cache Local Cache Site Cache ● Use Redis ● Cache Site’s data: - Q&As - Acceptance rates - ... Global Cache ● Use Redis ● Cache System Data: - User info - Inbox - ... ● Use HttpRunTime.Cache ● Cache: - User Session - View Count - ...
  19. 19. Update cache flow - Local cache Local Cache Redis DB Other sites 1 3 2.1 2.2 4 1 - OnStartup - Subcribe invalidation message to Redis 2.1 - Data changed (by other sites, apps…) 2.2 - Send message to Redis 3 - Redis send Notification to Subscribers 4 - Get data from DB - update Local cache
  20. 20. Deployment flow with HAProxy ● Tell HAProxy to take the server out of rotation via a POST ● Delay to let IIS finish current requests (~5 sec) ● Stop the website ● Copy files ● Start the website ● Local testing, update local cache, etc… ● Re-enable HAProxy via another POST
  21. 21. ● High performance ● Low-cost Load Balancing (use HAProxy) ● Use Messaging của Redis for cache invalidation Pros ● Too many SQL query Cons
  22. 22. ● 95 million pageviews a month ● 800 requests per second ● 16 million users
  23. 23. Restructure #2 - Scale Up Cache All the Thing Elastic Search Reinvention
  24. 24. Stack Exchange Network Structure Elastic Search Tag Engine Databases Redis HAProxy
  25. 25. 5 Level cache Network Level Local Cache Redis Cache SQL SV Cache SSD ● Network Level: Browser cache… ● Local Cache: HttpRuntime.Cache - Cache all data in memory ● Redis Cache: Cache all data ● SQL Server Cache: Cache all data in memory (the database servers have 384GB of RAM)
  26. 26. Cache Flow ● Check Local Cache ● Else, check Redis Cache and update Local Cache ● If Cache Redis doesn’t have data, fetch from databases, then update Redis Cache and Local Cache
  27. 27. Cache All the Things!
  28. 28. Pros ● Data has latency ● Very, Very Fast (<400ms) ● Low servers load: ○ IIS: 10-15% CPU usage ○ DB: 10% CPU usage ● 99% request served by cache Cons
  29. 29. ● 95 million pageviews a month ● 800 requests per second ● 16 million users
  30. 30. Open-source Libs • StackExchange.Redis - high perfomance Redis client • Dapper - a micro ORM - very fast • Jil - fast JSON Serializer Reinvention is OK!
  31. 31. Reference sources ● http://stackoverflow.com ● http://highscalability.com ● http://codinghorror.com ● http://www.joelonsoftware.com ● http://nickcraver.com ● http://josephwoodward.co.uk/2014/02/the-architecture-of-stackoverflow/
  32. 32. Thank you! Ngô Xuân Hòa xuanhoa862001@gmail.com

×