Running a Megasite on Microsoft Technologies


Published on

MySpace and are two of the most-visited Web sites on the planet. Come to this session to hear about lessons learned using Microsoft technologies to run Web applications on a massive scale. Representatives from talk about lessons learned using an all-Microsoft datacenter. Representatives from MySpace talk about the realities of using Microsoft technologies in a scalable, federated environment using SQL Server 2005, .NET 2.0 and IIS 6 on Windows Server 2003 64-bit editions. This session features an open Q&A with a panel of technical managers and engineers from MySpace and

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Running a Megasite on Microsoft Technologies

  1. 1. Running A Megasite On Microsoft Technologies Casey Jacobs Aber Whitcomb Director of Engineering CTO Chris St.Amand Jim Benedetto Sr. System Engineer VP of Technology NGW046
  2. 2. Agenda <ul><li>Introduction – Quick Facts </li></ul><ul><li> – Growing Up </li></ul><ul><li>Upcoming Technology Enablers </li></ul><ul><li>Open Panel Discussion </li></ul>
  3. 3. Introduction
  4. 4. Brief History Of Microsoft combines Web platform, ops, and content teams Standardization effort begins, consolidation hosted systems Focus on MSCOM Network Programming and campaign-to-Web integration Single MSCOM group formed Brand, content, site std’s, Privacy, brand compliance Microsoft launches Information & support publishing; hosting Enable an innovative customer experience online & in-product Product Info, Support, Dev / ITPro Experience, Customer Intelligence, Profile Mgmt & Enterprise Downloads 2001 4M UUsers / day 2003 6.5M UUsers / day 1995 30k users / day 2006 17.1M UUsers / day
  5. 5. Quick Facts <ul><li>Infrastructure and Application Footprint </li></ul><ul><li>5 Internet Data Centers & 3 CDN Partnerships </li></ul><ul><li>110 Web Sites, 1000’s App's and 2138 Databases </li></ul><ul><li>80+ Gigabit/sec Bandwidth </li></ul><ul><li>Solutions at High Scale </li></ul><ul><li> </li></ul><ul><ul><li>13M UUsers/Day & 70M Page Views/Day </li></ul></ul><ul><ul><li>10K Req/Sec, 300K CC Conn’s on 80 Servers </li></ul></ul><ul><ul><li>350 Vroots, 190 IIS Web App’s & 12 App Pools </li></ul></ul><ul><li>Microsoft Update </li></ul><ul><ul><li>250M UScans/Day, 12K ASP.NET Req/Sec, 1.1M ConCurrent </li></ul></ul><ul><ul><li>28.2 Billion Downloads for CY 2005 </li></ul></ul><ul><ul><li>Egress – MS, Akamai & Savvis (30-80+ Gbit/Sec) </li></ul></ul>
  6. 6. MySpace Company Overview <ul><li>Launched Sept, 2003 </li></ul><ul><li>Latest as of February 2006 </li></ul><ul><ul><li>64+ MM Registered Users </li></ul></ul><ul><ul><li>38 MM UUsers & 2.3M Concurrent </li></ul></ul><ul><ul><li>260K New Registered Users/Day </li></ul></ul><ul><ul><li>23 Billion Page* Views/Month </li></ul></ul><ul><li>Demographics </li></ul><ul><ul><li>50.2% Female / 49.8% Male </li></ul></ul><ul><ul><li>Primary Age Demo: 14-34 </li></ul></ul><ul><li>Site Trends </li></ul><ul><ul><li>260K New Users/Day </li></ul></ul><ul><ul><li>430M Total Images </li></ul></ul><ul><ul><li>Millions of Songs Streamed/Day </li></ul></ul><ul><ul><li>1000’s of New MP3’s/Day </li></ul></ul><ul><ul><li>20 Million Comments Posted </li></ul></ul>Media Metrix February 2006 Audience Rankings Source comScore Media Metrix February - 2006 23,566 #2 MySpace Hotmail Google Ebay MSN Yahoo 9,632 #4 29,508 #1 14,695 #3 7,329 #5 6,812 #6 Pageviews in ‘000s Internet Rank
  7. 7. Quick Facts <ul><li>Infrastructure and Application Footprint </li></ul><ul><li>3 Internet Data Centers </li></ul><ul><li>Server Breakdown </li></ul><ul><ul><li>2682 Web and 650 Database Servers </li></ul></ul><ul><ul><li>90 Cache Servers 16gb RAM </li></ul></ul><ul><ul><li>650 Dart servers </li></ul></ul><ul><ul><li>60 DB Servers </li></ul></ul><ul><ul><li>150 Media servers </li></ul></ul><ul><li>3000 disks in SAN architecture </li></ul><ul><li>Egress Management </li></ul><ul><ul><li>17,000 mb/s bandwidth </li></ul></ul><ul><ul><li>15,000 mb/s on CDN </li></ul></ul>
  8. 8. Growing up in the Internet World
  9. 9. 0 users The beginning <ul><li>Two tiered architecture </li></ul><ul><ul><li>Single Database </li></ul></ul><ul><ul><li>Load balanced web servers </li></ul></ul><ul><li>Great for rapid development </li></ul><ul><li>Less complexity means faster time to market and less operational costs </li></ul><ul><li>Works for small to medium sized websites, not big ones </li></ul>0 Users
  10. 10. 500k Users A Single database is not enough <ul><li>Max out a single database </li></ul><ul><li>Split reads and writes across separate databases </li></ul><ul><li>Use transactional replication so multiple databases can service reads </li></ul>500k Users
  11. 11. 1 Million Vertical partitioning <ul><li>Transactional replication doesn’t work for all workloads and data types </li></ul><ul><li>Use a combination of Vertical Partitioning and replication </li></ul>1M Users
  12. 12. 2 Million SAN <ul><li>Start to reconsider SCSI arrays for the long-term </li></ul><ul><li>SCSI arrays have good performance but reliability issues </li></ul><ul><li>SANS provide better performance, uptime, and redundancy </li></ul><ul><li>Move to a clarion and enjoy better these benefits </li></ul>2M Users
  13. 13. 3 Million Horizontal partitioning <ul><li>Vertical Partitions see performance problems </li></ul><ul><li>Decide we need to re-architect the database </li></ul><ul><li>Horizontal partitioning is the answer but is difficult to do while in production </li></ul>3M Users
  14. 14. Horizontal Partitioning <ul><li>All features reside on a single database server </li></ul><ul><li>Data is partitioned by user ID </li></ul><ul><li>Some data cannot be partitioned especially on a social networking site </li></ul>3M Users
  15. 15. 5 Million Network bottlenecks <ul><li>Various areas of the network become saturated </li></ul><ul><li>Gig uplinks are maxed out </li></ul><ul><ul><li>Switch to Autonomous network and BGP </li></ul></ul><ul><ul><li>Get multiple gig links and 10G links </li></ul></ul><ul><li>Load balancer is maxed out </li></ul><ul><ul><li>“ Must load balance the load balancers” </li></ul></ul><ul><ul><li>Use DNS </li></ul></ul>5M Users
  16. 16. 7 Million Site dependencies <ul><li>Separating features on the front end isolates potential bottlenecks </li></ul><ul><li>Using subdomains is easiest way </li></ul>7M Users
  17. 17. 10 Million Scalable storage <ul><li>Trying to partition storage on the backend is time consuming and inefficient </li></ul><ul><li>Maxing out SANs is very costly </li></ul><ul><li>We realize scalable storage is key </li></ul>10M Users
  18. 18. 15 Million DB’s versus Caching <ul><li>Databases still having perf issues </li></ul><ul><ul><li>Databases are expensive </li></ul></ul><ul><ul><li>Have a lot of transactional overhead </li></ul></ul><ul><li>Caching tier </li></ul><ul><ul><li>High speed cache is perfect for reads </li></ul></ul><ul><ul><li>LRU algorithm is self managing </li></ul></ul><ul><ul><li>Drastically reduces database load </li></ul></ul>
  19. 19. MySpace Where we are today
  20. 20. Upcoming Technology Enablers What’s Next for and
  21. 21. SQL Server 2005 Product technology enablers <ul><li>Peer-To-Peer Replication </li></ul><ul><ul><li>System & Data Center Autonomy </li></ul></ul><ul><ul><li>Zero “perceived” Application Downtime from Consumers </li></ul></ul><ul><ul><li>Eliminates Single Point of Failure for R/W Databases </li></ul></ul><ul><li>Mirroring (SP1) </li></ul><ul><ul><li>Targeting Replacement of Log Shipping Fail-Over pairs </li></ul></ul><ul><ul><li>3 Systems in TAP Program (Technet, Learning & Genuine) </li></ul></ul><ul><ul><li>Reduced Failover Downtime </li></ul></ul><ul><ul><ul><li>Log Shipping: 5-15min Avg </li></ul></ul></ul><ul><ul><ul><li>Mirroring < 1min (planned) </li></ul></ul></ul><ul><li>Table Partitioning </li></ul><ul><ul><li>Reduced Storage Costs </li></ul></ul><ul><ul><li>Scale Up at Lower Costs </li></ul></ul>
  22. 22. MySpace Scaling SQL Server <ul><li>V1: Single Instance – < 1 Million Users </li></ul><ul><ul><ul><li>Single SQL Server Instance Supports All Users and Features </li></ul></ul></ul><ul><li>V2: Single Instance Replicating to Read Only Full Copies < 2 Million Users </li></ul><ul><ul><ul><li>Single server handles all write transactions, read transactions spread across multiple transactional replication copies </li></ul></ul></ul><ul><li>V3: Vertical Partitioning - < 4 Million Users </li></ul><ul><ul><ul><li>Each Feature/Page of the site on its own SQL Server </li></ul></ul></ul>
  23. 23. MySpace Scaling SQL Server <ul><li>V4: Horizontal Partitioning - < 8 Million Users </li></ul><ul><ul><ul><li>All features/pages brought back to single database schema </li></ul></ul></ul><ul><ul><ul><li>Standard schema across all databases </li></ul></ul></ul><ul><ul><ul><li>User ranges partitioned across databases </li></ul></ul></ul><ul><li>V5: Horizontally Partitioned Core with Replicated Content, Vertically Partitioned Features Databases, “Shared Content” Databases - > 8 Million Users </li></ul><ul><ul><ul><li>Primary Myspace schema exists across large farm of servers </li></ul></ul></ul><ul><ul><ul><li>Small amounts of content replicated to all horizontally partitioned servers to allow for features spanning all user ranges </li></ul></ul></ul><ul><li>V6: Migration to SQL Server 2005 - >26 Million Users </li></ul>
  24. 24. SQL Server 2005 64 bit <ul><li>Memory Pressure under 4GB 32 Limit </li></ul><ul><ul><li>Servers loaded with 32Gigs of RAM </li></ul></ul><ul><ul><li><4 Gig Addressable to the memory pools we were stressing </li></ul></ul><ul><li>Manifestations </li></ul><ul><ul><li>Connection Timeouts </li></ul></ul><ul><ul><li>Servers going “dark”, requiring restart </li></ul></ul><ul><ul><li>Rejected Connections </li></ul></ul><ul><li>Problem Eliminated on 64bit Arch </li></ul><ul><ul><li>Connection/Sort memory pools now able to address all 32Gigs of RAM </li></ul></ul>
  25. 25. Virtualizing Storage <ul><li>What is it? </li></ul><ul><ul><li>Software layer between your disks & hosts </li></ul></ul><ul><li>Advantages </li></ul><ul><ul><li>Provisioning is very simple, makes capacity planning more predictable </li></ul></ul><ul><ul><li>Much better performance </li></ul></ul><ul><ul><li>Can easily add more capacity to a LUN </li></ul></ul><ul><li>What do we use? </li></ul><ul><ul><li>3par </li></ul></ul><ul><ul><li>14 week bake off </li></ul></ul>
  26. 26. Longhorn And IIS 7.0 Product technology enablers <ul><li>UNC Content Store </li></ul><ul><ul><li>Simplified Content Mgmt </li></ul></ul><ul><ul><li>Reduced Disk Footprint </li></ul></ul><ul><li>File Replication (DC to DC) </li></ul><ul><ul><li>Latent/Long links improved 80X (10Mbps vs 850Mbps) </li></ul></ul><ul><ul><li>Enabler of Geo-Hosting Options </li></ul></ul><ul><li>Centralized IIS Config’s </li></ul><ul><ul><li>Copy “Host-Host” capability </li></ul></ul><ul><ul><li>Eliminate complex scripting of meta-base & config’s </li></ul></ul><ul><li>Dynamic Content Compression </li></ul><ul><ul><li>Further reduced Egress </li></ul></ul><ul><ul><li>Improved Web Perf Delivery </li></ul></ul>
  27. 27. IIS 7.0 Failed Request Tracing
  28. 28. Geo-Targeting Solutions Demographic management <ul><li>Objective – Enable Targeted Release of App’s and Content </li></ul><ul><li>Avoid demographic support spikes and further align to marketing campaigns </li></ul>Microsoft Confidential. © 2006 Microsoft Corporation. All rights reserved. This presentation is for internal Microsoft use only. <ul><li>Sensitivity to Time/Frequency of customer online experiences </li></ul><ul><li>Improve ability to reach last 30% of client population </li></ul>
  29. 29. Open Panel Discussion
  30. 30. © 2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.