Running a Megasite on Microsoft Technologies - Presentation Transcript
Running A Megasite On Microsoft Technologies Casey Jacobs Aber Whitcomb Director of Engineering CTO Microsoft.com MySpace.com Chris St.Amand Jim Benedetto Sr. System Engineer VP of Technology Microsoft.com MySpace.com NGW046
Agenda
Introduction – Quick Facts
MySpace.com – Growing Up
Upcoming Technology Enablers
Open Panel Discussion
Introduction
Brief History Of Microsoft.com Microsoft combines Web platform, ops, and content teams Standardization effort begins, consolidation hosted systems Focus on MSCOM Network Programming and campaign-to-Web integration Single MSCOM group formed Brand, content, site std’s, Privacy, brand compliance Microsoft launches www.microsoft.com Information & support publishing; hosting Enable an innovative customer experience online & in-product Product Info, Support, Dev / ITPro Experience, Customer Intelligence, Profile Mgmt & Enterprise Downloads 2001 4M UUsers / day 2003 6.5M UUsers / day 1995 30k users / day 2006 17.1M UUsers / day
Microsoft.com Quick Facts
Infrastructure and Application Footprint
5 Internet Data Centers & 3 CDN Partnerships
110 Web Sites, 1000’s App's and 2138 Databases
80+ Gigabit/sec Bandwidth
Solutions at High Scale
www.Microsoft.com
13M UUsers/Day & 70M Page Views/Day
10K Req/Sec, 300K CC Conn’s on 80 Servers
350 Vroots, 190 IIS Web App’s & 12 App Pools
Microsoft Update
250M UScans/Day, 12K ASP.NET Req/Sec, 1.1M ConCurrent
28.2 Billion Downloads for CY 2005
Egress – MS, Akamai & Savvis (30-80+ Gbit/Sec)
MySpace Company Overview
Launched Sept, 2003
Latest as of February 2006
64+ MM Registered Users
38 MM UUsers & 2.3M Concurrent
260K New Registered Users/Day
23 Billion Page* Views/Month
Demographics
50.2% Female / 49.8% Male
Primary Age Demo: 14-34
Site Trends
260K New Users/Day
430M Total Images
Millions of Songs Streamed/Day
1000’s of New MP3’s/Day
20 Million Comments Posted
Media Metrix February 2006 Audience Rankings Source comScore Media Metrix February - 2006 23,566 #2 MySpace Hotmail Google Ebay MSN Yahoo 9,632 #4 29,508 #1 14,695 #3 7,329 #5 6,812 #6 Pageviews in ‘000s Internet Rank
MySpace.com Quick Facts
Infrastructure and Application Footprint
3 Internet Data Centers
Server Breakdown
2682 Web and 650 Database Servers
90 Cache Servers 16gb RAM
650 Dart servers
60 DB Servers
150 Media servers
3000 disks in SAN architecture
Egress Management
17,000 mb/s bandwidth
15,000 mb/s on CDN
MySpace.com Growing up in the Internet World
0 users The beginning
Two tiered architecture
Single Database
Load balanced web servers
Great for rapid development
Less complexity means faster time to market and less operational costs
Works for small to medium sized websites, not big ones
0 Users
500k Users A Single database is not enough
Max out a single database
Split reads and writes across separate databases
Use transactional replication so multiple databases can service reads
500k Users
1 Million Vertical partitioning
Transactional replication doesn’t work for all workloads and data types
Use a combination of Vertical Partitioning and replication
1M Users
2 Million SAN
Start to reconsider SCSI arrays for the long-term
SCSI arrays have good performance but reliability issues
SANS provide better performance, uptime, and redundancy
Move to a clarion and enjoy better these benefits
2M Users
3 Million Horizontal partitioning
Vertical Partitions see performance problems
Decide we need to re-architect the database
Horizontal partitioning is the answer but is difficult to do while in production
3M Users
Horizontal Partitioning
All features reside on a single database server
Data is partitioned by user ID
Some data cannot be partitioned especially on a social networking site
3M Users
5 Million Network bottlenecks
Various areas of the network become saturated
Gig uplinks are maxed out
Switch to Autonomous network and BGP
Get multiple gig links and 10G links
Load balancer is maxed out
“ Must load balance the load balancers”
Use DNS
5M Users
7 Million Site dependencies
Separating features on the front end isolates potential bottlenecks
Using subdomains is easiest way
7M Users
10 Million Scalable storage
Trying to partition storage on the backend is time consuming and inefficient
Maxing out SANs is very costly
We realize scalable storage is key
10M Users
15 Million DB’s versus Caching
Databases still having perf issues
Databases are expensive
Have a lot of transactional overhead
Caching tier
High speed cache is perfect for reads
LRU algorithm is self managing
Drastically reduces database load
MySpace Where we are today
Upcoming Technology Enablers What’s Next for Microsoft.com and MySpace.com?
SQL Server 2005 Product technology enablers
Peer-To-Peer Replication
System & Data Center Autonomy
Zero “perceived” Application Downtime from Consumers
Eliminates Single Point of Failure for R/W Databases
Mirroring (SP1)
Targeting Replacement of Log Shipping Fail-Over pairs
3 Systems in TAP Program (Technet, Learning & Genuine)
Reduced Failover Downtime
Log Shipping: 5-15min Avg
Mirroring < 1min (planned)
Table Partitioning
Reduced Storage Costs
Scale Up at Lower Costs
MySpace Scaling SQL Server
V1: Single Instance – < 1 Million Users
Single SQL Server Instance Supports All Users and Features
V2: Single Instance Replicating to Read Only Full Copies < 2 Million Users
Single server handles all write transactions, read transactions spread across multiple transactional replication copies
V3: Vertical Partitioning - < 4 Million Users
Each Feature/Page of the site on its own SQL Server
MySpace Scaling SQL Server
V4: Horizontal Partitioning - < 8 Million Users
All features/pages brought back to single database schema
Standard schema across all databases
User ranges partitioned across databases
V5: Horizontally Partitioned Core with Replicated Content, Vertically Partitioned Features Databases, “Shared Content” Databases - > 8 Million Users
Primary Myspace schema exists across large farm of servers
Small amounts of content replicated to all horizontally partitioned servers to allow for features spanning all user ranges
V6: Migration to SQL Server 2005 - >26 Million Users
SQL Server 2005 64 bit
Memory Pressure under 4GB 32 Limit
Servers loaded with 32Gigs of RAM
<4 Gig Addressable to the memory pools we were stressing
Manifestations
Connection Timeouts
Servers going “dark”, requiring restart
Rejected Connections
Problem Eliminated on 64bit Arch
Connection/Sort memory pools now able to address all 32Gigs of RAM
Virtualizing Storage
What is it?
Software layer between your disks & hosts
Advantages
Provisioning is very simple, makes capacity planning more predictable
Much better performance
Can easily add more capacity to a LUN
What do we use?
3par
14 week bake off
Longhorn And IIS 7.0 Product technology enablers
UNC Content Store
Simplified Content Mgmt
Reduced Disk Footprint
File Replication (DC to DC)
Latent/Long links improved 80X (10Mbps vs 850Mbps)
Enabler of Geo-Hosting Options
Centralized IIS Config’s
Copy “Host-Host” capability
Eliminate complex scripting of meta-base & config’s
Dynamic Content Compression
Further reduced Egress
Improved Web Perf Delivery
IIS 7.0 Failed Request Tracing
Geo-Targeting Solutions Demographic management
Objective – Enable Targeted Release of App’s and Content
Avoid demographic support spikes and further align to marketing campaigns
MySpace and Microsoft.com are two of the most-visit more
MySpace and Microsoft.com are two of the most-visited Web sites on the planet. Come to this session to hear about lessons learned using Microsoft technologies to run Web applications on a massive scale. Representatives from Microsoft.com talk about lessons learned using an all-Microsoft datacenter. Representatives from MySpace talk about the realities of using Microsoft technologies in a scalable, federated environment using SQL Server 2005, .NET 2.0 and IIS 6 on Windows Server 2003 64-bit editions. This session features an open Q&A with a panel of technical managers and engineers from MySpace and Microsoft.com. less
0 comments
Post a comment