Yahoo Communities Architecture Unlikely Bedfellows

1,169 views
1,051 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,169
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
40
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Yahoo Communities Architecture Unlikely Bedfellows

  1. 1. Yahoo! Communities Architectures Ian Flint November 9, 2007 1
  2. 2. Agenda • What makes Yahoo! Yahoo!? • Hardware Infrastructure • Software Infrastructure • Operational Infrastructure • Process • Examples 2
  3. 3. What makes Yahoo! Yahoo!? • What do these sites have in common? – Del.icio.us – Flickr – Yahoo! Groups – Yahoo! Mail – Bix 3
  4. 4. What makes Yahoo! Yahoo!? • Accountability at the property level – Architecture – Application Operations – Infrastructure Decisions • Incubator Environment – Properties function independently on a common hardware platform – Highly cost-conscious – Open-source attitude 4
  5. 5. What makes Yahoo! Yahoo!? • Standards at the infrastructure level – Hardware/Software platform – Configuration Management – Operational tools and best practices • Executive Involvement – Cost – Robustness – Redundancy 5
  6. 6. Hardware Infrastructure Common Platform 6
  7. 7. Hardware Infrastructure • Shared Components – Network, Data Center, NAS – Centrally managed by infrastructure team • Load Balancing – DSR is preferred model – Proxy load balancing only where necessary 7
  8. 8. Hardware Infrastructure • Hardware (x86, RAID/SCSI) – Jointly managed by properties and ops • Hardware Selection – Price/Performance is a constant consideration – Supply chain and provisioning cost – Reliability vs. Price • Single-Homed hosts (even databases) • Pooling across multiple switches • Fast Failover to mitigate risk of switch failure 8
  9. 9. Hardware Infrastructure Example • Layered Infrastructure The Internet • Hosts distributed across multiple racks Router for power/network Load redundancy at the Balancer pool level Aggregation Switch • Really Big Load Rack Switches Balancers doing DSR Hosts 9
  10. 10. Software Infrastructure Shared Repository 10
  11. 11. Software Infrastructure • OS (FreeBSD, moving to RHEL) • Databases (MySQL, Oracle) • Development Platforms – PHP (most properties) – C/C++ (primary infrastructure platform) – Java – Python 11
  12. 12. Software Infrastructure • Installable components – Managed through yinst package manager – Stored on common distribution server – Examples: yapache, yts, yfor, ymon, yiv, vespa 12
  13. 13. Software Infrastructure • More about yinst – Robust Package Manager • Installation, Versioning, Scripting – Implementation • Software installed on distribution cluster (package repository) • Hosts then pull software (via proxies) • Software stored under a common root • Used for everything from perl modules to common components to applications 13
  14. 14. Software Infrastructure • Shared Infrastructure enables rapid integration of acquisitions – SDS – UDB – YMDB – SSO • External Infrastructure – Akamai CDN and DNS – Gomez & Keynote 14
  15. 15. Software Infrastructure - Bix The Internet Akamai Akamai (CDN) (CDN) YTS YTS (Reverse Proxy) (Reverse Proxy) Primary Colo Backup Colo Yfor Yfor (failover resolver) (failover resolver) Web Servers Web Servers Web Servers Web Servers Web Servers Web Servers Media Servers Media Servers Media Servers Media Servers Media Servers Media Servers • Global Server Load Balancing between sites • YTS provides Reverse Proxy and Connection Management • Yfor provides fast failover from colo to colo • Media is served via a content delivery network for performance and to reduce load on servers 15
  16. 16. Software Infrastructure - Bix • Yfor Failover Resolver used for fast failover of database connections • Dual Master MySQL setup for write hosts • Media storage on NetApp NAS device, with snap- mirroring to backup data center 16
  17. 17. Software Infrastructure - Bix • Yapache reverse proxy in Web Server front of Tomcat instance Yapache (Yahoo-improved Apache) • PHP used to access Mod_proxy PHP Yahoo shared services (Reverse Proxy) Yahoo Shared Services Static • Static files served from Files disk • Fairly standard Java Tomcat environment (Spring, Bix Application Spring Hibernate Lucene ehcache Hibernate, ehcache, c3po, Hessian c3po log4j MySQL Connector log4j, etc.) 17
  18. 18. Software Infrastructure - Groups • Inbound Groups mail hits a qmail cluster • Mail filtered against real-time blacklist • Mail forwarded to second qmail cluster • Proprietary anti-spam algorithms applied • Mail forwarded to group members • Mail stored on archive servers • Oracle RAC clusters store metadata • Periodic “Electric Potato” measures QoS 18
  19. 19. Software Infrastructure – Groups • Dynamic content served via web pool running python/c++ application • CSS and images served via a squid-fronted pool • Group photos on Y! photos infrastructure backed by Yahoo! Media DB (YMDB) • Database feature implemented as sleepycat DB hosted on message store • Calendar feature implemented via API calls to calendar.yahoo.com 19
  20. 20. Operational Infrastructure Managing the Platform 20
  21. 21. Operational Infrastructure • Common Monitoring Infrastructure – Nagios • Main monitor for clusters • Numerous standard plugins • Standards/Best Practices around custom plugins – Ywatch • Basic monitoring of machines over SNMP • Heartbeats plus fundamental metrics (IO, CPU, Disk, etc.) – Ymon • NRPE/NSCA on steroids • Automated forwarding of active and passive checks • Scripted setup – Drraw • Data Visualization • Deep integration with Nagios and ymon 21
  22. 22. Operational Infrastructure – Rollup Monitoring • Clusters rolled up to centralized monitoring console • Prioritization and correlation of events – Internal Site QOS Monitoring • QOS monitoring for sites • Response time and availability – “The OC” • 24x7, worldwide operations center • Provides tier 1 and 2 support – Centralized CMDB • Configuration Management DB – manages every device • Contact info, escalations, and runbooks included 22
  23. 23. Operational Infrastructure Example • Application Servers Media perform checks which Media Web Web Servers DB Servers Media Web Web Servers Servers Web Servers Servers Servers Servers Servers are registered by Nagios as passive checks ymon server SNMP-Based Checks • Metrics are ymond aggregated by metrics metrics module aggregator yapache nagios • On-demand graphing grapher (rrdtool/drraw) is provided by drraw forwarding agent • Nagios alerts are forwarded to central ywatch console ywatch server listener yapache Ops Console 23
  24. 24. Processes and Standards Keeping it sane 24
  25. 25. Process and Standards • Hardware Review Committee – Strong emphasis on economics – Personal attention from David Filo • Software Review Committee – Thinking through major licensing decisions • Business Continuity Planning – Required of all properties – Must have and test backup data center • Paranoids – Ongoing site scans – Enforcement of standards 25
  26. 26. Questions? 26

×