Building a platform from symfony at Yahoo!
Presented at symfonyLive 2010 in Paris, France by Dustin Whittle.
Join us for a case study on using open source tools to build a platform for enterprise web applications with symfony. The focus of this session will be on how Yahoo! has built web applications that scale with symfony. Find out what worked and what didn't when building scalable web applications with the symfony framework.
* Why symfony?
* symfony vs ysymfony
* Social Search: Delicious and Answers
* YOS: Developer Tool & Application Platform
* Internal Tools: Customer Care + Dashboards
* The Platform + Components
* Yahoo! symfony Plugins
* Developer Tools - YUI3, YQL, Design Patterns, etc
3. Overview Why symfony? symfony vs. ysymfony What does scaling really mean? Social Search: Delicious and Answers International: ShopGenie.co.uk, FoxyTunes, Wretch.cc Yahoo! Open Strategy What is the Yahoo! Open Stack? Developer Tools YUI, Design Patterns, Tutorials Data & Social APIs YQL: Yahoo! Query Language Profiles, Connections, Updates, … Geo, Flickr, Delicious, Upcoming YOS SDK for PHP
4. Who am I? Working with symfony since open source symfony Core Team Member Responsible for the development and support of ysymfony at Yahoo! Worked with Y! Answers, Delicious, Y! Widgets, Y! Bookmarks, Yahoo! Application Platform Developer advocacy for Yahoo! Developer Network Consultant Commercial symfony support + training (USA)
8. YUI | BROWSER PLUS | DESIGN PATTERNS | R3 | YSLOW + PERFORMANCE RULES YAHOO! gives back to open source
9. YQL | PIPES | BOSS | CONTACTS | UPDATES | MAIL | DELICIOUS | FLICKR | UPCOMING | HOTJOBS | MAPS | FIREEAGLE | GEOLOCATION | LOCAL | TRAFFIC | WEATHER | MUSIC | ANSWERS | SHOPPING | FINANCE | TRAVEL YAHOO! SHARES ITS DATA THROUGH OPEN APIs AND WEB SERVICES
10. Conferences | Hack Days | HackU | Tech Talks | YDN Theater YAHOO! ENGAGES COMMUNITIES WITH OPEN HACK EVENTS around the world
11.
12. Users LoadBalancers Frontend Backend Linux ysymfony/ YUI Apache Custom Modules PHPAPC, PEAR, PECL, Custom Extensions UserAPI MySQL/Memcache Web Services Ad API
13. Why a frontend platform? Rasmus says “frameworks are not well suited for Yahoo!” Build applications to requirements Do exactly what you need: no more, no less Understand that frameworks add a lot of overhead Choosing functional components is a better fit Despite choosing open source or building your own Everyone uses a framework If you use open source, use only the pieces you need
14. Y! needs from a frontend platform Fit existing environment (RHEL/PHP5/Apache) Development Cycle – How easy to develop, test, and deploy? Clean separation between data, logic, and display (MVC) Independent model layer to fit service oriented architecture Extensible and pluggable Internationalization and localization support Detailed documentation and active community of support Open source and ability to contribute back
15. Why a framework at all? Another software layer (ysymfony, yphp, yapache) Factors out common patterns Code Layout Configuration URL Routing Authentication / Security Form Validation / Repopulation Internationalization / Localization Encourages good design Abstraction > Consistency > Maintainability
16. The choice to adopt symfony? Philosophy Full-stack framework for building complex web applications Adopt best ideas from anywhere, using existing code if available (Mojavi, Prado, Rails, Django) Design Clean separation between Model, View, and Controller Controller using modules and actions Views using templates in straight PHP with helpers Easy to reuse view modules to compose a page Layouts, Components, Partials, Slots
17. The choice to use symfony Configurability / Flexibility Features we do not want are easily disabled Use of factories for easy customization Documentation / Support Community The Definitive Guide to symfony Askeet, Jobeet, Cookbooks, Advents Active community with wiki, mailing lists, forums, irc channel
18. Why ysymfony for Yahoo! teams? Eliminate common patterns by adding a layer on PHP Code layout/structure (MVC) Configuration Internationalization ysymfony is just a toolkit Learn one set of tools Shift between multiple projects Consistency Long term maintainability through platform
24. A look at Yahoo! Answers http://answers.yahoo.com Yahoo! Answers is the largest collection of human knowledge on the Web with more than 135 million users and 515 million answers worldwide (Yahoo! Internal Data, March 2008). Yahoo! Answers is the 2nd ranked education & reference site on the web (comScore) Available in 26 markets and 12 languages
25. Yahoo! Answers at the beginning Started as a small development team on PHP4 from a fork of Yahoo! Taiwan Knowledge+ Launched December 2005 by December 2006 there were 60 million users and 65 million answers The code base eventually became difficult to maintain and iterate new features Large distributed development teams (US / UK)
26.
27.
28.
29.
30.
31.
32. The big picture A complete platform for building web applications from frameworks PHP Framework JavaScript Framework CSS Framework UI Design Patterns + Best Practices Development Tools (logger, profiler, debugger, docs) Unit + Functional Testing Frameworks (LIME / YUI Test) Deployment Tools (rsync deployment system)
33. What does Yahoo! change? Minor changes to fit our environment Most changes are easily implemented via factories/plugins Dropped the ORM and pushed down the stack (SOA) Added a parallel API Dispatcher (ysfAPIClientPlugin) Added dimensions to configurations (ysfDimensionsPlugin) Integrated R3 translation/template management (ysfR3Plugin) R3 - http://developer.yahoo.com/r3/ Created a build and deployment solution (ysfBuildPlugin) Uses internal tools for packaging/deployment Integrate support for Y! User Interface libraries (ysfYUIPlugin)
34. Propel or Doctrine or ??? No ORM for large projects Doctrine for medium sized projects Service Oriented Architecture Platforms/Backendsas services (reusable to all) Thin Controller/Fat Model (where model == services) Use PHP as the frontend glue No heavy lifting in PHP = Push down the stack Java/C++/Erlang + JSON/XML
35. Localizing with dimensions Cascading Configuration based on YAML Framework -> Project -> Application -> Module Extending the cascade to be based on dimensions Dimensions can be anything (and can be chained together) Data Center + Environment for customizing configurations Culture for localizing user interface + data Theme for customizing look and feel User info (is user on corporate intranet?) Caching
37. A build and deployment system Aggregate and minify stylesheets and javascripts Rewrite templates, css, js for CDN (Akamai, S3, …) Generate translations for configurations + templates Generate configuration cache Aggregate core classes + remove debug statements Run lint, unit, functional tests Package applications as .tgz Deployment via packages Integrated with Yahoo! internal tools
38. Hello World Performance Hello world benchmarks are generally not a useful measurement You don’t use a framework to write hello world die(‘hello world’); Performance is relative to the features you use ORM I18N Output Escaping Forms/Validation Ability to scale != performance
39. What does it mean to scale? A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system. High Availability + Scalability + Performance Bigger dataset, more traffic, maintainable Not about performance PHP is slow, but it is very rarely your bottleneck Languages do not scale, architectures do Planning to grow and planning to fail Capacity Planning Business Continuity Planning
40. Scaling – Planning Planning hardware purchases and hosting options to have as much as you need without breaking your wallet Partitioning and distributing databases to support large datasets and simultaneous transactions Offline transactions and queuing Monitoring your applications to find and clear bottlenecks Providing service APIs and using services from other providers to increase your site's reach and capabilities Think Minimal, Plan to grow, Plan to fail
41. Scaling – The basics in PHP PHP is rarely the bottleneck (even though it can be slow) “Most performance comes not from the language, but from application design” – Rasmus Share Nothing Architecture Independent, self-sufficient, no single point of contention No local storage = No PHP Sessions Use a database (works for distributed) Use a small signed cookie (ideal) Important data in database Individual expiration on session objects Small data items Use a distributed cache Memcache Forget about small efficiencies Premature optimization is the root of all evil. “ vs. ‘, echo vs. print
42. Scaling Databases – The basics Drop the ORM It’s a choice of convenience over performance Master/Slave Replication First steps Helps with reads, writes are still bottleneck Partitioning Segmenting data Sharding (horizontal partitioning) Segmenting data onto different physical machines Make problems smaller, easier to grow
43. Improving latency with Caching Always use PHP opcode cache (APC, Xcache, etc) Use for routing and i18n cache Memcache (distributed cache) Use for view cache Distributed invalidation can be a pain sfViewCacheManager makes this easy! Be intelligent about cache_keys (uri, user, state) There is a fine line to caching At what point do you spend more time managing the cache, than reading from it?
44. Tweaking Performance Do not use .htaccess (move to real apache config) Set a minimal include path Increase realpath_cache_size + realpath_cache_ttl Use apc.stat=0 Don’t use features you do not need Disable in settings.yml/ factories.yml Use core_compile to aggregate classes to reduce file i/o Remove debug statements + optimize file lookups sfOptimizerPlugin / project:optimize Use @routeName and use caching for factories
45. Symfony (v2) Symfony 2 is a set of cohesive yet decoupled components This makes it much easier to use single component to solve a single problem Which makes it easier to build micro frameworks that solve very specific problems Yahoo! Teams generally prefer solutions that are specific to their exact problem Selling the full stack can be difficult when a team only wants a few components Symfony 2 is the right direction, even if it breaks backwards compatability
46. Do it yourself for cheap Open source software = Free Apache PHP MySQL Memcache / Perlbal / MogileFS / Squid / Gearman symfony / Doctrine / Propel / Swift Nagios / Cacti Amazon Shared Infrastructure = Cheap EC2 Cloud S3 Storage + Cloudfront SimpleDB
60. What is Yahoo! Developer Network? The Yahoo! Developer Network offers open source tools and open data APIs to make it easy for developers to build applications and mashups. 50+ APIs / Web Services Developer Dashboard to create/manage Oauth applications Tutorials + Code Samples on using our apis Complete API Documentation Yahoo! User Interface libraries ASTRA Flash Components Design Patterns Library Evangelism: Conferences / Theater / Blogs / Events
63. CSS Foundation Reset - Neutralizes browser CSS styles Base - Applies consistent style foundation Fonts - Foundation for typography and font-sizing Grids - Thousands of wireframe layouts User Interface Design Patterns Library Proven solutions to common interfaces http://developer.yahoo.com/ypatterns/ Grade Browser Support / Progressive Enhancement Y! Developer Network – YUI CSS
64. More than 275 functional examples http://developer.yahoo.com/yui/examples/ YSlow + Performance Rules http://developer.yahoo.com/performance YUI Blog http://yuiblog.com/ Mailing List @ Yahoo! Groups http://tech.groups.yahoo.com/group/ydn-javascript/ Y! Developer Network – Documentation
66. Thousands of web services that provide valuable data Require developers to read documentation and form URLs/queries. Data is isolated and can not be combined Needs combining, tweaking, shaping even after it gets to the developer. Before YQL
67. SQL-Like Language Synonymous with Data access Familiar to developers Expressive enough to get the right data Self Describing - show, desc table Allows you to query, filter and join data across Web Services. Y! Open Stack – YQL
68.
69. YQL – Open Tables Twitter Google Facebook Friendfeed Wesabe Whitepages Zillow Search Weather Flickr Upcoming Delicious Dopplr Github New York Times Shopping …any web service can be as easy as SQL Available on github - http://github.com/spullara/yql-tables/
73. select * from social.connections select * from delicious.feeds.popular select * from flickr.photos.interestingness select * from friendfeed.status select * from github.checkins YQL - Examples