Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20080410 Pf Congrez Presentation E Bay V0 2

2,092 views

Published on

These are the slides I used for a talk I held at PFCongrez (April 12th 2008, in NL) about some of the internals & background at Marktplaats, the Dutch leading classifieds site!

  • Be the first to comment

  • Be the first to like this

20080410 Pf Congrez Presentation E Bay V0 2

  1. 1. A sneak preview at Marktplaats.nl PFCongrez April 12 th 2008 JA. Oldenbeuving [email_address] eBay Inc. Proprietary & Confidential
  2. 2. Who am I? <ul><li>Jilles Oldenbeuving </li></ul><ul><li>Working for Marktplaats since early 2003 </li></ul><ul><li>Responsible for application development </li></ul><ul><li>Lot’s of fun: </li></ul><ul><ul><li>Great technical and infrastructural challenges </li></ul></ul><ul><ul><li>Top 3 Dutch website  59% reach! </li></ul></ul><ul><ul><li>Real business, where product drives success </li></ul></ul><ul><ul><li>World class team! </li></ul></ul>
  3. 3. Content <ul><li>eBay’s classifieds portfolio </li></ul><ul><li>Marktplaats statistics </li></ul><ul><li>Marktplaats production environment </li></ul><ul><li>Scaling databases </li></ul><ul><li>Marktplaats and PHP </li></ul>
  4. 4. eBay’s classifieds portfolio
  5. 5. Content <ul><li>eBay’s classifieds portfolio </li></ul><ul><li>Marktplaats statistics </li></ul><ul><li>Marktplaats production environment </li></ul><ul><li>Scaling databases </li></ul><ul><li>Marktplaats and PHP </li></ul>
  6. 6. Marktplaats statistics <ul><li>At peak: </li></ul><ul><ul><li>Over 71M page views/day </li></ul></ul><ul><ul><li>10 new listings/s, 6M total listings </li></ul></ul><ul><ul><li>600 search queries/s </li></ul></ul><ul><ul><li>900 MB/s uplink traffic </li></ul></ul><ul><ul><li>120 user generated emails/s (send from user to user) </li></ul></ul><ul><li>Collection of 20M user images (2TB) </li></ul><ul><li>Utilizing 600+ servers across 3 datacenters </li></ul>
  7. 7. Content <ul><li>eBay’s classifieds portfolio </li></ul><ul><li>Marktplaats statistics </li></ul><ul><li>Marktplaats production environment </li></ul><ul><li>Scaling databases </li></ul><ul><li>Marktplaats and PHP </li></ul>
  8. 8. Marktplaats Production Environment Search Engine LB/Firewall LB LB Tracker Mogile storage nodes NetCaches Application Application LB/Firewall Ads/Users Hitcounters etc. AdMarkt Read slaves Read slaves Read slaves Etc.. Read slaves Memcache CS Backend Simplified LAMP MySQL
  9. 9. Content <ul><li>eBay’s classifieds portfolio </li></ul><ul><li>Marktplaats statistics </li></ul><ul><li>Marktplaats production environment </li></ul><ul><li>Scaling databases </li></ul><ul><li>Marktplaats and PHP </li></ul>
  10. 10. MySQL Replication <ul><li>Slaves upon slaves doesn’t scale well… </li></ul><ul><ul><li>Only spreads reads </li></ul></ul>500 reads/s 200 writes/s 250 reads/s 200 writes/s 250 reads/s 200 writes/s w/ 1 server w/ 2 servers
  11. 11. As your site grows… <ul><li>Databases eventual consumed by writing </li></ul><ul><li>Can not be solved by caching read actions </li></ul>3 reads/s 400 writes/s 3 reads/s 400 writes/s 3 reads/s 400 writes/s 3 reads/s 400 writes/s 3 reads/s 400 writes/s 3 reads/s 400 writes/s
  12. 12. Generic MySQL database pool setup Active Master Failover Master Access through VIP Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 1 Slave Pool 2 Lag Slave <ul><li>Provides: </li></ul><ul><li>High availability for both writes and reads </li></ul><ul><li>Scales reads </li></ul><ul><li>Writes need to be scaled by partitioning (either functionally, or by modulo) </li></ul><ul><li>Prevents human disasters </li></ul><ul><li>Long term backups </li></ul><ul><li>A way to change database schema’s without downtime </li></ul>Offsite Backup <ul><li>Slaves can vary: </li></ul><ul><li>Different replication sets (For really high read:write ratios) </li></ul><ul><li>Different indexes </li></ul><ul><li>Different access patterns/impact seperation (Ex. Cronjobs; for key buffers) </li></ul>TIP : Abstract this in the code. Both configuration as well as physical vs logical mapping or look into MySQL Proxy
  13. 13. How to manage database schema’s? <ul><li>The problem: </li></ul><ul><ul><li>Hundreds of database instances across Marktplaats </li></ul></ul><ul><ul><ul><li>Each development environment it’s own database </li></ul></ul></ul><ul><ul><ul><li>Each QA and staging environments </li></ul></ul></ul><ul><ul><ul><li>Production environment </li></ul></ul></ul><ul><ul><ul><li>With 10-20 separate database pools each </li></ul></ul></ul><ul><ul><ul><li>In total 500+ databases </li></ul></ul></ul><ul><ul><li>Application needs to be consistent with the database version too! </li></ul></ul>
  14. 14. Enter DBC <ul><li>In-house developed tool in PHP </li></ul><ul><li>Inspects your current database version and application version and will bring those in synch </li></ul><ul><li>Is aware of our database setup </li></ul><ul><ul><li>Physical </li></ul></ul><ul><ul><li>Logical </li></ul></ul><ul><li>DBC is integrated with the build system </li></ul>
  15. 15. DBC <ul><li>Benefits: </li></ul><ul><ul><li>Allows to “branch” database changes, but share within project team until feature is finished </li></ul></ul><ul><ul><li>Leaves an audit trail of database changes </li></ul></ul><ul><ul><li>Allows review by DBA before propagating a change into a release </li></ul></ul><ul><ul><li>Consistent and safe rollout of database changes to production </li></ul></ul><ul><ul><ul><li>Checks target system before and after </li></ul></ul></ul>Trying to gauge interest in DBC to decide to open source it. If you have interest, let me know.
  16. 16. Content <ul><li>eBay’s classifieds portfolio </li></ul><ul><li>Marktplaats statistics </li></ul><ul><li>Marktplaats production environment </li></ul><ul><li>Scaling databases </li></ul><ul><li>Marktplaats and PHP </li></ul>
  17. 17. Marktplaats and PHP <ul><li>Started out as a PHP-only shop in 1999 </li></ul><ul><li>PHP worked great, and scaled well up to a certain point </li></ul><ul><ul><li>Usage of Marktplaats keeps on growing </li></ul></ul><ul><ul><li>Application grew immensely in complexity </li></ul></ul><ul><ul><li>Number of developers quadrupled </li></ul></ul><ul><li>Java and SOA architecture gaining more ground </li></ul><ul><li>One example of limits in PHP </li></ul>
  18. 18. APC Autofilter issue <ul><li>PHP’s speed can be improved by using an opcode cache like Zend, APC </li></ul><ul><li>Examples to the right bypass this since the path is variable </li></ul><ul><li>No constants in PHP! </li></ul>Include (‘foo.php’) Include ($path. ‘/foo.php’) Include (MY_INCL. ‘/foo.php’) If($a) include (‘/path/foo.php’) Include (‘/path/foo.php’)
  19. 19. APC Autofilter issue <ul><li>This file is cached, but parent.php is not since includes are only done at runtime. </li></ul><ul><li>Child is actually created as an incomplete “mangled” class definition </li></ul><ul><li>What if parent.php was already cached? </li></ul>Include_once “parent.php”; class Child extends Parent();
  20. 20. APC Autofilter issue <ul><li>By the time child.php is including parent.php, it is already cached </li></ul><ul><li>Zend changes the opcodes for child.php, removing the include of parent.php  this speeds up execution </li></ul><ul><li>APC can not use this version of child.php for caching </li></ul><ul><li>APC will stop caching child.php at all (called “Autofilter”) </li></ul><ul><li>… for ever! </li></ul><ul><li>One of the reasons why Java is gaining more traction within Marktplaats </li></ul>Include_once “parent.php”; Include_once “child.php” $c = new Child(); Given the 1000’s of files in Marktplaats’ codebase this costs ~30% runtime performance!
  21. 21. We’re hiring! Get in touch: [email_address] eBay Inc. Proprietary & Confidential
  22. 22. Marktplaats Production Environment Search Engine LB/Firewall LB LB Tracker Mogile storage nodes NetCaches Application Application LB/Firewall Ads/Users Hitcounters etc. AdMarkt Read slaves Read slaves Read slaves Etc.. Read slaves Memcache CS Backend Simplified LAMP MySQL

×