Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pg nordic-day-2014-2 tb-enough

2,912 views

Published on

Online classified web site Leboncoin.fr is one of the success stories of the French Web. 1/3 of the total internet population in France uses the site each month. The growth has been spectacular and swift, and was made possible by a robust and performant software platform. At the heart of the platform is a large PostgreSQL infrastructure, part of it running on some of the largest PC-class hardware available. In this presentation, we will show how we have grown our infrastructure. In particular, the amazing vertical scalability of PG will be showcased with hard numbers (IOPS, transactions/seconds, etc). We will also cover some of the hard lessons we have learned along the way, including near-disasters. Finally, we will look into how innovative features from the PostgreSQL ecosystem enable new approaches to our scalability challenge.

Published in: Technology
  • Be the first to comment

Pg nordic-day-2014-2 tb-enough

  1. 1. 2TB of RAM ought to be enough for anybody PG Nordic Day 2014
  2. 2. The Presenter 1 Renaud Bruyeron - @brew_your_own Paris, France
  3. 3. The Presenter – short bio 2 • 1998-1999: W3C @ MIT • 1999-2013: FullSIX • Top 3 Web agency in FR • Strong focus on PHP, Java, and Oracle… • 2013-present: CTO of
  4. 4. Schibsted Classified Media From 1 to 30+ Countries in 7 years 3
  5. 5. 4
  6. 6. Templated deployment in 30 countries w/ shared technology 5 • Technology originally inherited from Blocket.se • Has since evolved to power all 30 sites • Focus on performance and ease of local modificationsPostgreSQL Middleware (data access, business rules) Index Presentation layer APIs Web Search C C C PHP C/PHP C PL/SQL
  7. 7. PostgreSQL in the SCM Platform 6 100+ servers running PostgreSQL 8TB of data 50+ million classified ads
  8. 8. Schibsted Classified Media & PostgreSQL married…and in love  7
  9. 9. #1 Classified Web site in France 8
  10. 10. History 9 Project initiated in 2006 Site launch: early 2007 Based on technology from 1 to 230 people, challenger to #1 in 7 years
  11. 11. 10
  12. 12. 11
  13. 13. 12
  14. 14. Explosive growth…with a few bumps along the way! 13
  15. 15. Big Audience 14 250M page views / day 5M unique visitors / day 18M UV / month That’s 1/3 of French internet population… 600000+ new ads / day 25M live ads #7 most visited Website in France
  16. 16. Big Ops 15 300+ servers in 2 DCs 20 servers hosting PG databases (in production)
  17. 17. Built on SCM Technology 16 PostgreSQL Middleware (data access, business rules) Index Presentation layer APIs Web Search
  18. 18. Built on SCM Technology 17 Website Data Master Slave (hot standby) Slave (short queries) Slave (Long queries) Middleware (data access, business rules) Index Presentation layer APIs Web Search
  19. 19. Built on SCM Technology 18 Website Data Master Slave (hot standby) Slave (short queries) Slave (Long queries) Middleware (data access, business rules) Index Presentation layer APIs Web Search +15 support DBs
  20. 20. Website Data +15 support DBs We use PostgreSQL everywhere! 19 Backoffice & Analytics Master Slave (hot standby) Slave (short queries) Slave (Long queries) ODS OLAP BI & CRM Middleware (data access, business rules) Index Presentation layer APIs Web Search
  21. 21. We try to limit writes! 20 • Index/Search acting as a structured cache • Master DB workload = 70% writes • Slaves used to offload read queries • Main database = 6TB on disk… • +4TB archived away… • 20K LOCs of PL/SQL Website Data +15 support DBs Master Slave (hot standby) Slave (short queries) Slave (Long queries) Index Search
  22. 22. PostgreSQL works beautifully as a DW! 21 Website DataBackoffice & Analytics Master Slave (hot standby) Slave (short queries) Slave (Long queries) ODS OLAP ETL moving 500GB / day into the DW 2.2TB Adding 6GB/day
  23. 23. Big Iron: HP DL980, 2TB of RAM, 64-80 cores 22
  24. 24. HTOP on the Master  23
  25. 25. Physical storage for the main PostgreSQL instances 24 DL980 DL980 3Par V800 (SAN) Fusion-IO DL980DL980 3Par V800 (SAN) Fusion-IO Fiber Channel Fiber Channel Master Slave SlaveSlave DC 1 DC 2
  26. 26. Big Iron: 3Par V800 SAN 25
  27. 27. Big Iron: thin provisioning with mix of SSD and FC disks 26
  28. 28. Big Iron: high performance… 27
  29. 29. Bragging about it ;) 28
  30. 30. Despite the caches and the search engine, we get impressive workloads on the master DB 29 600 tx/sec
  31. 31. How did we get there? 30 2x DL980 w/ Fusion-IO 2x 3Par V800 2x DL980 w/ FC to the V800s
  32. 32. We did go through growing pains and near disasters 31 The « Big One »
  33. 33. Our own « worst day of our lives »: March 1st 2013 (1/2) 32 Master DB is slowing down dramatically We find that Slony replication is the culprit We don’t know what to do… Until we find a solution on the net that involves cleaning up slony metadata… …(you know where this is going)… We fumble. We notice. The slave is borked. Rebuilding the Slave with slony brings the Master down. Oh. God… We take the Master off the stack, and start rebuilding the slave w/ Slony …5 days later, we are done (!)…
  34. 34. Our own « worst day of our lives »: March 1st 2013 (2/2) 33 …but we are not out of the woods yet!... Pent up demand is bringing the site down! We decide to switch to native replication! …but the network cards are maxed out by the replication data… …triggering a kernel bug… …(Murphy, could you please step out of the room?)… We implement network card bonding, and start moving support tables off the main instance …and we are done!
  35. 35. What’s Next? 34
  36. 36. Vertical Scalability has limits! 35 We are already running on the biggest HW money can buy Past certain volume levels, execution plans can change radically Huge instances are difficult to maintain & backup safely Rebuilding the slave in March 2013 took a full 5 days… Although we are maxing out the HW available to us, especially on the writes/s, We are committed to PostgreSQL at the heart of our platform
  37. 37. Key ideas to take our platform to the next level 36 Sharding Unbundling of the schema PGQ Spread the reads & writes horizontally Move parts of the schema (that can be decoupled) to other instances. Spread the workload Reduce application-level transaction interleaving by moving parts of the transactions to asynchronous workers
  38. 38. 37

×