FeedBurner:
Scalable Web
Applications using
MySQL and Java
 Joe Kottke, Director of
 Network Operations
What is FeedBurner?                             2




• Market-leading feed management provider
• 170,000 bloggers, podcas...
Scaling history                                          3




• July 2004
   – 300Kbps, 5,600 feeds
   – 3 app servers, 3...
Scalability Problem 1: Plain old reliability        4




• August 2004
• 3 web servers, 3 app servers, 2 DB servers.
  Ro...
Solution: Load Balancers, Monitoring              5




• Health Check pages
  – Round trip all the way back to the databa...
Health Check                                                     6




UserComponent uc = UserComponentFactory.getUserComp...
Cacti                        7




        © 2006 F eedBurner
Start/Stop scripts                                      8




#!/bin/bash

# Source the environment
. ${HOME}/fb.env

# St...
Start/Stop scripts                                                   9




#!/bin/bash
FB_APPHOME=/opt/fb/fb-app
JAVA_HOME...
Scalability Problem 2: Stats recording/mgmt   10




•   Every hit is recorded
•   Certain hits mean more than others
•   ...
Solution: Executor Pool                                                   11




• Executor Pool
  – Doug Lea’s concurrenc...
Solution: Lazy rollup                              12




• Only today’s detailed stats need to go against
  real-time tab...
Scalability Problem 3: Primary DB overload        13




• Mostly used master DB server for everything
• Read vs. Read/Wri...
Solution: Balance read and read/write load              14




• Looked at workload
  – Found where we could break up read...
Example: Cacti graph of MySQL handlers    15




                     © 2006 F eedBurner
ExtendedDaoObject                                                                   16




• Application code extends this...
Scalability Problem 4: Total DB overload               17




•   Everything slowing down
•   Using DB as cache
•   Databa...
Solution: Stop using the database                 18




• Where possible :)
• Multi-level caching
  – Local VM caching (E...
Scalability Problem 5: Lazy initialization              19




• Our stats get rolled up on demand
  – Popular feeds slowe...
Solution: BATCH PROCESSING                               20




• For FeedCount, we staggered the calculation
  – Still wo...
Scalability Problem 6: Stats writes, again    21




• Too much writing to master DB
• More and more data stored associate...
Solution: Merge Tables                           22




• After the nightly rollup, we truncate the
  subtable from 2 days...
Solution: Horizontal Partitioning                     23




• Constantly identifying hot spots in the
  database
  – Ad s...
Scalability Problem 7: Master DB Failure       24




• Still using just a primary and slave
• Master crash: Single point ...
Solution: No easy answer                      25




• Still using auto_increment
   – Multi-master replication is out
• T...
Our multi-master solution                   26




• Low-volume master cluster
  – Uses DRBD + HeartBeat
  – Works well un...
Mapping / Marshalling Database Cluster    27




                     © 2006 F eedBurner
Scalability Problem 8: Power Failure                  28




• Chicago has ‘questionable’ infrastructure.
• Battery backup...
Code Name: Panic App                                        29




•   Product Name: Feed Insurance
•   Elegant, simple so...
General guidelines                                  30




• Know your DB workload
  – Cacti really helps with this
• ‘EXP...
Our settings / what we use                    31




• Don’t always need the latest and greatest
  –   Hibernate 2.1
  –  ...
JDBC                                                                    32




• Hibernate/iBatis/Name-Your-ORM-Here
  – U...
Thank You                              33




                Questions?

            joek@feedburner.com




            ...
Upcoming SlideShare
Loading in...5
×

Feed Burner Scalability

8,470

Published on

See more scalability tales at:
http://rapd.wordpress.com

Published in: Business, Technology
3 Comments
23 Likes
Statistics
Notes
No Downloads
Views
Total Views
8,470
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
174
Comments
3
Likes
23
Embeds 0
No embeds

No notes for slide

Feed Burner Scalability

  1. 1. FeedBurner: Scalable Web Applications using MySQL and Java Joe Kottke, Director of Network Operations
  2. 2. What is FeedBurner? 2 • Market-leading feed management provider • 170,000 bloggers, podcasters and commercial publishers including Reuters, USA TODAY, Newsweek, Ars Technica, BoingBoing… • 11 million subscribers in 190 countries. • Web-based services help publishers expand their reach online, attract subscribers and make money from their content • The largest advertising network for feeds © 2006 F eedBurner
  3. 3. Scaling history 3 • July 2004 – 300Kbps, 5,600 feeds – 3 app servers, 3 web servers 2 DB servers • April 2005 – 5Mbps, 47,700 feeds – My first MySQL Users Conference – 6 app servers, 6 web servers (same machines) • September 2005 – 20Mbps, 109,200 feeds • Currently – 115 Mbps, 270,000 feeds, 100 Million hits per day © 2006 F eedBurner
  4. 4. Scalability Problem 1: Plain old reliability 4 • August 2004 • 3 web servers, 3 app servers, 2 DB servers. Round Robin DNS • Single-server failure, seen by 1/3 of all users © 2006 F eedBurner
  5. 5. Solution: Load Balancers, Monitoring 5 • Health Check pages – Round trip all the way back to the database – Same page monitored by load balancers and monitoring • Monitoring – Cacti (http://www.cacti.net/) – Nagios (http://www.nagios.org) © 2006 F eedBurner
  6. 6. Health Check 6 UserComponent uc = UserComponentFactory.getUserComponent(); User user = uc.getUser(”monitor-userquot;); // If first load, mark as down. // Let FeedServlet mark things as up in init method. load-on-startup String healthcheck = (String) application.getAttribute(quot;healthcheckquot;); if(healthcheck == null || healthcheck.length() < 1) { healthcheck = new String(”DOWNquot;); application.setAttribute(quot;healthcheckquot;,healthcheck); } // We return null in case of problem, or if user doesn’t exist if( user == null ) { healthcheck = new String(quot;DOWNquot;); application.setAttribute(quot;healthcheckquot;,healthcheck); } System.out.print(healthcheck); © 2006 F eedBurner
  7. 7. Cacti 7 © 2006 F eedBurner
  8. 8. Start/Stop scripts 8 #!/bin/bash # Source the environment . ${HOME}/fb.env # Start TOMCAT cd ${FB_APPHOME} # Remove stale temp files find ~/rsspp/catalina/temp/ -type f -exec rm -rf {} ; # Remove the work directory #rm -rf ~/rsspp/catalina/work/* ${CATALINA_HOME}/bin/startup.sh © 2006 F eedBurner
  9. 9. Start/Stop scripts 9 #!/bin/bash FB_APPHOME=/opt/fb/fb-app JAVA_HOME=/usr CATALINA_HOME=/opt/tomcat CATALINA_BASE=${FB_APPHOME}/catalina CATALINA_OPTS=quot;-Xmx768m -Xms7688m -Dnetworkaddress.cache.ttl=0quot; WEBROOT=/opt/fb/webroot export JAVA_HOME CATALINA_HOME CATALINA_BASE CATALINA_OPTS WEBROOT © 2006 F eedBurner
  10. 10. Scalability Problem 2: Stats recording/mgmt 10 • Every hit is recorded • Certain hits mean more than others • Flight recorder • Any table management locks • Inserts slow way down (90GB table) © 2006 F eedBurner
  11. 11. Solution: Executor Pool 11 • Executor Pool – Doug Lea’s concurrency library – Use a PooledExecutor so stats inserts happen in a separate thread – Spring bean definition: <bean id=quot;StatsExecutorquot; class=quot;EDU.oswego.cs.dl.util.concurrent.PooledExecutorquot;> <constructor-arg> <bean class=quot;EDU.oswego.cs.dl.util.concurrent.LinkedQueuequot;/> </constructor-arg> <property name=quot;minimumPoolSizequot; value=quot;10quot; /> <property name=quot;keepAliveTimequot; value=quot;5000quot; /> </bean> © 2006 F eedBurner
  12. 12. Solution: Lazy rollup 12 • Only today’s detailed stats need to go against real-time table • Roll up previous days into sparse summary tables on-demand • First access for stats for a day is slow, subsequent request are fast © 2006 F eedBurner
  13. 13. Scalability Problem 3: Primary DB overload 13 • Mostly used master DB server for everything • Read vs. Read/Write load didn’t matter in the beginnning • Slow inserts would block reads, when using MyISAM © 2006 F eedBurner
  14. 14. Solution: Balance read and read/write load 14 • Looked at workload – Found where we could break up read vs. read/write – Created Spring ExtendedDaoObjects – Tomcat-managed DataSources • Balanced master vs. slave load (Duh) – Slave becomes perfect place for snapshot backups • Watch for replication problems – Merge table problems (later) – Slow queries slow down replication © 2006 F eedBurner
  15. 15. Example: Cacti graph of MySQL handlers 15 © 2006 F eedBurner
  16. 16. ExtendedDaoObject 16 • Application code extends this class and uses getHibernateTemplate() or getReadOnlyHibernateTemplate() depending upon requirements • Similar class for JDBC public class ExtendedHibernateDaoSupport extends HibernateDaoSupport { private HibernateTemplate readOnlyHibernateTemplate; public void setReadOnlySessionFactory(SessionFactory sessionFactory) { this.readOnlyHibernateTemplate = new HibernateTemplate(sessionFactory); readOnlyHibernateTemplate.setFlushMode(HibernateTemplate.FLUSH_NEVER); } protected HibernateTemplate getReadOnlyHibernateTemplate() { return (readOnlyHibernateTemplate == null) ? getHibernateTemplate() : readOnlyHibernateTemplate; } } © 2006 F eedBurner
  17. 17. Scalability Problem 4: Total DB overload 17 • Everything slowing down • Using DB as cache • Database is the ‘shared’ part of all app servers • Ran into table size limit defaults on MyISAM (4GB). We were lazy. – Had to use Merge tables as a bridge to newer larger tables © 2006 F eedBurner
  18. 18. Solution: Stop using the database 18 • Where possible :) • Multi-level caching – Local VM caching (EHCache, memory only) – Memcached (http://www.danga.com/memcached/) – And finally, database. • Memcached – Fault-tolerant, but client handles that. – Shared nothing – Data is transient, can be recreated © 2006 F eedBurner
  19. 19. Scalability Problem 5: Lazy initialization 19 • Our stats get rolled up on demand – Popular feeds slowed down the whole system • FeedCount chicklet calculation – Every feed gets its circulation calculated at the same time – Contention on the table © 2006 F eedBurner
  20. 20. Solution: BATCH PROCESSING 20 • For FeedCount, we staggered the calculation – Still would run into contention – Stats stuff again slowed down at 1AM Chicago time. • We now process the rolled-up data every night – Delay showing the previous circulation in the FeedCount until roll-up is done. • Still wasn’t enough © 2006 F eedBurner
  21. 21. Scalability Problem 6: Stats writes, again 21 • Too much writing to master DB • More and more data stored associated with each feed • More stats tracking – Ad Stats – Item Stats – Circulation Stats © 2006 F eedBurner
  22. 22. Solution: Merge Tables 22 • After the nightly rollup, we truncate the subtable from 2 days ago • Gotcha with truncating a subtable: – FLUSH TABLES; TRUNCATE TABLE ad_stats0; – Could succeed on master, but fail on slave • The right way to truncate a subtable: – ALTER TABLE ad_stats TYPE=MERGE UNION=(ad_stats1,ad_stats2); – TRUNCATE TABLE ad_stats0; – ALTER TABLE ad_stats TYPE=MERGE UNION=(ad_stats0,ad_stats1,ad_stats2); © 2006 F eedBurner
  23. 23. Solution: Horizontal Partitioning 23 • Constantly identifying hot spots in the database – Ad serving – Flare serving – Circulation (constant writes, occasional reads) • Move hottest tables/queries off to own clusters – Hibernate and certain lazy patterns allow this – Keeps the driving tables from slowing down © 2006 F eedBurner
  24. 24. Scalability Problem 7: Master DB Failure 24 • Still using just a primary and slave • Master crash: Single point of failure • No easy way to promote a slave to a master © 2006 F eedBurner
  25. 25. Solution: No easy answer 25 • Still using auto_increment – Multi-master replication is out • Tried DRBD + HeartBeat – Disk is replicated block-by-block – Hot primary, cold secondary • Didn’t work as we hoped – Myisamchk takes too long after failure – I/O + CPU overhead • InnoDB is supposedly better © 2006 F eedBurner
  26. 26. Our multi-master solution 26 • Low-volume master cluster – Uses DRBD + HeartBeat – Works well under smaller load – Does mapping to feed data clusters • Feed Data Cluster – Standard Master + Slave(s) structure – Can be added as needed © 2006 F eedBurner
  27. 27. Mapping / Marshalling Database Cluster 27 © 2006 F eedBurner
  28. 28. Scalability Problem 8: Power Failure 28 • Chicago has ‘questionable’ infrastructure. • Battery backup, generators can be problematic • Colo techs have been known to hit the Big Red Switch • Needed a disaster recovery/secondary site – Active/Active not possible for us. Yet. – Would have to keep fast connection to redundant site – Would require 100% of current hardware, but would lie quiet © 2006 F eedBurner
  29. 29. Code Name: Panic App 29 • Product Name: Feed Insurance • Elegant, simple solution • Not Java (sorry) • Perl-based feed fetcher – Downloads copies of feeds, saved as flat XML files – Synchronized out to local and remote servers – Special rules for click tracking, dynamic GIFs, etc © 2006 F eedBurner
  30. 30. General guidelines 30 • Know your DB workload – Cacti really helps with this • ‘EXPLAIN’ all of your queries – Helps keep crushing queries out of the system • Cache everything that you can • Profile your code – Usually only needed on hard-to-find leaks © 2006 F eedBurner
  31. 31. Our settings / what we use 31 • Don’t always need the latest and greatest – Hibernate 2.1 – Spring – DBCP – MySQL 4.1 – Tomcat 5.0.x • Let the container manage DataSources © 2006 F eedBurner
  32. 32. JDBC 32 • Hibernate/iBatis/Name-Your-ORM-Here – Use ORM when appropriate – Watch the queries that your ORM generates – Don't be afraid to drop to JDBC • Driver parameters we use: # For Internationalization of Ads, multi-byte characters in general useUnicode=true characterEncoding=UTF-8 # Biggest performance bits cacheServerConfiguration=true useLocalSessionState=true # Some other settings that we've needed as things have evolved useServerPrepStmts=false jdbcCompliantTruncation=false © 2006 F eedBurner
  33. 33. Thank You 33 Questions? joek@feedburner.com © 2006 F eedBurner
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×