How to build an app withTwitter-like throughputon just 9 servers...Lew Cirne, Founder & CEO - New Relic
I’m Lew Cirne@sweetlew
What our app doesAPM as a ServiceIn-app agent instrumentation (BCI, etc)150,000+ app processes monitored, globally (10K cu...
Each day we collect 20 billion measurements,    from 150,000 application processes,         for over 10,000 customers.
Each day we collect 20 billion measurements,    from 150,000 application processes,         for over 10,000 customers.    ...
We capture “Timeslices”                              Each o ne is aboutResponse Time                    250 bytes4 hours f...
timeslice insertion rate: 100K/second >7 billion rows per day                           Twitter peak insertion rate:      ...
Collecting is one thing...• We provide realtime monitoring• One minute granularity• Data is almost always stale• Each user...
Our most popular page...                                    age                            e Full P                      A...
Our most popular page...                                    age                            e Full P                      A...
Main App Software stackUser Interface       Data Collectors        Data Store  & REST API                                 ...
Simplified architecture...                                     9 Collector / Aggregator / DB’s                             ...
Even more data!On May 17, we launched Real User Monitoring• Using Episodes to measure browser load time of every page view...
Beacon Architecture                                                           Response Time 0.15ms                        ...
Challenges• Data Purging• Determining what to pre-aggregate• Large Accounts• MySQL Optimization and Tuning• I/O performanc...
5 Lessons Learned
1. Keep it simple
2. Less is more
3. Trendy != Reliable
4. Plan for scale
s                                             s ode                                         Epi      New
                 ...
See New RelicMonitor New Relic   at our booth
How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
Upcoming SlideShare
Loading in...5
×

How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers

8,520

Published on

Velocity Conference 2011 presentation by New Relic CEO Lew Cirne. - New Relic’s multitenant, SaaS web application monitoring service collects and persists over 90,000 metrics every second on a sustained basis, while still delivering an average page load time of 1.5 seconds. In this presentation Lew Cirne discusses how good architecture and good tools can help you handle an extremely large amount of data while still providing extremely fast service. He shows you how we scale to support customer growth, how we monitor our system, and what traps to look out for.

Published in: Technology
1 Comment
29 Likes
Statistics
Notes
No Downloads
Views
Total Views
8,520
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
224
Comments
1
Likes
29
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Software\n
  • People\n
  • \n
  • \n
  • \n
  • \n
  • How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers

    1. 1. How to build an app withTwitter-like throughputon just 9 servers...Lew Cirne, Founder & CEO - New Relic
    2. 2. I’m Lew Cirne@sweetlew
    3. 3. What our app doesAPM as a ServiceIn-app agent instrumentation (BCI, etc)150,000+ app processes monitored, globally (10K customers)Each process reports a few hundred metrics per minute5 Languages (Ruby, Java, PHP, .NET, Python)
    4. 4. Each day we collect 20 billion measurements, from 150,000 application processes, for over 10,000 customers.
    5. 5. Each day we collect 20 billion measurements, from 150,000 application processes, for over 10,000 customers. All on 9 servers.
    6. 6. We capture “Timeslices” Each o ne is aboutResponse Time 250 bytes4 hours from 11:04 to 15:04Count: 1242 A single tweetAvg: 337 ms is about theMin: 0.63 msMax: 95669 ms same sizeStd Dev: 782
    7. 7. timeslice insertion rate: 100K/second >7 billion rows per day Twitter peak insertion rate: 8K rows per second 9 Servers handle all data collection
    8. 8. Collecting is one thing...• We provide realtime monitoring• One minute granularity• Data is almost always stale• Each user/account has different data• Page caching and other easy solutions don’t work for us.
    9. 9. Our most popular page... age e Full P Averag Time: Load 2.4 Sec
    10. 10. Our most popular page... age e Full P Averag Time: Load 2.4 Sec
    11. 11. Main App Software stackUser Interface Data Collectors Data Store & REST API MySQL Servlets on Jetty Sharded by accounts Rails 2.3
    12. 12. Simplified architecture... 9 Collector / Aggregator / DB’s Sustained 100K insertion rate per second SCustomer’s environment HTTP 24 Core Intel Nehalem 48 GB RAM SAS attached RAID 5 No Virtualization (either cloud or datacenter) 2 Web App Servers 12 Core Intel Nehalem 48 GB RAM
    13. 13. Even more data!On May 17, we launched Real User Monitoring• Using Episodes to measure browser load time of every page view• Browser reports data to our ‘Beacon’ servers• Monitoring >1 Billion page views per week• Doubled our total inbound HTTP requests in a MONTH
    14. 14. Beacon Architecture Response Time 0.15ms RUM Beacons Real User Asynchronously Browsers Billions of metrics from Servlets Capture and across the globe enqueue (in-memory) aggregate and forward Timeslices to our Collectors Over 1 Billion user sessionsmeasured for performance in first Currently at EC2 month.
    15. 15. Challenges• Data Purging• Determining what to pre-aggregate• Large Accounts• MySQL Optimization and Tuning• I/O performance - (virtualized to dedicated) ...
    16. 16. 5 Lessons Learned
    17. 17. 1. Keep it simple
    18. 18. 2. Less is more
    19. 19. 3. Trendy != Reliable
    20. 20. 4. Plan for scale
    21. 21. s s ode Epi New
 Ja Relic va y ub5. Use the right technology Ngin x Je/y R Rails for a given task
    22. 22. See New RelicMonitor New Relic at our booth
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×