moshe . [email_address] . com http :// top - performance . blogspot . com 1 Billion Events Per Day The Internet Building B...
1 Billion Events Per Day The Internet Building Blocks [email_address] http://top-performance.blogspot.com
RockeTier <ul><li>The 1 billion events per day software development company: </li></ul><ul><li>Consulting </li></ul><ul><l...
Assumptions <ul><li>Cloud Computing    Virtualization </li></ul><ul><li>Virtualization    Low end servers </li></ul><ul>...
Assumptions…
Major Options
Presentation Objectives <ul><li>Who is using MySQL? </li></ul><ul><li>MySQL Limitations </li></ul><ul><li>How to get over ...
Who is Using MySQL?
MySQL Limitations <ul><li>Table sizes:  50-100M records per table </li></ul><ul><li>Reads:  50 queries/second </li></ul>
Why Do I Care? <ul><li>From 0 to 100  (US mass adaptation) </li></ul><ul><ul><li>Phone: 100 yrs </li></ul></ul><ul><ul><li...
100K New Users/Week
The Network Effect
What Should I Do? <ul><li>Oracle </li></ul><ul><li>SQL Server </li></ul><ul><li>$$$ </li></ul>
The 3 Stages System
Ad Network Reference Architecture
Step I – Load Balancing <ul><li>Software: HAProxy, Apache </li></ul><ul><li>Hardware: Cisco, F5, Radware-Alteon </li></ul>
Step II – Web Server
Step III – In Memory Database UPDATE …  SET HIT=HIT+1 UPDATE …  SET HIT=HIT+1 Validate Validate Banner 1 Hit Banner 2 Hit
Step III – In Memory Database IMDB UPDATE …  SET HIT=HIT+1 UPDATE …  SET HIT=HIT+1 Validate Validate UPDATE SET HIT=HIT+41...
And Finally… Sharding
Vertical Sharding
Horizontal Sharding <ul><li>Static Hashing </li></ul><ul><ul><li>Complex growth </li></ul></ul><ul><ul><li>Simple </li></u...
<ul><li>Key locations are defined in a directory </li></ul><ul><ul><li>Simple growth </li></ul></ul><ul><ul><li>Directory ...
Horizontal Sharding <ul><li>Static hashing with directory mapping </li></ul><ul><ul><li>Simple growth </li></ul></ul><ul><...
Horizontal Sharding <ul><li>Each key signed by DB#  generated on creation </li></ul><ul><ul><li>Simple growth </li></ul></...
Sharding Management <ul><li>Starting shards on the fly </li></ul><ul><li>Shutting down shards on the fly </li></ul><ul><li...
Reporting
Best Practices <ul><li>$connection  =   new_db_connection (&quot; customer :// 1234 &quot;) ; </li></ul><ul><li>$statement...
Lessons <ul><li>Vertical Sharding:  </li></ul><ul><ul><li>User Actions, Users, Comments, Items </li></ul></ul><ul><li>Hori...
Lessons <ul><li>100M views per day </li></ul><ul><li>The path to Sharding:  </li></ul><ul><ul><li>Single server </li></ul>...
Lessons <ul><li>Master-Master replication </li></ul><ul><li>Each Shard is 50% loaded </li></ul><ul><li>40K queries/second ...
The Bottom Line: Grow ∞
Startup your Engines  Thank you [email_address] http://top-performance.blogspot.com Our Methodology Performance  problems ...
Upcoming SlideShare
Loading in...5
×

1 Billion Events per Day, Israel 3rd Java Technology Day, June 22, 2009

1,222

Published on

We presented at the Israeli 3rd Java Technology Day, the largest SUN Microsystems/MySQL event in israel. We presented here the essentials parts of building a real life web/enterprise system that needs to handle the performance needs of 1 billion events per day (a case study from the ad networks billing systems). We presented the adoption rate in the internet, Load Balancers (HAProxy, Apache, Radware, F5, Cisco), Web Servers, In Memory Database (IMDB inc. Memcached, Gigaspaces, Teracotta and Oracle Coherence) and finally Sharding (inc. Veritical, Static Horizontal and dynamic). A great example for a performance boosting architecture.

Published in: Technology, News & Politics
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,222
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Initial setup of a u-Page from toolbar 14M users today 20M users 2009Q1 600K downloads per day Moving from 1:N templates to 10-15% users that updates the page (from 1:1000 templates/users ratio to 1:7 templates/users) Statistics: user changes: 1 per week, upload: 2 times a day Wants to use the default if the user did not modified the template. However, wants to support pushed changes from templates to users pages, even if those were changed Have the same situation now, but now it’s saved on the users desktop Current applicative cache: - int: toolbar_id, list of int: widget_id[] - int: widget_id, object: widget - int: user_id, object: user Table design: - User id, tab id and user_tab_id are GUID – meaning that they can be distributed between databases - Other are not (may be needed to support?) - Expected XML scheme size: 45KB (up to 500KB) Why migration to MySQL will be problematic (they are already using the current SQL Server SP features): - Maximal row size * http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html * The maximum row length, except for VARBINARY , VARCHAR , BLOB and TEXT columns, is slightly less than half of a database page. That is, the maximum row length is about 8000 bytes - Scope identity - For XML - Bulk Insert from XML - Rollback/Transaction - BEGIN TRY Applicative changes by the user: - Move - Delete - Add - Minimize - Change internal parameters Options to be considered: - SQL Server based large machine - Sharding based on MySQL, SQL Server - Gigaspaces solution: felt that it’s a large machine, and they prefer the database way Other options - Saving only XML in the current way - Save the summed page configuration in an XML, so little read should be done from the DB (Tab based) - Write 20K files of 40KB each on the my laptop HD: 149s - Read XML: 109.5s - Write to DB: 1789s Use serialization of .Net - Save the XML on the disk in order to avoid variable length fields - Use memcached to hold the hash of users? Things to be considered: - What horizontal sharding algorithm should be selected - Hibernate Shards – provided by Google. Still beta-testing phase - What vertical sharding tables should be spitted to different databases - How do you manage so many databases (distribute data and so on) - There is not really an option to do that - Defining optimal table sizes - Retrieval of data from the disk vs. Getting data from the tables: Tab (1 per displayed page), Zone (3 per displayed page), Widget instance (10-300 per displayed page + should be extracted with/out zones) - OLAP solution to merge data OLAP solution Toolbar design: - Saving
  • Our Methodology Performance problems are extremely complex and due to the diferent technologies deployed, each case is unique. A “typical” performance problem requires delving into databases, application servers, client technology, code in difering programming languages and system and software architectures. RockeTier implements a unique methodology in order to simplify the problem and evaluate each performance bottleneck, providing both an immediate efective relief and when necessary, design a gradual roadmap to speed up your software system and make it scalable and robust. Our 5 steps methodology : 1. Detect: Pinpoint your performance bottlenecks using various tools including load and stress tools, code profiling, database profiling, network sniffing and code review to detect performance bottlenecks in specific components. 2. Rate: Grade each bottleneck by importance and provide immediate practical recommendations and performance boost estimations. 3. Immediate effective relief: Provide immediate fixes and workarounds in a short time frame helping you meet your urgent business needs. 4. Roadmap Planning: When necessary, redesign next generation Solutions, using proven robust and scalable solutions such as grid and in memory databases. 5. Scale up and Scale out: In cases where redesign is necessary - RockeTier provides implementation or software design description (SDD), and guidance for in house programmers for the implementation of the next generation scalable system, which will meet your growing business needs. Your Value Business: Achieve your business performance requirements. GreenIT: Protect the environment and reduce CO2 emissions. Bottom Line: Reducing hardware and 3rd party software cost. The Performance Experts Success Stories The Finance Sector: An international insurance company managing over 20 Billion US dollars in assets was facing poor performance in its core life insurance policy software system. The RockeTier team detected bottlenecks originating from several software infrastructure modules. A practical solution was implemented. The customer’s success criteria was a 20% decrease in insurance policy creation run time, Our solution provided a 40% decrease in run time! Telecom: A VC backed start-up company was facing critical installation problems in the leading Israeli cellular operator. Knowing that existing system performance would not meet client requirements, the company asked RockeTier to help it boost its performance. RockeTier evaluated the system and implemented a workaround to the system database architecture, boosting the overall system performance by 30%. Following that. the RockeTier team designed the company’s next generation architecture, meeting a throughput of 3000Mbps by design. RockeTier at a Glance RockeTier is a software solutions company, which utilizes its knowledge and skills to help companies from both the enterprise sector and the start-up industry. RockeTier has numerous success stories in solving customers’ system performance bottlenecks and scale out limitations, providing immediate improvements and workarounds in a short time frame and, when necessary, redesign and implementation of the next generation solutions employing grid and/or in- memory databases in the Web 2.0, Telecom and finance markets. Web 2.0: a start-up company providing an innovative electronic advertising and billing system was facing its technological limits. The RockeTier team evaluated and redesigned its system architecture and is currently implementing a scale out grid mechanism and caching algorithms. The solution supports 20 times the original capacity using the same hardware. Moreover it supports semi-linear growth (by simple scale out) and high availability requirements. “ 20% reduction in transaction time within 3 months” “ Boost Performance by a factor of 200” “ 200 million events per day”
  • The world is changing more and more fast You have to minimize NRE You must support unexpected demand You must provide top service (people now leaves cell operator after single incident, rather then 5, 10 years ago Firms a vulnerable – Citi worth 20Bill $ instead of 200 Bill a year ago Break and Mortar bookstores 15 years ago and Amazon…. Will it happen again to banks, insurance, real estate agencies… IS YOUR MARKET THE NEXT FOR PENETRATION – Finance? Real Estate? How to win a rival that is not existing yet? http://www.johnmwillis.com/ibm/cloud-computing-and-the-enterprise/ -
  • Sharded database – OLTP Little/No reporting OLAP – must be implemented for reporting Loads data from sharded DBs Custom mechanism Any commercial
  • Start with nothing: storage, FW, LB, Server and grow… Can buy servers for more than hour
  • [email_address] http://top-performance.blogspot.com http://www.rocketier.com
  • Transcript of "1 Billion Events per Day, Israel 3rd Java Technology Day, June 22, 2009"

    1. 1. moshe . [email_address] . com http :// top - performance . blogspot . com 1 Billion Events Per Day The Internet Building Blocks Moshe Kaplan, RockeTier The Performance Experts
    2. 2. 1 Billion Events Per Day The Internet Building Blocks [email_address] http://top-performance.blogspot.com
    3. 3. RockeTier <ul><li>The 1 billion events per day software development company: </li></ul><ul><li>Consulting </li></ul><ul><li>Boosting </li></ul><ul><li>Development </li></ul>
    4. 4. Assumptions <ul><li>Cloud Computing  Virtualization </li></ul><ul><li>Virtualization  Low end servers </li></ul><ul><li>Very large databases  High end servers </li></ul><ul><li>Therefore: </li></ul><ul><ul><li>Very large databases ≠ Cloud Computing </li></ul></ul>
    5. 5. Assumptions…
    6. 6. Major Options
    7. 7. Presentation Objectives <ul><li>Who is using MySQL? </li></ul><ul><li>MySQL Limitations </li></ul><ul><li>How to get over this? </li></ul><ul><ul><li>Move to another DB and scale up… </li></ul></ul><ul><ul><li>Vertical Sharding </li></ul></ul><ul><ul><li>Horizontal Sharding </li></ul></ul><ul><li>Sharding test case </li></ul>
    8. 8. Who is Using MySQL?
    9. 9. MySQL Limitations <ul><li>Table sizes: 50-100M records per table </li></ul><ul><li>Reads: 50 queries/second </li></ul>
    10. 10. Why Do I Care? <ul><li>From 0 to 100 (US mass adaptation) </li></ul><ul><ul><li>Phone: 100 yrs </li></ul></ul><ul><ul><li>Radio: 40 yrs </li></ul></ul><ul><ul><li>TV: 30 yrs </li></ul></ul><ul><ul><li>Mobile: 20 yrs </li></ul></ul><ul><ul><li>Internet: 10 yrs </li></ul></ul><ul><ul><li>Facebook: 2 yrs </li></ul></ul>
    11. 11. 100K New Users/Week
    12. 12. The Network Effect
    13. 13. What Should I Do? <ul><li>Oracle </li></ul><ul><li>SQL Server </li></ul><ul><li>$$$ </li></ul>
    14. 14. The 3 Stages System
    15. 15. Ad Network Reference Architecture
    16. 16. Step I – Load Balancing <ul><li>Software: HAProxy, Apache </li></ul><ul><li>Hardware: Cisco, F5, Radware-Alteon </li></ul>
    17. 17. Step II – Web Server
    18. 18. Step III – In Memory Database UPDATE … SET HIT=HIT+1 UPDATE … SET HIT=HIT+1 Validate Validate Banner 1 Hit Banner 2 Hit
    19. 19. Step III – In Memory Database IMDB UPDATE … SET HIT=HIT+1 UPDATE … SET HIT=HIT+1 Validate Validate UPDATE SET HIT=HIT+41 UPDATE SET HIT=HIT+22 UPDATE SET HIT=HIT+87 Banner 1 Hit Banner 2 Hit
    20. 20. And Finally… Sharding
    21. 21. Vertical Sharding
    22. 22. Horizontal Sharding <ul><li>Static Hashing </li></ul><ul><ul><li>Complex growth </li></ul></ul><ul><ul><li>Simple </li></ul></ul>Mod 10 = 0 Mod 10 = 1 Mod 10 = 2 Mod 10 = 3 Mod 10 = 4 Mod 10 = 5 Mod 10 = 6 Mod 10 = 7 Mod 10 = 8 Mod 10 = 9
    23. 23. <ul><li>Key locations are defined in a directory </li></ul><ul><ul><li>Simple growth </li></ul></ul><ul><ul><li>Directory is SPOF </li></ul></ul>Horizontal Sharding
    24. 24. Horizontal Sharding <ul><li>Static hashing with directory mapping </li></ul><ul><ul><li>Simple growth </li></ul></ul><ul><ul><li>Small Directory still SPOF </li></ul></ul>Mod 1000 = 4
    25. 25. Horizontal Sharding <ul><li>Each key signed by DB# generated on creation </li></ul><ul><ul><li>Simple growth </li></ul></ul><ul><ul><li>New key generation is SPOF </li></ul></ul>
    26. 26. Sharding Management <ul><li>Starting shards on the fly </li></ul><ul><li>Shutting down shards on the fly </li></ul><ul><li>Distributing changes in DB schema </li></ul>
    27. 27. Reporting
    28. 28. Best Practices <ul><li>$connection = new_db_connection (&quot; customer :// 1234 &quot;) ; </li></ul><ul><li>$statement = $connection->prepare( $sql_statement, $params ); </li></ul><ul><li>$result = $statement->execute(); </li></ul>
    29. 29. Lessons <ul><li>Vertical Sharding: </li></ul><ul><ul><li>User Actions, Users, Comments, Items </li></ul></ul><ul><li>Horizontal Sharding </li></ul><ul><li>Denormalization </li></ul><ul><li>MySQL Replication </li></ul>
    30. 30. Lessons <ul><li>100M views per day </li></ul><ul><li>The path to Sharding: </li></ul><ul><ul><li>Single server </li></ul></ul><ul><ul><li>Single master with multiple read slaves </li></ul></ul><ul><ul><li>Partitioned </li></ul></ul><ul><ul><li>Sharding </li></ul></ul>
    31. 31. Lessons <ul><li>Master-Master replication </li></ul><ul><li>Each Shard is 50% loaded </li></ul><ul><li>40K queries/second </li></ul>
    32. 32. The Bottom Line: Grow ∞
    33. 33. Startup your Engines Thank you [email_address] http://top-performance.blogspot.com Our Methodology Performance problems are extremely complex and due to the diferent technologies deployed, each case is unique. A “typical” performance problem requires delving into databases, application servers, client technology, code in difering programming languages and system and software architectures. RockeTier implements a unique methodology in order to simplify the problem and evaluate each performance bottleneck, providing both an immediate efective relief and when necessary, design a gradual roadmap to speed up your software system and make it scalable and robust. Our 5 steps methodology : 1. Detect: Pinpoint your performance bottlenecks using various tools including load and stress tools, code profiling, database profiling, network sniffing and code review to detect performance bottlenecks in specific components. 2. Rate: Grade each bottleneck by importance and provide immediate practical recommendations and performance boost estimations. 3. Immediate effective relief: Provide immediate fixes and workarounds in a short time frame helping you meet your urgent business needs. 4. Roadmap Planning: When necessary, redesign next generation Solutions, using proven robust and scalable solutions such as grid and in memory databases. 5. Scale up and Scale out: In cases where redesign is necessary - RockeTier provides implementation or software design description (SDD), and guidance for in house programmers for the implementation of the next generation scalable system, which will meet your growing business needs. Your Value Business: Achieve your business performance requirements. GreenIT: Protect the environment and reduce CO2 emissions. Bottom Line: Reducing hardware and 3rd party software cost. The Performance Experts Success Stories The Finance Sector: An international insurance company managing over 20 Billion US dollars in assets was facing poor performance in its core life insurance policy software system. The RockeTier team detected bottlenecks originating from several software infrastructure modules. A practical solution was implemented. The customer’s success criteria was a 20% decrease in insurance policy creation run time, Our solution provided a 40% decrease in run time! Telecom: A VC backed start-up company was facing critical installation problems in the leading Israeli cellular operator. Knowing that existing system performance would not meet client requirements, the company asked RockeTier to help it boost its performance. RockeTier evaluated the system and implemented a workaround to the system database architecture, boosting the overall system performance by 30%. Following that. the RockeTier team designed the company’s next generation architecture, meeting a throughput of 3000Mbps by design. RockeTier at a Glance RockeTier is a software solutions company, which utilizes its knowledge and skills to help companies from both the enterprise sector and the start-up industry. RockeTier has numerous success stories in solving customers’ system performance bottlenecks and scale out limitations, providing immediate improvements and workarounds in a short time frame and, when necessary, redesign and implementation of the next generation solutions employing grid and/or in- memory databases in the Web 2.0, Telecom and finance markets. Web 2.0: a start-up company providing an innovative electronic advertising and billing system was facing its technological limits. The RockeTier team evaluated and redesigned its system architecture and is currently implementing a scale out grid mechanism and caching algorithms. The solution supports 20 times the original capacity using the same hardware. Moreover it supports semi-linear growth (by simple scale out) and high availability requirements. “ 20% reduction in transaction time within 3 months” “ Boost Performance by a factor of 200” “ 200 million events per day”

    ×