Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012

7,145 views

Published on

The presentation the CUBRID team presented at Russian HighLoad++ Conference in October, 2012. The presentation covers the topic of Big Data management through Database Sharding. CUBRID open source RDBMS provides native support for Sharding with load balancing, connection pooling, and auto fail-over features.

Published in: Technology
  • Be the first to comment

Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012

  1. 1. Database Sharding the Right Way:Easy, Reliable, and Open source.
  2. 2. • – – – – –
  3. 3. Growing in the Wild. The story by CUBRID Database Developers. View on Slideshare http://profyclub.ru/docs/439
  4. 4. ••••
  5. 5. 
  6. 6. ••
  7. 7. =Big Business Opportunity
  8. 8. - Enterprise - Vendor dependencySQL - Scalability constraints - Common interface - Open SourceNoSQL - Scalable - Non-standard API
  9. 9. ••••••••••••••
  10. 10. SQLTransactionsNoSQL => NoACIDStandard InterfaceExperts
  11. 11. DBMS Worldwide 21,359 23,252 26,701 11.8% Market Korea 349 395 478 17% $MM Ratio 1.6% 1.7% 1.8%70%65%60%55% Korea50% Worldwide45%40% 2009 2010 2011 Source: Gartner, 2012
  12. 12. RDBMS is still the best choice for mission-critical data
  13. 13. Database Sharding
  14. 14. Name Type Requirements Interface DB ETC DBMS w/ - HibernateHibernate shards AS framework Hibernate Java - JVM supportdbShards AS & Middleware MySQL Java, C MiddlewareGizzard (Twitter) Any storage - JVM Java Middleware &Spider for MySQL MySQL Any Storage Engine - CUBRIDCUBRID SHARD Middleware - MySQL Any - Oracle
  15. 15. •••• – –•••
  16. 16. • – – – – –
  17. 17. Is there such RDBMS?
  18. 18. CUBRID 9.0
  19. 19.      
  20. 20.
  21. 21. Easy Installation
  22. 22. http://www.cubrid.org/downloads
  23. 23. • –• –• –
  24. 24. SHARD_KEY_MODULAR = 256SHARD_KEY_LIBRARY_NAME = ‘’SHARD_KEY_FUNCTION_NAME = ‘’
  25. 25.  id  user_id=  order_no  …
  26. 26. int user_get_shard_key(int type, void *val){ int mod = 2; if (val == NULL) { return ERROR_ON_ARGUMENT; } switch(type) { case SHARD_U_TYPE_INT: { int ival; ival = (int) (*(int *)val); return ival % 2; } break; case SHARD_U_TYPE_STRING: return ERROR_ON_MAKE_SHARD_KEY; default: return ERROR_ON_ARGUMENT; } return ERROR_ON_MAKE_SHARD_KEY;}
  27. 27. Configuring CUBRID SHARD is very easy!
  28. 28. • $> cubrid createdb shard1 $> csql -S -u dba shard1 -c "create user shard password shard123’” $> cubrid server start shard1
  29. 29. • $> csql -C -u shard -p shard123 shard1@localhost -c ”CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(20), age SMALLINT)”
  30. 30. $> cubrid shard start@ cubrid shard start ++cubrid shard start: success
  31. 31. connectionURL ="jdbc:cubrid:localhost:45511:shard1:shard:shard123:";
  32. 32. String query = "SELECT name FROM student WHERE student_no = /*+ shard_key */ ?; ";PrepareStatement query_stmt = connection.prepareStatement(query);query_stmt.setInt(1,100);ResultSet rs = query_stmt.executeQuery();// fetch resultset range key_column (hash result) shard_id min max student_no 0 63 0 student_no 64 127 1 student_no 128 191 2 student_no 192 255 3
  33. 33. SELECT name FROM student WHEREstudent_no = /*+ shard_key */ ?; • •
  34. 34. How did we tackle the unique ID problem?
  35. 35. • – – – – –
  36. 36. CUBRID SHARD Performance
  37. 37. Description Quantity OS (64bit) / CPU / MEMAgent to generatload and 8 Centos5.3 / xeon 2G-8core / 8GNDrive App SimulatorCUBRID Shard 1 Centos5.3 / xeon 2.27G-16core / 24GCUBRID Broker 1 Centos5.3 / xeon 2.27G-16core / 24GMeta DB 4 Centos5.x / xeon 2.33G-4core / 8GUser DB 1 Centos5.3 / xeon 2.5G-8core / 8G
  38. 38. Load Generator Performance 100000 80000 60000 RPS 40000 20000 0 32 64 96 128 160 192 256 320 384 448 512 # of concurrent users Performance trend when load is increased60000 7050000 60 5040000 4030000 3020000 2010000 10 0 0 64 128 192 256 320 proxy cpu RPS metadb TPS Mean Time(ms)
  39. 39. - Similar performance until 128 Vuser - When SHARD is not used, 128 Vuser is maximum - In SHARD usage case, when # of Vuser is increase - maximum performance can be achieved as well as shorter response time and lower CPU utilization.64 128 192 256 320 Vuser
  40. 40. TPC-C Performance Test
  41. 41. • • AWS Xlarge instance – • 7GB RAM • 20 EC2 units – – • Ubuntu 12.04 64-bit – • CUBRID 9.0 (beta) – – no shrading – • MySQL 5.5.28 – • Buffer – • 2.8GB – data_buffer_size • 2.8GB• innodb_pool_size • Default configurations
  42. 42. 46 44.18 42.664238 MySQL 5.5.28 CUBRID 9.03430 TPC-C Index
  43. 43.         
  44. 44. • – –• – –• –
  45. 45. What’s next for CUBRID?
  46. 46. 
  47. 47. www.cubrid.orgEsen SagynovCUBRID Project Manageresen@cubrid.org CUBRID Q&A www.cubrid.org/questions

×