ScimoreDB @ CommunityDays 2011


Published on

Scimore presentation @ CommunitDays 2011 Copenhagen.

Quick introduction to motivation for ScimoreDB.

Explanation of cluster topology - sharding/grouping and replication.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

ScimoreDB @ CommunityDays 2011

  1. 1. YesSQL<br />
  2. 2. Evolution of database – birth of NoSQL<br />15 year ago :<br />Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows). <br />Small amount of data.<br />Database loads was small.<br />Today internet has changed the game:<br /> 24x7 availability.<br />Large data.<br />Insane database loads<br />Tomorrow<br />Switching to hosted apps and thin clients. Even larger load<br />
  3. 3. NoSQL – mosaic of options<br />key‐value‐caches<br />memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa<br />key‐value‐store<br />keyspace, flare, schema‐free, RAMCloud<br />eventually‐consistent key‐value‐store<br />dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb<br />ordered‐key‐value‐store<br />tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord<br />data‐structures server<br />redis<br />tuple‐store<br />gigaspaces, coord, apache river<br />object database<br />ZopeDB, db4o, Shoal<br />document store<br />CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris<br />wide columnar store<br />BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI<br />
  4. 4. Best of two worlds<br />SQL<br />Transactions<br />Consistency<br />Ad-hoc query language<br />Common language<br />No-SQL<br />Scales horizontally<br />Super fast<br />Always available<br />Comodity hardware<br />
  5. 5. History of ScimoreDB<br />Driven by demand:<br />1999 Jubii - memory based / COM interface<br />2003 transaction/disc enabled<br />2004 distributed and DQL <br />2005 Scimore founded<br />2007 sql<br />2009 embedded<br />2010 replication, merge/bi-directional<br />2011 new distributied version for massive scale, fault tolerant.<br />
  6. 6. ScimoreDB v.4<br />Native SQL Database for Windows<br />Distributed<br />Elastic<br />Fault tolerant<br />Transactional / consistent<br />Scale on commodity hardware<br />
  7. 7. Going distributed is easy<br />Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table.<br />All existing sql queries continue to run.<br />There is no magic – its just doing it how you would program your own sharding and map-reduce layer!<br />
  8. 8. Partition Groups (shard)<br />Group1<br />Group2<br />Group3<br />Node #6<br />Node #4<br />Node #1<br />Node #2<br />Node #5<br />Node #3<br />
  9. 9. Distributed data over large amount of partition groups<br /><ul><li>Scale horizontally for writes
  10. 10. Scale for large data sets</li></ul>Group<br />Group<br />Group<br />Group<br />Group<br />Group<br />Group<br />Group<br />Group<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />
  11. 11. Many nodes in each group – higher replication<br /><ul><li>Safety – decide how safe do you want to be?
  12. 12. Slower on insert/update – more machines needs to be updated
  13. 13. Fast reads – more machines with same data</li></ul>Group<br />Group<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />Node<br />
  14. 14. Partitioning<br />Distribute to all – replicated on all groups.<br />Partition by column hash value.<br />Round-robin.<br />Relation.<br />
  15. 15. Partitioning by column hash value<br />Column [col1]>hash(100)MOD 1024<br />Select * from table where col1 = 100<br />0<br />1024<br />512<br />Group2<br />Group1<br />Node #3<br />Node #1<br />Node #4<br />Node #2<br />
  16. 16. Demo on Amazon EC2<br />Customer<br />c_id bigint<br />c_name varchar<br />c_zip varchar<br />Products<br />p_id bigint<br />p_Name varchar<br />p_price money<br />Orders<br />o_id autobigint<br />o_c_id bigint<br />o_p_id bigint<br />o_amount int<br />o_date datetime<br />o_price money<br />
  17. 17. Performance<br />Single machine 8 core Simple select:<br />75.000 queries/s (10 client threads)<br />Vodafone cluster of 6 machines: <br />21.000 transactions inserting 1 row<br />DTU cluster of 31 small machines TPC-C :<br />140.000 transactions/s (35% insert, 35% update, 30% select)<br />
  18. 18. ACID transactions<br />Crash safe recovery<br />Row & tabel level locking<br />Dynamic phase commit (D2PC)<br />Dynamic group commit<br />Transactions isolation levels (read commit, read repeatable)<br />In-Doubt transaction state<br />Multiversioning Concurrency control MVCC<br />Local and distributed deadlock detection<br />Write ahead logging<br />Fuzzy checkpoint - non blocking checkpoints<br />B+-Tree<br />Page compression<br />TEXT/NTEXT field compression<br />System tables: performance, monitoring, schema<br />T-SQL<br />Recursive queries/CTE<br />Security – users & roles<br />Free text (lucene)<br />ScimoreOS, fiber based tasks scheduling<br />100% asynchronious, io-completion based<br />NUMA aware<br />Distributed query optimizer<br />Distributed tree execution<br />Query prioritization and throttling<br />
  19. 19. Questions<br />