ScimoreDB @ CommunityDays 2011

•Download as PPTX, PDF•

0 likes•1,076 views

Scimore presentation @ CommunitDays 2011 Copenhagen. Quick introduction to motivation for ScimoreDB. Explanation of cluster topology - sharding/grouping and replication.

Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows). Small amount of data. Database loads was small. Today internet has changed the game: 24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load

NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI

Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware

History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL 2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.

ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware

Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!

Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3

Distributed data over large amount of partition groups ,[object Object]

Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node

Many nodes in each group – higher replication ,[object Object]

Slower on insert/update – more machines needs to be updated

Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node

Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.

Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2

Demo on Amazon EC2 Customer c_id bigint c_name varchar c_zip varchar Products p_id bigint p_Name varchar p_price money Orders o_id autobigint o_c_id bigint o_p_id bigint o_amount int o_date datetime o_price money

What's hot

Understanding and tuning WiredTiger, the new high performance database engine...Ontico

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Lviv Startup Club

Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...Magalix Corporation

Apache Cassandra in the Real WorldJeremy Hanna

A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax

Running Cassandra in AWSDataStax Academy

Cassandra Operations at Netflixgreggulrich

ScyllaDB: NoSQL at Ludicrous SpeedJ On The Beach

Load testing Cassandra applicationsBen Slater

CassandraCarbo Kuo

Mongo db multidc_webinarMongoDB

Introduction to NoSQL & Apache CassandraChetan Baheti

HDFS introductioninjae yeo

MySQL HAKris Buytaert

Update on OpenTSDB and AsyncHBase HBaseCon

An Overview of Apache CassandraDataStax

Building your own NSQL storeEdward Capriolo

Big data nyuEdward Capriolo

What's hot (19)

Understanding and tuning WiredTiger, the new high performance database engine...

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...

Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...

Apache Cassandra in the Real World

A New MongoDB Sharding Architecture for Higher Availability and Better Resour...

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...

Running Cassandra in AWS

Cassandra Operations at Netflix

ScyllaDB: NoSQL at Ludicrous Speed

Load testing Cassandra applications

Cassandra

Mongo db multidc_webinar

Introduction to NoSQL & Apache Cassandra

HDFS introduction

MySQL HA

Update on OpenTSDB and AsyncHBase

An Overview of Apache Cassandra

Building your own NSQL store

Big data nyu

Similar to ScimoreDB @ CommunityDays 2011

Agility and Scalability with MongoDBMongoDB

Navigating NoSQL in cloudy skiesshnkr_rmchndrn

Data Grids with Oracle CoherenceBen Stopford

Getting Started with Amazon RedshiftAmazon Web Services

Getting Started with Amazon ElastiCacheAmazon Web Services

2012-03-15 What's New at Red HatShawn Wells

Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab

Is your cloud ready for Big Data? Strata NY 2013Richard McDougall

Getting started with Amazon ElastiCacheAmazon Web Services

MongoDB Sharding Webinar 2014Dylan Tong

Dragonflow Austin Summit Talk Eran Gampel

Getting Started with Amazon RedshiftAmazon Web Services

DynamoDB Deep DiveAmazon Web Services LATAM

Cassandra Consistency: Tradeoffs and LimitationsPanagiotis Papadopoulos

Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services

RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs

ceph optimization on ssd ilsoo byun-shortNAVER D2

Distribute Key Value StoreSantal Li

Distribute key value_storedrewz lin

Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLucidworks

Similar to ScimoreDB @ CommunityDays 2011 (20)

Agility and Scalability with MongoDB

Navigating NoSQL in cloudy skies

Data Grids with Oracle Coherence

Getting Started with Amazon Redshift

Getting Started with Amazon ElastiCache

2012-03-15 What's New at Red Hat

Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab

Is your cloud ready for Big Data? Strata NY 2013

Getting started with Amazon ElastiCache

MongoDB Sharding Webinar 2014

Dragonflow Austin Summit Talk

Getting Started with Amazon Redshift

DynamoDB Deep Dive

Cassandra Consistency: Tradeoffs and Limitations

Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...

RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman

ceph optimization on ssd ilsoo byun-short

Distribute Key Value Store

Distribute key value_store

Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware

ScimoreDB @ CommunityDays 2011

1. YesSQL

2. Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows). Small amount of data. Database loads was small. Today internet has changed the game: 24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load

3. NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI

4. Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware

5. History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL 2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.

6. ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware

7. Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!

8. Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3

10. Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node

11.

12. Slower on insert/update – more machines needs to be updated

13. Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node

14. Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.

15. Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2

16. Demo on Amazon EC2 Customer c_id bigint c_name varchar c_zip varchar Products p_id bigint p_Name varchar p_price money Orders o_id autobigint o_c_id bigint o_p_id bigint o_amount int o_date datetime o_price money

17. Performance Single machine 8 core Simple select: 75.000 queries/s (10 client threads) Vodafone cluster of 6 machines: 21.000 transactions inserting 1 row DTU cluster of 31 small machines TPC-C : 140.000 transactions/s (35% insert, 35% update, 30% select)

18. ACID transactions Crash safe recovery Row & tabel level locking Dynamic phase commit (D2PC) Dynamic group commit Transactions isolation levels (read commit, read repeatable) In-Doubt transaction state Multiversioning Concurrency control MVCC Local and distributed deadlock detection Write ahead logging Fuzzy checkpoint - non blocking checkpoints B+-Tree Page compression TEXT/NTEXT field compression System tables: performance, monitoring, schema T-SQL Recursive queries/CTE Security – users & roles Free text (lucene) ScimoreOS, fiber based tasks scheduling 100% asynchronious, io-completion based NUMA aware Distributed query optimizer Distributed tree execution Query prioritization and throttling

19. Questions

ScimoreDB @ CommunityDays 2011

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to ScimoreDB @ CommunityDays 2011

Similar to ScimoreDB @ CommunityDays 2011 (20)

ScimoreDB @ CommunityDays 2011