SlideShare a Scribd company logo
1 of 16
YesSQL
Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows).  Small amount of data. Database loads was small. Today internet has changed the game:  24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load
NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware
History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL  2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.
ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware
Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!
Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3
Distributed data over large amount of partition groups ,[object Object]
Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node
Many nodes in each group – higher replication ,[object Object]
Slower on insert/update – more machines needs to be updated
Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node
Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.
Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2
Demo on Amazon EC2 Customer c_id   bigint c_name varchar c_zip  varchar Products p_id    bigint p_Name  varchar p_price money Orders o_id     autobigint o_c_id   bigint o_p_id   bigint o_amount int o_date   datetime o_price  money

More Related Content

What's hot

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Ontico
 
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Lviv Startup Club
 
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...Magalix Corporation
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflixgreggulrich
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedJ On The Beach
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinarMongoDB
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introductioninjae yeo
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL storeEdward Capriolo
 

What's hot (19)

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
 
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
Cassandra
CassandraCassandra
Cassandra
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinar
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
MySQL HA
MySQL HAMySQL HA
MySQL HA
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 

Similar to ScimoreDB @ CommunityDays 2011

Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheAmazon Web Services
 
2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red Hat2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red HatShawn Wells
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Richard McDougall
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheAmazon Web Services
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Eran Gampel
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsPanagiotis Papadopoulos
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortNAVER D2
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLucidworks
 

Similar to ScimoreDB @ CommunityDays 2011 (20)

Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCache
 
2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red Hat2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red Hat
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCache
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
DynamoDB Deep Dive
DynamoDB Deep DiveDynamoDB Deep Dive
DynamoDB Deep Dive
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and Limitations
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
 

ScimoreDB @ CommunityDays 2011

  • 2. Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows). Small amount of data. Database loads was small. Today internet has changed the game: 24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load
  • 3. NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
  • 4. Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware
  • 5. History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL 2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.
  • 6. ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware
  • 7. Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!
  • 8. Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3
  • 9.
  • 10. Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node
  • 11.
  • 12. Slower on insert/update – more machines needs to be updated
  • 13. Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node
  • 14. Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.
  • 15. Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2
  • 16. Demo on Amazon EC2 Customer c_id bigint c_name varchar c_zip varchar Products p_id bigint p_Name varchar p_price money Orders o_id autobigint o_c_id bigint o_p_id bigint o_amount int o_date datetime o_price money
  • 17. Performance Single machine 8 core Simple select: 75.000 queries/s (10 client threads) Vodafone cluster of 6 machines: 21.000 transactions inserting 1 row DTU cluster of 31 small machines TPC-C : 140.000 transactions/s (35% insert, 35% update, 30% select)
  • 18. ACID transactions Crash safe recovery Row & tabel level locking Dynamic phase commit (D2PC) Dynamic group commit Transactions isolation levels (read commit, read repeatable) In-Doubt transaction state Multiversioning Concurrency control MVCC Local and distributed deadlock detection Write ahead logging Fuzzy checkpoint - non blocking checkpoints B+-Tree Page compression TEXT/NTEXT field compression System tables: performance, monitoring, schema T-SQL Recursive queries/CTE Security – users & roles Free text (lucene) ScimoreOS, fiber based tasks scheduling 100% asynchronious, io-completion based NUMA aware Distributed query optimizer Distributed tree execution Query prioritization and throttling