SlideShare a Scribd company logo
Multiple ways of storing 
-> Data <- 
SQL -> NOSQL -> NEWSQL 
Tony Rogerson 
@tonyrogerson 
tonyrogerson@torver.net 
dataidol.com/tonyrogerson
Agenda 
Data structures 
◦ Relational, Key/Value pair, Document, Graph, Column/Column Family Store 
◦ Key Concepts 
◦ Hashing, Partitioning, Sharding, ACID, BASE 
Technology Areas 
◦ SQL, NoSQL, NewSQL
Who-am-I 
Freelance SQL Server professional and Data Specialist 
Fellow BCS, MSc in BI, PGCert in Data Science 
Started out in 1986 – VSAM, System W, Application System, DB2, Oracle, SQL Server since 4.21a 
Awarded SQL Server MVP yearly since 97 
Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, 
SQL Santa 
Interested in commodity based distributed processing of Data.
Data Structures 
WAYS OF STRUCTURING DATA
What is data? 
Tony Rogerson 
tonyrogerson@torver.net 
Harpenden 
36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 
46 
44
Data needs context and structure 
Tony Rogerson FullName 
tonyrogerson@torver.net Email 
Harpenden PostalTown 
36 on 2014-01-01, 
36 on 2014-05-01, {WaistInches, RecordedOn} 
38 on 2014-10-15 
46 ChestInches 
44 Ages 
Schema gives 
Context
Relational [Tables] 
FullName (PK) Email PostalTown WaistInches ChestInches AgeYears 
Tony Rogerson tonyrogerson@ 
torver.net 
Harpenden 46 44 
FullName (FK) WaistInches RecordedDate 
Tony Rogerson 36 2014-01-01 
Tony Rogerson 36 2014-05-01 
Tony Rogerson 38 2014-10-01 
People WaistInches 
Tony Rogerson 
tonyrogerson@torver.net 
Harpenden 
36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 
46 
44
Key/Value pair (EAV) 
Entity Attribute Value 
Person FullName Tony Rogerson 
Person Email tonyrogerson@torver.net 
Person PostalTown Harpenden 
Person ChestInches 46 
Person Age 44 
WaistInches FullName Tony Rogerson 
WaistInches WaistInches 36 
WaistInches RecordedDate 2014-01-01 
WaistInches FullName Tony Rogerson 
WaistInches WaistInches 36 
WaistInches RecordedDate 2014-05-01 
WaistInches FullName Tony Rogerson 
WaistInches WaistInches 38 
WaistInches RecordedDate 2014-10-01 
Examples: 
Riak, Dyanamo, Redis, 
Foundation etc. 
Tony Rogerson 
tonyrogerson@torver.net 
Harpenden 
36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 
46 
44
Document 
JSON Schema JSON Document 
{ 
“FullName” : “string”, 
“Email” : “string”, 
“PostalTown” : “string”, 
“WaistInches” : { 
“WaistInches” : “number”, 
“RecordedDate” : “string” }, 
“ChestInches” : “number”, 
“Age” : “number” 
} 
{ 
“FullName” : “Tony Rogerson”, 
“Email” : “tonyrogerson@torver.net”, 
“PostalTown” : “Harpenden”, 
“WaistInches” : [ { 
Examples: 
MongoDB, Couchbase, 
CouchDB etc. 
“WaistInches” : 36, 
“RecordedDate” : “2014-01-01” }, 
{ 
“WaistInches” : 36, 
“RecordedDate” : “2014-05-01” } ], 
“ChestInches” : 46, 
“Age” : 44 
} 
JSON vs XML discussion: http://stackoverflow.com/questions/4862310/json-and-xml-comparison 
Tony Rogerson 
tonyrogerson@torver.net 
Harpenden 
36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 
46 
44
Schema Design 
E.g. 100 machine cluster 
Document Database Normal Form (Relational) 
{ 
"firstName": "John", 
"lastName": "Smith", 
"isAlive": true, 
"age": 25, 
"height_cm": 167.6, 
"address": { 
"streetAddress": "21 2nd Street", 
"city": "New York", 
"state": "NY", 
"postalCode": "10021-3100" 
}, 
"phoneNumbers": [ 
{ "type": "home", "number": "212 555-1234" }, 
{ "type": "office", "number": "646 555-4567" } 
] 
} 
person address 
phoneN 
umbers 
Object data 
stored together 
(collection) 
Object data 
stored separately 
(tables)
MongoDB Example 
Use ESTER for MongoVUE 
What do documents look like?
Graph 
SQL (inherently very poor performance): 
◦ Nested Sets 
◦ Recursive CTE 
Represents “connected” data 
All about understanding and exploring relationships 
Examples: 
Neo4j, Virtuoso, Allegro. 
Tony Dave 
Fred 
Sid 
Node 
Relationship
Examples: 
Cassandra, Druid, HBase 
Column 
Values stored as a key-value pair 
Column Name (unique) 
Value 
Timestamp 
Important bit: It may not appear in each row! 
Column Family is: container for columns and rows (like but not a relational table) 
Relational Table: Fixed Columns 
Column Family: determined by application – flexible
Column storage 
Examples: 
Cassandra, Druid, HBase 
http://www.datastax.com/docs/1.1/ddl/column_family 
Stored as…
SQL Server Columnstore 
Table sliced into rowgroups (a group of rows – a batch) 
Each rowgroup compressed in column-wise manner 
Column segment is a column of data from within the rowgroup 
Column segment per column in table which is then compressed onto 
storage. 
SO: a table has rows (sliced into rowgroups), rowgroups have columns 
(each column having a column segment)
Demo: SQL Sparse columns
Key Concepts 
SHARDING, PARTITIONING, HASHING
Hashing 
Distributed Database Cluster has fixed number of data nodes 
Your data is spread across the database cluster 
◦ 10 node cluster; each data item may reside on 3 nodes 
◦ Which 3 nodes? 
Data key is Hashed to a number – hashing algorithm is deterministic 
data-node = f( data-key ) 
◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 
◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
Partitioning 
Chop big table up into “horizontal 
partitions” 
Partition key required 
Each partition is self-contained binding rows 
by the partitioning key 
Access all data through logical view over all 
partitions 
Table by table basis
Shared Nothing 
Partitioning+ 
Each Shard is self-contained and has all the 
procs, meta-data and of course your partition of 
data 
Shard Key common to multiple tables, for 
example CustomerID, Email Address. 
Greater autonomy across the distributed 
database 
Seeing the entire database as a logical unit is 
more difficult – joining is a nightmare 
Node 1 
Node 2 
Node 3
Sharding Sync 
Node 1 
Node 2 
Node 3 
Full copy of data 
Subset of data 
Replication
ACID (Automicity, Consistency, Isolation, Durability) 
BASE (Basically Available, Soft-state, Eventually Consistent) 
ACID is a Transactional model 
Not specific to the relational database 
◦ eg. HIVE (interface to HADOOP) provides ACID facilities 
Durability: write ahead Logging expensive (latency from serialisation of writes) 
Distributed transactions – Two Phase Commit (2PC) 
◦ Poor scalability because of Latency 
◦ ACID across distributed nodes bad design choice 
◦ Partition/Shard database and ACID in-node only 
Coordinator 
Subordinate 
Subordinate 
INSERT 
2PC Transaction 
All or nothing
ACID (Automicity, Consistency, Isolation, Durability) 
BASE (Basically Available, Soft-state, Eventually Consistent) 
BASE is a Transactional modelish 
Specific to Distributed database model 
Basically Available – all or some of the system is available 
Node 1 Node 2 Node 3
ACID (Automicity, Consistency, Isolation, Durability) 
BASE (Basically Available, Soft-state, Eventually Consistent) 
Soft-state 
Eventually Consistent 
System may change over time [as replica’s become up-to-date (consistent)] 
Node 1 Node 2 Node 3 
Insert value ‘A’
SQL 
AH – THE COMMON DENOMINATOR OF AN ACCESS LAYER
What is SQL? 
SQL is NOT a method of storing data! 
SQL is a language, it’s just syntax 
Relational Theory = thinking in sets 
SQL is a language that follows (but does not obey) relational theory 
With SQL we associate ACID (but durability is now optional in SQL 2014)
NoSQL 
NOT ONLY SQL 
NO SQL
Origins NoSQL? 
First NoSQL database was an open source relational database 
NoSQL (really NoREL) started in mid 2000’s 
Realisation that ACID doesn’t scale easily 
Should really be NoACID (Mutually exclusive for some 70’s developers) 
Hadoop – came out of Yahoo 
Cassandra, Riak and others derivatives of Amazon Dynamo 
NoSQL basically means: ACID doesn’t scale, SQL is too restrictive, and I’m a developer and I like 
complexity.
But why the need for “NoSQL”? 
Feb 2001 
◦ BigData - http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling- 
Data-Volume-Velocity-and-Variety.pdf 
Basically Scale-Up (SAN) costs too much and doesn’t scale well 
Sick of vendor lock in and associated costs – open source software running across cheap 
commodity machines (Redundant Array of Inexpensive Servers) 
Availability, Resilience – by design – by software and not expensive hardware 
Existing Relational Databases (with SQL as their only language) expensive and too slow (ACID) 
BASE v ACID 
SQL implements a rigid and inflexible framework (or does it)
Eventual Consistency in SQL Server 
Asynchronous Availability Groups/Database Mirroring 
Replication 
Eventual / Causal Consistency 
◦ Eventual no good for order specific [and important] transactions 
◦ Like Merge replication 
◦ Causal: deliver messages in correct order [e.g. service broker] 
◦ Like Transactional Replication
MongoDB – Replica Set 
primary 
$ mongo --host 10.0.0.1 --port 27017 
ROSIE 
10.0.0.2 
ESTER 
10.0.0.1 
HAZEL 
10.0.0.3 
secondary's 
replication replication 
Heart-beat 
• 1 Master – Multiple Secondary’s 
• 1 R/W – Multiple Readers 
• Setup: 
• Use replication.replSetName in mongo config file 
• On Primary: 
• rs.initiate() 
• rs.add( “---secondary address” ) 
• rs.add( “---secondary address” ) 
• rs.status()
MongoDB - Sharding 
Shards of data (data chopped up into multiple ranges, 
range depends where it sits) 
Standalone or Replica-Set MongoDB instances 
(data storage) 
Stores configuration information 
about the Shards.
MongoDB – Sharding (with Replica-Set) 
mongod: port 27017, replSet: rsDemoRS2 
DAISY 
10.0.0.4 
CONISTON 
10.0.0.11 
POPPY 
10.0.0.5 
KARLI 
10.0.0.6 
mongod: port 27017, replSet: rsDemo 
mongos: port 27020 (on ESTER, HAZEL, ROSIE) 
ROSIE 
10.0.0.2 
config servers 
port 27019 
(shard information 
point to replica sets) 
ESTER 
10.0.0.1 
primary 
HAZEL 
10.0.0.3 
secondary's 
replication replication 
Heart-beat 
THIRLMERE 
10.0.0.13 
primary 
ULLSWATER 
10.0.0.12 
secondary's 
replication replication 
Heart-beat 
DAISY 
10.0.0.4 
Query Balancer 
Query
NewSQL 
SCALABLE ACID!
Relational Databases catch up 
Maintains ACID 
Same scalability and performance of NoSQL systems 
Some Vendors: Clustrix, MemSQL, NuoDB, VoltDB, Postgres-XL 
Auto-sharding, auto-partitioning 
Queries need to take place on same box to save latency 
http://www.postgres-xl.org/overview/
Summary / Q & A / Discuss

More Related Content

What's hot

DSpace Under the Hood
DSpace Under the HoodDSpace Under the Hood
DSpace Under the Hood
DuraSpace
 
March 2011 HUG: HDFS Federation
March 2011 HUG: HDFS FederationMarch 2011 HUG: HDFS Federation
March 2011 HUG: HDFS Federation
Yahoo Developer Network
 
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Aaron Benton
 
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Dave Stokes
 
Hopsfs 10x HDFS performance
Hopsfs 10x HDFS performanceHopsfs 10x HDFS performance
Hopsfs 10x HDFS performance
Jim Dowling
 
MYSQL-Database
MYSQL-DatabaseMYSQL-Database
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
Patrick McFadin
 
Discover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQLDiscover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQL
Dave Stokes
 
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScriptJavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
Dave Stokes
 
Json within a relational database
Json within a relational databaseJson within a relational database
Json within a relational database
Dave Stokes
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
Dave Stokes
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
Duyhai Doan
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
Johnny Miller
 
Datacon LA - MySQL without the SQL - Oh my!
Datacon LA - MySQL without the SQL - Oh my! Datacon LA - MySQL without the SQL - Oh my!
Datacon LA - MySQL without the SQL - Oh my!
Dave Stokes
 
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
Gigaom
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHP
Dave Stokes
 
Pinterest的数据库分片架构
Pinterest的数据库分片架构Pinterest的数据库分片架构
Pinterest的数据库分片架构
Tommy Chiu
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
Dave Stokes
 
Redis
RedisRedis

What's hot (20)

DSpace Under the Hood
DSpace Under the HoodDSpace Under the Hood
DSpace Under the Hood
 
March 2011 HUG: HDFS Federation
March 2011 HUG: HDFS FederationMarch 2011 HUG: HDFS Federation
March 2011 HUG: HDFS Federation
 
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
Couchbase Overview - Monterey Bay Information Technologists Meetup 02.15.17
 
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...
 
Hopsfs 10x HDFS performance
Hopsfs 10x HDFS performanceHopsfs 10x HDFS performance
Hopsfs 10x HDFS performance
 
MYSQL-Database
MYSQL-DatabaseMYSQL-Database
MYSQL-Database
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Discover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQLDiscover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQL
 
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScriptJavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
JavaScript and Friends August 20th, 20201 -- MySQL Shell and JavaScript
 
Json within a relational database
Json within a relational databaseJson within a relational database
Json within a relational database
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
Datacon LA - MySQL without the SQL - Oh my!
Datacon LA - MySQL without the SQL - Oh my! Datacon LA - MySQL without the SQL - Oh my!
Datacon LA - MySQL without the SQL - Oh my!
 
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHP
 
Pinterest的数据库分片架构
Pinterest的数据库分片架构Pinterest的数据库分片架构
Pinterest的数据库分片架构
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
 
Redis
RedisRedis
Redis
 

Viewers also liked

Why SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureWhy SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
SolarWinds
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql server
Chris Adkin
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
Tony Rogerson
 
Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server faster
SolarWinds
 
The have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesThe have no fear guide to virtualizing databases
The have no fear guide to virtualizing databases
SolarWinds
 
Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016
SolarWinds
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
Chris Adkin
 
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
Tony Rogerson
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
Davide Mauri
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
SolarWinds
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
Chris Adkin
 
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
SolarWinds
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
Chris Adkin
 
2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis
SolarWinds
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architectures
Chris Adkin
 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server Indexing
Davide Mauri
 
The 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsThe 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictions
SolarWinds
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Chris Adkin
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
Davide Mauri
 

Viewers also liked (20)

Why SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer featureWhy SQL Server 2014 Cardinality Estimator is *the* killer feature
Why SQL Server 2014 Cardinality Estimator is *the* killer feature
 
Leveraging memory in sql server
Leveraging memory in sql serverLeveraging memory in sql server
Leveraging memory in sql server
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
 
Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server faster
 
The have no fear guide to virtualizing databases
The have no fear guide to virtualizing databasesThe have no fear guide to virtualizing databases
The have no fear guide to virtualizing databases
 
Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016Veja em primeira mão os tópicos de tecnologia de 2016
Veja em primeira mão os tópicos de tecnologia de 2016
 
Building scalable application with sql server
Building scalable application with sql serverBuilding scalable application with sql server
Building scalable application with sql server
 
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)SQL Server 2014 In-Memory Tables (XTP, Hekaton)
SQL Server 2014 In-Memory Tables (XTP, Hekaton)
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
 
Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)Column store indexes and batch processing mode (nx power lite)
Column store indexes and batch processing mode (nx power lite)
 
SolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring SurveySolarWinds State of Government IT Management and Monitoring Survey
SolarWinds State of Government IT Management and Monitoring Survey
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis2015 Top 10 Vorhersagen Für IT-Profis
2015 Top 10 Vorhersagen Für IT-Profis
 
Sql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architecturesSql sever engine batch mode and cpu architectures
Sql sever engine batch mode and cpu architectures
 
Back to the roots - SQL Server Indexing
Back to the roots - SQL Server IndexingBack to the roots - SQL Server Indexing
Back to the roots - SQL Server Indexing
 
The 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictionsThe 2015 Top Ten IT Pro-dictions
The 2015 Top Ten IT Pro-dictions
 
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow EngineScaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
 

Similar to NoSQL, SQL, NewSQL - methods of structuring data.

NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
AkshayDwivedi31
 
No SQL - A Simple Intro
No SQL - A Simple IntroNo SQL - A Simple Intro
No SQL - A Simple Intro
Karthi Keyan
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Ehsan Asgarian
 
Datastores
DatastoresDatastores
Datastores
Raveen Vijayan
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
RithikRaj25
 
NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)
Rahul P
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
Mohan Rathour
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
Michael Yarichuk
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.
Mohammad Asif
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 
Nosql
NosqlNosql
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
Tony Rogerson
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
Clarence J M Tauro
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
Huy Do
 
Redis introduction
Redis introductionRedis introduction
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
Kelum Senanayake
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
Rajini Ramesh
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
Amazon Web Services
 

Similar to NoSQL, SQL, NewSQL - methods of structuring data. (20)

NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
No SQL - A Simple Intro
No SQL - A Simple IntroNo SQL - A Simple Intro
No SQL - A Simple Intro
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
Datastores
DatastoresDatastores
Datastores
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Nosql
NosqlNosql
Nosql
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Redis introduction
Redis introductionRedis introduction
Redis introduction
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 

Recently uploaded

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 

Recently uploaded (20)

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 

NoSQL, SQL, NewSQL - methods of structuring data.

  • 1. Multiple ways of storing -> Data <- SQL -> NOSQL -> NEWSQL Tony Rogerson @tonyrogerson tonyrogerson@torver.net dataidol.com/tonyrogerson
  • 2. Agenda Data structures ◦ Relational, Key/Value pair, Document, Graph, Column/Column Family Store ◦ Key Concepts ◦ Hashing, Partitioning, Sharding, ACID, BASE Technology Areas ◦ SQL, NoSQL, NewSQL
  • 3. Who-am-I Freelance SQL Server professional and Data Specialist Fellow BCS, MSc in BI, PGCert in Data Science Started out in 1986 – VSAM, System W, Application System, DB2, Oracle, SQL Server since 4.21a Awarded SQL Server MVP yearly since 97 Founded UK SQL Server User Group back in ’99, founder member of DDD, SQL Bits, SQL Relay, SQL Santa Interested in commodity based distributed processing of Data.
  • 4. Data Structures WAYS OF STRUCTURING DATA
  • 5. What is data? Tony Rogerson tonyrogerson@torver.net Harpenden 36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 46 44
  • 6. Data needs context and structure Tony Rogerson FullName tonyrogerson@torver.net Email Harpenden PostalTown 36 on 2014-01-01, 36 on 2014-05-01, {WaistInches, RecordedOn} 38 on 2014-10-15 46 ChestInches 44 Ages Schema gives Context
  • 7. Relational [Tables] FullName (PK) Email PostalTown WaistInches ChestInches AgeYears Tony Rogerson tonyrogerson@ torver.net Harpenden 46 44 FullName (FK) WaistInches RecordedDate Tony Rogerson 36 2014-01-01 Tony Rogerson 36 2014-05-01 Tony Rogerson 38 2014-10-01 People WaistInches Tony Rogerson tonyrogerson@torver.net Harpenden 36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 46 44
  • 8. Key/Value pair (EAV) Entity Attribute Value Person FullName Tony Rogerson Person Email tonyrogerson@torver.net Person PostalTown Harpenden Person ChestInches 46 Person Age 44 WaistInches FullName Tony Rogerson WaistInches WaistInches 36 WaistInches RecordedDate 2014-01-01 WaistInches FullName Tony Rogerson WaistInches WaistInches 36 WaistInches RecordedDate 2014-05-01 WaistInches FullName Tony Rogerson WaistInches WaistInches 38 WaistInches RecordedDate 2014-10-01 Examples: Riak, Dyanamo, Redis, Foundation etc. Tony Rogerson tonyrogerson@torver.net Harpenden 36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 46 44
  • 9. Document JSON Schema JSON Document { “FullName” : “string”, “Email” : “string”, “PostalTown” : “string”, “WaistInches” : { “WaistInches” : “number”, “RecordedDate” : “string” }, “ChestInches” : “number”, “Age” : “number” } { “FullName” : “Tony Rogerson”, “Email” : “tonyrogerson@torver.net”, “PostalTown” : “Harpenden”, “WaistInches” : [ { Examples: MongoDB, Couchbase, CouchDB etc. “WaistInches” : 36, “RecordedDate” : “2014-01-01” }, { “WaistInches” : 36, “RecordedDate” : “2014-05-01” } ], “ChestInches” : 46, “Age” : 44 } JSON vs XML discussion: http://stackoverflow.com/questions/4862310/json-and-xml-comparison Tony Rogerson tonyrogerson@torver.net Harpenden 36 on 2014-01-01, 36 on 2014-05-01, 38 on 2014-10-15 46 44
  • 10. Schema Design E.g. 100 machine cluster Document Database Normal Form (Relational) { "firstName": "John", "lastName": "Smith", "isAlive": true, "age": 25, "height_cm": 167.6, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021-3100" }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "office", "number": "646 555-4567" } ] } person address phoneN umbers Object data stored together (collection) Object data stored separately (tables)
  • 11. MongoDB Example Use ESTER for MongoVUE What do documents look like?
  • 12. Graph SQL (inherently very poor performance): ◦ Nested Sets ◦ Recursive CTE Represents “connected” data All about understanding and exploring relationships Examples: Neo4j, Virtuoso, Allegro. Tony Dave Fred Sid Node Relationship
  • 13. Examples: Cassandra, Druid, HBase Column Values stored as a key-value pair Column Name (unique) Value Timestamp Important bit: It may not appear in each row! Column Family is: container for columns and rows (like but not a relational table) Relational Table: Fixed Columns Column Family: determined by application – flexible
  • 14. Column storage Examples: Cassandra, Druid, HBase http://www.datastax.com/docs/1.1/ddl/column_family Stored as…
  • 15. SQL Server Columnstore Table sliced into rowgroups (a group of rows – a batch) Each rowgroup compressed in column-wise manner Column segment is a column of data from within the rowgroup Column segment per column in table which is then compressed onto storage. SO: a table has rows (sliced into rowgroups), rowgroups have columns (each column having a column segment)
  • 16. Demo: SQL Sparse columns
  • 17. Key Concepts SHARDING, PARTITIONING, HASHING
  • 18. Hashing Distributed Database Cluster has fixed number of data nodes Your data is spread across the database cluster ◦ 10 node cluster; each data item may reside on 3 nodes ◦ Which 3 nodes? Data key is Hashed to a number – hashing algorithm is deterministic data-node = f( data-key ) ◦ print ( checksum( 'All hale to the ale' ) * 1.) % 10 ◦ print ( checksum( 'And a glass of wine for the ladies' ) * 1.) % 10
  • 19. Partitioning Chop big table up into “horizontal partitions” Partition key required Each partition is self-contained binding rows by the partitioning key Access all data through logical view over all partitions Table by table basis
  • 20. Shared Nothing Partitioning+ Each Shard is self-contained and has all the procs, meta-data and of course your partition of data Shard Key common to multiple tables, for example CustomerID, Email Address. Greater autonomy across the distributed database Seeing the entire database as a logical unit is more difficult – joining is a nightmare Node 1 Node 2 Node 3
  • 21. Sharding Sync Node 1 Node 2 Node 3 Full copy of data Subset of data Replication
  • 22. ACID (Automicity, Consistency, Isolation, Durability) BASE (Basically Available, Soft-state, Eventually Consistent) ACID is a Transactional model Not specific to the relational database ◦ eg. HIVE (interface to HADOOP) provides ACID facilities Durability: write ahead Logging expensive (latency from serialisation of writes) Distributed transactions – Two Phase Commit (2PC) ◦ Poor scalability because of Latency ◦ ACID across distributed nodes bad design choice ◦ Partition/Shard database and ACID in-node only Coordinator Subordinate Subordinate INSERT 2PC Transaction All or nothing
  • 23. ACID (Automicity, Consistency, Isolation, Durability) BASE (Basically Available, Soft-state, Eventually Consistent) BASE is a Transactional modelish Specific to Distributed database model Basically Available – all or some of the system is available Node 1 Node 2 Node 3
  • 24. ACID (Automicity, Consistency, Isolation, Durability) BASE (Basically Available, Soft-state, Eventually Consistent) Soft-state Eventually Consistent System may change over time [as replica’s become up-to-date (consistent)] Node 1 Node 2 Node 3 Insert value ‘A’
  • 25. SQL AH – THE COMMON DENOMINATOR OF AN ACCESS LAYER
  • 26. What is SQL? SQL is NOT a method of storing data! SQL is a language, it’s just syntax Relational Theory = thinking in sets SQL is a language that follows (but does not obey) relational theory With SQL we associate ACID (but durability is now optional in SQL 2014)
  • 27. NoSQL NOT ONLY SQL NO SQL
  • 28. Origins NoSQL? First NoSQL database was an open source relational database NoSQL (really NoREL) started in mid 2000’s Realisation that ACID doesn’t scale easily Should really be NoACID (Mutually exclusive for some 70’s developers) Hadoop – came out of Yahoo Cassandra, Riak and others derivatives of Amazon Dynamo NoSQL basically means: ACID doesn’t scale, SQL is too restrictive, and I’m a developer and I like complexity.
  • 29. But why the need for “NoSQL”? Feb 2001 ◦ BigData - http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling- Data-Volume-Velocity-and-Variety.pdf Basically Scale-Up (SAN) costs too much and doesn’t scale well Sick of vendor lock in and associated costs – open source software running across cheap commodity machines (Redundant Array of Inexpensive Servers) Availability, Resilience – by design – by software and not expensive hardware Existing Relational Databases (with SQL as their only language) expensive and too slow (ACID) BASE v ACID SQL implements a rigid and inflexible framework (or does it)
  • 30. Eventual Consistency in SQL Server Asynchronous Availability Groups/Database Mirroring Replication Eventual / Causal Consistency ◦ Eventual no good for order specific [and important] transactions ◦ Like Merge replication ◦ Causal: deliver messages in correct order [e.g. service broker] ◦ Like Transactional Replication
  • 31. MongoDB – Replica Set primary $ mongo --host 10.0.0.1 --port 27017 ROSIE 10.0.0.2 ESTER 10.0.0.1 HAZEL 10.0.0.3 secondary's replication replication Heart-beat • 1 Master – Multiple Secondary’s • 1 R/W – Multiple Readers • Setup: • Use replication.replSetName in mongo config file • On Primary: • rs.initiate() • rs.add( “---secondary address” ) • rs.add( “---secondary address” ) • rs.status()
  • 32. MongoDB - Sharding Shards of data (data chopped up into multiple ranges, range depends where it sits) Standalone or Replica-Set MongoDB instances (data storage) Stores configuration information about the Shards.
  • 33. MongoDB – Sharding (with Replica-Set) mongod: port 27017, replSet: rsDemoRS2 DAISY 10.0.0.4 CONISTON 10.0.0.11 POPPY 10.0.0.5 KARLI 10.0.0.6 mongod: port 27017, replSet: rsDemo mongos: port 27020 (on ESTER, HAZEL, ROSIE) ROSIE 10.0.0.2 config servers port 27019 (shard information point to replica sets) ESTER 10.0.0.1 primary HAZEL 10.0.0.3 secondary's replication replication Heart-beat THIRLMERE 10.0.0.13 primary ULLSWATER 10.0.0.12 secondary's replication replication Heart-beat DAISY 10.0.0.4 Query Balancer Query
  • 35. Relational Databases catch up Maintains ACID Same scalability and performance of NoSQL systems Some Vendors: Clustrix, MemSQL, NuoDB, VoltDB, Postgres-XL Auto-sharding, auto-partitioning Queries need to take place on same box to save latency http://www.postgres-xl.org/overview/
  • 36. Summary / Q & A / Discuss

Editor's Notes

  1. Today’s environment is a polyglot database, that is to say, it’s made up of a number of different database sources and possibly types. In this session we’ll look at some of the options of storing data – relational, key/value, document etc. I’ll overview what is SQL, NoSQL and NewSQL to give you some context for today’s world of data storage.