SQL or NoSQL?
Lars Thorup, Zealake
September, 2016
Lars Thorup
● Software developer/architect
● JavaScript, C#
● Test Driven Development
● Coach
● Agile engineering practices
● Founder
● BestBrains
● Zealake
● Triggerz
● @larsthorup
Agenda
● My history with databases
● Databases - what are they good for?
● SQL and noSQL - what is the difference?
● Redis - a noSQL database
● Matching use cases to database systems
● Redis - datastructures and algorithms
My history with databases
● Databases
● Pre 1980 - many competing database models
● 1980-2010 - SQL dominates
● 2010-now - many competing noSQL database
models
● Myself
● 1990-2016 - SQL for administrative systems,
documents, e-commerce, music, collaboration
tools, data analytics
● 2015 - Redis and Neo4J for social media
Databases - what are they good for?
● Make data available
● Across the globe
● From multiple computers
● Across long time spans
● Prevent data loss
● Quickly search for, fetch data and update data
● Ensure consistency in data
Example database systems
● SQL
● Relational: SQL Server, PostgreSQL, Oracle
● NoSQL
● Key-value: DynamoDB, Berkeley DB, S3
● Document: MongoDB, RethinkDB
● Data structure: Redis
● Graph: Neo4J
● Columns: Cassandra, HBase
Example database use cases
● Banks: accounts, owners, transactions
● Social media: posts, comments, ratings
● Caching: user sessions, generated pages
● Sales analytics: counts, sums, locations, averages,
hierarchies
SQL and noSQL - what is the difference?
● What kind of data do we store?
● How many machines do we use?
● Will there be type checking?
● Will we have to code the lookup algorithms?
● How do we prevent incosistent data?
Typical SQL database
● Many small tables with lots of columns
● Single instance on a large server
● Explicit column types, referential constraints
● Advanced and efficient standard query language
● Transactions over complex updates
Typical NoSQL database
● Collections of JSON documents
● Cluster of servers with shards and replicas
● Application may handle evolving document structures
● Specific low-level query language
● Single-update transactions
Categorizing a database system
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis - one NoSQL database
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis
Redis
● REmote DIctionary Server, started in 2009
● Popular, fast, robust
● In-memory
● Single-threaded
● Many data types
● dictionaries, lists, sets, sorted sets
● Other features
● key expiry
● publish - subscribe
Redis demo
● string values (session count)
● dictionary values (session)
● list values (lucene index queue)
● sorted list values (front page posts)
● expiry (session)
● http://redis.io/topics/data-types-intro
Demo: string values
● Example: Global objects
incr 'session:id'
set 'session:42' '{"name":"lars", "level": 5}'
get 'session:42'
Demo: dictionary values
● Example: Session object
hset 'session:42' info '{"name":"lars"}'
hset 'session:42' level "5"
hgetall 'session:42'
hincrby 'session:42' level 1
hget 'session:42' level
Demo: list values
● Example: lucene indexing queue
rpush 'index:lucene' "42"
rpush 'index:lucene' "105"
rpush 'index:lucene' "7"
lrange 'index:lucene' 0 -1
lpop 'index:lucene'
Demo: sorted list values
● Example: front page posts
zadd 'post:score' 17 "42"
zadd 'post:score' 39 "43"
zadd 'post:score' 22 "44"
zincrby 'post:score' 1 "42"
zrevrange 'post:score' 0 -1
zrevrank 'post:score' "42"
Demo: expiry
● Example: session
set 'session:42' '{"name":"lars", "level": 5}'
expire 'session:42' 20
ttl 'session:42'
get 'session:42'
Demo: simple transactions
● Example: maintaining indexes
multi
set 'session:42' '{"name":"lars", "level": 5}'
hset 'session:by:name' "lars" "42"
expire 'session:42' 20
exec
Demo: not so simple transactions
● Example: doing updates
watch 'session:42'
get 'session:42'
multi
set 'session:42' '{"name":"lars", "level": 7}'
exec
Redis demo questions
Redis - one NoSQL database
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis
Trade-off: coding effort
● SQL
● Distribution: sharding and
clustering
● Impedance mismatch: Object-
relational mapping
● Explicit schema: fixed, declared
up-front, requires migrations
● NoSQL
● Manual query optimization
● Difficult transactional safety
● Implicit and dynamic schema
migrations
Questions!

SQL or NoSQL - how to choose

  • 1.
    SQL or NoSQL? LarsThorup, Zealake September, 2016
  • 2.
    Lars Thorup ● Softwaredeveloper/architect ● JavaScript, C# ● Test Driven Development ● Coach ● Agile engineering practices ● Founder ● BestBrains ● Zealake ● Triggerz ● @larsthorup
  • 3.
    Agenda ● My historywith databases ● Databases - what are they good for? ● SQL and noSQL - what is the difference? ● Redis - a noSQL database ● Matching use cases to database systems ● Redis - datastructures and algorithms
  • 4.
    My history withdatabases ● Databases ● Pre 1980 - many competing database models ● 1980-2010 - SQL dominates ● 2010-now - many competing noSQL database models ● Myself ● 1990-2016 - SQL for administrative systems, documents, e-commerce, music, collaboration tools, data analytics ● 2015 - Redis and Neo4J for social media
  • 5.
    Databases - whatare they good for? ● Make data available ● Across the globe ● From multiple computers ● Across long time spans ● Prevent data loss ● Quickly search for, fetch data and update data ● Ensure consistency in data
  • 6.
    Example database systems ●SQL ● Relational: SQL Server, PostgreSQL, Oracle ● NoSQL ● Key-value: DynamoDB, Berkeley DB, S3 ● Document: MongoDB, RethinkDB ● Data structure: Redis ● Graph: Neo4J ● Columns: Cassandra, HBase
  • 7.
    Example database usecases ● Banks: accounts, owners, transactions ● Social media: posts, comments, ratings ● Caching: user sessions, generated pages ● Sales analytics: counts, sums, locations, averages, hierarchies
  • 8.
    SQL and noSQL- what is the difference? ● What kind of data do we store? ● How many machines do we use? ● Will there be type checking? ● Will we have to code the lookup algorithms? ● How do we prevent incosistent data?
  • 9.
    Typical SQL database ●Many small tables with lots of columns ● Single instance on a large server ● Explicit column types, referential constraints ● Advanced and efficient standard query language ● Transactions over complex updates
  • 10.
    Typical NoSQL database ●Collections of JSON documents ● Cluster of servers with shards and replicas ● Application may handle evolving document structures ● Specific low-level query language ● Single-update transactions
  • 11.
    Categorizing a databasesystem SQL NoSQL Impedance mismatch Distribution Schema Query engine tables and columns server explicit optimizing manual implicit cluster documents
  • 12.
    Redis - oneNoSQL database SQL NoSQL Impedance mismatch Distribution Schema Query engine tables and columns server explicit optimizing manual implicit cluster documents Redis
  • 13.
    Redis ● REmote DIctionaryServer, started in 2009 ● Popular, fast, robust ● In-memory ● Single-threaded ● Many data types ● dictionaries, lists, sets, sorted sets ● Other features ● key expiry ● publish - subscribe
  • 14.
    Redis demo ● stringvalues (session count) ● dictionary values (session) ● list values (lucene index queue) ● sorted list values (front page posts) ● expiry (session) ● http://redis.io/topics/data-types-intro
  • 15.
    Demo: string values ●Example: Global objects incr 'session:id' set 'session:42' '{"name":"lars", "level": 5}' get 'session:42'
  • 16.
    Demo: dictionary values ●Example: Session object hset 'session:42' info '{"name":"lars"}' hset 'session:42' level "5" hgetall 'session:42' hincrby 'session:42' level 1 hget 'session:42' level
  • 17.
    Demo: list values ●Example: lucene indexing queue rpush 'index:lucene' "42" rpush 'index:lucene' "105" rpush 'index:lucene' "7" lrange 'index:lucene' 0 -1 lpop 'index:lucene'
  • 18.
    Demo: sorted listvalues ● Example: front page posts zadd 'post:score' 17 "42" zadd 'post:score' 39 "43" zadd 'post:score' 22 "44" zincrby 'post:score' 1 "42" zrevrange 'post:score' 0 -1 zrevrank 'post:score' "42"
  • 19.
    Demo: expiry ● Example:session set 'session:42' '{"name":"lars", "level": 5}' expire 'session:42' 20 ttl 'session:42' get 'session:42'
  • 20.
    Demo: simple transactions ●Example: maintaining indexes multi set 'session:42' '{"name":"lars", "level": 5}' hset 'session:by:name' "lars" "42" expire 'session:42' 20 exec
  • 21.
    Demo: not sosimple transactions ● Example: doing updates watch 'session:42' get 'session:42' multi set 'session:42' '{"name":"lars", "level": 7}' exec
  • 22.
  • 23.
    Redis - oneNoSQL database SQL NoSQL Impedance mismatch Distribution Schema Query engine tables and columns server explicit optimizing manual implicit cluster documents Redis
  • 24.
    Trade-off: coding effort ●SQL ● Distribution: sharding and clustering ● Impedance mismatch: Object- relational mapping ● Explicit schema: fixed, declared up-front, requires migrations ● NoSQL ● Manual query optimization ● Difficult transactional safety ● Implicit and dynamic schema migrations
  • 25.