Glynn Bird – Developer Advocate – IBM Cloud Data Services
NoSQL for SQL users
Introduction
@glynn_bird glynn.bird@uk.ibm.com
Glynn Bird
Developer Advocate
IBM Cloud Data Services
http://www.glynnbird.com
Agenda
 NoSQL vs SQL
 Types of NoSQL
 Scaling
 Querying and Data Modelling
 Replication
 Demo
3
SQL vs NoSQL
RDBMs
 Relational Database Management Systems
 SQL language developed by IBM in the 1970s
 RDBMs power lots of IT systems
 Oracle, IBM DB2, MySQL, PostgreSQL etc
5
RDBMS downsides
 scalability
 availability
 price
6
NoSQL
 NoSQL = "Not only SQL"
 Response to use-cases that a RDBMS is not a good fit
 Easier to scale
7
8
Key-Value Document
BigTable Graph
Development Cycle
SQL vs NoSQL - Development Cycle
 Build
 Migrate staging database
 Test
 Migrate production
 Deploy
10
 Build
 Test
 Deploy
Database migrations are costly
 Adding/updating/deleting columns
 May cause interruption to service
 Often performed "out of hours"
 Have to be carefully planned in multi-server deployments
11
Scalability
Scaling a web application
13
Scaling a SQL database
14
Scaling a Cloudant database
15
• Database-as-a-Service
• Free/PAYG/Dedicated/Local
• Sign up and start using
• Scale by adding nodes
• More data
• More concurrency
Scaling other NoSQL databases
16
Querying
SQL Tables
18
users socialmediaprofiles
userid*
firstname
lastname
registration_date
dob
address1
.
socmedid*
userid *
socmed_type
url
profile
SQL
19
SELECT * from users
LEFT JOIN socialmediaprofiles
ON users.userid =
socialmediaprofiles.userid
WHERE registration_date > "2015-01-01"
AND verified = true
AND socialmedia = true
ORDER BY registration_date
NoSQL Data model
20
{
"firstname": "Glynn",
"lastname": "Bird",
"dob": "1986-10-02",
"registration_date": "2015-02-04",
"verified": true,
"address": { "address1": "10", "postcode": "W1A 1AA" },
"socialmedia": [
{ "type": "twitter", "handle": "glynn_bird" },
{ "type": "github", "username": "glynnbird" }
]
Cloudant Query
21
{
"selector": {
"$and": [
{ "registration_date" : { "$gt" : "2015-01-01" } },
{ "verified" : true },
{ "socialmedia" : true}
]
},
"sort": [
"registration_date:string"
]
}
MapReduce
22
function(doc) {
if (doc.verified && doc.socialmedia.length > 0) {
emit(doc.registration_date, null);
}
}
MapReduce
23
function(doc) {
if (doc.verified && doc.socialmedia.length > 0) {
emit(doc.registration_date, null);
}
}
24
CRUD – Document Primary
Index
Secondary Index
(view)
Search
Index
GeoSpatial Index Cloudant
Query
• Direct document
look up by _id
• Exists “OOTB”
• stored in a b-tree
• Primary key 
doc._id
• Built using
MapReduce
• stored in a b-tree
• Key  user-
defined field(s)
• Built using Lucene
• FTI: Any or all
fields can be
indexed
• stored in R*, TPR,
KD tree
• Lat/Long
coorindates in
GeoJSON
• “Mongo-style”
querying
• Built natively in
erlang
• Use when you
want a single
document and
can find by its _id
• Use when you can
find documents
based on their _id
• Pull back a range
of keys
• Use when you
need to analyze
data or get a
range of keys
• Ex: count data
fields,
sum/average
numeric results,
advanced stats,
group by date,
etc.
• Ad-hoc queries
• Find documents
based on their
contents
• Can do groups,
facets, and basic
geo queries (bbox
& sort by
distance)
• Complex
geometries
(polygon,
circularstring, etc.)
• Advanced
relations
(intersect,
overlaps, etc.)
• Ad-hoc queries
• Lots of operators
(>, <, IN, OR,
AND, etc.)
• Intuitive for people
coming from
Mongo or SQL
backgrounds
Replication
Cloudant Replication
26
• Replicate data from one cluster to another
• Replicate data to browser/mobile and back
• No data loss
• Offline-first apps/websites
• http://www.glynnbird.com/
Simple Search Service
Simple Search Service
 Free, open-source Bluemix App – install
with one click
 Upload your .csv or .tsv
– Imports data into Cloudant
– Indexes everything for search
– Presents HTTP Search API
 Demo!
28
https://developer.ibm.com/clouddataservices/simple-search-service/
Simple Search Service Architecture
29
Simple Search Service – Production Architecture
30
Cloudant use-cases
 Big Data – Large data sets
 Scalable operational data store
 Search – faceted, full-text search
 Geo-spatial – geographic, GIS systems, GeoJSON
 Offline-first – replicating data to mobile devices
31
Glynn Bird
Developer Advocate, Cloud Data Services
glynn.bird@uk.ibm.com
@glynn_bird
github.com/glynnbird
www.glynnbird.com

NoSQL for SQL Users

  • 1.
    Glynn Bird –Developer Advocate – IBM Cloud Data Services NoSQL for SQL users
  • 2.
    Introduction @glynn_bird glynn.bird@uk.ibm.com Glynn Bird DeveloperAdvocate IBM Cloud Data Services http://www.glynnbird.com
  • 3.
    Agenda  NoSQL vsSQL  Types of NoSQL  Scaling  Querying and Data Modelling  Replication  Demo 3
  • 4.
  • 5.
    RDBMs  Relational DatabaseManagement Systems  SQL language developed by IBM in the 1970s  RDBMs power lots of IT systems  Oracle, IBM DB2, MySQL, PostgreSQL etc 5
  • 6.
    RDBMS downsides  scalability availability  price 6
  • 7.
    NoSQL  NoSQL ="Not only SQL"  Response to use-cases that a RDBMS is not a good fit  Easier to scale 7
  • 8.
  • 9.
  • 10.
    SQL vs NoSQL- Development Cycle  Build  Migrate staging database  Test  Migrate production  Deploy 10  Build  Test  Deploy
  • 11.
    Database migrations arecostly  Adding/updating/deleting columns  May cause interruption to service  Often performed "out of hours"  Have to be carefully planned in multi-server deployments 11
  • 12.
  • 13.
    Scaling a webapplication 13
  • 14.
    Scaling a SQLdatabase 14
  • 15.
    Scaling a Cloudantdatabase 15 • Database-as-a-Service • Free/PAYG/Dedicated/Local • Sign up and start using • Scale by adding nodes • More data • More concurrency
  • 16.
    Scaling other NoSQLdatabases 16
  • 17.
  • 18.
  • 19.
    SQL 19 SELECT * fromusers LEFT JOIN socialmediaprofiles ON users.userid = socialmediaprofiles.userid WHERE registration_date > "2015-01-01" AND verified = true AND socialmedia = true ORDER BY registration_date
  • 20.
    NoSQL Data model 20 { "firstname":"Glynn", "lastname": "Bird", "dob": "1986-10-02", "registration_date": "2015-02-04", "verified": true, "address": { "address1": "10", "postcode": "W1A 1AA" }, "socialmedia": [ { "type": "twitter", "handle": "glynn_bird" }, { "type": "github", "username": "glynnbird" } ]
  • 21.
    Cloudant Query 21 { "selector": { "$and":[ { "registration_date" : { "$gt" : "2015-01-01" } }, { "verified" : true }, { "socialmedia" : true} ] }, "sort": [ "registration_date:string" ] }
  • 22.
    MapReduce 22 function(doc) { if (doc.verified&& doc.socialmedia.length > 0) { emit(doc.registration_date, null); } }
  • 23.
    MapReduce 23 function(doc) { if (doc.verified&& doc.socialmedia.length > 0) { emit(doc.registration_date, null); } }
  • 24.
    24 CRUD – DocumentPrimary Index Secondary Index (view) Search Index GeoSpatial Index Cloudant Query • Direct document look up by _id • Exists “OOTB” • stored in a b-tree • Primary key  doc._id • Built using MapReduce • stored in a b-tree • Key  user- defined field(s) • Built using Lucene • FTI: Any or all fields can be indexed • stored in R*, TPR, KD tree • Lat/Long coorindates in GeoJSON • “Mongo-style” querying • Built natively in erlang • Use when you want a single document and can find by its _id • Use when you can find documents based on their _id • Pull back a range of keys • Use when you need to analyze data or get a range of keys • Ex: count data fields, sum/average numeric results, advanced stats, group by date, etc. • Ad-hoc queries • Find documents based on their contents • Can do groups, facets, and basic geo queries (bbox & sort by distance) • Complex geometries (polygon, circularstring, etc.) • Advanced relations (intersect, overlaps, etc.) • Ad-hoc queries • Lots of operators (>, <, IN, OR, AND, etc.) • Intuitive for people coming from Mongo or SQL backgrounds
  • 25.
  • 26.
    Cloudant Replication 26 • Replicatedata from one cluster to another • Replicate data to browser/mobile and back • No data loss • Offline-first apps/websites • http://www.glynnbird.com/
  • 27.
  • 28.
    Simple Search Service Free, open-source Bluemix App – install with one click  Upload your .csv or .tsv – Imports data into Cloudant – Indexes everything for search – Presents HTTP Search API  Demo! 28 https://developer.ibm.com/clouddataservices/simple-search-service/
  • 29.
    Simple Search ServiceArchitecture 29
  • 30.
    Simple Search Service– Production Architecture 30
  • 31.
    Cloudant use-cases  BigData – Large data sets  Scalable operational data store  Search – faceted, full-text search  Geo-spatial – geographic, GIS systems, GeoJSON  Offline-first – replicating data to mobile devices 31
  • 32.
    Glynn Bird Developer Advocate,Cloud Data Services glynn.bird@uk.ibm.com @glynn_bird github.com/glynnbird www.glynnbird.com