SlideShare a Scribd company logo
1 of 32
SQL to NoSQL: Top 6 Questions
Glynn Bird
Developer Advocate @ IBM
@glynn_bird
Agenda
2
• Top 6 Questions When Moving to NoSQL
1. Why NoSQL?
2. Rows and Tables Become ... What?
3. Will I Have to Rebuild My App?
4. How do I query data?
5. What's _rev?
6. Does it replicate?
• Live Q&A
1. Why NoSQL?
3
But, What Is NoSQL, Really?
4
• Umbrella term for databases using non-SQL query languages
• Key-Value stores
• Column-family stores
• Document stores
• Graph stores
• Some also say "non-relational," because data is not decomposed
into separate tables, rows, and columns
• It’s still possible to represent relationships in NoSQL
• The question is, are these relationships always necessary?
NoSQL Document Stores
5
• That's databases like MongoDB, Apache CouchDB™, Cloudant,
and Dynamo
• Optimized for "semi-structured" or "schema-optional" data
• People say "unstructured," but that's inaccurate
• Each document has its own structure
6
2.0
multi-node
clustering
Cloudant Geo
Cloudant Query
(Mango)
Cloudant Search
(Lucene)
Dashboard
Schema Flexibility
7
• Cloudant uses JavaScript Object Notation (JSON) as its data format
• Cloudant is based on Apache CouchDB. In both systems, a "database" is simply
a collection of JSON documents
{
"docs": [
{
"_id": "df8cecd9809662d08eb853989a5ca2f2",
"_rev": "1-
8522c9a1d9570566d96b7f7171623270",
"Movie_runtime": 162,
"Movie_rating": "PG-13",
"Person_name": "Zoe Saldana",
"Actor_actor_id": "0757855",
"Movie_genre": "AVYS",
"Movie_name": "Avatar",
"Actor_movie_id": "0499549",
"Movie_earnings_rank": "1",
"Person_pob": "New Jersey, USA",
"Person_id": "0757855",
"Movie_id": "0499549",
"Movie_year": 2009,
"Person_dob": "1978-06-19"
}
]
}
The Cloudant Data Layer
8
• Distributed NoSQL data persistence
layer
• Available as a fully-managed DBaaS,
or managed by you on-premises
• Transactional JSON document
database with REST API
• Spreads data across data centers &
devices for scale & high availability
• Ideal for apps that require:
• Massive, elastic scalability
• High availability
• Geo-location services
• Full-text search
• Offline-first design for occasionally
connected users
Not One DB Server; a Cluster of Servers
• A Cloudant cluster
• Horizontal scale
• Redundant load balancers
backed by multiple DB servers
• Designed for durability
• Saves multiple copies of data
• Spreads copies across cluster
• All replicas do reads & writes
• Access Cloudant over the Web
• Developers get an API
• Cloudant manages it all
behind the scenes
9
Horizontal Scaling
• Shard across many commodity servers vs. few expensive ones
• Performance improves linearly with cost, not exponentially
10
2. Rows and Tables Become ... What?
11
... This!
SQL Terms/Concepts
database -->
table -->
row -->
column -->
materialized view -->
primary key -->
table JOIN operations -->
Document Store Terms/Concepts
database
bunch of documents
document
field
index/database view/secondary index
"_id":
entity relations
12
Rows --> Documents
13
• Use some field to group documents by schema
• Example:
"type":"user" or "type":"book"
"_id":"user:456" or "_id":"book:9988"
Tables --> Databases
14
• Put all tables in one database; use "type": to distinguish
• Model entity relationships with secondary indexes
• http://wiki.apache.org/couchdb/EntityRelationship
3. How do you query NoSQL
15
Indexes and Queries
16
• An "index" in Cloudant is not strictly a performance optimization
• Instead, more akin to "materialized view" in RDBMS terms
• Index also called a "database view" in Cloudant
• Index, then query
• You need one before you can do the other
• Create index, then query by URL
• Can create a secondary index on any field within a document
• You get primary index (based on reserved "_id": field) by default
• Indexes precomputed, updated in real time
• Indexes are updated using incremental MapReduce
• You don't need to rebuild the entire index every time a document is changed,
added, or deleted
• Performant at big-honkin' scale
One Cloudant DB, Many Indexes
17
The Cloudant API
Cloudant Query
18
curl -X POST 'https://<accountname>.cloudant.com/users/_find' -d
'{
"selector": {
"age": {
"$gt": 25,
"$lte": 50
}
}
}'
4. Will I Have to Rebuild My App?
19
Yes
20
By ripping out the bad parts:
• Extract, Transform, Load
• Schema migrations
• JOINs that don't scale
Each of My Tables Becomes a Different
Type of JSON Document?
21
No
• Fancy explanation:
• Best practice is to denormalize
data into 3rd normal form
• Or, less fancy:
• Smoosh relationships for each
entry all together into one JSON
doc
• Denormalization
• Approach to data modeling that
shards well and scales well
• Works well with data that is
somewhat static, or infrequently
updated
22
A smooshed and griddled cheese sandwich
23
Example {
"_id": "johnsmith@us.ibm.com",
"_rev": "12-89e6128fb2d3e2e14559e796b6a71c9d",
"name": "John Smith",
"title": "Technical Sales Manager",
"products": [ "Cloudant", "Information Server"],
"languages": [ "English" ],
"geolocation": {
"coordinates": [ -122.18258, 37.880058 ],
"type": "point"
},
"address": {
"street": "63 Citron knoll",
"city": "Orinda",
"state": "CA",
"country": "USA"
}
}
5. Does it replicate?
24
{
"_id": "johnsmith@us.ibm.com",
"_rev": "12-89e6128fb2d3e2e14559e796b6a71c9d",
"name": "John Smith",
"title": "Technical Sales Manager",
"products": [ "Cloudant", "Information Server",],
"languages": [ "English" ],
"geolocation": {
"coordinates": [ -122.18258, 37.880058 ],
"type": "point"
},
"address": {
"street": "63 Citron knoll",
"city": "Orinda",
"state": "CA",
"country": "USA"
}
}
Replication targets
25
• Apache CouchDB
• IBM Cloudant
• PouchDB (client & server)
• Cloudant Sync Libraries
26
www.glynnbird.com
• My home page
• Cloudant database of articles
• Replicated to PouchDB
• Appcache for offline first
• http://www.glynnbird.com/
26
6. How do I get data in and out?
27
• Yes
• https://cloudant.com/for-developers/migrating-data/
• But every use case is different and everyone’s data is different
• Lots of DIY tools on github that could work for you
• Cloudant’s Homegrown CSV --> JSON Tools
• python: https://github.com/claudiusli/csv-import
• Java: https://github.com/cavanaugh-ibm/db-data-loader
• Node: https://github.com/glynnbird/couchimport
Simple Data Pipe
28
• https://github.com/ibm-cds-labs/pipes
Simple Search Service
29
https://developer.ibm.com/clouddataservices
/simple-search-service/
Glynn Bird
Developer Advocate, Cloud Data Services
glynn.bird@uk.ibm.com
@glynn_bird
github.com/glynnbird
Legal Slide #1
31
© "Apache", "CouchDB", "Apache CouchDB", "Apache Lucene," "Lucene", and the CouchDB logo are trademarks or registered
trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.
Legal Slide #2
32
© Copyright IBM Corporation 2016
IBM and the IBM Cloudant logo are trademarks of International Business Machines Corp., registered in many
jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current
list of IBM trademarks is available on the Web at "Copyright and trademark information" at
ibm.com/legal/copytrade.shtml

More Related Content

What's hot

Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingInsights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingSpark Summit
 
Hitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIHitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIAndrew Brust
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...Data Con LA
 
Dive Into Azure Data Lake - PASS 2017
Dive Into Azure Data Lake - PASS 2017Dive Into Azure Data Lake - PASS 2017
Dive Into Azure Data Lake - PASS 2017Ike Ellis
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0SpringPeople
 
Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbDimgba Kalu
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLEDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Logical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupLogical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupGianmario Spacagna
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsMongoDB
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Tableau & MongoDB: Visual Analytics at the Speed of Thought
Tableau & MongoDB: Visual Analytics at the Speed of ThoughtTableau & MongoDB: Visual Analytics at the Speed of Thought
Tableau & MongoDB: Visual Analytics at the Speed of ThoughtMongoDB
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkIke Ellis
 

What's hot (20)

Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingInsights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
 
Hitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIHitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BI
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
Big Data Day LA 2015 - NoSQL: Doing it wrong before getting it right by Lawre...
 
HTAP Queries
HTAP QueriesHTAP Queries
HTAP Queries
 
Dive Into Azure Data Lake - PASS 2017
Dive Into Azure Data Lake - PASS 2017Dive Into Azure Data Lake - PASS 2017
Dive Into Azure Data Lake - PASS 2017
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
 
Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdb
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Deep Dive on ArangoDB
Deep Dive on ArangoDBDeep Dive on ArangoDB
Deep Dive on ArangoDB
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Logical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetupLogical-DataWarehouse-Alluxio-meetup
Logical-DataWarehouse-Alluxio-meetup
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSs
 
Dremio introduction
Dremio introductionDremio introduction
Dremio introduction
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Tableau & MongoDB: Visual Analytics at the Speed of Thought
Tableau & MongoDB: Visual Analytics at the Speed of ThoughtTableau & MongoDB: Visual Analytics at the Speed of Thought
Tableau & MongoDB: Visual Analytics at the Speed of Thought
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You Think
 

Similar to SQL To NoSQL - Top 6 Questions Before Making The Move

NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the moveCodemotion
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessInfiniteGraph
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLInside Analysis
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoopclairvoyantllc
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerIBM Cloud Data Services
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An AnalysisAndrew Brust
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...DataStax
 
Going Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueGoing Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueMichael Rainey
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeIke Ellis
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumFrank Lyaruu
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 

Similar to SQL To NoSQL - Top 6 Questions Before Making The Move (20)

Wmware NoSQL
Wmware NoSQLWmware NoSQL
Wmware NoSQL
 
NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the move
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQL
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data Layer
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
CDC to the Max!
CDC to the Max!CDC to the Max!
CDC to the Max!
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An Analysis
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
 
Going Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueGoing Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS Glue
 
Survey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data LandscapeSurvey of the Microsoft Azure Data Landscape
Survey of the Microsoft Azure Data Landscape
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and Debezium
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 

More from IBM Cloud Data Services

CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDBCouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDBIBM Cloud Data Services
 
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
CouchDB Day NYC 2017: Introduction to CouchDB 2.0CouchDB Day NYC 2017: Introduction to CouchDB 2.0
CouchDB Day NYC 2017: Introduction to CouchDB 2.0IBM Cloud Data Services
 
I See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsI See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsIBM Cloud Data Services
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsIBM Cloud Data Services
 
Cloud Data Services: A Brand New Ballgame for Business
Cloud Data Services: A  Brand New Ballgame for BusinessCloud Data Services: A  Brand New Ballgame for Business
Cloud Data Services: A Brand New Ballgame for BusinessIBM Cloud Data Services
 
IBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDBIBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDBIBM Cloud Data Services
 
Run Oracle Apps in the Cloud with dashDB
Run Oracle Apps in the Cloud with dashDBRun Oracle Apps in the Cloud with dashDB
Run Oracle Apps in the Cloud with dashDBIBM Cloud Data Services
 
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGISAnalyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGISIBM Cloud Data Services
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceIBM Cloud Data Services
 
Introducing dashDB MPP: The Power of Data Warehousing in the Cloud
Introducing dashDB MPP: The Power of Data Warehousing in the CloudIntroducing dashDB MPP: The Power of Data Warehousing in the Cloud
Introducing dashDB MPP: The Power of Data Warehousing in the CloudIBM Cloud Data Services
 

More from IBM Cloud Data Services (18)

CouchDB Day NYC 2017: Full Text Search
CouchDB Day NYC 2017: Full Text SearchCouchDB Day NYC 2017: Full Text Search
CouchDB Day NYC 2017: Full Text Search
 
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDBCouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
CouchDB Day NYC 2017: Using Geospatial Data in Cloudant & CouchDB
 
CouchDB Day NYC 2017: MapReduce Views
CouchDB Day NYC 2017: MapReduce ViewsCouchDB Day NYC 2017: MapReduce Views
CouchDB Day NYC 2017: MapReduce Views
 
CouchDB Day NYC 2017: Replication
CouchDB Day NYC 2017: ReplicationCouchDB Day NYC 2017: Replication
CouchDB Day NYC 2017: Replication
 
CouchDB Day NYC 2017: Mango
CouchDB Day NYC 2017: MangoCouchDB Day NYC 2017: Mango
CouchDB Day NYC 2017: Mango
 
CouchDB Day NYC 2017: JSON Documents
CouchDB Day NYC 2017: JSON DocumentsCouchDB Day NYC 2017: JSON Documents
CouchDB Day NYC 2017: JSON Documents
 
CouchDB Day NYC 2017: Core HTTP API
CouchDB Day NYC 2017: Core HTTP APICouchDB Day NYC 2017: Core HTTP API
CouchDB Day NYC 2017: Core HTTP API
 
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
CouchDB Day NYC 2017: Introduction to CouchDB 2.0CouchDB Day NYC 2017: Introduction to CouchDB 2.0
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
 
I See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial ApplicationsI See NoSQL Document Stores in Geospatial Applications
I See NoSQL Document Stores in Geospatial Applications
 
NoSQL for SQL Users
NoSQL for SQL UsersNoSQL for SQL Users
NoSQL for SQL Users
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systems
 
Cloud Data Services: A Brand New Ballgame for Business
Cloud Data Services: A  Brand New Ballgame for BusinessCloud Data Services: A  Brand New Ballgame for Business
Cloud Data Services: A Brand New Ballgame for Business
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
 
IBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDBIBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDB
 
Run Oracle Apps in the Cloud with dashDB
Run Oracle Apps in the Cloud with dashDBRun Oracle Apps in the Cloud with dashDB
Run Oracle Apps in the Cloud with dashDB
 
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGISAnalyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
Analyzing GeoSpatial data with IBM Cloud Data Services & Esri ArcGIS
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Introducing dashDB MPP: The Power of Data Warehousing in the Cloud
Introducing dashDB MPP: The Power of Data Warehousing in the CloudIntroducing dashDB MPP: The Power of Data Warehousing in the Cloud
Introducing dashDB MPP: The Power of Data Warehousing in the Cloud
 

Recently uploaded

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 

Recently uploaded (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

SQL To NoSQL - Top 6 Questions Before Making The Move

  • 1. SQL to NoSQL: Top 6 Questions Glynn Bird Developer Advocate @ IBM @glynn_bird
  • 2. Agenda 2 • Top 6 Questions When Moving to NoSQL 1. Why NoSQL? 2. Rows and Tables Become ... What? 3. Will I Have to Rebuild My App? 4. How do I query data? 5. What's _rev? 6. Does it replicate? • Live Q&A
  • 4. But, What Is NoSQL, Really? 4 • Umbrella term for databases using non-SQL query languages • Key-Value stores • Column-family stores • Document stores • Graph stores • Some also say "non-relational," because data is not decomposed into separate tables, rows, and columns • It’s still possible to represent relationships in NoSQL • The question is, are these relationships always necessary?
  • 5. NoSQL Document Stores 5 • That's databases like MongoDB, Apache CouchDB™, Cloudant, and Dynamo • Optimized for "semi-structured" or "schema-optional" data • People say "unstructured," but that's inaccurate • Each document has its own structure
  • 7. Schema Flexibility 7 • Cloudant uses JavaScript Object Notation (JSON) as its data format • Cloudant is based on Apache CouchDB. In both systems, a "database" is simply a collection of JSON documents { "docs": [ { "_id": "df8cecd9809662d08eb853989a5ca2f2", "_rev": "1- 8522c9a1d9570566d96b7f7171623270", "Movie_runtime": 162, "Movie_rating": "PG-13", "Person_name": "Zoe Saldana", "Actor_actor_id": "0757855", "Movie_genre": "AVYS", "Movie_name": "Avatar", "Actor_movie_id": "0499549", "Movie_earnings_rank": "1", "Person_pob": "New Jersey, USA", "Person_id": "0757855", "Movie_id": "0499549", "Movie_year": 2009, "Person_dob": "1978-06-19" } ] }
  • 8. The Cloudant Data Layer 8 • Distributed NoSQL data persistence layer • Available as a fully-managed DBaaS, or managed by you on-premises • Transactional JSON document database with REST API • Spreads data across data centers & devices for scale & high availability • Ideal for apps that require: • Massive, elastic scalability • High availability • Geo-location services • Full-text search • Offline-first design for occasionally connected users
  • 9. Not One DB Server; a Cluster of Servers • A Cloudant cluster • Horizontal scale • Redundant load balancers backed by multiple DB servers • Designed for durability • Saves multiple copies of data • Spreads copies across cluster • All replicas do reads & writes • Access Cloudant over the Web • Developers get an API • Cloudant manages it all behind the scenes 9
  • 10. Horizontal Scaling • Shard across many commodity servers vs. few expensive ones • Performance improves linearly with cost, not exponentially 10
  • 11. 2. Rows and Tables Become ... What? 11
  • 12. ... This! SQL Terms/Concepts database --> table --> row --> column --> materialized view --> primary key --> table JOIN operations --> Document Store Terms/Concepts database bunch of documents document field index/database view/secondary index "_id": entity relations 12
  • 13. Rows --> Documents 13 • Use some field to group documents by schema • Example: "type":"user" or "type":"book" "_id":"user:456" or "_id":"book:9988"
  • 14. Tables --> Databases 14 • Put all tables in one database; use "type": to distinguish • Model entity relationships with secondary indexes • http://wiki.apache.org/couchdb/EntityRelationship
  • 15. 3. How do you query NoSQL 15
  • 16. Indexes and Queries 16 • An "index" in Cloudant is not strictly a performance optimization • Instead, more akin to "materialized view" in RDBMS terms • Index also called a "database view" in Cloudant • Index, then query • You need one before you can do the other • Create index, then query by URL • Can create a secondary index on any field within a document • You get primary index (based on reserved "_id": field) by default • Indexes precomputed, updated in real time • Indexes are updated using incremental MapReduce • You don't need to rebuild the entire index every time a document is changed, added, or deleted • Performant at big-honkin' scale
  • 17. One Cloudant DB, Many Indexes 17 The Cloudant API
  • 18. Cloudant Query 18 curl -X POST 'https://<accountname>.cloudant.com/users/_find' -d '{ "selector": { "age": { "$gt": 25, "$lte": 50 } } }'
  • 19. 4. Will I Have to Rebuild My App? 19
  • 20. Yes 20 By ripping out the bad parts: • Extract, Transform, Load • Schema migrations • JOINs that don't scale
  • 21. Each of My Tables Becomes a Different Type of JSON Document? 21
  • 22. No • Fancy explanation: • Best practice is to denormalize data into 3rd normal form • Or, less fancy: • Smoosh relationships for each entry all together into one JSON doc • Denormalization • Approach to data modeling that shards well and scales well • Works well with data that is somewhat static, or infrequently updated 22 A smooshed and griddled cheese sandwich
  • 23. 23 Example { "_id": "johnsmith@us.ibm.com", "_rev": "12-89e6128fb2d3e2e14559e796b6a71c9d", "name": "John Smith", "title": "Technical Sales Manager", "products": [ "Cloudant", "Information Server"], "languages": [ "English" ], "geolocation": { "coordinates": [ -122.18258, 37.880058 ], "type": "point" }, "address": { "street": "63 Citron knoll", "city": "Orinda", "state": "CA", "country": "USA" } }
  • 24. 5. Does it replicate? 24 { "_id": "johnsmith@us.ibm.com", "_rev": "12-89e6128fb2d3e2e14559e796b6a71c9d", "name": "John Smith", "title": "Technical Sales Manager", "products": [ "Cloudant", "Information Server",], "languages": [ "English" ], "geolocation": { "coordinates": [ -122.18258, 37.880058 ], "type": "point" }, "address": { "street": "63 Citron knoll", "city": "Orinda", "state": "CA", "country": "USA" } }
  • 25. Replication targets 25 • Apache CouchDB • IBM Cloudant • PouchDB (client & server) • Cloudant Sync Libraries
  • 26. 26 www.glynnbird.com • My home page • Cloudant database of articles • Replicated to PouchDB • Appcache for offline first • http://www.glynnbird.com/ 26
  • 27. 6. How do I get data in and out? 27 • Yes • https://cloudant.com/for-developers/migrating-data/ • But every use case is different and everyone’s data is different • Lots of DIY tools on github that could work for you • Cloudant’s Homegrown CSV --> JSON Tools • python: https://github.com/claudiusli/csv-import • Java: https://github.com/cavanaugh-ibm/db-data-loader • Node: https://github.com/glynnbird/couchimport
  • 28. Simple Data Pipe 28 • https://github.com/ibm-cds-labs/pipes
  • 30. Glynn Bird Developer Advocate, Cloud Data Services glynn.bird@uk.ibm.com @glynn_bird github.com/glynnbird
  • 31. Legal Slide #1 31 © "Apache", "CouchDB", "Apache CouchDB", "Apache Lucene," "Lucene", and the CouchDB logo are trademarks or registered trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.
  • 32. Legal Slide #2 32 © Copyright IBM Corporation 2016 IBM and the IBM Cloudant logo are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at ibm.com/legal/copytrade.shtml