Your SlideShare is downloading. ×
0
NoSQL Databases and 

Analytic Use Cases
Aaron Cordova
INFORMS
NoSQL
• Perhaps better is “Non-Relational”
• Departure from conventional relational db
• Trade traditional features for si...
Types of NoSQL DBs
Columnar!
!
BigTable
Hbase
Accumulo
Cassandra
Graph!
!
Neo4j
OrientDB
Key-Value
!
Dynamo
Riak
Voldemort...
Trades
Give up!
!
Cross-row Transactions
Relational JOINS
Type Checking
SQL
Gain!
!
Simplicity
Scalability (distributed)
S...
NoSQL Distributed
Name Age Phone
Bob 43 555-1212
Jenny 32 555-1213
Sally 28 555-1214
Joe 45 555-1215
Up to
Petabytes
Consistency
Name Age Phone
Bob 43 555-1212
Jenny 32 555-1213
Sally 28 555-1214
Joe 45 555-1215
Name Age Phone
Bob 43 555-1...
Consistency
Geographically
Distributed, !
Eventually Consistent!
!
Dynamo
Riak
Voldemort
Cassandra
MongoDB
CouchDB
Single ...
Programmability
SQLObjects DB
Objects DB
VS
Programmability
MongoDB
Web Client
Javascript
Node.js server
Javascript
JSON JSON
Analytics
Analytics
Analytical DB
Operational DB
Operational DB
Operational DB
Business 

Activity
Business 

Intelligence
Updates, ...
Analytics
OLAP
OLTP
OLTP
OLTP
Business 

Activity
Business 

Intelligence
ETL
Schema
knowledge
Joins happen
here
Analytics
NoSQL DB
OLTP
OLTP
OLTP
Business 

Activity
Business 

Intelligence
?
NoSQL and Analytics
• Importing operational data can create a scale
problem
• Combining operational data can create sparse...
NoSQL and Analytics
Scalability, Schema Flexibility
Full Outer Join
Cust.name Cust.age Orders.shoes Facebook.likes …
Bob 43 $50 - …
Sarah 32 $25 5/5/14 …
Sally 28 - 4/3/12 …
...
BigTable Data Model
Row ID Column Value
R000 Cust.name Bob
R000 Cust.age 43
R000 Orders.shoes $50
R002 Cust.name Sally
R00...
MongoDB Data Model
{ !
! Cust.name: “Bob”,!
! Cust.age: 43,!
! Orders.shoes: $50!
},!
{!
! Cust.name: “Sally”,!
! Cust.age...
NoSQL Data Loading Shift
NoSQL Analytics!
!
Composite, Sparse Schemas
Scale out
Aggressive Indexing
Data Discovery
Convent...
Analytics
NoSQL DB
OLTP
OLTP
OLTP
Business 

Activity
Business 

Intelligence
Schema
Discovery
Joins happen
here
NoSQL Analytics Shift
Transformations!
!
MapReduce
Pre-computed
Large answers
Simple Lookups
Queries!
!
SQL
Computed on th...
Analytics
NoSQL DB
OLTP
OLTP
OLTP
Business 

Activity
Business 

Intelligence
MapReduce
Transformations
Fast
Lookups
MapReduce Analytics
Supported!
!
SQL (Hive)
Statistical Modeling
Machine Learning
Text Analytics
Feature Extraction
Image ...
MapReduce Analytic Workflow
Reusable
Transforms
Searchable

Collections
Combined-Data Security
Requirements!
!
Physically co-located data
Strong logical access control
Role-based
Questions
?
Contact Info
!
!
Aaron Cordova!
1-855-403-1399
www.koverse.com
info@koverse.com
Upcoming SlideShare
Loading in...5
×

NoSQL Databases and Analytic Use Cases

426

Published on

Koverse CTO Aaron Cordova's (@aaroncordova) talk from the 2014 INFORMS conference - "The Business of Big Data"

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
426
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "NoSQL Databases and Analytic Use Cases"

  1. 1. NoSQL Databases and 
 Analytic Use Cases Aaron Cordova INFORMS
  2. 2. NoSQL • Perhaps better is “Non-Relational” • Departure from conventional relational db • Trade traditional features for simplicity, scalability, flexibility
  3. 3. Types of NoSQL DBs Columnar! ! BigTable Hbase Accumulo Cassandra Graph! ! Neo4j OrientDB Key-Value ! Dynamo Riak Voldemort BerkeleyDB Document! ! MongoDB CouchDB MarkLogic (XML)
  4. 4. Trades Give up! ! Cross-row Transactions Relational JOINS Type Checking SQL Gain! ! Simplicity Scalability (distributed) Schema Flexibility Geographic distribution Programmatic APIs
  5. 5. NoSQL Distributed Name Age Phone Bob 43 555-1212 Jenny 32 555-1213 Sally 28 555-1214 Joe 45 555-1215 Up to Petabytes
  6. 6. Consistency Name Age Phone Bob 43 555-1212 Jenny 32 555-1213 Sally 28 555-1214 Joe 45 555-1215 Name Age Phone Bob 43 555-1212 Jenny 32 867-5309 Sally 28 555-1214 Joe 45 555-1215 Name Age Phone Bob 43 555-1212 Jenny 32 555-1213 Sally 28 555-1214 Joe 45 555-1215 X Multiple Data Centers Single Data Center
  7. 7. Consistency Geographically Distributed, ! Eventually Consistent! ! Dynamo Riak Voldemort Cassandra MongoDB CouchDB Single Data Center, Highly Consistent! ! BigTable Hbase Accumulo Cassandra Neo4j OrientDB MongoDB MarkLogic (XML)
  8. 8. Programmability SQLObjects DB Objects DB VS
  9. 9. Programmability MongoDB Web Client Javascript Node.js server Javascript JSON JSON
  10. 10. Analytics
  11. 11. Analytics Analytical DB Operational DB Operational DB Operational DB Business 
 Activity Business 
 Intelligence Updates, transactions Denormalized,
 Aggregations
  12. 12. Analytics OLAP OLTP OLTP OLTP Business 
 Activity Business 
 Intelligence ETL Schema knowledge Joins happen here
  13. 13. Analytics NoSQL DB OLTP OLTP OLTP Business 
 Activity Business 
 Intelligence ?
  14. 14. NoSQL and Analytics • Importing operational data can create a scale problem • Combining operational data can create sparse data • Operational schemas may change
  15. 15. NoSQL and Analytics Scalability, Schema Flexibility
  16. 16. Full Outer Join Cust.name Cust.age Orders.shoes Facebook.likes … Bob 43 $50 - … Sarah 32 $25 5/5/14 … Sally 28 - 4/3/12 … - - $35 11/1/13 … - - - 9/24/12 … Joe 45 $45 - … … … … … … Billions of rows Thousands of columns Sparse
  17. 17. BigTable Data Model Row ID Column Value R000 Cust.name Bob R000 Cust.age 43 R000 Orders.shoes $50 R002 Cust.name Sally R002 Cust.age 32 R002 Facebook.likes 4/3/12 … … …
  18. 18. MongoDB Data Model { ! ! Cust.name: “Bob”,! ! Cust.age: 43,! ! Orders.shoes: $50! },! {! ! Cust.name: “Sally”,! ! Cust.age: 32,! ! Facebook.likes: 4/3/12! },! …!
  19. 19. NoSQL Data Loading Shift NoSQL Analytics! ! Composite, Sparse Schemas Scale out Aggressive Indexing Data Discovery Conventional BI! ! Data cleaning Regularization Denormalization Star Schema Known operational Schemas
  20. 20. Analytics NoSQL DB OLTP OLTP OLTP Business 
 Activity Business 
 Intelligence Schema Discovery Joins happen here
  21. 21. NoSQL Analytics Shift Transformations! ! MapReduce Pre-computed Large answers Simple Lookups Queries! ! SQL Computed on the fly Small answers Roll up Drill down
  22. 22. Analytics NoSQL DB OLTP OLTP OLTP Business 
 Activity Business 
 Intelligence MapReduce Transformations Fast Lookups
  23. 23. MapReduce Analytics Supported! ! SQL (Hive) Statistical Modeling Machine Learning Text Analytics Feature Extraction Image Processing Graph Analysis
  24. 24. MapReduce Analytic Workflow Reusable Transforms Searchable
 Collections
  25. 25. Combined-Data Security Requirements! ! Physically co-located data Strong logical access control Role-based
  26. 26. Questions ?
  27. 27. Contact Info ! ! Aaron Cordova! 1-855-403-1399 www.koverse.com info@koverse.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×