SlideShare a Scribd company logo
1 of 57
Download to read offline
© 2015 MapR Technologies© 2016 MapR Technologies1
Real-World NoSQL Schema Design
Tugdual Grall
April 13, 2016
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall
{“about” : “me”}
Tugdual “Tug” Grall
• MapR
• Technical Evangelist
• MongoDB
• Technical Evangelist
• Couchbase
• Technical Evangelist
• eXo
• CTO
• Oracle
• Developer/Product Manager
• Mainly Java/SOA
• Developer in consulting firms
• Web
• @tgrall
• http://tgrall.github.io
• tgrall

• NantesJUG co-founder

• Pet Project :
• http://www.resultri.com
• tug@mapr.com
• tugdual@gmail.com
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 3
Database Schema Design
“ERD” - Logical
For RDBMS:
Entities => Tables
Attributes => Columns
Relationships => Foreign Keys
Many-to-Many => Junction Table
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 4
SEMI-STRUCTURED
DATA
STRUCTURED DATA
1980 2000 20101990 2020
Data is Doubling Every Two Years
Source: Human-Computer Interaction & Knowledge Discovery in Complex Unstructured, Big Data
TotalDataStored
Unstructured data will account
for more than 80% of the data
collected by organizations
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 5
Big Datastore
Distributed File System
HDFS/MapR-FS
NoSQL Database
HBase/MapR-DB
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 6
Store data as File or Row?
HDFS / MapR-FS
• Data stores as “files”
• Fast with Large Scans
• Slow random read/writes
HBase/MapR-DB
• Data stores as row/documents
• Fast with random read/writes
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 7
NoSQL Database
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 8
Contrast Relational and HBase Style noSQL
Relational
• Rows containing fields
• Fields contain primitive types
• Structure is fixed and uniform
• Structure is pre-defined
• Referential integrity (optional)
• Expressions over sets of rows
HBase / MapR DB
• Rows contain fields
• Fields bytes
• Structure is flexible
• No pre-defined structure
• Single key
• Column families
• Timestamps
• Versions
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 9
Mix Models for Databases
• Allows complex objects in field values
• JSON style lists and objects
• Allow references to objects via join
• Includes references localized within lists
• Lists of objects and objects of lists are isomorphic to tables so …
• Complex data in tables,
• But also tables in complex data,
• Even tables containing complex data containing tables
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 10
A Catalog of NoSQL Idioms
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 11
Tables as Objects, Objects as Tables
c1 c2 c3
Row-wise form
c1 c2 c3
Column-wise form
[ { c1:v1, c2:v2, c3:v3 },
{ c1:v1, c2:v2, c3:v3 },
{ c1:v1, c2:v2, c3:v3 } ]
List of objects
{ c1:[v1, v2, v3],
c2:[v1, v2, v3],
c3:[v1, v2, v3] }
Object containing lists
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 12
A first example:

Time-series data
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 13
Column names as data
• When column names are not pre-defined, they can convey
information
• Examples
• Time offsets within a window for time series
• Top-level domains for web crawlers
• Vendor id’s for customer purchase profiles
• Predefined schema is impossible for this idiom
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 14
Relational Model for Time-Series
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 15
NoSQL Table Design: Point-by-Point
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 16
Table Design: Hybrid Point-by-Point + Sub-table
After close of window, data in row is restated as column-oriented
tabular value in different column family.
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 17
A second example:
Music Application
https://musicbrainz.org/
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 18
MusicBrainz on NoSQL
• Artists, albums, tracks and labels are key objects
• Reality check:
• Add works (compositions), recordings, release, release group
• 7 tables for artist alone
• 12 for place, 7 for label, 17 for release/group, 8 for work
• (but only 4 for recording!)
• Total of 12 + 7 + 17 + 8 + 4 = 48 tables
• But wait, there’s more!
• 10 annotation tables, 10 edit tables, 19 tag tables, 5 rating tables, 86 link
tables, 5 cover art tables and 3 tables for CD timing info (138 total)
• And 50 more tables that aren’t documented yet
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 19
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 20
180 tables
not shown
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 21
236 tables

to describe 7 kinds of things
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 22
Can we do better?
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 23
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 24
artist
id
gid
name
sort_name
begin_date
end_date
ended
type
gender
area
being_area
end_area
comment
list<ipi>
list<isni>
list<alias>
list<release_id>
list<recording_id>
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 25
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 26
{id, recording_id,
name, list<credit>
length}
recording
id
gid
list<credit>
name
list<track_ref>
{id, format,
name,
list<track>}
release_group
id
gid
name
list<credit>
type
list<release_id>
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 27
27 tables reduce to 4
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 28
27 tables reduce to 4
so far
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 29
Further Reductions
• All 86 link tables become properties on artists, releases and other
entities
• All 44 tag, rating and annotation tables become list properties
• All 5 cover art tables become lists of file references
• Current score: 162 tables become 4
• You get the idea
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 30
Artist in HBase/MapR-DB
get '/apps/db/music/artist_hbase', 'nirvana'
COLUMN CELL
default:begin_date timestamp=1460500945476, value=1988-01-01
default:end_data timestamp=1460500945509, value=1994-04-05
default:ended timestamp=1460500945538, value=true
default:name timestamp=1460500945438, value=Nirvana
list:albums timestamp=1460500945578, value=[
{“title”:"In Utero", "released" : “1993-09-21"},

{"title":"Nevermind", "released" : "1991-09-24"},

{"title":"Bleach", "released" : "1989-06-15"}]
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 31
NoSQL Data Model
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 32
Scalable
databases
Document
databases
Ease of use
Developer friendly
Flexible
Scalable, parallel
Binary API
Performant
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 33
Scalable
databases
Document
databases
Mapr JSON DBMapr JSON DB
or
OJAI + other DB
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 34
JSON Document : Flexible Schema
• Support Data-Types
• Complex Data Structure
• Developer Friendly
• http://json.org
• Flexible
• Easy to evolve
• Start right, stay right
{
"_id" : "001-003-ABC",
"first_name" : "John",
"last_name" : "Doe",
"email" : "jdoe@doe.com",
"dob" : “1970-04-23",
"points : 55000,
"interests" : ["sports", "movies"],
"address" : {
"street" : "1212 Maple Street",
"city" : "San Jose",
"state" : "CA",
"zip" : "95101"
},
"sessions" : [
{“id":"CoD4","ts":1439824477},
{"id":"fifa14""ts": 1439565276}
]
}
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 35
JSON Document : Flexible Schema
{
"_id" : "001-003-ABC",
"first_name" : "John",
"last_name" : "Doe",
"email" : "jdoe@doe.com",
"dob" : “1970-04-23",
"points : 55000,
"interests" : ["sports", "movies"],
"address" : {
"street" : "1212 Maple Street",
"city" : "San Jose",
"state" : "CA",
"zip" : "95101"
},
"sessions" : [
{“id":"CoD4","ts":1439824477},
{"id":"fifa14""ts": 1439565276}
]
}
Attributes
Types
String
Numbers
Arrays
} Nested Doc
} Arrays of Doc
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 36
No SQL : use cases
Key Value
Session Management

User Profile/Preferences

Shopping Cart
Document
Event Logging

Content Management

Web Analytics

Product Catalog

Single View

e-Commerce
Wide Column
Event Logging

Content Management

Counters
Graph
Social Network

Routing/Dispatch

Recommendation on
Social Graph
MapR-DB-
JSON
MapR-DB-
HBase
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 37
No SQL : use cases
Key Value
Session Management

User Profile/Preferences

Shopping Cart
Document
Event Logging

Content Management

Web Analytics

Product Catalog
Single View

e-Commerce
Wide Column
Event Logging

Content Management

Counters
Graph
Social Network

Routing/Dispatch

Recommendation on
Social Graph
MapR-DB-
JSON
MapR-DB-
HBase
© 2016 MapR Technologies 38
Flexible Schema in Action
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 39
Product Catalog - RDBMS
SELECT * FROM (
SELECT
ce.sku,
ea.attribute_id,
ea.attribute_code,
CASE ea.backend_type
WHEN 'varchar' THEN ce_varchar.value
WHEN 'int' THEN ce_int.value
WHEN 'text' THEN ce_text.value
WHEN 'decimal' THEN ce_decimal.value
WHEN 'datetime' THEN ce_datetime.value
ELSE ea.backend_type
END AS value,
ea.is_required AS required
FROM catalog_product_entity AS ce
LEFT JOIN eav_attribute AS ea
ON ce.entity_type_id = ea.entity_type_id
LEFT JOIN catalog_product_entity_varchar AS ce_varchar
ON ce.entity_id = ce_varchar.entity_id
AND ea.attribute_id = ce_varchar.attribute_id
AND ea.backend_type = 'varchar'
LEFT JOIN catalog_product_entity_text AS ce_text
ON ce.entity_id = ce_text.entity_id
AND ea.attribute_id = ce_text.attribute_id
AND ea.backend_type = 'text'
LEFT JOIN catalog_product_entity_decimal AS ce_decimal
ON ce.entity_id = ce_decimal.entity_id
AND ea.attribute_id = ce_decimal.attribute_id
AND ea.backend_type = 'decimal'
LEFT JOIN catalog_product_entity_datetime AS ce_datetime
ON ce.entity_id = ce_datetime.entity_id
AND ea.attribute_id = ce_datetime.attribute_id
AND ea.backend_type = 'datetime'
WHERE ce.sku = ‘rp-prod132546’
) AS tab
WHERE tab.value != ’’;
“Entity Value Attribute” Pattern To get a single product
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 40
Product Catalog - NoSQL/Document
Store the product “as a business object”
{
"_id" : "rp-prod132546",
"name" : "Marvel T2 Athena”,
"brand" : "Pinarello",
"category" : "bike",
"type" : "Road Bike”,
"price" : 2949.99,
"size" : "55cm",
"wheel_size" : "700c",
"frameset" : {
"frame" : "Carbon Toryaca",
"fork" : "Onda 2V C"
},
"groupset" : {
"chainset" : "Camp. Athena 50/34",
"brake" : "Camp."
},
"wheelset" : {
"wheels" : "Camp. Zonda",
"tyres" : "Vittoria Pro"
}
}
products
.findById(“rp-prod132546”)
To get a single product
© 2016 MapR Technologies 41
Easy variation with Documents
{
"_id" : "rp-prod132546",
"name" : "Marvel T2 Athena”,
"brand" : "Pinarello",
"category" : "bike",
"type" : "Road Bike”,
"price" : 2949.99,
"size" : "55cm",
"wheel_size" : "700c",
"frameset" : {
"frame" : "Carbon Toryaca",
"fork" : "Onda 2V C"
},
"groupset" : {
"chainset" : "Camp. Athena 50/34",
"brake" : "Camp."
},
"wheelset" : {
"wheels" : "Camp. Zonda",
"tyres" : "Vittoria Pro"
}
}
{
"_id" : "rp-prod106702",
"name" : " Ultegra SPD-SL 6800”,
"brand" : "Shimano",
"category" : "pedals",
"type" : "Components,
"price" : 112.99,
"features" : [
"Low profile design increases ...",
"Supplied with floating SH11 cleats",
"Weight: 260g (pair)"
]
}
{
"_id" : "rp-prod113104",
"name" : "Bianchi Pride Jersey SS15”,
"brand" : "Nalini",
"category" : "Jersey",
"type" : "Clothing,
"price" : 76.99,
"features" : [
"100% Polyester",
"3/4 hidden zip",
"3 rear pocket"
],
"color" : "black"
}
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 42
Back-end data
matches
front-end expectations
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 43
Add features easily
• Requirement: “users can vote and comment product”
{
"_id" : "rp-prod113104",
"name" : "Bianchi Pride Jersey SS15”,
"brand" : "Nalini",
"category" : "Jersey",
"type" : "Clothing,
"price" : 76.99,
"features" : [
"100% Polyester",
"3/4 hidden zip",
"3 rear pocket"
],
"color" : “black”,
Done: just store the data!
Means less time to market
{
"_id" : "rp-prod113104",
"name" : "Bianchi Pride Jersey SS15”,
"brand" : "Nalini",
"category" : "Jersey",
"type" : "Clothing,
"price" : 76.99,
"features" : [
"100% Polyester",
"3/4 hidden zip",
"3 rear pocket"
],
"color" : “black”,
"comments" : [{…}, {…}],
"ratings" : [{…, “value" : 5}, {…, value : 3}],
“rating” : 4
}
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 44
JSON documents make

data modeling easy
19
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 45
An artist look
find '/apps/db/music/artist_json', 'nirvana'
{
"_id" : "nirvana",
"name" : "Nirvana",
"begin_date" : "1988-01-01",
"end_data" : "1994-04-05"
"ended" : true,
"albums" : [
{"title":"In Utero", "released" : "1993-09-21"},
{"title":"Nevermind", "released" : "1991-09-24"},
{"title":"Bleach", "released" : "1989-06-15"}
]
}
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 46
Artist in HBase/MapR-DB
get '/apps/db/music/artist_hbase', 'nirvana'
COLUMN CELL
default:begin_date timestamp=1460500945476, value=1988-01-01
default:end_data timestamp=1460500945509, value=1994-04-05
default:ended timestamp=1460500945538, value=true
default:name timestamp=1460500945438, value=Nirvana
list:albums timestamp=1460500945578, value=[
{“title”:"In Utero", "released" : “1993-09-21"},

{"title":"Nevermind", "released" : "1991-09-24"},

{"title":"Bleach", "released" : "1989-06-15"}]
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 47
How does it work
• MapR JSON DB inherits almost all aspects of MapR DB
– Columns families (but now defined on the fly)
– Column level security
• MapR-DB JSON
– does NOT store JSON document “string”
– stores “real” data types (not Javascript Type)
• OJAI (open source) can add similar capabilities to other db’s
• http://ojai.io/ (/ OH-hy /)
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 48
Developers love JSON because….
• API is really easy to use :
– you deal with your business objects
• All operations to manipulate the fields, structure
– Add/Remove Fields
– Update Values, including sub documents, arrays
– Increments
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 49
Why is the developer so happy?
• Can build a quick and dirty MVP that evolves
• Can update application just a little bit
• Can tune performance later using column families
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 50
Why is the developer so happy?
• Expressive: lets you say what you need to say
– Developer hate circumlocution … say what you mean, don’t repeat
yourself
• Efficient (remember how much easier to get one product?)
– Simple designs run better because you can get them right
• Human readable: you can introspect
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 51
SQL on NoSQL
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 52
Hbase Columnar
select
convert_from(t.row_key, 'UTF8') id,
convert_from(t.`default`.`name`, 'UTF8') name,
convert_from(convert_from(t.list.albums, 'UTF8'),
'JSON') albums
from hbase.`/apps/db/music/artist_hbase` t
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 53
Document Database (MapR-DB JSON)
select
t._id,
t.name,
t.albums
from maprdb.`/apps/db/artist_json` t
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 54
Summary
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 55
Some Tips
• Carefully select rowkey/id & keys
• Design for “your application”
• Design for “questions” not “answers” (Highly Scalable Blog)
• Rows are updated atomically
• Use Pre Aggregation
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 56
© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 57
Q&A
@tgrall maprtech
tug@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

More Related Content

What's hot

Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)Carlos Buenosvinos
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
JDBC - JPA - Spring Data
JDBC - JPA - Spring DataJDBC - JPA - Spring Data
JDBC - JPA - Spring DataArturs Drozdovs
 
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaEvent Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaVMware Tanzu
 
The Art of Metaprogramming in Java
The Art of Metaprogramming in Java  The Art of Metaprogramming in Java
The Art of Metaprogramming in Java Abdelmonaim Remani
 
Learning git
Learning gitLearning git
Learning gitSid Anand
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OGeorge Cao
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Conhecendo Apache Kafka
Conhecendo Apache KafkaConhecendo Apache Kafka
Conhecendo Apache KafkaRafa Noronha
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Introduction to Design Patterns and Singleton
Introduction to Design Patterns and SingletonIntroduction to Design Patterns and Singleton
Introduction to Design Patterns and SingletonJonathan Simon
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at ScaleFabian Reinartz
 
Asynchronous API in Java8, how to use CompletableFuture
Asynchronous API in Java8, how to use CompletableFutureAsynchronous API in Java8, how to use CompletableFuture
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 

What's hot (20)

Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Reactive Java (33rd Degree)
Reactive Java (33rd Degree)Reactive Java (33rd Degree)
Reactive Java (33rd Degree)
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
JDBC - JPA - Spring Data
JDBC - JPA - Spring DataJDBC - JPA - Spring Data
JDBC - JPA - Spring Data
 
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaEvent Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
 
The Art of Metaprogramming in Java
The Art of Metaprogramming in Java  The Art of Metaprogramming in Java
The Art of Metaprogramming in Java
 
Learning git
Learning gitLearning git
Learning git
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/O
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Conhecendo Apache Kafka
Conhecendo Apache KafkaConhecendo Apache Kafka
Conhecendo Apache Kafka
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Introduction to Design Patterns and Singleton
Introduction to Design Patterns and SingletonIntroduction to Design Patterns and Singleton
Introduction to Design Patterns and Singleton
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at Scale
 
Java NIO.2
Java NIO.2Java NIO.2
Java NIO.2
 
JUnit 5
JUnit 5JUnit 5
JUnit 5
 
Asynchronous API in Java8, how to use CompletableFuture
Asynchronous API in Java8, how to use CompletableFutureAsynchronous API in Java8, how to use CompletableFuture
Asynchronous API in Java8, how to use CompletableFuture
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
Apache Zookeeper
Apache ZookeeperApache Zookeeper
Apache Zookeeper
 

Viewers also liked

MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMike Friedman
 
Impala: A Modern, Open-Source SQL Engine for Hadoop
Impala: A Modern, Open-Source SQL Engine for HadoopImpala: A Modern, Open-Source SQL Engine for Hadoop
Impala: A Modern, Open-Source SQL Engine for HadoopAll Things Open
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata StorageDataWorks Summit/Hadoop Summit
 
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...Cloudera, Inc.
 
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...Cloudera, Inc.
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceSkillspeed
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini Cloudera, Inc.
 
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase Cloudera, Inc.
 
How we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseHow we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseDataWorks Summit
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Divante
 
Surprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusinessSurprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusinessDivante
 
Omnichannel Customer Experience
Omnichannel Customer ExperienceOmnichannel Customer Experience
Omnichannel Customer ExperienceDivante
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
 

Viewers also liked (13)

MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
Impala: A Modern, Open-Source SQL Engine for Hadoop
Impala: A Modern, Open-Source SQL Engine for HadoopImpala: A Modern, Open-Source SQL Engine for Hadoop
Impala: A Modern, Open-Source SQL Engine for Hadoop
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
 
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
HBaseCon 2012 | Gap Inc Direct: Serving Apparel Catalog from HBase for Live W...
 
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
 
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
 
How we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBaseHow we solved Real-time User Segmentation using HBase
How we solved Real-time User Segmentation using HBase
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
 
Surprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusinessSurprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusiness
 
Omnichannel Customer Experience
Omnichannel Customer ExperienceOmnichannel Customer Experience
Omnichannel Customer Experience
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 

Similar to Real-World NoSQL Schema Design

An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folksThomas Hütter
 
HUG Italy meet-up with Tugdual Grall, MapR Technical Evangelist
HUG Italy meet-up with Tugdual Grall, MapR Technical EvangelistHUG Italy meet-up with Tugdual Grall, MapR Technical Evangelist
HUG Italy meet-up with Tugdual Grall, MapR Technical EvangelistSpagoWorld
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGDuyhai Doan
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQLHBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQLMapR Technologies
 
HBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLHBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLDataWorks Summit
 
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-..."Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...hamidsamadi
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceDuyhai Doan
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformSyracuse University
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for PythonWes McKinney
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...Facultad de Informática UCM
 
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016Matt Turner
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Tugdual Grall
 
Migrating from Relational - Data Modeling and Access
Migrating from Relational - Data Modeling and AccessMigrating from Relational - Data Modeling and Access
Migrating from Relational - Data Modeling and AccessClarence J M Tauro
 

Similar to Real-World NoSQL Schema Design (20)

An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 
HUG Italy meet-up with Tugdual Grall, MapR Technical Evangelist
HUG Italy meet-up with Tugdual Grall, MapR Technical EvangelistHUG Italy meet-up with Tugdual Grall, MapR Technical Evangelist
HUG Italy meet-up with Tugdual Grall, MapR Technical Evangelist
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQLHBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
 
HBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLHBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQL
 
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-..."Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...
"Real-time data processing with Spark & Cassandra", jDays 2015 Speaker: "Duy-...
 
Essentials of R
Essentials of REssentials of R
Essentials of R
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
 
Migrating from Relational - Data Modeling and Access
Migrating from Relational - Data Modeling and AccessMigrating from Relational - Data Modeling and Access
Migrating from Relational - Data Modeling and Access
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Real-World NoSQL Schema Design

  • 1. © 2015 MapR Technologies© 2016 MapR Technologies1 Real-World NoSQL Schema Design Tugdual Grall April 13, 2016
  • 2. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall {“about” : “me”} Tugdual “Tug” Grall • MapR • Technical Evangelist • MongoDB • Technical Evangelist • Couchbase • Technical Evangelist • eXo • CTO • Oracle • Developer/Product Manager • Mainly Java/SOA • Developer in consulting firms • Web • @tgrall • http://tgrall.github.io • tgrall
 • NantesJUG co-founder
 • Pet Project : • http://www.resultri.com • tug@mapr.com • tugdual@gmail.com
  • 3. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 3 Database Schema Design “ERD” - Logical For RDBMS: Entities => Tables Attributes => Columns Relationships => Foreign Keys Many-to-Many => Junction Table
  • 4. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 4 SEMI-STRUCTURED DATA STRUCTURED DATA 1980 2000 20101990 2020 Data is Doubling Every Two Years Source: Human-Computer Interaction & Knowledge Discovery in Complex Unstructured, Big Data TotalDataStored Unstructured data will account for more than 80% of the data collected by organizations
  • 5. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 5 Big Datastore Distributed File System HDFS/MapR-FS NoSQL Database HBase/MapR-DB
  • 6. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 6 Store data as File or Row? HDFS / MapR-FS • Data stores as “files” • Fast with Large Scans • Slow random read/writes HBase/MapR-DB • Data stores as row/documents • Fast with random read/writes
  • 7. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 7 NoSQL Database
  • 8. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 8 Contrast Relational and HBase Style noSQL Relational • Rows containing fields • Fields contain primitive types • Structure is fixed and uniform • Structure is pre-defined • Referential integrity (optional) • Expressions over sets of rows HBase / MapR DB • Rows contain fields • Fields bytes • Structure is flexible • No pre-defined structure • Single key • Column families • Timestamps • Versions
  • 9. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 9 Mix Models for Databases • Allows complex objects in field values • JSON style lists and objects • Allow references to objects via join • Includes references localized within lists • Lists of objects and objects of lists are isomorphic to tables so … • Complex data in tables, • But also tables in complex data, • Even tables containing complex data containing tables
  • 10. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 10 A Catalog of NoSQL Idioms
  • 11. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 11 Tables as Objects, Objects as Tables c1 c2 c3 Row-wise form c1 c2 c3 Column-wise form [ { c1:v1, c2:v2, c3:v3 }, { c1:v1, c2:v2, c3:v3 }, { c1:v1, c2:v2, c3:v3 } ] List of objects { c1:[v1, v2, v3], c2:[v1, v2, v3], c3:[v1, v2, v3] } Object containing lists
  • 12. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 12 A first example:
 Time-series data
  • 13. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 13 Column names as data • When column names are not pre-defined, they can convey information • Examples • Time offsets within a window for time series • Top-level domains for web crawlers • Vendor id’s for customer purchase profiles • Predefined schema is impossible for this idiom
  • 14. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 14 Relational Model for Time-Series
  • 15. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 15 NoSQL Table Design: Point-by-Point
  • 16. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 16 Table Design: Hybrid Point-by-Point + Sub-table After close of window, data in row is restated as column-oriented tabular value in different column family.
  • 17. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 17 A second example: Music Application https://musicbrainz.org/
  • 18. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 18 MusicBrainz on NoSQL • Artists, albums, tracks and labels are key objects • Reality check: • Add works (compositions), recordings, release, release group • 7 tables for artist alone • 12 for place, 7 for label, 17 for release/group, 8 for work • (but only 4 for recording!) • Total of 12 + 7 + 17 + 8 + 4 = 48 tables • But wait, there’s more! • 10 annotation tables, 10 edit tables, 19 tag tables, 5 rating tables, 86 link tables, 5 cover art tables and 3 tables for CD timing info (138 total) • And 50 more tables that aren’t documented yet
  • 19. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 19
  • 20. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 20 180 tables not shown
  • 21. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 21 236 tables
 to describe 7 kinds of things
  • 22. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 22 Can we do better?
  • 23. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 23
  • 24. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 24 artist id gid name sort_name begin_date end_date ended type gender area being_area end_area comment list<ipi> list<isni> list<alias> list<release_id> list<recording_id>
  • 25. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 25
  • 26. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 26 {id, recording_id, name, list<credit> length} recording id gid list<credit> name list<track_ref> {id, format, name, list<track>} release_group id gid name list<credit> type list<release_id>
  • 27. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 27 27 tables reduce to 4
  • 28. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 28 27 tables reduce to 4 so far
  • 29. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 29 Further Reductions • All 86 link tables become properties on artists, releases and other entities • All 44 tag, rating and annotation tables become list properties • All 5 cover art tables become lists of file references • Current score: 162 tables become 4 • You get the idea
  • 30. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 30 Artist in HBase/MapR-DB get '/apps/db/music/artist_hbase', 'nirvana' COLUMN CELL default:begin_date timestamp=1460500945476, value=1988-01-01 default:end_data timestamp=1460500945509, value=1994-04-05 default:ended timestamp=1460500945538, value=true default:name timestamp=1460500945438, value=Nirvana list:albums timestamp=1460500945578, value=[ {“title”:"In Utero", "released" : “1993-09-21"},
 {"title":"Nevermind", "released" : "1991-09-24"},
 {"title":"Bleach", "released" : "1989-06-15"}]
  • 31. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 31 NoSQL Data Model
  • 32. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 32 Scalable databases Document databases Ease of use Developer friendly Flexible Scalable, parallel Binary API Performant
  • 33. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 33 Scalable databases Document databases Mapr JSON DBMapr JSON DB or OJAI + other DB
  • 34. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 34 JSON Document : Flexible Schema • Support Data-Types • Complex Data Structure • Developer Friendly • http://json.org • Flexible • Easy to evolve • Start right, stay right { "_id" : "001-003-ABC", "first_name" : "John", "last_name" : "Doe", "email" : "jdoe@doe.com", "dob" : “1970-04-23", "points : 55000, "interests" : ["sports", "movies"], "address" : { "street" : "1212 Maple Street", "city" : "San Jose", "state" : "CA", "zip" : "95101" }, "sessions" : [ {“id":"CoD4","ts":1439824477}, {"id":"fifa14""ts": 1439565276} ] }
  • 35. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 35 JSON Document : Flexible Schema { "_id" : "001-003-ABC", "first_name" : "John", "last_name" : "Doe", "email" : "jdoe@doe.com", "dob" : “1970-04-23", "points : 55000, "interests" : ["sports", "movies"], "address" : { "street" : "1212 Maple Street", "city" : "San Jose", "state" : "CA", "zip" : "95101" }, "sessions" : [ {“id":"CoD4","ts":1439824477}, {"id":"fifa14""ts": 1439565276} ] } Attributes Types String Numbers Arrays } Nested Doc } Arrays of Doc
  • 36. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 36 No SQL : use cases Key Value Session Management User Profile/Preferences Shopping Cart Document Event Logging Content Management Web Analytics Product Catalog Single View e-Commerce Wide Column Event Logging Content Management Counters Graph Social Network Routing/Dispatch Recommendation on Social Graph MapR-DB- JSON MapR-DB- HBase
  • 37. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 37 No SQL : use cases Key Value Session Management User Profile/Preferences Shopping Cart Document Event Logging Content Management Web Analytics Product Catalog Single View e-Commerce Wide Column Event Logging Content Management Counters Graph Social Network Routing/Dispatch Recommendation on Social Graph MapR-DB- JSON MapR-DB- HBase
  • 38. © 2016 MapR Technologies 38 Flexible Schema in Action
  • 39. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 39 Product Catalog - RDBMS SELECT * FROM ( SELECT ce.sku, ea.attribute_id, ea.attribute_code, CASE ea.backend_type WHEN 'varchar' THEN ce_varchar.value WHEN 'int' THEN ce_int.value WHEN 'text' THEN ce_text.value WHEN 'decimal' THEN ce_decimal.value WHEN 'datetime' THEN ce_datetime.value ELSE ea.backend_type END AS value, ea.is_required AS required FROM catalog_product_entity AS ce LEFT JOIN eav_attribute AS ea ON ce.entity_type_id = ea.entity_type_id LEFT JOIN catalog_product_entity_varchar AS ce_varchar ON ce.entity_id = ce_varchar.entity_id AND ea.attribute_id = ce_varchar.attribute_id AND ea.backend_type = 'varchar' LEFT JOIN catalog_product_entity_text AS ce_text ON ce.entity_id = ce_text.entity_id AND ea.attribute_id = ce_text.attribute_id AND ea.backend_type = 'text' LEFT JOIN catalog_product_entity_decimal AS ce_decimal ON ce.entity_id = ce_decimal.entity_id AND ea.attribute_id = ce_decimal.attribute_id AND ea.backend_type = 'decimal' LEFT JOIN catalog_product_entity_datetime AS ce_datetime ON ce.entity_id = ce_datetime.entity_id AND ea.attribute_id = ce_datetime.attribute_id AND ea.backend_type = 'datetime' WHERE ce.sku = ‘rp-prod132546’ ) AS tab WHERE tab.value != ’’; “Entity Value Attribute” Pattern To get a single product
  • 40. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 40 Product Catalog - NoSQL/Document Store the product “as a business object” { "_id" : "rp-prod132546", "name" : "Marvel T2 Athena”, "brand" : "Pinarello", "category" : "bike", "type" : "Road Bike”, "price" : 2949.99, "size" : "55cm", "wheel_size" : "700c", "frameset" : { "frame" : "Carbon Toryaca", "fork" : "Onda 2V C" }, "groupset" : { "chainset" : "Camp. Athena 50/34", "brake" : "Camp." }, "wheelset" : { "wheels" : "Camp. Zonda", "tyres" : "Vittoria Pro" } } products .findById(“rp-prod132546”) To get a single product
  • 41. © 2016 MapR Technologies 41 Easy variation with Documents { "_id" : "rp-prod132546", "name" : "Marvel T2 Athena”, "brand" : "Pinarello", "category" : "bike", "type" : "Road Bike”, "price" : 2949.99, "size" : "55cm", "wheel_size" : "700c", "frameset" : { "frame" : "Carbon Toryaca", "fork" : "Onda 2V C" }, "groupset" : { "chainset" : "Camp. Athena 50/34", "brake" : "Camp." }, "wheelset" : { "wheels" : "Camp. Zonda", "tyres" : "Vittoria Pro" } } { "_id" : "rp-prod106702", "name" : " Ultegra SPD-SL 6800”, "brand" : "Shimano", "category" : "pedals", "type" : "Components, "price" : 112.99, "features" : [ "Low profile design increases ...", "Supplied with floating SH11 cleats", "Weight: 260g (pair)" ] } { "_id" : "rp-prod113104", "name" : "Bianchi Pride Jersey SS15”, "brand" : "Nalini", "category" : "Jersey", "type" : "Clothing, "price" : 76.99, "features" : [ "100% Polyester", "3/4 hidden zip", "3 rear pocket" ], "color" : "black" }
  • 42. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 42 Back-end data matches front-end expectations
  • 43. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 43 Add features easily • Requirement: “users can vote and comment product” { "_id" : "rp-prod113104", "name" : "Bianchi Pride Jersey SS15”, "brand" : "Nalini", "category" : "Jersey", "type" : "Clothing, "price" : 76.99, "features" : [ "100% Polyester", "3/4 hidden zip", "3 rear pocket" ], "color" : “black”, Done: just store the data! Means less time to market { "_id" : "rp-prod113104", "name" : "Bianchi Pride Jersey SS15”, "brand" : "Nalini", "category" : "Jersey", "type" : "Clothing, "price" : 76.99, "features" : [ "100% Polyester", "3/4 hidden zip", "3 rear pocket" ], "color" : “black”, "comments" : [{…}, {…}], "ratings" : [{…, “value" : 5}, {…, value : 3}], “rating” : 4 }
  • 44. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 44 JSON documents make
 data modeling easy 19
  • 45. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 45 An artist look find '/apps/db/music/artist_json', 'nirvana' { "_id" : "nirvana", "name" : "Nirvana", "begin_date" : "1988-01-01", "end_data" : "1994-04-05" "ended" : true, "albums" : [ {"title":"In Utero", "released" : "1993-09-21"}, {"title":"Nevermind", "released" : "1991-09-24"}, {"title":"Bleach", "released" : "1989-06-15"} ] }
  • 46. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 46 Artist in HBase/MapR-DB get '/apps/db/music/artist_hbase', 'nirvana' COLUMN CELL default:begin_date timestamp=1460500945476, value=1988-01-01 default:end_data timestamp=1460500945509, value=1994-04-05 default:ended timestamp=1460500945538, value=true default:name timestamp=1460500945438, value=Nirvana list:albums timestamp=1460500945578, value=[ {“title”:"In Utero", "released" : “1993-09-21"},
 {"title":"Nevermind", "released" : "1991-09-24"},
 {"title":"Bleach", "released" : "1989-06-15"}]
  • 47. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 47 How does it work • MapR JSON DB inherits almost all aspects of MapR DB – Columns families (but now defined on the fly) – Column level security • MapR-DB JSON – does NOT store JSON document “string” – stores “real” data types (not Javascript Type) • OJAI (open source) can add similar capabilities to other db’s • http://ojai.io/ (/ OH-hy /)
  • 48. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 48 Developers love JSON because…. • API is really easy to use : – you deal with your business objects • All operations to manipulate the fields, structure – Add/Remove Fields – Update Values, including sub documents, arrays – Increments
  • 49. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 49 Why is the developer so happy? • Can build a quick and dirty MVP that evolves • Can update application just a little bit • Can tune performance later using column families
  • 50. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 50 Why is the developer so happy? • Expressive: lets you say what you need to say – Developer hate circumlocution … say what you mean, don’t repeat yourself • Efficient (remember how much easier to get one product?) – Simple designs run better because you can get them right • Human readable: you can introspect
  • 51. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 51 SQL on NoSQL
  • 52. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 52 Hbase Columnar select convert_from(t.row_key, 'UTF8') id, convert_from(t.`default`.`name`, 'UTF8') name, convert_from(convert_from(t.list.albums, 'UTF8'), 'JSON') albums from hbase.`/apps/db/music/artist_hbase` t
  • 53. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 53 Document Database (MapR-DB JSON) select t._id, t.name, t.albums from maprdb.`/apps/db/artist_json` t
  • 54. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 54 Summary
  • 55. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 55 Some Tips • Carefully select rowkey/id & keys • Design for “your application” • Design for “questions” not “answers” (Highly Scalable Blog) • Rows are updated atomically • Use Pre Aggregation
  • 56. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 56
  • 57. © 2016 MapR Technologies© 2016 MapR Technologies@tgrall 57 Q&A @tgrall maprtech tug@mapr.com Engage with us! MapR maprtech mapr-technologies