SlideShare a Scribd company logo
1 of 44
Download to read offline
NoSQL data models
Viet-Trung Tran
is.hust.edu.vn/~trungtv/
1	
  
Eras of Databases
•  Why NoSQL?
2	
  
Before NoSQL
3	
  
RDBMS one-size-fits-all-needs
4	
  
ICDE 2005 conference
5	
  
The	
  last	
  25	
  years	
  of	
  commercial	
  DBMS	
  development	
  can	
  be	
  summed	
  up	
  in	
  a	
  single	
  phrase:	
  
"one	
  size	
  fits	
  all".	
  This	
  phrase	
  refers	
  to	
  the	
  fact	
  that	
  the	
  tradi.onal	
  DBMS	
  architecture	
  
(originally	
  designed	
  and	
  op.mized	
  for	
  business	
  data	
  processing)	
  has	
  been	
  used	
  to	
  support	
  
many	
  data-­‐centric	
  applica.ons	
  with	
  widely	
  varying	
  characterisCcs	
  and	
  requirements.	
  In	
  this	
  
paper,	
  we	
  argue	
  that	
  this	
  concept	
  is	
  no	
  longer	
  applicable	
  to	
  the	
  database	
  market,	
  and	
  that	
  the	
  
commercial	
  world	
  will	
  fracture	
  into	
  a	
  collecCon	
  of	
  independent	
  database	
  engines,	
  some	
  of	
  
which	
  may	
  be	
  unified	
  by	
  a	
  common	
  front-­‐end	
  parser.	
  We	
  use	
  examples	
  from	
  the	
  stream-­‐
processing	
  market	
  and	
  the	
  data-­‐warehouse	
  market	
  to	
  bolster	
  our	
  claims.	
  We	
  also	
  briefly	
  
discuss	
  other	
  markets	
  for	
  which	
  the	
  tradiConal	
  architecture	
  is	
  a	
  poor	
  fit	
  and	
  argue	
  for	
  a	
  criCcal	
  
rethinking	
  of	
  the	
  current	
  factoring	
  of	
  systems	
  services	
  into	
  products.	
  
After NoSQL
6	
  
RDBMS vs. others
7	
  
NoSQL landscape
8	
  
NoSQL raising
9	
  
10	
  
Why NoSQL
•  “The whole point of seeking alternatives [to
RDBMS systems] is that you need to solve a
problem that relational databases are a bad
fit for.” Eric Evans - Rackspace 
11	
  
Why NoSQL [cont'd]
•  ACID does not scale
•  Web applications have different needs
–  Scalability 
–  Elasticity
–  Flexible schema/ semi-structured data
–  Geographically distributed
•  Web applications do not always need
–  Transaction
–  Strong consistency
–  Complex queries
12	
  
NoSQL use cases
•  Massive data volume (Big volume)
– Google, Amazon, Yahoo, Facebook – 10-100K
servers 
•  Extreme query workload 
•  Schema evolution 
13	
  
Relational data model revisited
•  Data is usually stored in row by row
manner (row store)
•  Standardized query language (SQL)
•  Data model defined before you add
data
•  Joins merge data from multiple tables
–  Results are tables
•  Pros: Mature ACID transactions with fine-
grain security controls, widely used
•  Cons: Requires up front data modeling,
does not scale well 
14	
  
Oracle,	
  MySQL,	
  PostgreSQL,	
  
MicrosoP	
  SQL	
  Server,	
  IBM	
  
DB/2	
  	
  
Key/value data model
•  Simple key/value interface
– GET, PUT, DELETE
•  Value can contain any kind of
data
•  Pros
•  Cons
•  Berkley DB, Memcache,
DynamoDB, Redis, Riak
15	
  
Key/value vs. table
•  A table with two columns
and a simple interface
– Add a key-value
– For this key, give me the value
– Delete a key
•  Super fast and easy to scale
(no joins) 
16	
  
Key/value vs. locker
17	
  
vs. Relational Model
18	
  
Memcached
•  Open source in-memory key-value caching system
•  Make effective use of RAM on many distributed web servers
•  Designed to speed up dynamic web applications by alleviating
database load
–  Simple interface for highly distributed RAM caches
–  30ms read times typical
•  Designed for quick deployment, ease of development
•  APIs in many languages
19	
  
•  Open source in-memory key-value store with optional
durability
•  Focus on high speed reads and writes of common data
structures to RAM
•  Allows simple lists, sets and hashes to be stored
within the value and manipulated
•  Many features that developers like expiration,
transactions, pub/sub, partitioning 
20	
  
•  Scalable key-value store
•  Fastest growing product in Amazon's history
•  Focus on throughput on storage and predictable read
and write times
•  Strong integration with S3 and Elastic MapReduce
21	
  
•  Open source distributed key-value store with support
and commercial versions by Basho
•  A "Dynamo-inspired" database
•  Focus on availability, fault-tolerance, operational
simplicity and scalability
•  Support for replication and auto-sharding and
rebalancing on failures
•  Support for MapReduce, fulltext search and secondary
indexes of value tags
•  Written in ERLANG 
22	
  
Column family store
•  Dynamic schema, column-oriented data
model
•  Sparse, distributed persistent multi-
dimensional sorted map
(row, column (family), timestamp) -> cell
contents
23	
  
Column families
•  Group columns into "Column
families"
•  Group column families into
"Super-Columns"
•  Be able to query all columns
with a family or super family
•  Similar data grouped together
to improve speed 
24	
  
Column family data model vs.
relational
•  Sparse matrix, preserve table structure
– One row could have millions of columns but can
be very sparse
•  Hybrid row/column stores
•  Number of columns is extendible
– New columns to be inserted without doing an
"alter table"
25	
  
Bigtable
•  ACM TOCS 2008
•  Fault-tolerant, persistent
•  Scalable
–  Thousands of servers
–  Terabytes of in-memory data
–  Petabyte of disk-based data
–  Millions of reads/writes per
second, efficient scans
•  Self-managing
–  Servers can be added/
removed dynamically
–  Servers adjust to load
imbalance
26	
  
•  Open-source Bigtable, written in JAVA
•  Part of Apache Hadoop project
27	
  
Hadoop?
28	
  
•  Apache open source column family database
•  Supported by DataStax
•  Peer-to-peer distribution model
•  Strong reputation for linear scale out (millions of writes/
second)
•  Written in Java and works well with HDFS and
MapReduce 
29	
  
Graph data model
•  Core abstractions: Nodes, Relationships, Properties on both
30	
  
Graph database (store)
•  A database stored data in an explicitly graph structure
•  Each node knows its adjacent nodes
•  Queries are really graph traversals
31	
  
Compared to Relational
Databases
OpCmized	
  for	
  aggregaCon	
   OpCmized	
  for	
  connecCons	
  
Compared to Key Value Stores
OpCmized	
  for	
  simple	
  look-­‐ups	
   OpCmized	
  for	
  traversing	
  connected	
  data	
  
Compared to Document Stores
OpCmized	
  for	
  “trees”	
  of	
  data	
   OpCmized	
  for	
  seeing	
  the	
  forest	
  and	
  the	
  
trees,	
  and	
  the	
  branches,	
  and	
  the	
  trunks	
  
35	
  
36	
  
•  Graph database designed to be easy to use by
Java developers
•  Disk-based (not just RAM)
•  Full ACID 
•  High Availability (with Enterprise Edition)
•  32 Billion Nodes, 32 Billion Relationships, 

64 Billion Properties
•  Embedded java library
•  REST API
37	
  
Document store
•  Documents, not value, not
tables
•  JSON or XML formats
•  Document is identified by
ID
•  Allow indexing on
properties
38	
  
Relational data mapping
•  T1–HTML into Objects
•  T2–Objects into SQL Tables
•  T3–Tables into Objects
•  T4–Objects into HTML
39	
  
Web Service in the middle
•  T1 – HTML into Java Objects
•  T2 – Java Objects into SQL Tables
•  T3 – Tables into Objects
•  T4 – Objects into HTML
•  T5 – Objects to XML
•  T6 – XML to Objects
40	
  
T1	
  
T3	
  
T2	
  
T4	
  
Object	
  Middle	
  
Tier	
  
Relational	
  
Database	
  
Web	
  Browser	
  
T5	
  
Web	
  Service	
  
T6	
  
Discussion
•  Object-relational mapping has become one
of the most complex components of building
applications today
– Java Hibernate Framework
– JPA
•  To avoid complexity is to keep your
architecture very simple
41	
  
Document mapping
•  Documents in the database
•  Documents in the application
•  No object middle tier
•  No "shredding"
•  No reassembly
•  Simple!
42	
  
ApplicaCon	
  Layer	
   Database	
  
Document	
   Document	
  
•  Open Source JSON data store created by 10gen
•  Master-slave scale out model
•  Strong developer community
•  Sharding built-in, automatic
•  Implemented in C++ with many APIs (C++, JavaScript,
Java, Perl, Python etc.)
43	
  
•  Apache project
•  Open source JSON data store
•  Written in ERLANG
•  RESTful JSON API
•  B-Tree based indexing, shadowing b-tree versioning
•  ACID fully supported
•  View model
•  Data compaction
•  Security
44	
  

More Related Content

What's hot

NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxRadhika R
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 

What's hot (20)

NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Polyglot Persistence
Polyglot Persistence Polyglot Persistence
Polyglot Persistence
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
NoSQL et Big Data
NoSQL et Big DataNoSQL et Big Data
NoSQL et Big Data
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
SQOOP PPT
SQOOP PPTSQOOP PPT
SQOOP PPT
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Sqoop
SqoopSqoop
Sqoop
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
NoSQL
NoSQLNoSQL
NoSQL
 

Viewers also liked

5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQLTony Tam
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سومآموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سومfaradars
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB
 
Data Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data WarehouseData Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data WarehouseDaniel Upton
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersKaren Lopez
 
SDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingSDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingKorea Sdec
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big dataSteven Francia
 
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP AnalyticsLeveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP AnalyticsMethod360
 
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on ReadBig Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on ReadThink Big, a Teradata Company
 

Viewers also liked (20)

5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
Redis vs Memcached
Redis vs MemcachedRedis vs Memcached
Redis vs Memcached
 
Drupal 6 Database layer
Drupal 6 Database layerDrupal 6 Database layer
Drupal 6 Database layer
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سومآموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
 
Data Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data WarehouseData Modeling for Integration of NoSQL with a Data Warehouse
Data Modeling for Integration of NoSQL with a Data Warehouse
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
NoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data ModelersNoSQL and Data Modeling for Data Modelers
NoSQL and Data Modeling for Data Modelers
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
SDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingSDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modelling
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP AnalyticsLeveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
 
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on ReadBig Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
 

Similar to Nosql data models

No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
Introduction to no sql database
Introduction to no sql databaseIntroduction to no sql database
Introduction to no sql databaseHeman Hosainpana
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 

Similar to Nosql data models (20)

No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Sql Server2008
Sql Server2008Sql Server2008
Sql Server2008
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Revision
RevisionRevision
Revision
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
No SQL
No SQLNo SQL
No SQL
 
Introduction to no sql database
Introduction to no sql databaseIntroduction to no sql database
Introduction to no sql database
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
Modern database
Modern databaseModern database
Modern database
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Mis assignment (database)
Mis assignment (database)Mis assignment (database)
Mis assignment (database)
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 

More from Viet-Trung TRAN

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Viet-Trung TRAN
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreViet-Trung TRAN
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnViet-Trung TRAN
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processingViet-Trung TRAN
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookViet-Trung TRAN
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studyViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkViet-Trung TRAN
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learningViet-Trung TRAN
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposalsViet-Trung TRAN
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents Viet-Trung TRAN
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Viet-Trung TRAN
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Viet-Trung TRAN
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learningViet-Trung TRAN
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 

More from Viet-Trung TRAN (20)

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớn
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processing
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case study
 
Giasan.vn @rstars
Giasan.vn @rstarsGiasan.vn @rstars
Giasan.vn @rstars
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on Spark
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learning
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposals
 
GPSinsights poster
GPSinsights posterGPSinsights poster
GPSinsights poster
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 

Recently uploaded

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 

Recently uploaded (20)

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 

Nosql data models

  • 1. NoSQL data models Viet-Trung Tran is.hust.edu.vn/~trungtv/ 1  
  • 2. Eras of Databases •  Why NoSQL? 2  
  • 5. ICDE 2005 conference 5   The  last  25  years  of  commercial  DBMS  development  can  be  summed  up  in  a  single  phrase:   "one  size  fits  all".  This  phrase  refers  to  the  fact  that  the  tradi.onal  DBMS  architecture   (originally  designed  and  op.mized  for  business  data  processing)  has  been  used  to  support   many  data-­‐centric  applica.ons  with  widely  varying  characterisCcs  and  requirements.  In  this   paper,  we  argue  that  this  concept  is  no  longer  applicable  to  the  database  market,  and  that  the   commercial  world  will  fracture  into  a  collecCon  of  independent  database  engines,  some  of   which  may  be  unified  by  a  common  front-­‐end  parser.  We  use  examples  from  the  stream-­‐ processing  market  and  the  data-­‐warehouse  market  to  bolster  our  claims.  We  also  briefly   discuss  other  markets  for  which  the  tradiConal  architecture  is  a  poor  fit  and  argue  for  a  criCcal   rethinking  of  the  current  factoring  of  systems  services  into  products.  
  • 10. 10  
  • 11. Why NoSQL •  “The whole point of seeking alternatives [to RDBMS systems] is that you need to solve a problem that relational databases are a bad fit for.” Eric Evans - Rackspace 11  
  • 12. Why NoSQL [cont'd] •  ACID does not scale •  Web applications have different needs –  Scalability –  Elasticity –  Flexible schema/ semi-structured data –  Geographically distributed •  Web applications do not always need –  Transaction –  Strong consistency –  Complex queries 12  
  • 13. NoSQL use cases •  Massive data volume (Big volume) – Google, Amazon, Yahoo, Facebook – 10-100K servers •  Extreme query workload •  Schema evolution 13  
  • 14. Relational data model revisited •  Data is usually stored in row by row manner (row store) •  Standardized query language (SQL) •  Data model defined before you add data •  Joins merge data from multiple tables –  Results are tables •  Pros: Mature ACID transactions with fine- grain security controls, widely used •  Cons: Requires up front data modeling, does not scale well 14   Oracle,  MySQL,  PostgreSQL,   MicrosoP  SQL  Server,  IBM   DB/2    
  • 15. Key/value data model •  Simple key/value interface – GET, PUT, DELETE •  Value can contain any kind of data •  Pros •  Cons •  Berkley DB, Memcache, DynamoDB, Redis, Riak 15  
  • 16. Key/value vs. table •  A table with two columns and a simple interface – Add a key-value – For this key, give me the value – Delete a key •  Super fast and easy to scale (no joins) 16  
  • 19. Memcached •  Open source in-memory key-value caching system •  Make effective use of RAM on many distributed web servers •  Designed to speed up dynamic web applications by alleviating database load –  Simple interface for highly distributed RAM caches –  30ms read times typical •  Designed for quick deployment, ease of development •  APIs in many languages 19  
  • 20. •  Open source in-memory key-value store with optional durability •  Focus on high speed reads and writes of common data structures to RAM •  Allows simple lists, sets and hashes to be stored within the value and manipulated •  Many features that developers like expiration, transactions, pub/sub, partitioning 20  
  • 21. •  Scalable key-value store •  Fastest growing product in Amazon's history •  Focus on throughput on storage and predictable read and write times •  Strong integration with S3 and Elastic MapReduce 21  
  • 22. •  Open source distributed key-value store with support and commercial versions by Basho •  A "Dynamo-inspired" database •  Focus on availability, fault-tolerance, operational simplicity and scalability •  Support for replication and auto-sharding and rebalancing on failures •  Support for MapReduce, fulltext search and secondary indexes of value tags •  Written in ERLANG 22  
  • 23. Column family store •  Dynamic schema, column-oriented data model •  Sparse, distributed persistent multi- dimensional sorted map (row, column (family), timestamp) -> cell contents 23  
  • 24. Column families •  Group columns into "Column families" •  Group column families into "Super-Columns" •  Be able to query all columns with a family or super family •  Similar data grouped together to improve speed 24  
  • 25. Column family data model vs. relational •  Sparse matrix, preserve table structure – One row could have millions of columns but can be very sparse •  Hybrid row/column stores •  Number of columns is extendible – New columns to be inserted without doing an "alter table" 25  
  • 26. Bigtable •  ACM TOCS 2008 •  Fault-tolerant, persistent •  Scalable –  Thousands of servers –  Terabytes of in-memory data –  Petabyte of disk-based data –  Millions of reads/writes per second, efficient scans •  Self-managing –  Servers can be added/ removed dynamically –  Servers adjust to load imbalance 26  
  • 27. •  Open-source Bigtable, written in JAVA •  Part of Apache Hadoop project 27  
  • 29. •  Apache open source column family database •  Supported by DataStax •  Peer-to-peer distribution model •  Strong reputation for linear scale out (millions of writes/ second) •  Written in Java and works well with HDFS and MapReduce 29  
  • 30. Graph data model •  Core abstractions: Nodes, Relationships, Properties on both 30  
  • 31. Graph database (store) •  A database stored data in an explicitly graph structure •  Each node knows its adjacent nodes •  Queries are really graph traversals 31  
  • 32. Compared to Relational Databases OpCmized  for  aggregaCon   OpCmized  for  connecCons  
  • 33. Compared to Key Value Stores OpCmized  for  simple  look-­‐ups   OpCmized  for  traversing  connected  data  
  • 34. Compared to Document Stores OpCmized  for  “trees”  of  data   OpCmized  for  seeing  the  forest  and  the   trees,  and  the  branches,  and  the  trunks  
  • 35. 35  
  • 36. 36  
  • 37. •  Graph database designed to be easy to use by Java developers •  Disk-based (not just RAM) •  Full ACID •  High Availability (with Enterprise Edition) •  32 Billion Nodes, 32 Billion Relationships, 
 64 Billion Properties •  Embedded java library •  REST API 37  
  • 38. Document store •  Documents, not value, not tables •  JSON or XML formats •  Document is identified by ID •  Allow indexing on properties 38  
  • 39. Relational data mapping •  T1–HTML into Objects •  T2–Objects into SQL Tables •  T3–Tables into Objects •  T4–Objects into HTML 39  
  • 40. Web Service in the middle •  T1 – HTML into Java Objects •  T2 – Java Objects into SQL Tables •  T3 – Tables into Objects •  T4 – Objects into HTML •  T5 – Objects to XML •  T6 – XML to Objects 40   T1   T3   T2   T4   Object  Middle   Tier   Relational   Database   Web  Browser   T5   Web  Service   T6  
  • 41. Discussion •  Object-relational mapping has become one of the most complex components of building applications today – Java Hibernate Framework – JPA •  To avoid complexity is to keep your architecture very simple 41  
  • 42. Document mapping •  Documents in the database •  Documents in the application •  No object middle tier •  No "shredding" •  No reassembly •  Simple! 42   ApplicaCon  Layer   Database   Document   Document  
  • 43. •  Open Source JSON data store created by 10gen •  Master-slave scale out model •  Strong developer community •  Sharding built-in, automatic •  Implemented in C++ with many APIs (C++, JavaScript, Java, Perl, Python etc.) 43  
  • 44. •  Apache project •  Open source JSON data store •  Written in ERLANG •  RESTful JSON API •  B-Tree based indexing, shadowing b-tree versioning •  ACID fully supported •  View model •  Data compaction •  Security 44