SlideShare a Scribd company logo
1 of 28
Download to read offline
Types of Databases
Where to use what ?
So many names and technologies (aka confusing)
Azure data warehouse
Blob
Redis
Cassandra
Druid
Redis
Graphite
MySQL
MemSQL
…. plus 10’s of more options in the market
Break it down
1. How is data stored
Row oriented,
Column oriented,
Sorted string,
Document,
Object store,
Key-value in memory,
Time series
2. Partitioning - Scale up and down
3. Replication - Consistency
4. Atomicity - All or none
5. Isolation - consistent view of data
through the transaction
2. Partitioning
It’s all about Key
Key Data
2. Partitioning
Writing the data
Key Data
Specified as partition key
Or
Generated by system
Range partition (Manual intervention)
Hash partition (Scaling issues)
Consistent hashing (Avoid shuffle during scaling)
Round robin (Even distribution)
State Less
State Full
2. Partitioning
Writing the data
Key Data
Specified as partition key
Or
Generated by system
Partition 1
Partition 2
Partition n
Redirect
Logic
2. Partitioning
Reading it back
Key (?) Data
Partition 1
Partition 2
Partition n
Partitioning key
columns are specified
Partitioning key columns are
NOT specified
Local indexes
Local indexes
Local indexes
2. Partitioning
Reading it back
Key (?) Data
Partition 1
Partition 2
Partition n
Partitioning key
columns are specified
Partitioning key columns are
NOT specified
Local indexes
Local indexes
Local indexes
Process 1
Process 2
Process n
Collect
Render
Output
MPP
Massively parallel processing
3. Replication
Centralised model - Master slave
Round robin
based
partitioning
requires
centralised
metastore to
keep track of
states
3. Replication
Decentralised model - Peer to Peer
How is data stored & CRUD operations
Data format is for the partition of Data
Your replication and partitioning strategy
is Independent of
Storage format
Data storage - Row oriented
Write path
Data storage - Row oriented
Read path
Ordering
of columns
matter
(a,b,c) is
different
from (c,b,a)
Penalty for
updating all index
trees for the table
Statistics refresh
can be deferred -
Hybrids
Data storage - Row oriented - Examples
Row level operations
Multiple types of query searches - by virtue of different indexes
Inefficiencies -
Analytical querying !
Lot of seeks from Disk (Range based queries)
Efficiency(Scan) >>> Efficiency(Seek)
Entire Row is fetched to operate on Few columns
Big drawback for Analytical queries.
Data storage - Column oriented
In-memory table / Memtable
Threshold
Write path
Data storage - Column oriented
Read path
Data storage - Column oriented
Hybrid Hybrid
Efficient for columnar aggregates and joins - Analytical queries
Efficient for filtering data based on condition
Inefficient for frequent updates (causes lot of soft deletes/tombstones)
Inefficient for retrieval of selected few rows
Compaction overheads
Data storage - Sorted String
Immutable concept of Columnar but Storage is Row level
Row
based
data
Threshold
Write path
Data storage - Sorted String
Read path
Data storage - Sorted String
Conceptually SSTable => Segments
Efficient for range based queries - Scan on disk
Low latency Inserts
Peer to peer protocol. Multi datacenter replication.
Inefficient for interleaved reads - filter queries. Potentially traverse complete table.
Inefficient for aggregates and joins
Compaction overheads
Query first approach
Data storage - Key, Value = Doc : Document
Key Data
Data is a Document whose schema can vary.
Usually a json format is standard.
Query ability may be required on certain columns
in the document.
Ability to specify a column within document as key
for partitioning
Data storage - Key, Value = File : Object store
Key Data
Data is a large file.
Query ability is not required.
Eventual consistency is fine.
Metadata layer to provide a file system look and feel
Data storage - Key, Value = Minimal data : Cache
Key Data
Data is in few MB’s
Lightweight data structures used for persisting value
In-memory and fast
Ideal for caching use cases
Hybrid
Data storage - Key, Value = Periodic : Time Series
Key Data
Data captured from
data-stream/device-measurements periodically at
high frequency.
Size of value is not large. Older values should have
the capability to be aggregated and stored
Using concept of
sorted string database
De-Normalisation vs Normalisation
Types of Databases

More Related Content

What's hot

Web Application Development using PHP Chapter 6
Web Application Development using PHP Chapter 6Web Application Development using PHP Chapter 6
Web Application Development using PHP Chapter 6Mohd Harris Ahmad Jaal
 
ASP.NET Session 7
ASP.NET Session 7ASP.NET Session 7
ASP.NET Session 7Sisir Ghosh
 
File organization in database
File organization in databaseFile organization in database
File organization in databaseAfrasiyab Haider
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Rabin BK
 
358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16sumitbardhan
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashingJeet Poria
 
File Types in Data Structure
File Types in Data StructureFile Types in Data Structure
File Types in Data StructureProf Ansari
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataGraph-TA
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...stepheneisenhauer
 
Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Maryam Fida
 
Dynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ TreesDynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ TreesPooja Dixit
 
Bdam presentation on parquet
Bdam presentation on parquetBdam presentation on parquet
Bdam presentation on parquetManpreet Khurana
 
RDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsRDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsGraph-TA
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databasesGraph-TA
 
Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on lineMilind Patil
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashingsathish sak
 
StaTIX - Statistical Type Inference on Linked Data
StaTIX - Statistical Type Inference on Linked DataStaTIX - Statistical Type Inference on Linked Data
StaTIX - Statistical Type Inference on Linked DataArtem Lutov
 

What's hot (20)

Web Application Development using PHP Chapter 6
Web Application Development using PHP Chapter 6Web Application Development using PHP Chapter 6
Web Application Development using PHP Chapter 6
 
ASP.NET Session 7
ASP.NET Session 7ASP.NET Session 7
ASP.NET Session 7
 
File organization in database
File organization in databaseFile organization in database
File organization in database
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)
 
358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
File Types in Data Structure
File Types in Data StructureFile Types in Data Structure
File Types in Data Structure
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF Data
 
Lect 8 updated (1)
Lect 8 updated (1)Lect 8 updated (1)
Lect 8 updated (1)
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
 
Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)
 
Dynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ TreesDynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ Trees
 
Bdam presentation on parquet
Bdam presentation on parquetBdam presentation on parquet
Bdam presentation on parquet
 
RDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsRDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL Platforms
 
Database management system session 6
Database management system session 6Database management system session 6
Database management system session 6
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
 
Ardbms
ArdbmsArdbms
Ardbms
 
Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on line
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
 
StaTIX - Statistical Type Inference on Linked Data
StaTIX - Statistical Type Inference on Linked DataStaTIX - Statistical Type Inference on Linked Data
StaTIX - Statistical Type Inference on Linked Data
 

Similar to Types of Databases

Ch 7 Physical D B Design
Ch 7  Physical D B  DesignCh 7  Physical D B  Design
Ch 7 Physical D B Designguest8fdbdd
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesAndrew Kandels
 
Database Systems - A Historical Perspective
Database Systems - A Historical PerspectiveDatabase Systems - A Historical Perspective
Database Systems - A Historical PerspectiveKaroly K
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQLPankaj Khattar
 
Apache ignite as in-memory computing platform
Apache ignite as in-memory computing platformApache ignite as in-memory computing platform
Apache ignite as in-memory computing platformSurinder Mehra
 
NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)Rahul P
 
vFabric SQLFire for high performance data
vFabric SQLFire for high performance datavFabric SQLFire for high performance data
vFabric SQLFire for high performance dataVMware vFabric
 
Indy pass writing efficient queries – part 1 - indexing
Indy pass   writing efficient queries – part 1 - indexingIndy pass   writing efficient queries – part 1 - indexing
Indy pass writing efficient queries – part 1 - indexingeddiew
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Mba admission in india
Mba admission in indiaMba admission in india
Mba admission in indiaEdhole.com
 
Dipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsDipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsBob Pusateri
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictabilityRichardWarburton
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelAndrey Lomakin
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageNilesh Salpe
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil AmbagadeSigmoid
 

Similar to Types of Databases (20)

Ch 7 Physical D B Design
Ch 7  Physical D B  DesignCh 7  Physical D B  Design
Ch 7 Physical D B Design
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational Databases
 
OldSQL to NewSQL
OldSQL to NewSQL OldSQL to NewSQL
OldSQL to NewSQL
 
Database Systems - A Historical Perspective
Database Systems - A Historical PerspectiveDatabase Systems - A Historical Perspective
Database Systems - A Historical Perspective
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQL
 
Apache ignite as in-memory computing platform
Apache ignite as in-memory computing platformApache ignite as in-memory computing platform
Apache ignite as in-memory computing platform
 
NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)NoSQL(NOT ONLY SQL)
NoSQL(NOT ONLY SQL)
 
vFabric SQLFire for high performance data
vFabric SQLFire for high performance datavFabric SQLFire for high performance data
vFabric SQLFire for high performance data
 
Indy pass writing efficient queries – part 1 - indexing
Indy pass   writing efficient queries – part 1 - indexingIndy pass   writing efficient queries – part 1 - indexing
Indy pass writing efficient queries – part 1 - indexing
 
Vault2016
Vault2016Vault2016
Vault2016
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Data storage and indexing
Data storage and indexingData storage and indexing
Data storage and indexing
 
Mba admission in india
Mba admission in indiaMba admission in india
Mba admission in india
 
Dipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsDipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAs
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed Storage
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil Ambagade
 

Recently uploaded

Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 

Recently uploaded (20)

Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 

Types of Databases

  • 1. Types of Databases Where to use what ?
  • 2.
  • 3. So many names and technologies (aka confusing) Azure data warehouse Blob Redis Cassandra Druid Redis Graphite MySQL MemSQL …. plus 10’s of more options in the market
  • 4. Break it down 1. How is data stored Row oriented, Column oriented, Sorted string, Document, Object store, Key-value in memory, Time series 2. Partitioning - Scale up and down 3. Replication - Consistency 4. Atomicity - All or none 5. Isolation - consistent view of data through the transaction
  • 5. 2. Partitioning It’s all about Key Key Data
  • 6. 2. Partitioning Writing the data Key Data Specified as partition key Or Generated by system Range partition (Manual intervention) Hash partition (Scaling issues) Consistent hashing (Avoid shuffle during scaling) Round robin (Even distribution) State Less State Full
  • 7. 2. Partitioning Writing the data Key Data Specified as partition key Or Generated by system Partition 1 Partition 2 Partition n Redirect Logic
  • 8. 2. Partitioning Reading it back Key (?) Data Partition 1 Partition 2 Partition n Partitioning key columns are specified Partitioning key columns are NOT specified Local indexes Local indexes Local indexes
  • 9. 2. Partitioning Reading it back Key (?) Data Partition 1 Partition 2 Partition n Partitioning key columns are specified Partitioning key columns are NOT specified Local indexes Local indexes Local indexes Process 1 Process 2 Process n Collect Render Output MPP Massively parallel processing
  • 10. 3. Replication Centralised model - Master slave Round robin based partitioning requires centralised metastore to keep track of states
  • 12. How is data stored & CRUD operations Data format is for the partition of Data Your replication and partitioning strategy is Independent of Storage format
  • 13. Data storage - Row oriented Write path
  • 14. Data storage - Row oriented Read path Ordering of columns matter (a,b,c) is different from (c,b,a) Penalty for updating all index trees for the table Statistics refresh can be deferred - Hybrids
  • 15. Data storage - Row oriented - Examples Row level operations Multiple types of query searches - by virtue of different indexes Inefficiencies -
  • 16. Analytical querying ! Lot of seeks from Disk (Range based queries) Efficiency(Scan) >>> Efficiency(Seek) Entire Row is fetched to operate on Few columns Big drawback for Analytical queries.
  • 17. Data storage - Column oriented In-memory table / Memtable Threshold Write path
  • 18. Data storage - Column oriented Read path
  • 19. Data storage - Column oriented Hybrid Hybrid Efficient for columnar aggregates and joins - Analytical queries Efficient for filtering data based on condition Inefficient for frequent updates (causes lot of soft deletes/tombstones) Inefficient for retrieval of selected few rows Compaction overheads
  • 20. Data storage - Sorted String Immutable concept of Columnar but Storage is Row level Row based data Threshold Write path
  • 21. Data storage - Sorted String Read path
  • 22. Data storage - Sorted String Conceptually SSTable => Segments Efficient for range based queries - Scan on disk Low latency Inserts Peer to peer protocol. Multi datacenter replication. Inefficient for interleaved reads - filter queries. Potentially traverse complete table. Inefficient for aggregates and joins Compaction overheads Query first approach
  • 23. Data storage - Key, Value = Doc : Document Key Data Data is a Document whose schema can vary. Usually a json format is standard. Query ability may be required on certain columns in the document. Ability to specify a column within document as key for partitioning
  • 24. Data storage - Key, Value = File : Object store Key Data Data is a large file. Query ability is not required. Eventual consistency is fine. Metadata layer to provide a file system look and feel
  • 25. Data storage - Key, Value = Minimal data : Cache Key Data Data is in few MB’s Lightweight data structures used for persisting value In-memory and fast Ideal for caching use cases Hybrid
  • 26. Data storage - Key, Value = Periodic : Time Series Key Data Data captured from data-stream/device-measurements periodically at high frequency. Size of value is not large. Older values should have the capability to be aggregated and stored Using concept of sorted string database