SlideShare a Scribd company logo
1 of 51
Download to read offline
Database Systems
A Historical Perspective
_
Károly Kálmán
June 22, 2023
Topics Covered
Historical databases
Relational databases
Non-relational databases
Future directions
Historical Databases
Historical Databases (No Database)
All data is stored in memory
It's a start
✔ Fast
✔ Store anything in any format
✖ No persistent and durable storage
Historical Databases (Flat File)
Ted Scott ▫ $100 ▫ Apple ☷ Ai Joe ▫ $900 ▫ Peach ☷
◺ ◿ ↑ ↑
field │ │
value │ └─ record separator
└─ field separator
✔ Persistent
✔ Store anything (records can be different)
✖ Low-level access, programmer needed
✖ Complex queries are hard and slow
➤ Today: for small data sets in some domains
Historical Databases (Hierarchical)
CTO
╱ ╲
Head1 Head2
╱ ╲
Mngr1 Mngr2
✔ Defined structure
✔ Faster than flat file
✖ Navigation through the hierarchy only (up-down)
✖ "Programmer perspective" needed
➤ Today: LDAP, Active Directory
Historical Databases (Navigational)
John ── Alice ── Maggie Rob
│ │ │ │
Richard ── Scott Susie ── Nancy
✔ Relaxed navigation
✔ Very fast
✖ Still pre-determined navigation (no ad hoc queries)
✖ "Programmer perspective" needed
➤ Today: IBM Information Management System v15
Relational Databases
Relational Database Management System (RDBMS)
E. F. Codd in 1970 (IBM)
Relational model of data
Based on formal (math) rules
Optimal database design (NF)
Data access optimization
User friendly
Very popular
MySQL, Oracle, MS SQL, Sybase, MS Access, etc.
RDBMS Concepts (Table)
Database = tables + table cross-reference + keys
RDBMS Concepts (Keys)
Primary Key (PK) is an unique identifier for an entity
Keys are needed to make relations
Structured Query Language (SQL)
SQL = Structured Query Language
ANSI Standard
Declarative language
Focus on what to do, not how to do
User friendly
Abstractions for non-programmers
English like language
Pure SQL applications (MS Access)
Not fancy, but no programming needed
Structured Query Language (Table Operations)
Create table
CREATE TABLE families (f_name char, s_name char, id int);
Modify table (add column)
ALTER TABLE families ADD child_name char;
Delete table
DROP TABLE families;
DROP TABLE = NoSQL :)
Structured Query Language (Data Operations)
Insert new data
INSERT INTO families VALUES ("Philip", "Zimmer", 3);
Query for data
SELECT f_name, s_name FROM families WHERE child_count > 2;
Modify existing data
UPDATE families SET f_name ="Jonas" WHERE f_name = "Jhn";
Structured Query Language (Transaction)
Transaction
Multiple operations treated as a single unit of work
Either all operations succeed or all fail
Example
BEGIN TRANSACTION
INSERT INTO families VALUES ("Philip", "Zimmer", 3);
INSERT INTO families VALUES ("Hans", "Vogler", 347);
END TRANSACTION
ACID Model
ACID defines who sees what changes and when
ACID transaction control properties
Atomic: operations succeed or roll-back (state before)
Consistent: database is in correct state when trans. finished
Isolated: transactions do not disturb/effect another
4 isolation levels (speed vs consistency)
Durable: results are permanent, even if error'd
Typical 3-tier System Architecture
RDBMS Drawbacks
Scaling is hard (ACID)
Expensive
'Free' solutions are not mature for 9...9%
Non-structured data is hard to store
NoSQL for rescue
For majority of uses RDBMS is just enough
Distributed Databases
Multiple database servers
Data duplicated
Performance/availability increases
But complexity too
Distributed Databases (Replicas)
Master-Slave
Master serves r/w and replicates data to slaves
Slaves serve reads only
Master-Master
Multiple masters that serve r/w
Replication between masters
Distributed Databases (Sharding and Federation)
Sharding
Break data into smaller chunks by key
Store chunks on different servers
Federation
Databases by domain functions
No single monolith database
Query impact (linking tables)
Data Warehouse TODO!!!!
Current and historical data
Store structured data (schema)
Query focused (Business Analytic)
Large and central data store
Data Mart TODO!!!!
Specific views by business departments
Based on data warehouse
Multiple data marts, not a single monolith
More summarized than data warehouse
Data Lake TODO!!!!
Central location for all data
Store raw data (no schema)
Purpose of data is not defined
Data science
Data Pipeline TODO!!!!
Process to move or transform data between systems
...
Data Mesh TODO!!!!
Architectural pattern
Data ownership and distribution
Analytical data (optimizing the business)
Historical and aggregated view
Operational data (running the business)
Current and transactional state
Non-Relational Databases
(NoSQL)
NoSQL Databases
Schema/structure definition is optional
Store anything (mix data in collections)
Need to know major use cases before design
Performance
Very good for expected use cases
Bad for unexpected use cases
Varied transaction support (event-cons, quorum)
Query language complexities
Scalable distributed systems
Consistency Models TODO!!!!
When reader sees a system change TODO!!!!
Weak
Reader might or might not (at all) see the change
Eventual
Reader will see the change sometime
Strong
Reader sees the change immediately
CAP Theorem
Eric Brewer (~1997)
CAP theorem (Reliability)
Consistency: a read receives the most recent data or an error
Availability: a request receives a (non-error) response with
(maybe old) data
Partition tolerance: system operates when network is not
reliable
Choose two (but P shall be a must)
Some systems support configurable CAP modes
BASE Model
Similar to ACID, but for NoSQL
BASE model properties
Basically Available: system guarantees availability
Soft state: system state may change over time, even with no
input
Eventual consistency: system will be consistent over a period of
time, if no input received
NoSQL Databases (Historical)
XML Database
Wasteland
Object Store
Programmer's database
NoSQL Databases (Key-Value)
123 ↠ firstName = "Arthur" ⌁ surName = "Legend"
8874 ↠ color = "Black ⌁ make = "Ford"
Very Fast
Simple to use
Access by keys only
Caches (Infinispan, Redis, Memcached, Ignite, etc.)
NoSQL Databases (Document I.)
Store JSON structured data
Documents can have different fields
{ ⌲ Document 1 Start
name: { ⌲ Complex field
first: "John" ⌲ Simple field
last: "Dee"
}
birth: "2/2/1982" ⌲ Document 1 field (only)
} ⌲ Document 1 End
{ ⌲ Document 2 Start
fullName : ⌲ Simple field
"James Doe"
} ⌲ Document 2 End
NoSQL Databases (Document II.)
Effective document (text) store
Free-text search engine
Documents are JSON based
Various query format
Varied transaction support (single doc.)
Couchbase, Elasticsearch, MongoDB, etc.
NoSQL Databases (Wide Column I.)
Rows (keys) with many (~1000) columns
Write optimized (call logs, bank transactions, etc.)
SQL like query language
Limited ACID support
Heavy weight systems
HBase, Cassandra, etc.
NoSQL Databases (Wide Column II.)
NoSQL Databases (Graph I.)
Based on directed graph
Nodes, properties and relations
Replacement for complex relational models
High level query language
ACID transactions
Neo4j (Cypher), GraphDB (SparQL), etc.
NoSQL Databases (Graph II.)
NoSQL Databases (Time I.)
Data points (measurement) over time interval
Regular intervals (metrics)
Irregular intervals (events)
Data is more useful as aggregate (continuous queries)
SQL like query language with time related additions
No transaction concept
PK is time in high precision
Data modification is rare (append only)
InfluxDB, Kdb+, Prometheus, etc.
NoSQL Databases (Time II.)
Example measurement:
weather,location=us-midwest temperature=82 144488740
| ─────────┬────────── ──┬─────────── |
measurement tag field timestamp
measurement ≈ table
tag ≈ indexed field
field ≈ not indexed field
NoSQL Databases (Computing Grid)
Calculations performed in a computing grid
Move program logic to data, not the other way around
Ignite, Infinispan, etc.
NoSQL Drawbacks
Operational/developer experience needed
Complex Infrastructure
Planned usage drives database design
Data de-normalization might be needed (!)
ACID/BASE compliance varies
Complex queries can be hard
Large distributed systems are always in the state of partial failure
>> Distributed systems are hard <<
NoSQL and RDBMS
Term NoSQL RDBMS
Consistency Weak Strong
Performance Varies Good
Language Custom SQL
DevOps Complex Simpler
Node Count >3 1
Scalability Good Poor
>> Use whatever is the best for the problem <<
Popular Database Combinations (2019)
A Distributed System Example
Future Directions
Cloud (hosted database)
Hybrid (multiple NoSQL modes)
NewSQL (SQL ↔ NoSQL convergence)
NIC/RDMA (cross memory access)
RAM Store (very fast)
FPGA (custom hardware)
Thank You!
Questions?
github.com/kk-sw
slideshare.net/kksw1/presentations
linkedin.com/in/károly-k-java/
kksw.nhely.hu

More Related Content

Similar to Database Systems - A Historical Perspective

Oracle Database Overview
Oracle Database OverviewOracle Database Overview
Oracle Database Overviewhonglee71
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptxlevichan1
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDTony Rogerson
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemAbishek V S
 
Presentation sql server to oracle a database migration roadmap
Presentation    sql server to oracle a database migration roadmapPresentation    sql server to oracle a database migration roadmap
Presentation sql server to oracle a database migration roadmapxKinAnx
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesAndrew Kandels
 
Azure Data Fundamentals DP 900 Full Course
Azure Data Fundamentals DP 900 Full CourseAzure Data Fundamentals DP 900 Full Course
Azure Data Fundamentals DP 900 Full CoursePiyush sachdeva
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.pptAnandKonj1
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'sankarapu posibabu
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.pptssuser8c8fc1
 
NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designCalpont
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 

Similar to Database Systems - A Historical Perspective (20)

Oracle Database Overview
Oracle Database OverviewOracle Database Overview
Oracle Database Overview
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptx
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Presentation sql server to oracle a database migration roadmap
Presentation    sql server to oracle a database migration roadmapPresentation    sql server to oracle a database migration roadmap
Presentation sql server to oracle a database migration roadmap
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational Databases
 
Azure Data Fundamentals DP 900 Full Course
Azure Data Fundamentals DP 900 Full CourseAzure Data Fundamentals DP 900 Full Course
Azure Data Fundamentals DP 900 Full Course
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.ppt
 
NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.NoSQL, SQL, NewSQL - methods of structuring data.
NoSQL, SQL, NewSQL - methods of structuring data.
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
MS-SQL SERVER ARCHITECTURE
MS-SQL SERVER ARCHITECTUREMS-SQL SERVER ARCHITECTURE
MS-SQL SERVER ARCHITECTURE
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Database
DatabaseDatabase
Database
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Database Systems - A Historical Perspective

  • 1. Database Systems A Historical Perspective _ Károly Kálmán June 22, 2023
  • 2. Topics Covered Historical databases Relational databases Non-relational databases Future directions
  • 4. Historical Databases (No Database) All data is stored in memory It's a start ✔ Fast ✔ Store anything in any format ✖ No persistent and durable storage
  • 5. Historical Databases (Flat File) Ted Scott ▫ $100 ▫ Apple ☷ Ai Joe ▫ $900 ▫ Peach ☷ ◺ ◿ ↑ ↑ field │ │ value │ └─ record separator └─ field separator ✔ Persistent ✔ Store anything (records can be different) ✖ Low-level access, programmer needed ✖ Complex queries are hard and slow ➤ Today: for small data sets in some domains
  • 6. Historical Databases (Hierarchical) CTO ╱ ╲ Head1 Head2 ╱ ╲ Mngr1 Mngr2 ✔ Defined structure ✔ Faster than flat file ✖ Navigation through the hierarchy only (up-down) ✖ "Programmer perspective" needed ➤ Today: LDAP, Active Directory
  • 7. Historical Databases (Navigational) John ── Alice ── Maggie Rob │ │ │ │ Richard ── Scott Susie ── Nancy ✔ Relaxed navigation ✔ Very fast ✖ Still pre-determined navigation (no ad hoc queries) ✖ "Programmer perspective" needed ➤ Today: IBM Information Management System v15
  • 9. Relational Database Management System (RDBMS) E. F. Codd in 1970 (IBM) Relational model of data Based on formal (math) rules Optimal database design (NF) Data access optimization User friendly Very popular MySQL, Oracle, MS SQL, Sybase, MS Access, etc.
  • 10. RDBMS Concepts (Table) Database = tables + table cross-reference + keys
  • 11. RDBMS Concepts (Keys) Primary Key (PK) is an unique identifier for an entity Keys are needed to make relations
  • 12. Structured Query Language (SQL) SQL = Structured Query Language ANSI Standard Declarative language Focus on what to do, not how to do User friendly Abstractions for non-programmers English like language Pure SQL applications (MS Access) Not fancy, but no programming needed
  • 13. Structured Query Language (Table Operations) Create table CREATE TABLE families (f_name char, s_name char, id int); Modify table (add column) ALTER TABLE families ADD child_name char; Delete table DROP TABLE families; DROP TABLE = NoSQL :)
  • 14. Structured Query Language (Data Operations) Insert new data INSERT INTO families VALUES ("Philip", "Zimmer", 3); Query for data SELECT f_name, s_name FROM families WHERE child_count > 2; Modify existing data UPDATE families SET f_name ="Jonas" WHERE f_name = "Jhn";
  • 15. Structured Query Language (Transaction) Transaction Multiple operations treated as a single unit of work Either all operations succeed or all fail Example BEGIN TRANSACTION INSERT INTO families VALUES ("Philip", "Zimmer", 3); INSERT INTO families VALUES ("Hans", "Vogler", 347); END TRANSACTION
  • 16. ACID Model ACID defines who sees what changes and when ACID transaction control properties Atomic: operations succeed or roll-back (state before) Consistent: database is in correct state when trans. finished Isolated: transactions do not disturb/effect another 4 isolation levels (speed vs consistency) Durable: results are permanent, even if error'd
  • 17. Typical 3-tier System Architecture
  • 18. RDBMS Drawbacks Scaling is hard (ACID) Expensive 'Free' solutions are not mature for 9...9% Non-structured data is hard to store NoSQL for rescue For majority of uses RDBMS is just enough
  • 19. Distributed Databases Multiple database servers Data duplicated Performance/availability increases But complexity too
  • 20. Distributed Databases (Replicas) Master-Slave Master serves r/w and replicates data to slaves Slaves serve reads only Master-Master Multiple masters that serve r/w Replication between masters
  • 21. Distributed Databases (Sharding and Federation) Sharding Break data into smaller chunks by key Store chunks on different servers Federation Databases by domain functions No single monolith database Query impact (linking tables)
  • 22. Data Warehouse TODO!!!! Current and historical data Store structured data (schema) Query focused (Business Analytic) Large and central data store
  • 23. Data Mart TODO!!!! Specific views by business departments Based on data warehouse Multiple data marts, not a single monolith More summarized than data warehouse
  • 24. Data Lake TODO!!!! Central location for all data Store raw data (no schema) Purpose of data is not defined Data science
  • 25. Data Pipeline TODO!!!! Process to move or transform data between systems ...
  • 26. Data Mesh TODO!!!! Architectural pattern Data ownership and distribution Analytical data (optimizing the business) Historical and aggregated view Operational data (running the business) Current and transactional state
  • 28. NoSQL Databases Schema/structure definition is optional Store anything (mix data in collections) Need to know major use cases before design Performance Very good for expected use cases Bad for unexpected use cases Varied transaction support (event-cons, quorum) Query language complexities Scalable distributed systems
  • 29. Consistency Models TODO!!!! When reader sees a system change TODO!!!! Weak Reader might or might not (at all) see the change Eventual Reader will see the change sometime Strong Reader sees the change immediately
  • 30. CAP Theorem Eric Brewer (~1997) CAP theorem (Reliability) Consistency: a read receives the most recent data or an error Availability: a request receives a (non-error) response with (maybe old) data Partition tolerance: system operates when network is not reliable Choose two (but P shall be a must) Some systems support configurable CAP modes
  • 31. BASE Model Similar to ACID, but for NoSQL BASE model properties Basically Available: system guarantees availability Soft state: system state may change over time, even with no input Eventual consistency: system will be consistent over a period of time, if no input received
  • 32. NoSQL Databases (Historical) XML Database Wasteland Object Store Programmer's database
  • 33. NoSQL Databases (Key-Value) 123 ↠ firstName = "Arthur" ⌁ surName = "Legend" 8874 ↠ color = "Black ⌁ make = "Ford" Very Fast Simple to use Access by keys only Caches (Infinispan, Redis, Memcached, Ignite, etc.)
  • 34. NoSQL Databases (Document I.) Store JSON structured data Documents can have different fields { ⌲ Document 1 Start name: { ⌲ Complex field first: "John" ⌲ Simple field last: "Dee" } birth: "2/2/1982" ⌲ Document 1 field (only) } ⌲ Document 1 End { ⌲ Document 2 Start fullName : ⌲ Simple field "James Doe" } ⌲ Document 2 End
  • 35. NoSQL Databases (Document II.) Effective document (text) store Free-text search engine Documents are JSON based Various query format Varied transaction support (single doc.) Couchbase, Elasticsearch, MongoDB, etc.
  • 36. NoSQL Databases (Wide Column I.) Rows (keys) with many (~1000) columns Write optimized (call logs, bank transactions, etc.) SQL like query language Limited ACID support Heavy weight systems HBase, Cassandra, etc.
  • 37. NoSQL Databases (Wide Column II.)
  • 38. NoSQL Databases (Graph I.) Based on directed graph Nodes, properties and relations Replacement for complex relational models High level query language ACID transactions Neo4j (Cypher), GraphDB (SparQL), etc.
  • 40. NoSQL Databases (Time I.) Data points (measurement) over time interval Regular intervals (metrics) Irregular intervals (events) Data is more useful as aggregate (continuous queries) SQL like query language with time related additions No transaction concept PK is time in high precision Data modification is rare (append only) InfluxDB, Kdb+, Prometheus, etc.
  • 41. NoSQL Databases (Time II.) Example measurement: weather,location=us-midwest temperature=82 144488740 | ─────────┬────────── ──┬─────────── | measurement tag field timestamp measurement ≈ table tag ≈ indexed field field ≈ not indexed field
  • 42. NoSQL Databases (Computing Grid) Calculations performed in a computing grid Move program logic to data, not the other way around Ignite, Infinispan, etc.
  • 43. NoSQL Drawbacks Operational/developer experience needed Complex Infrastructure Planned usage drives database design Data de-normalization might be needed (!) ACID/BASE compliance varies Complex queries can be hard Large distributed systems are always in the state of partial failure
  • 44. >> Distributed systems are hard <<
  • 45. NoSQL and RDBMS Term NoSQL RDBMS Consistency Weak Strong Performance Varies Good Language Custom SQL DevOps Complex Simpler Node Count >3 1 Scalability Good Poor
  • 46. >> Use whatever is the best for the problem <<
  • 49. Future Directions Cloud (hosted database) Hybrid (multiple NoSQL modes) NewSQL (SQL ↔ NoSQL convergence) NIC/RDMA (cross memory access) RAM Store (very fast) FPGA (custom hardware)