5. What is a Database?
A database is an organized collection of data stored and accessed
electronically
5
6. Importance of the right database
6
▪ Databases are a core part of any application
architecture
▪ Database architecture will determine
▫ How many users can be served by the application
▫ Time it takes to respond to a request
▫ How much downtime would there for the system
▪ Migrating Databases is an expensive/time-consuming
operation
8. Questions to ask yourself
▪ Who are my users?
▪ What are my users’ short term and long term goals and
requirements?
▪ Which languages and tools should I be using to meet those
requirements?
▪ How would I grow my team as per my users and their
requirements?
8
9. Decision Framework
9
Maintenance
● Maturity
● Maintainability
● Community
Support
Functional Reqs
● Query
Patterns
● Evolution
● Language
Support
Scale
● Scalability
● Latency
● Reliability
● Consistency
11. In Focus Today
Relational DBs
▪ MySQL
▪ SQL Server
▪ Oracle
NoSQL DBs
▪ Cassandra
▪ MongoDB
▪ Neo4j
Evented Datastores
▪ Kafka
▪ Azure EventHubs
11
12. In Focus Today
▪ Storage structure
▫ How is the data logically stored
▪ Querying support
▫ Ways the data be accessed
▪ Scalability
▫ How to account for future growth
▪ Use cases
▫ Where to use these databases
12
14. Relational Databases Overview
▪ “SQL” (Structured Query Language)
Databases
▪ Tables can have (Foreign Key)
relationships with each other
▪ Allow creation of tables with Fixed
Structure
▪ Powerful Query interface with SQL
▪ Can be vertically scaled easily,
horizontal scaling requires more
effort 14
15. Storage Structure
▪ Foreign Key - Primary Key
Relationship
▪ Columns are assigned a
datatype
▪ Columns can have
constraints
15
16. Querying Capabilities
▪ SQL
▫ Joins!
▫ Subqueries
▫ Filter, etc.
▪ Object Relational Mapper (ORM)
▪ Mature Language Support
16
17. Scalability
▪ Vertical Scaling (Bigger
Machines)
▪ Horizontal Scaling
▫ Replicas (same data in
multiple machines)
▫ Sharding (distributing
data)
▫ Joins could suffer
17
18. Use Cases
▪ Relational Query Requirements
▪ Flexibility in Query Patterns
▫ Data Analytics
▪ Column Constraints Requirements
▫ Transaction Processing
▪ Small to Large Scale*
▫ Boutique Ecommerce Website
18
*With Sharding and Replicas
20. NoSQL Databases Overview
▪ “NoSQL” is a loaded term, which refers to any database
without SQL
▪ Started getting popular in late 2000s
▪ Schema is not pre-defined, “Schemaless” or “Schema-on-
Read”
▪ Instead of breaking data into multiple units (or tables), the
data is typically stored together as one.
20
21. Storage Structure*
Wide Column DBs
▪ Tables, Rows,
columns
▪ “Column
Families” are
stored together
▪ No Relationships
▪ No/limited joins
▪ e.g. Cassandra
Document DBs
▪ JSON like storage
structure
▪ Data stored in
one nested
structure called
“Documents”
▪ E.g. MongoDB
21
*There are other NoSQL stores like key-value (Redis), GraphDB(Neo4j) and search
engines (Elastic), we will skip those in today’s conversation
23. Querying Capabilities
▪ Custom Query Languages
▫ Range from SQL-like
(CQL/Cassandra) to very
different
(JavaScript/MongoDB)
▪ REST API Support
▪ Querying primarily dependent
on IDs
▪ Programming Language Support
mileage may vary 23
24. Use Cases
▪ Fixed Access Pattern, cases where everything always comes
together
▪ High Scale and Low latency requirements
▫ Gaming servers
▪ Evolving Data
▫ Product Catalogs
24
26. Evented Datastores Overview
▪ Also, called “Messaging
Queues”
▪ Storage of “Events”
▪ Event = self-
contained,immutable object
with timestamp
▪ “Producers” and
“Consumers” of events
▪ Used in conjunction with
SQL/NoSQL stores
26
27. Difference from Conventional Databases
▪ Focus on transient data, data usually deleted after sometime
▪ Query tools mostly focused on offsets (message number)
▪ No support for secondary indexes / other search capabilities
27
28. Storage Structure
▪ Data could be in multiple
formats
▫ JSON
▫ Text
▫ Binary
▪ Schema constraints can
sometimes be added
28
29. Querying Capabilities
▪ Consumers can read/write data
using offsets, i.e., message
number
▪ Some high level functions
▫ Data manipulations
▫ Aggregation over small
periods of time
29