SQL Database Design
For Developers
Scott Keck-Warren
php[tek] 2024
@scottKeckWarren@phpc.social
1.Never Worked With a DBA On a Project
2.Every Project Has At Least One SQL
Database
1.Never Worked With a DBA On a Project
2.Every Project Has At Least One SQL
Database
My (Early) Relationship to SQL
Database:
Less Than Ideal
How Did I Get Better?
Trial By Fire
Problems
• “Random” slowness
• Data inconsistencies
• Weird Bugs
SQL Database Design
For Developers
Scott Keck-Warren
Director of Technology
@ WeCare Connect
Scott Keck-Warren
PHP Developer
Scott Keck-Warren
Director of Technology
@ WeCare Connect
Scott Keck-Warren
Content Creator
@php[architect] YouTube
Community Corner Podcast
Scott’s Rules For Database
Design
Never Worked With a DBA On a
Project
Scott’s Rules For Database Design
1. Normalize Your Database For Data
Deduplication
2. Use The Database Engine to Keep Data
Clean
3. Proactively Add Indexes to Keep Queries
Performant
Users Table
Users Table
• Email address
• Password
• Active state
• Hire Date
• Listing of previous passwords
• Office Name
• Office City
• Office Zip
Users Table
• Email address (string)
• Password (string)
• Active state (string)
• Hire Date (string)
• Listing of previous passwords (string)
• Office Name (string)
• Office City (string)
• Office Zip (string)
Users Table
email password active hire_date
previous_
password
office_name office_phone office_city office_zip
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main Office
555-555-5555
Saginaw 48609
avery@exa
mple.com
NULL 1 8/11/2024
hash2
hash7
hash8
main office 5555555555 Saginaw 48609
scott@exa
mple.com
hash3 1
May 11th,
23
hash3 Man office
(555)555-5555
Saginaw 48609
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
Normalize Your Database For
Data Deduplication
Normalize Your Database For Data Deduplication
“[T]he process of structuring a relational
database in accordance with a series of so-
called normal forms in order to reduce data
redundancy and improve data integrity.”
-“Database normalization” on Wikipedia
Normalize Your Database For Data Deduplication
• UNF: Unnormalized form
• 1NF: First normal form
• 2NF: Second normal form
• 3NF: Third normal form
• EKNF: Elementary key normal
form
• BCNF: Boyce–Codd normal form
• 4NF: Fourth normal form
• ETNF: Essential tuple normal
form
• 5NF: Fifth normal form
• DKNF: Domain-key normal form
• 6NF: Sixth normal form
Normalize Your Database For Data Deduplication
• UNF: Unnormalized form
• 1NF: First normal form
• 2NF: Second normal form
• 3NF: Third normal form
• EKNF: Elementary key normal
form
• BCNF: Boyce–Codd normal form
• 4NF: Fourth normal form
• ETNF: Essential tuple normal
form
• 5NF: Fifth normal form
• DKNF: Domain-key normal form
• 6NF: Sixth normal form
Normalize Your Database For Data Deduplication
• Boyce–Codd Normal Form:
• X should be a superkey for every
functional dependency (FD) X−>Y in a
given relation.
Unnormalized Form
Unnormalized Form
• A table doesn’t meet any of the conditions of normalization
• Essentially a spreadsheet
email password active hire_date
previous_
password
office_name office_phone office_city office_zip
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main Office
555-555-5555
Saginaw 48609
avery@exa
mple.com
NULL 1 8/11/2024
hash2
hash7
hash8
main office 5555555555 Saginaw 48609
scott@exa
mple.com
hash3 1
May 11th,
23
hash3 Man office
(555)555-5555
Saginaw 48609
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
First Normal Form (1NF)
First Normal Form (1NF)
1. The table contains a unique identifier, also called the primary key, that is
used to identify the row.
2. Each column contains atomic values (values that can not be broken
down)
1NF - users
email password active hire_date
previous_p
assword
office_
name
office_phone office_city office_zip
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main
Office
555-555-5555
Saginaw 48609
avery@exa
mple.com
NULL 1 8/11/2024
hash2
hash7
Hash8
main
office
5555555555 Saginaw 48609
scott@exa
mple.com
hash3 1
May 11th,
23
hash3
Man
office
(555)555-5555
Saginaw 48609
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
1NF - users
• A unique identifier should be:
• Auto-incrementing int
• UUID
1NF - users
id email password active hire_date
previous_
password
office_
name
office_phone
office_cit
y
office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main
Office
555-555-5555
Saginaw 48609
2
avery@ex
ample.com
NULL 1 8/11/2024
hash2
hash7
Hash8
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
hash3
Man
office
(555)555-
5555 Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
1NF - users
id email password active hire_date
previous_
password
office_
name
office_phone
office_cit
y
office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main
Office
555-555-5555
Saginaw 48609
2
avery@ex
ample.com
NULL 1 8/11/2024
hash2
hash7
hash8
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
hash3
Man
office
(555)555-
5555 Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
1NF - user_password_histories
1NF - user_password_histories
id user_id password
1NF - user_password_histories
id user_id password
1 1 hash1
2 1 hash5
3 1 hash6
4 2 hash2
5 2 hash7
6 2 hash8
7 3 hash3
8 4 hash4
1NF - users
id email password active hire_date
previous_
password
office_
name
office_phone
office_cit
y
office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
hash1
hash5
hash6
Main
Office
555-555-5555
Saginaw 48609
2
avery@ex
ample.com
NULL 1 8/11/2024
hash2
hash7
Hash8
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
hash3
Man
office
(555)555-
5555 Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday hash4 Main
555/555/5555
Saginaw 48609
1NF - users
id email password active hire_date
office_
name
office_phone office_city office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
Main
Office
555-555-5555
Saginaw 48609
2
avery@exa
mple.com
NULL 1 8/11/2024
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
Man
office
(555)555-5555
Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday Main
555/555/5555
Saginaw 48609
Second Normal Form (2NF)
Second Normal Form (2NF)
1. Is already in 1NF
2. All the non-key columns are dependent on the primary key of the table
Second Normal Form (2NF)
id email password active hire_date
office_
name
office_phone office_city office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
Main
Office
555-555-5555
Saginaw 48609
2
avery@exa
mple.com
NULL 1 8/11/2024
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
Man
office
(555)555-5555
Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday Main
555/555/5555
Saginaw 48609
2nd - offices
id name phone city zip
1 Main Office
555-555-5555
Saginaw 48609
2 main office 5555555555 Saginaw 48609
3 Man office
(555)555-5555
Saginaw 48609
4 Main
555/555/5555
Saginaw 48609
2NF - users
id email password active hire_date
office_
name
office_phone office_city office_zip
1
alice@exa
mple.com
hash1 1 1/1/2024
Main
Office
555-555-5555
Saginaw 48609
2
avery@exa
mple.com
NULL 1 8/11/2024
main
office
5555555555 Saginaw 48609
3
scott@exa
mple.com
hash3 1
May 11th,
23
Man
office
(555)555-5555
Saginaw 48609
4
scott@exa
mple.com
hash4 1 Tuesday Main
555/555/5555
Saginaw 48609
2NF - users
id email password active hire_date
office_
name
office_phone
office_cit
y
office_zip office_id
1
alice@exa
mple.com
hash1 1 1/1/2024
Main
Office
555-555-5555
Saginaw 48609 1
2
avery@ex
ample.com
NULL 1 8/11/2024
main
office
5555555555 Saginaw 48609 2
3
scott@exa
mple.com
hash3 1
May 11th,
23
Man
office
(555)555-
5555 Saginaw 48609 3
4
scott@exa
mple.com
hash4 1 Tuesday Main
555/555/5555
Saginaw 48609 4
2NF - users
id email password active hire_date office_id
1
alice@example.co
m
hash1 1 1/1/2024 1
2
avery@example.c
om
NULL 1 8/11/2024 2
3
scott@example.co
m
hash3 1 May 11th, 23 3
4
scott@example.co
m
hash4 1 Tuesday 4
Third Normal Form (3NF)
Third Normal Form (3NF)
1. Is already in 2NF
2. It contains columns that are non-transitively dependent on the primary key
3NF - offices
id name phone city zip
1 Main Office
555-555-5555
Saginaw 48609
2 main office 5555555555 Saginaw 48609
3 Man office
(555)555-5555
Saginaw 48609
4 Main
555/555/5555
Saginaw 48609
3NF - zips
id city
48609 Saginaw
48640 Midland
48642 Midland
48901 Lansing
3NF - zips
id city state
48609 Saginaw MI
48640 Midland MI
48642 Midland MI
48901 Lansing MI
Old Table Structure
New Table Structure
Use The Database Engine to
Keep Data Clean
Why?
Why?
mysql> insert into users
(password)
values
(“just a password?");
Query OK, 1 row affected (0.01 sec)
mysql> insert into users
(password)
values
(“just a password?");
Query OK, 1 row affected (0.01 sec)
Garbage In -> Garbage Out
Garbage Data -> Bugs
To Prevent Bugs:
Make The Database
Work For Us
id email password active hire_date office_id
1
alice@example.co
m
hash1 1 1/1/2024 1
2
avery@example.c
om
NULL 1 8/11/2024 2
3
scott@example.co
m
hash3 1 May 11th, 23 3
4
scott@example.co
m
hash4 1 Tuesday 4
5 Hash12 2 2024-04-01 1000
Use Correct Column Types
Use Correct Column Types
id email password active hire_date office_id
1
alice@example.co
m
hash1 1 1/1/2024 1
2
avery@example.c
om
NULL 1 8/11/2024 2
3
scott@example.co
m
hash3 1 May 11th, 23 3
4
scott@example.co
m
hash4 1 Tuesday 4
5 Hash12 2 2024-04-01 1000
Use Correct Column Types
• Numeric: INT, TINYINT, BIGINT, FLOAT, REAL, etc.
• Date/Time: DATE, TIME, DATETIME, etc.
• String: CHAR, VARCHAR, TEXT, etc.
• Binary data types such as: BLOB, etc.
Use Correct Column Types
Use Correct Column Types
Use Correct Column Types
mysql> insert into users (hire_date) values ("tuesday");
ERROR 1292 (22007): Incorrect date value: 'tuesday'
for column 'hire_date' at row 1
mysql> insert into users (hire_date) values ("May 11th, 23");
ERROR 1292 (22007): Incorrect date value: 'May 11th, 23'
for column 'hire_date' at row 1
Use NOT NULL for Required
Fields
Use NOT NULL for Required Fields
mysql> insert into users
(password)
values
(“just a password?");
Query OK, 1 row affected (0.01 sec)
Use NOT NULL for Required Fields
Use NOT NULL for Required Fields
Use NOT NULL for Required Fields
mysql>
insert into users
(password)
values
(“just a password?”);
ERROR 1364 (HY000): Field ‘email' doesn't have a default value
Use NOT NULL for Required Fields
mysql>
insert into users
(password)
values
(“just a password?”);
ERROR 1364 (HY000): Field ‘email' doesn't have a default value
Use NOT NULL for Required Fields
mysql>
insert into users
(email, password)
values
(“s@s”, "just a password?");
ERROR 1364 (HY000): Field 'active' doesn't have a default value
Use UNIQUE for Unique Values
Use UNIQUE for Unique Values
mysql> insert into users (email) values ("scott@keck-warren.com");
Query OK, 1 row affected (0.01 sec)
mysql> insert into users (email) values ("scott@keck-warren.com");
Query OK, 1 row affected (0.02 sec)
Use UNIQUE for Unique Values
Use UNIQUE for Unique Values
Use UNIQUE for Unique Values
mysql> insert into users (email) values ("scott@keck-warren.com");
ERROR 1062 (23000): Duplicate entry 'scott@keck-warren.com' for key 'users.email'
Use UNIQUE for Unique Values
Use Foreign Keys For
References To Other Tables
Use Foreign Keys For References To Other Tables
id name phone city zip_id
1 Main Office
555-555-5555
Saginaw 48609
2 main office 5555555555 Saginaw 48609
3 Man office
(555)555-5555
Saginaw 48609
4 Main
555/555/5555
Saginaw 48609
Use Foreign Keys For References To Other Tables
id name phone city zip_id
1 Main Office
555-555-5555
Saginaw 48609
2 main office 5555555555 Saginaw 48609
3 Man office
(555)555-5555
Saginaw 48609
Use Foreign Keys For References To Other Tables
id name phone city zip_id
1 Main Office
555-555-5555
Saginaw 48609
2 main office 5555555555 Saginaw 48609
Use Foreign Keys For References To Other Tables
id name phone city zip_id
1 Main Office
555-555-5555
Saginaw 48609
Constraints - Foreign Keys
Orphaned Rows
mysql> insert into users (email, office_id) values ("s@kw.com", 1000000);
Query OK, 1 row affected (0.03 sec)
mysql> insert into users (email, office_id) values ("s@kw.com", 1000000);
Query OK, 1 row affected (0.03 sec)
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
mysql> insert into users (email, office_id) values ("s@kw.com", 1000000);
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint
fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
Foreign Key Constraints
mysql> insert into users (email, office_id) values ("s@kw.com", 1000000);
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint
fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
Foreign Key Constraints
mysql> insert into users (email, office_id) values ("s@kw.com", 1);
Query OK, 1 row affected (0.01 sec)
Foreign Key Constraints
mysql> delete from offices where id = 1;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint
fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
Foreign Key Constraints
mysql> delete from offices where id = 1;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint
fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
Foreign Key Constraints
mysql> insert into user_password_histories (user_id, password) values (2, "test1");
Query OK, 1 row affected (0.01 sec)
mysql> insert into user_password_histories (user_id, password) values (2, "test2");
Query OK, 1 row affected (0.01 sec)
Foreign Key Constraints
mysql> delete from users where id = 2;
Query OK, 1 row affected (0.01 sec)
mysql> select * from user_password_histories where user_id = 2;
Empty set (0.00 sec)
Foreign Key Constraints
mysql> delete from users where id = 2;
Query OK, 1 row affected (0.01 sec)
mysql> select * from user_password_histories where user_id = 2;
Empty set (0.00 sec)
Foreign Key Constraints
Foreign Key Constraints
Downsides to Constraints
Performance
Use Triggers For Complex
Requirements
Use Triggers For Complex Requirements
• Triggers add additional power to DB
• Operate based on create, update, or delete
Use Triggers For Complex Requirements
id email password active hire_date office_id
1
alice@example.co
m
hash1 1 1/1/2024 1
2
avery@example.c
om
NULL 1 8/11/2024 2
3
scott@example.co
m
hash3 1 May 11th, 23 3
4
scott@example.co
m
hash4 1 Tuesday 4
5 Hash12 2 2024-04-01 1000
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
Use Triggers For Complex Requirements
mysql> insert into users (active) values (2);
ERROR 1644 (45000): active must be 0 or 1
Use Triggers For Complex Requirements
mysql> insert into users (active) values (2);
ERROR 1644 (45000): active must be 0 or 1
Proactively Add Indexes to Keep
Queries Performant
Indexes in Databases
Indexes in Databases
Indexes in Databases
Indexes in Databases
2023-12-03
Index: hire_date
2023-12-04
2023-12-05
2023-12-06
Start Out Simple
What You Need to Know
What You Need to Know
1. Normalize Your Database For Data
Deduplication
2. Use The Database Engine to Keep Data
Clean
3. Proactively Add Indexes to Keep Queries
Performant
What You Need to Know
1. The table contains a unique identifier, also called the primary key, that is
used to identify the row.
2. Each column contains atomic values (values that can not be broken
down)
3. All the non-key columns are dependent on the primary key of the table
4. It contains columns that are non-transitively dependent on the primary key
What You Need to Know
• Make the DB Work With You
• Correct Column Types
• NOT NULL for Required Fields
• UNIQUE for Unique Values
• Foreign Keys For References To Other Tables
• Triggers For Complex Requirements
What You Need to Know
• Use indexes on commonly searched columns
• Start simple
• See recorded talks about how to add
Thank The Speakers
Thank The Organizers
Questions/Follow Me
• Questions
• Please rate the talk
• <link>
• @scottKeckWarren@phpc.social
• @scottKeckWarren@twitter.com

SQL Database Design For Developers at php[tek] 2024

  • 1.
    SQL Database Design ForDevelopers Scott Keck-Warren php[tek] 2024 @scottKeckWarren@phpc.social
  • 2.
    1.Never Worked Witha DBA On a Project 2.Every Project Has At Least One SQL Database
  • 3.
    1.Never Worked Witha DBA On a Project 2.Every Project Has At Least One SQL Database
  • 4.
    My (Early) Relationshipto SQL Database: Less Than Ideal
  • 5.
    How Did IGet Better?
  • 6.
  • 8.
    Problems • “Random” slowness •Data inconsistencies • Weird Bugs
  • 11.
  • 12.
    Scott Keck-Warren Director ofTechnology @ WeCare Connect
  • 13.
  • 14.
    Scott Keck-Warren Director ofTechnology @ WeCare Connect
  • 15.
    Scott Keck-Warren Content Creator @php[architect]YouTube Community Corner Podcast
  • 16.
    Scott’s Rules ForDatabase Design
  • 17.
    Never Worked Witha DBA On a Project
  • 18.
    Scott’s Rules ForDatabase Design 1. Normalize Your Database For Data Deduplication 2. Use The Database Engine to Keep Data Clean 3. Proactively Add Indexes to Keep Queries Performant
  • 19.
  • 20.
    Users Table • Emailaddress • Password • Active state • Hire Date • Listing of previous passwords • Office Name • Office City • Office Zip
  • 21.
    Users Table • Emailaddress (string) • Password (string) • Active state (string) • Hire Date (string) • Listing of previous passwords (string) • Office Name (string) • Office City (string) • Office Zip (string)
  • 22.
    Users Table email passwordactive hire_date previous_ password office_name office_phone office_city office_zip alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 avery@exa mple.com NULL 1 8/11/2024 hash2 hash7 hash8 main office 5555555555 Saginaw 48609 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555-5555 Saginaw 48609 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 23.
    Normalize Your DatabaseFor Data Deduplication
  • 24.
    Normalize Your DatabaseFor Data Deduplication “[T]he process of structuring a relational database in accordance with a series of so- called normal forms in order to reduce data redundancy and improve data integrity.” -“Database normalization” on Wikipedia
  • 25.
    Normalize Your DatabaseFor Data Deduplication • UNF: Unnormalized form • 1NF: First normal form • 2NF: Second normal form • 3NF: Third normal form • EKNF: Elementary key normal form • BCNF: Boyce–Codd normal form • 4NF: Fourth normal form • ETNF: Essential tuple normal form • 5NF: Fifth normal form • DKNF: Domain-key normal form • 6NF: Sixth normal form
  • 26.
    Normalize Your DatabaseFor Data Deduplication • UNF: Unnormalized form • 1NF: First normal form • 2NF: Second normal form • 3NF: Third normal form • EKNF: Elementary key normal form • BCNF: Boyce–Codd normal form • 4NF: Fourth normal form • ETNF: Essential tuple normal form • 5NF: Fifth normal form • DKNF: Domain-key normal form • 6NF: Sixth normal form
  • 27.
    Normalize Your DatabaseFor Data Deduplication • Boyce–Codd Normal Form: • X should be a superkey for every functional dependency (FD) X−>Y in a given relation.
  • 28.
  • 29.
    Unnormalized Form • Atable doesn’t meet any of the conditions of normalization • Essentially a spreadsheet email password active hire_date previous_ password office_name office_phone office_city office_zip alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 avery@exa mple.com NULL 1 8/11/2024 hash2 hash7 hash8 main office 5555555555 Saginaw 48609 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555-5555 Saginaw 48609 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 30.
  • 31.
    First Normal Form(1NF) 1. The table contains a unique identifier, also called the primary key, that is used to identify the row. 2. Each column contains atomic values (values that can not be broken down)
  • 32.
    1NF - users emailpassword active hire_date previous_p assword office_ name office_phone office_city office_zip alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 avery@exa mple.com NULL 1 8/11/2024 hash2 hash7 Hash8 main office 5555555555 Saginaw 48609 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555-5555 Saginaw 48609 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 33.
    1NF - users •A unique identifier should be: • Auto-incrementing int • UUID
  • 34.
    1NF - users idemail password active hire_date previous_ password office_ name office_phone office_cit y office_zip 1 alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 2 avery@ex ample.com NULL 1 8/11/2024 hash2 hash7 Hash8 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555- 5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 35.
    1NF - users idemail password active hire_date previous_ password office_ name office_phone office_cit y office_zip 1 alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 2 avery@ex ample.com NULL 1 8/11/2024 hash2 hash7 hash8 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555- 5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 36.
  • 37.
  • 38.
    1NF - user_password_histories iduser_id password 1 1 hash1 2 1 hash5 3 1 hash6 4 2 hash2 5 2 hash7 6 2 hash8 7 3 hash3 8 4 hash4
  • 39.
    1NF - users idemail password active hire_date previous_ password office_ name office_phone office_cit y office_zip 1 alice@exa mple.com hash1 1 1/1/2024 hash1 hash5 hash6 Main Office 555-555-5555 Saginaw 48609 2 avery@ex ample.com NULL 1 8/11/2024 hash2 hash7 Hash8 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 hash3 Man office (555)555- 5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday hash4 Main 555/555/5555 Saginaw 48609
  • 40.
    1NF - users idemail password active hire_date office_ name office_phone office_city office_zip 1 alice@exa mple.com hash1 1 1/1/2024 Main Office 555-555-5555 Saginaw 48609 2 avery@exa mple.com NULL 1 8/11/2024 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 Man office (555)555-5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday Main 555/555/5555 Saginaw 48609
  • 41.
  • 42.
    Second Normal Form(2NF) 1. Is already in 1NF 2. All the non-key columns are dependent on the primary key of the table
  • 43.
    Second Normal Form(2NF) id email password active hire_date office_ name office_phone office_city office_zip 1 alice@exa mple.com hash1 1 1/1/2024 Main Office 555-555-5555 Saginaw 48609 2 avery@exa mple.com NULL 1 8/11/2024 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 Man office (555)555-5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday Main 555/555/5555 Saginaw 48609
  • 44.
    2nd - offices idname phone city zip 1 Main Office 555-555-5555 Saginaw 48609 2 main office 5555555555 Saginaw 48609 3 Man office (555)555-5555 Saginaw 48609 4 Main 555/555/5555 Saginaw 48609
  • 45.
    2NF - users idemail password active hire_date office_ name office_phone office_city office_zip 1 alice@exa mple.com hash1 1 1/1/2024 Main Office 555-555-5555 Saginaw 48609 2 avery@exa mple.com NULL 1 8/11/2024 main office 5555555555 Saginaw 48609 3 scott@exa mple.com hash3 1 May 11th, 23 Man office (555)555-5555 Saginaw 48609 4 scott@exa mple.com hash4 1 Tuesday Main 555/555/5555 Saginaw 48609
  • 46.
    2NF - users idemail password active hire_date office_ name office_phone office_cit y office_zip office_id 1 alice@exa mple.com hash1 1 1/1/2024 Main Office 555-555-5555 Saginaw 48609 1 2 avery@ex ample.com NULL 1 8/11/2024 main office 5555555555 Saginaw 48609 2 3 scott@exa mple.com hash3 1 May 11th, 23 Man office (555)555- 5555 Saginaw 48609 3 4 scott@exa mple.com hash4 1 Tuesday Main 555/555/5555 Saginaw 48609 4
  • 47.
    2NF - users idemail password active hire_date office_id 1 alice@example.co m hash1 1 1/1/2024 1 2 avery@example.c om NULL 1 8/11/2024 2 3 scott@example.co m hash3 1 May 11th, 23 3 4 scott@example.co m hash4 1 Tuesday 4
  • 48.
  • 49.
    Third Normal Form(3NF) 1. Is already in 2NF 2. It contains columns that are non-transitively dependent on the primary key
  • 50.
    3NF - offices idname phone city zip 1 Main Office 555-555-5555 Saginaw 48609 2 main office 5555555555 Saginaw 48609 3 Man office (555)555-5555 Saginaw 48609 4 Main 555/555/5555 Saginaw 48609
  • 51.
    3NF - zips idcity 48609 Saginaw 48640 Midland 48642 Midland 48901 Lansing
  • 52.
    3NF - zips idcity state 48609 Saginaw MI 48640 Midland MI 48642 Midland MI 48901 Lansing MI
  • 53.
  • 54.
  • 55.
    Use The DatabaseEngine to Keep Data Clean
  • 56.
  • 57.
  • 59.
    mysql> insert intousers (password) values (“just a password?"); Query OK, 1 row affected (0.01 sec)
  • 60.
    mysql> insert intousers (password) values (“just a password?"); Query OK, 1 row affected (0.01 sec)
  • 61.
    Garbage In ->Garbage Out
  • 62.
  • 63.
    To Prevent Bugs: MakeThe Database Work For Us
  • 64.
    id email passwordactive hire_date office_id 1 alice@example.co m hash1 1 1/1/2024 1 2 avery@example.c om NULL 1 8/11/2024 2 3 scott@example.co m hash3 1 May 11th, 23 3 4 scott@example.co m hash4 1 Tuesday 4 5 Hash12 2 2024-04-01 1000
  • 65.
  • 66.
    Use Correct ColumnTypes id email password active hire_date office_id 1 alice@example.co m hash1 1 1/1/2024 1 2 avery@example.c om NULL 1 8/11/2024 2 3 scott@example.co m hash3 1 May 11th, 23 3 4 scott@example.co m hash4 1 Tuesday 4 5 Hash12 2 2024-04-01 1000
  • 67.
    Use Correct ColumnTypes • Numeric: INT, TINYINT, BIGINT, FLOAT, REAL, etc. • Date/Time: DATE, TIME, DATETIME, etc. • String: CHAR, VARCHAR, TEXT, etc. • Binary data types such as: BLOB, etc.
  • 68.
  • 69.
  • 70.
    Use Correct ColumnTypes mysql> insert into users (hire_date) values ("tuesday"); ERROR 1292 (22007): Incorrect date value: 'tuesday' for column 'hire_date' at row 1 mysql> insert into users (hire_date) values ("May 11th, 23"); ERROR 1292 (22007): Incorrect date value: 'May 11th, 23' for column 'hire_date' at row 1
  • 71.
    Use NOT NULLfor Required Fields
  • 72.
    Use NOT NULLfor Required Fields mysql> insert into users (password) values (“just a password?"); Query OK, 1 row affected (0.01 sec)
  • 73.
    Use NOT NULLfor Required Fields
  • 74.
    Use NOT NULLfor Required Fields
  • 75.
    Use NOT NULLfor Required Fields mysql> insert into users (password) values (“just a password?”); ERROR 1364 (HY000): Field ‘email' doesn't have a default value
  • 76.
    Use NOT NULLfor Required Fields mysql> insert into users (password) values (“just a password?”); ERROR 1364 (HY000): Field ‘email' doesn't have a default value
  • 77.
    Use NOT NULLfor Required Fields mysql> insert into users (email, password) values (“s@s”, "just a password?"); ERROR 1364 (HY000): Field 'active' doesn't have a default value
  • 78.
    Use UNIQUE forUnique Values
  • 79.
    Use UNIQUE forUnique Values mysql> insert into users (email) values ("scott@keck-warren.com"); Query OK, 1 row affected (0.01 sec) mysql> insert into users (email) values ("scott@keck-warren.com"); Query OK, 1 row affected (0.02 sec)
  • 80.
    Use UNIQUE forUnique Values
  • 81.
    Use UNIQUE forUnique Values
  • 82.
    Use UNIQUE forUnique Values mysql> insert into users (email) values ("scott@keck-warren.com"); ERROR 1062 (23000): Duplicate entry 'scott@keck-warren.com' for key 'users.email'
  • 83.
    Use UNIQUE forUnique Values
  • 84.
    Use Foreign KeysFor References To Other Tables
  • 85.
    Use Foreign KeysFor References To Other Tables id name phone city zip_id 1 Main Office 555-555-5555 Saginaw 48609 2 main office 5555555555 Saginaw 48609 3 Man office (555)555-5555 Saginaw 48609 4 Main 555/555/5555 Saginaw 48609
  • 86.
    Use Foreign KeysFor References To Other Tables id name phone city zip_id 1 Main Office 555-555-5555 Saginaw 48609 2 main office 5555555555 Saginaw 48609 3 Man office (555)555-5555 Saginaw 48609
  • 87.
    Use Foreign KeysFor References To Other Tables id name phone city zip_id 1 Main Office 555-555-5555 Saginaw 48609 2 main office 5555555555 Saginaw 48609
  • 88.
    Use Foreign KeysFor References To Other Tables id name phone city zip_id 1 Main Office 555-555-5555 Saginaw 48609
  • 89.
  • 92.
  • 94.
    mysql> insert intousers (email, office_id) values ("s@kw.com", 1000000); Query OK, 1 row affected (0.03 sec)
  • 95.
    mysql> insert intousers (email, office_id) values ("s@kw.com", 1000000); Query OK, 1 row affected (0.03 sec)
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
    Foreign Key Constraints mysql>insert into users (email, office_id) values ("s@kw.com", 1000000); ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
  • 102.
    Foreign Key Constraints mysql>insert into users (email, office_id) values ("s@kw.com", 1000000); ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
  • 103.
    Foreign Key Constraints mysql>insert into users (email, office_id) values ("s@kw.com", 1); Query OK, 1 row affected (0.01 sec)
  • 104.
    Foreign Key Constraints mysql>delete from offices where id = 1; ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
  • 105.
    Foreign Key Constraints mysql>delete from offices where id = 1; ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`databasetalk`.`users`, CONSTRAINT `users_ibfk_1` FOREIGN KEY (`office_id`) REFERENCES `offices` (`id`))
  • 106.
  • 107.
  • 108.
  • 109.
    Foreign Key Constraints mysql>insert into user_password_histories (user_id, password) values (2, "test1"); Query OK, 1 row affected (0.01 sec) mysql> insert into user_password_histories (user_id, password) values (2, "test2"); Query OK, 1 row affected (0.01 sec)
  • 110.
    Foreign Key Constraints mysql>delete from users where id = 2; Query OK, 1 row affected (0.01 sec) mysql> select * from user_password_histories where user_id = 2; Empty set (0.00 sec)
  • 111.
    Foreign Key Constraints mysql>delete from users where id = 2; Query OK, 1 row affected (0.01 sec) mysql> select * from user_password_histories where user_id = 2; Empty set (0.00 sec)
  • 112.
  • 113.
  • 114.
  • 115.
  • 116.
    Use Triggers ForComplex Requirements
  • 117.
    Use Triggers ForComplex Requirements • Triggers add additional power to DB • Operate based on create, update, or delete
  • 118.
    Use Triggers ForComplex Requirements id email password active hire_date office_id 1 alice@example.co m hash1 1 1/1/2024 1 2 avery@example.c om NULL 1 8/11/2024 2 3 scott@example.co m hash3 1 May 11th, 23 3 4 scott@example.co m hash4 1 Tuesday 4 5 Hash12 2 2024-04-01 1000
  • 119.
    Use Triggers ForComplex Requirements
  • 120.
    Use Triggers ForComplex Requirements
  • 121.
    Use Triggers ForComplex Requirements
  • 122.
    Use Triggers ForComplex Requirements
  • 123.
    Use Triggers ForComplex Requirements
  • 124.
    Use Triggers ForComplex Requirements
  • 125.
    Use Triggers ForComplex Requirements mysql> insert into users (active) values (2); ERROR 1644 (45000): active must be 0 or 1
  • 126.
    Use Triggers ForComplex Requirements mysql> insert into users (active) values (2); ERROR 1644 (45000): active must be 0 or 1
  • 127.
    Proactively Add Indexesto Keep Queries Performant
  • 132.
  • 134.
  • 135.
  • 136.
    Indexes in Databases 2023-12-03 Index:hire_date 2023-12-04 2023-12-05 2023-12-06
  • 141.
  • 142.
  • 143.
    What You Needto Know 1. Normalize Your Database For Data Deduplication 2. Use The Database Engine to Keep Data Clean 3. Proactively Add Indexes to Keep Queries Performant
  • 144.
    What You Needto Know 1. The table contains a unique identifier, also called the primary key, that is used to identify the row. 2. Each column contains atomic values (values that can not be broken down) 3. All the non-key columns are dependent on the primary key of the table 4. It contains columns that are non-transitively dependent on the primary key
  • 145.
    What You Needto Know • Make the DB Work With You • Correct Column Types • NOT NULL for Required Fields • UNIQUE for Unique Values • Foreign Keys For References To Other Tables • Triggers For Complex Requirements
  • 146.
    What You Needto Know • Use indexes on commonly searched columns • Start simple • See recorded talks about how to add
  • 148.
  • 149.
  • 150.
    Questions/Follow Me • Questions •Please rate the talk • <link> • @scottKeckWarren@phpc.social • @scottKeckWarren@twitter.com

Editor's Notes

  • #2 Ask people for photos Good morning all! 2 Shocking facts
  • #4 Anyone else in this boat Like to think I’m good at working with DBs Know that there’s always something to learn
  • #5 Early in my journey: just threw data into it and it spit it back Might have been magic Core piece of technology that I don’t understand
  • #7 <slide> Didn’t have a more senior level developer who could mentor So I had to figure it out Not necessarily a bad thing because that’s how I work best
  • #8 Push a new feature users are initially happy But as usage grows we start finding problems
  • #10 Angry people Customers My boss Not ideal
  • #11 Results in me fixing things under distress Night/weekends Once in a bathroom at a holiday inn
  • #12 My goal is to have you learn from my thrama
  • #13 For those of you who haven’t met me my name is …
  • #14 Professional PHP Developer for 16 years // team lead/CTO role for 11 of those 16
  • #15 Currently Director of Technology at WeCare Connect Use PHP and mysql for our backend Also …
  • #17 That being said My goal for today give you <slide> These are the rules I give new hires so they can understand our teams design
  • #18 So we have to figure it out ourselves
  • #19 All of these rules exist to prevent bugs or performance problem
  • #20 Like examples Today’s example is <slide> from a project
  • #21 Initial version of this database as it existed when we took over project
  • #22 Track everything using a string
  • #27 Only going to talk about the first four forms today as the others are hard to understand and demo 1 and 2 give us huge bang for our buck and we start looking a demising returns around 3
  • #30 A table doesn’t meet any of the conditions of normalization Essentially a spreadsheet
  • #33 The table contains a unique identifier, also called the primary key, that is used to identify the row.
  • #35 Make it auto incrementing a primary key so the database knows how to handle it
  • #36 Each column contains atomic values (values that can not be broken down) To solve this we need to create another table
  • #39 A lot of normalization is fixed with more tables
  • #41 Now in 1NF Still a lot of duplication and mismatched data
  • #44 Review users table Three sections Primary key Second section - all related to that Third section - not related First X columns are dependent
  • #45 Fix? It’s a new table Create offices
  • #46 Link our offices table to the users table
  • #47 Link our offices table to the users table
  • #48 Drop all the office columns 2nf
  • #51 When columns are transitively dependent one column's data relies on another column through a third column. For example, our offices' city column is dependent on the zip column which is dependent on the office's id.
  • #52 To fix this we'll split out the zip in a new table.
  • #53 To fix this we'll split out the zip in a new table.
  • #58 As many validation rules as possible <slide has a bunch>
  • #59 Not going to prevent lazy me Right to the DB This is just hiding future bugs want to prevent that
  • #60 Not going to prevent lazy me Right to the DB This is just hiding future bugs want to prevent that
  • #61 Not going to prevent lazy me Right to the DB This is just hiding future bugs want to prevent that
  • #64 <slide> Not the other way around Let’s start with one of the most basic constraints
  • #65 Looking back at our users table Still issues: Blank emails, Date problem , Duplicate emails , Deleted Sites Problem We can and should enforce rules at application level but …
  • #67 Next thing: weird dates Want dates in the “correct” format Right now if someone asks for all the employees hired in 2023 getting that information will be a challenge Especially the person who starts on Tuesday
  • #68 List of all the types in mysql SQL has a ton of types to best fit our needs
  • #69 Switch this column to a date Reformat a little and we get consistent values Now easy to find everyone who start in 2023
  • #70 Switch this column to a date Reformat a little and we get consistent values Now easy to find everyone who start in 2023
  • #73 Might have required for field but Show insert missing email
  • #74 Might have required for field but Show insert missing email
  • #75 Embrace NOT NULL for required columns
  • #76 Embrace NOT NULL for required columns
  • #77 Embrace NOT NULL for required columns
  • #78 Embrace NOT NULL for required columns
  • #80 Example insert 2 users with same email and password
  • #81 Example insert 2 users with same email and password
  • #82 Embrase unique constraints Allows us to specify this column is unique Good for thing we never ever want to see two of email is the best option
  • #83 Embrase unique constraints Allows us to specify this column is unique Good for thing we never ever want to see two of email is the best option
  • #84 Can specify multiple columns for uniqueness Example: multi-tenant database could support email address uniqueness per office
  • #86 Gave users access to clean up offices So they started deleting the duplicates
  • #87 Gave users access to clean up offices So they started deleting the duplicates
  • #88 Gave users access to clean up offices So they started deleting the duplicates
  • #89 Gave users access to clean up offices So they started deleting the duplicates
  • #90 Deleted locations so the values don’t match
  • #92 This table is using a join which is breaking the results User as assigned to locations that no longer exist
  • #93 Users that belong to non-existent offices
  • #95 Need some way to say what’s valid
  • #96 Need some way to say what’s valid
  • #97 Allow us to define the relationship of one column to another table
  • #98 Allow us to define the relationship of one column to another
  • #99 Allow us to define the relationship of one column to another
  • #100 Allow us to define the relationship of one column to another
  • #101 Allow us to define the relationship of one column to another
  • #102 Allow us to define the relationship of one column to another
  • #103 Allow us to define the relationship of one column to another
  • #104 Allow us to define the relationship of one column to another
  • #105 Allow us to define the relationship of one column to another
  • #106 Allow us to define the relationship of one column to another
  • #107 Allow us to define the relationship of one column to another
  • #108 Allow us to define the relationship of one column to another
  • #109 Allow us to define the relationship of one column to another
  • #110 Allow us to define the relationship of one column to another
  • #111 Allow us to define the relationship of one column to another
  • #112 Allow us to define the relationship of one column to another
  • #113 Allow us to define the relationship of one column to another
  • #114 Allow us to define the relationship of one column to another
  • #115 Performance
  • #116 Not enough I ever worry about but each FK requires looks
  • #118 “Magic” according to some developers
  • #119 Active column can accept any integer value
  • #127 I also like this for complex requirement that a standard column doesn’t cover Ex: if a row is one type different fields are not null
  • #129 Indexes In Life I love to cook Love to try new recipes
  • #130 Leftover food from recipe Now get a neural network to figure out But could use cookbooks
  • #131 Option 1 Go through every page looking for matches Slow as most don’t meet our criteria
  • #132 Option 2 Go to back of book to the index and look up ingreditants Use that to look up recipes Much faster
  • #133 Same
  • #135 Database is going to look at every row Fine when you have 100 users Slow when you have 10 million
  • #136 We’re going to use indexes to tell the database common things we’re going to query on <click>
  • #138 For example, I’m going to search commonly on email and active so that’s a prime candidate
  • #139 For example, I’m going to search commonly on email and active so that’s a prime candidate
  • #140 For example, I’m going to search commonly on email and active so that’s a prime candidate
  • #141 For example, I’m going to search commonly on email and active so that’s a prime candidate
  • #144 All of these rules exist to prevent bugs or performance problem
  • #148 Thank the sponsors