2. What is a database?
• Database: Collection of related data and the tools
to manage and use that data
• Collection refers to a group of like things
– Students at SPSCC represent a collection
• What belongs in the group is determined by a
purpose, task or need:
– What will the data be used for?
– Provides the ability to determine what specific data is
needed to complete a task, satisfy the stated purpose
3. Manage & Use Data
• Add new data
• Edit existing data
• Remove data
• Find data
– Filtering: limit by characteristics
– Sorting: order by value
4. Database Tools
• How data is stored doesn’t matter
– May be a list
– May be post-it notes
• Tools may be simple or complex
– Piece of paper and pencil
– Spreadsheet
• A Grocery List is a database
– Using pen/paper, can add items to buy, change items
to buy, remove items to buy
– Use a different list for different days or stores
5. What is a relational database?
• Incorporates basic definition of a database
• Data organized in a set of tables, where the data and
relationships between data are modeled based on
the real world
– Table: Group of records about one kind of
thing (entity)
– Record: Entry for one entity (row)
– Field: Single value describing characteristic of one entity
(column)
• Reduces data entry, size of files, number of errors
• Helps to ensure the accuracy and validity of data
6. Relationships – 1:1
• One to one: for each record in one table there is a
single corresponding record in a second table
• Similar to splitting a table in two:
– If have a persons table (name, address) could have a
students table (school ID, major); each person can
only be a single student
7. Relationships – 1:M
• One to many: for each record in one table, there
can be one or more related records in a second
table
• In simplest form, represents ownership
– One student completes many assignments
• Each assignment “belongs” to a single student, and only that
student
8. Relationships – M:N
• Many to many: for each record in both
tables, there can be many matching records in
the other table
• Most common kind of relationship
– One student takes many classes, each class has many
students
• Requires a third table to create relationship
– Third table “joins” entries in original table
• An Enrollments table would identify which student is in
which class
– Join table has at least two foreign keys
9. Primary Keys
• Keys provide a means to find specific rows
• Primary key defines unique, required value in a
table
– Provides a way to get one row in the table
– May be one or more columns
• Column(s) in primary key must have a value
• Value(s) must be different for each row
• Table can have only one primary key
• Student ID represents a value that is different
(unique) for each student in the Students table
10. Foreign Keys
• Foreign key is a value in one table that refers to a
unique value in a different table
– “Foreign” means outside
• A student ID in enrollments refers to an entry in the Students
table
– Usually refers to primary key, but can use any unique
index
• Foreign key must “look like” related primary
key
– Same number of fields
• Field names don’t have to match
– Data types must match
11. Referential Integrity
• Ensures that data is consistent
– Value in foreign key must exist in related primary key
– Prevents “orphans”, records on many side without a
valid “parent”
• Creates limits on both tables
– Can’t enter a row in the many side with a foreign key
that doesn’t exist
– Can’t remove a row on the one side if there are
related rows in the many side
12. Cascade Update/Delete
• Can implement referential integrity to help
manage changes automatically
• Cascade update passes changes to primary key
values to the related rows on the many side
– If student ID is changed in students, change the
student ID in related rows in Enrollments to match
the new value
• Cascade delete deletes related rows from the
many table when a row from the one-side is
deleted
– If a student is deleted from Students table, delete
related rows in Enrollments
13. Normalizing a database
• Process of organizing data in database to:
– Reduce redundancy: don’t repeat values
• Some repetition is needed for primary keys/foreign keys
– Reduce inconsistent dependencies: changing one
value shouldn’t require a change to a second value
• Rather than store Price, Quantity, Total Price (which is Price *
Quantity), store Price, Quantity and calculate Total Price
when needed
• Helps to ensure each table is about one thing
14. Using Normalization
• Different degrees of normalization are referred
to as “forms”
• Database designer determines how far to
normalize
– Most relational databases are in 3rd normal form
– Some “de-normalization” is common
• Each higher level of normalization leads to more tables with
fewer columns
• More joins in queries are required to make data
useful, understandable
15. First Normal Form
• First Normal Form is most basic level
• Each row/column combination has only one
value
– Address should be broken up to
Street, City, State, Zip
• Eliminate repeating groups
– Don’t have multiple phone number columns in a
Students table
16. First Normal Form Example
• Example: Instead of using a single field for all
items purchased, each item is unique, as is
quantity
OrderID
CustomerID
OrderDate
Items Purchased
Not Normalized OrderID
CustomerID
OrderDate
ItemID
Quantity
ItemName
1st Normal Form
17. Second Normal Form
• Remove fields that are not fully dependent on
the key, and place in separate table(s)
– Each row should be about just one thing
– Listing the grades a student receives doesn’t help
describe a student – grades belong in a different table
18. Second Normal Form Example
• Example: ItemID is not dependent on the
Customer and OrderID; it is dependent on
OrderID (an order can include many different
things)
OrderID
CustomerID
OrderDate
ItemID
Quantity
ItemName
1st Normal Form
OrderID
CustomerID
OrderDate
2nd Normal Form
OrderID
ItemID
Quantity
ItemName
19. Third Normal Form
• All non-key columns are mutually independent
– A change in one field does not require a change in
another field in the table (i.e. no calculations)
• All fields contribute to describing the key
(making the record unique)
• There are limits to how far to go:
– A change in city could require a change in state and
zip code
– Need to either add many small tables or some level of
not being normalized
20. Third Normal Form Example
• If the ItemID that’s part of an order changes, that
means that item name should change too; break
out Products into it’s own table
OrderID
CustomerID
OrderDate
2nd Normal Form
OrderID
ItemID
Quantity
ItemName
OrderID
CustomerID
OrderDate
3rd Normal Form
OrderID
ItemID
Quantity
ItemID
ItemName
21. Normalization Summary
• A change in one field should not require change
in another field in the table
– No calculations
• All fields help describe the key
– Each record is unique
– Each table stores information about one “thing”