INT 1010 07-3.pdf

Introduction to Information Technology
7.3. Databases: Relational Data Model and others
INT-1010
Prof C
Luis R Castellanos
1
07.3
Databases:
Relational Data Model and others

2
Data and
Databases
Before
Databases
Relational
Model
Databases and
security
Database
concepts
Database
design

3
Relational Model Data Warehouse Data Mining
Relational
Model
Business
Intelligence
and Analytics

4
Relational
Model
Databases

5
The relational data model was introduced
by E. F. Codd in 1970. Currently, it is the
most widely used data model.
The relational model has provided the
basis for:
• Research on the theory of
data/relationship/constraint
• Numerous database design methodologies
• The standard database access language called
structured query language (SQL)
• Almost all modern commercial database
management systems
The relational data model describes the
world as “a collection of inter-related
relations (or tables).”

6
Relation
A relation, also known as a table or file,
is a subset of the Cartesian product of a
list of domains characterized by a name.
And within a table, each row represents
a group of related data values.
A row, or record, is also known as a
tuple.
students
Row
The columns in a table are fields and
they are also referred to as attributes.
You can also think of it this way: an
attribute is used to define the record
and a record contains a set of attributes.
Field

7
Table
A database is composed of multiple tables and each table holds the data. Here is
shown a database that contains three tables.

8
Column
A database stores pieces of information
or facts in an organized way.
Understanding how to use and get the
most out of databases requires us to
understand that method of
organization.
The principal storage units are called
columns or fields or attributes.
These house the basic components of
data into which your content can be
broken down.
Field

9
Now, let’s look at the example of an ID
card in the left figure to see the
relationship between fields and their
data.
First Name Fam Name Nationality Salary DoB Marital Sta Shift Place Issue Valid until
Let’s take a look at the following table:

10
Domain
A domain is the original set of atomic values
used to model data.
By atomic value, we mean that each value in
the domain is indivisible as far as the
relational model is concerned.
For example:
The domain of Marital Status has a set of
possibilities: Married, Single, Divorced.
The domain of Shift has the set of all possible
days: {Mon, Tue, Wed…}.

11
The domain of Salary is the set of all
floating-point numbers greater than 0
and less than 200,000.
The domain of First Name is the set of
character strings that represents
names of people.
In summary, a domain is a set of
acceptable values that a column is
allowed to contain.
This is based on various properties and
the data type for the column.

12
Records
Just as the content of any one
document or item needs to be broken
down into its constituent bits of data
for storage in the fields, the link
between them also needs to be
available so that they can be
reconstituted into their whole form.
Records allow us to do this.
Records contain fields that are related,
such as a customer or an employee.
As noted earlier, a tuple is another
term used for record.

13
Records and fields form the basis of all databases.
A simple table gives us the clearest picture of how records and fields work
together in a database storage project.
Attribute values
Rows
Attribute name

14
The simple table example in the figure shows us how fields can hold a range of
different sorts of data.
This one has:
• A Record ID field: this is an ordinal number; its data type is an integer.
• A PubDate field: this is displayed as month/day/year; its data type is date.
• An Author field: this is displayed as Initial. Surname; its data type is text.
• A Title field text: free text can be entered here.

15
You can command the database to sift through its
data and organize it in a particular way.
For example, you can request that a selection of
records be limited by date:
1. all before a given date,
2. all after a given date or
3. all between two given dates.
Similarly, you can choose to have records sorted
by date.
Because the field, or record, containing the data is
set up as a Date field, the database reads the
information in the Date field not just as numbers
separated by slashes, but rather, as dates that
must be ordered according to a calendar system.
Other date formats:
DD/MM/YYYY
YYYY/MM/DD

16
Degree
The degree is the number of attributes in a table.
In our example the figure we have been working with, has a 4 degree.

17
Table Properties
• A table has a name that is distinct from all other
tables in the database.
• There are no duplicate rows; each row is distinct.
• Entries in columns are atomic. The table does not
contain repeating groups or multivalued attributes.
• Entries from columns are from the same domain
based on their data type including:
• number (numeric, integer, float, smallint,…)
• character (string)
• date
• logical (true or false)
• Operations combining different data types are
disallowed.
• Each attribute has a distinct name.
• The sequence of columns is insignificant.
• The sequence of rows is insignificant.
record
field

18
Relational
Model
Business
Intelligence
and Analytics

19
Data Warehouse
Databases

20
As organizations have begun to utilize
databases as the centerpiece of their
operations, the need to fully understand and
leverage the data they are collecting has
become more and more apparent.
However, directly analyzing the data that is
needed for day-to-day operations is not a good
idea; we do not want to tax the operations of
the company more than we need to.
Further, organizations also want to analyze
data in a historical sense: How does the data
we have today compare with the same set of
data this time last month, or last year?

21
From these needs arose the concept of
the data warehouse.
The concept of the data warehouse is
simple: extract data from one or more
of the organization’s databases and
load it into the data warehouse (which
is itself another database) for storage
and analysis.
However, the execution of this concept
is not that simple.

22
A data warehouse should be designed
so that it meets the following criteria:
• It uses non-operational data. This means
that the data warehouse is using a copy of
data from the active databases that the
company uses in its day-to-day operations,
so the data warehouse must pull data from
the existing databases on a regular,
scheduled basis.
• The data is time-variant. This means that
whenever data is loaded into the data
warehouse, it receives a time stamp, which
allows for comparisons between different
time periods.

23
• The data is standardized. Because the data
in a data warehouse usually comes from
several different sources, it is possible that
the data does not use the same definitions
or units. For example, our Events table in
our Student Clubs database lists the event
dates using the mm/dd/yyyy format (e.g.,
01/10/2013). A table in another database
might use the format yy/mm/dd (e.g.,
13/01/10) for dates. In order for the data
warehouse to match up dates, a standard
date format would have to be agreed upon
and all data loaded into the data warehouse
would have to be converted to use this
standard format. This process is called
extraction-transformation-load (ETL).

24
There are two primary schools of thought
when designing a data warehouse:
bottom-up and top-down.
• The bottom-up approach starts by creating
small data warehouses, called data marts, to
solve specific business problems. As these data
marts are created, they can be combined into a
larger data warehouse.
• The top-down approach suggests that we
should start by creating an enterprise-wide data
warehouse and then, as specific business needs
are identified, create smaller data marts from
the data warehouse.

25
Benefits of Data Warehouses
Organizations find data warehouses
quite beneficial for a number of
reasons:
• The process of developing a data warehouse
forces an organization to better understand
the data that it is currently collecting and,
equally important, what data is not being
collected.
• A data warehouse provides a centralized
view of all data being collected across the
enterprise and provides a means for
determining data that is inconsistent.

26
• Once all data is identified as consistent, an
organization can generate one version of the
truth. This is important when the company
wants to report consistent statistics about
itself, such as revenue or number of
employees.
• By having a data warehouse, snapshots of
data can be taken over time. This creates a
historical record of data, which allows for
an analysis of trends.
• A data warehouse provides tools to combine
data, which can provide new information
and analysis.

27
Relational
Model
Business
Intelligence
and Analytics

28
Data Mining
Databases

29
Data mining
Data mining is the process of analyzing data to find
previously unknown trends, patterns, and associations in
order to make decisions. Generally, data mining is
accomplished through automated means against
extremely large data sets, such as a data warehouse. Some
examples of data mining include:
• An analysis of sales from a large grocery chain might determine
that milk is purchased more frequently the day after it rains in
cities with a population of less than 50,000.
• A bank may find that loan applicants whose bank accounts show
particular deposit and withdrawal patterns are not good credit
risks.
• A baseball team may find that collegiate baseball players with
specific statistics in hitting, pitching, and fielding make for more
successful major league players.

30
In some cases, a data-mining project is
begun with a hypothetical result in
mind.
For example, a grocery chain may
already have some idea that buying
patterns change after it rains and want
to get a deeper understanding of
exactly what is happening.
In other cases, there are no
presuppositions and a data-mining
program is run against large data sets
in order to find patterns and
associations.

31

32
Privacy Concerns
The increasing power of data mining has
caused concerns for many, especially in the
area of privacy.
In today’s digital world, it is becoming easier
than ever to take data from disparate sources
and combine them to do new forms of analysis.
In fact, a whole industry has sprung up around
this technology: data brokers.
These firms combine publicly accessible data
with information obtained from the
government and other sources to create vast
warehouses of data about people and
companies that they can then sell.

33
Relational
Model
Business
Intelligence
and Analytics

34
Business
Intelligence and
Analytics
Databases

35
Business Intelligence and
Business Analytics
With tools such as data warehousing
and data mining at their disposal,
businesses are learning how to use
information to their advantage.
The term business intelligence is
used to describe the process that
organizations use to take data they are
collecting and analyze it in the hopes of
obtaining a competitive advantage.

36
Besides using data from their internal
databases, firms often purchase
information from data brokers to get a
big-picture understanding of their
industries.
Business analytics is the term used
to describe the use of internal company
data to improve business processes
and practices.

37

38
Knowledge Management
All companies accumulate knowledge over
the course of their existence.
Some of this knowledge is written down or
saved, but not in an organized fashion.
Much of this knowledge is not written down;
instead, it is stored inside the heads of its
employees.
Knowledge management is the process of
formalizing the capture, indexing, and storing
of the company’s knowledge in order to
benefit from the experiences and insights that
the company has captured during its
existence.

39

40

41

42
What is a relational data
model?
1. has its schema viewed as a graph in
which object types are nodes and
relationship types are arcs
2. stores data as records which are
connected to one another through links
3. describes the world as “a collection of
inter-related relations (or tables).”
4. both data and their relationships are
contained in a single structure known as
an object

43
True or False?
In the relational model, the columns
are known as fields or attributes, and
rows are known as records.
True False

44
What is a data warehouse?
1. extracts data from one or more of the
organization’s DBs and loads it into
another DB for storage and analysis
2. data transfer is initiated by the server
rather than the client so he can store data
3. process of finding anomalies, patterns
and correlations in large data sets to
predict outcomes
4. graphical representation of information
and data

45
What is a data mining?
1. process of extracting data from the own
database
2. extracts data from one or more database
into another database
3. process of analyzing data to find
previously unknown trends, patterns, and
associations
4. harvest data that can be sold to a third
party

47

Textbook
49
https://eng.libretexts.org/Courses/Prince_
Georges_Community_College/INT_1010%
3A_Concepts_in_Computing
Purchase of a book is not
required.

Professor C
50
castellr@pgcc.edu
eLearning Expert
BS & MS in Systems Engineering
BS & MS in Military Science and Arts
HC Dr in Education
IT Professor | Spanish Instructor
LCINT1010.wordpress.com
Presentation created in 01/2022.
Slides last updated on 03/2023

INT-1010
Prof C
Luis R Castellanos
51
07.3
Databases:
Relational Data Model and others

INT 1010 07-3.pdf

Recommended

Recommended

More Related Content

Similar to INT 1010 07-3.pdf

Similar to INT 1010 07-3.pdf (20)

More from Luis R Castellanos

More from Luis R Castellanos (20)

Recently uploaded

Recently uploaded (20)

INT 1010 07-3.pdf