SlideShare a Scribd company logo
1
Managing Data Resources
2
The Name of the Game
• Information is a valuable resource.
• It is expensive to collect, maintain, and use.
• The goal of database management it to
– maximize the benefits gained from information
• maximize the accuracy of information
– minimize the costs associated with information
3
Keeping Track of Things
• Entity - person, place, thing or event on
which we maintain information.
• Attribute - A single piece of information
describing a particular entity.
4
Data Hierarchy
• Database - a collection of related files
• File - a collection of uniform records
• Record - a collection of related fields
• Field - a collection of bytes
• Byte (& words)
• Bit
5
Terminology
• Generic Database Spreadsheet
• ----- Table Table/Sheet
• Entity Record Row
• Attribute Field Column
6
Key Field(Attribute)
• A key field is an attribute that uniquely
identifies a record in a file.
– Examples: SSN, NAID
• The values in the key field MUST be
unique.
• It is possible to use several fields to form a
composite key.
– Example: Lastname + firstname + middlename
7
Natural Keys
• It is convenient and desirable to use
attributes which “naturally occur” with an
entity as a key.
• Example - most students have a SSN by the
time they enroll at HU, so the SSN would
be natural key.
8
Accessing Information
• Lookup items(records) by the value of their
key.
• Methods of access:
– Sequential Access
– Direct Access
– Indexed Sequential Access
9
Ordered vs. Unordered
• A database file (collection of records) may
be:
• ordered - physically arranged in the file so
that the key field increases (or decreases) in
a sequential fashion.
• unordered - physically arranged in the file
so the key field has no ordered relation with
the preceding or succeeding key.
10
Costs & Benefits of Ordering
• “In general” a record can be found faster in
an ordered list than in an unordered list.
– I’ll use the term file & list interchangeably.
• “In general” you can turn an unordered list
into an ordered list by sorting.
• Sorting is a cost of keeping a list ordered.
• In this course we will generally be dealing
with ordered lists.
11
Sequential Access
• Look at key of first record in file,
• if not the target then look at next record,
• if not the target then look at next record, …
• If file has N records on average will have to
look at N/2 records to find a random target.
• Question - Why not just “skip over” some
of the records?
12
Sequential Access
• An employee database might use SSN as
the key field.
• If the target SSN is 540-12-3763, and
• the first record SSN is 120-11-0007, then
• how many records should you skip?
• This is why sequential access has to look at
every record.
13
Sequential Access
• Historically data was stored on tapes.
• Tapes store information sequentially and
“only” allow for sequential access.
• DASD (disks drives) can also store files
sequentially. Files are written to the disk
track-by-track, cylinder-by-cylinder in a
“physically contiguous” fashion.
14
Direct Access
• Direct access means that given a value for
the key attribute the system can move
“directly” to the corresponding record
without having to look at an intervening
records in the file.
• Direct access requires that the system
“know” the physical location of the target
record on the disk.
15
Hashing Algorithms
• To find the physical location on the disk a
computation is performed on the key value
which yields a “unique” physical address
for the corresponding record.
• Perfect hashing algorithms get you to a
unique address.
• Imperfect algorithms may hash several keys
to the same address.
16
Hashing Example
• Suppose that I were using SSN as the key
and wanted to keep track of 100 entities.
• Select 101 (a prime number closest to the
number of records) and divide this into the
SSN.
• Remainder will always be a number
between 0 and 100.
17
Hashing Example
• The remainder represents the disk address.
– A remainder of 52 could represent
– cylinder 5, surface 2
• If two or more SSNs have the same
remainder (hash to the same address) this is
called a collision. Essentially these records
are then searched sequentially.
18
Direct Access Note
• The physical addresses in Direct Access
have no relation to the sequential “order” of
the keys.
• For any two adjacent sequential keys there
is no guarantee about the relationship
between their physical locations on the disk,
they may not be “physically contiguous”.
19
Sequential vs. Direct Access
• Sequential Access
– good when you want to process all records in
key order, next record is always ready to be
read/written.
• Direct Access
– good when you want to process records in a
random order, next record can be found
directly.
20
Indexed Sequential Access
Method (ISAM)
• Combines a sequential file with one or more
levels of indexes.
• Each index relates a physical location to the
highest key value stored in that location.
• You find physical location by looking in
each level of the index and then sequentially
searching the last physical location.
21
ISAM
• In the library the books are laid out
sequentially by call number (the key).
• Look at floor index to determine the floor
• Look at shelf index to determine the shelf
• Sequentially search the shelf
22
ISAM
• ISAM tries to give the best of both worlds.
• When you want to process items
sequentially you have an underlying
sequential file.
• When you want direct access you go
through the indexes to get close, then a
“small” sequential search at end.
23
Traditional File Systems
• Also called:
– flat file organization
– data file approach
• Typically an organization or a department
within an organization would develop their
applications and associated data files in an
independent fashion.
24
Problems with Traditional Files
• Data Redundancy
– conflicting data
• Program-Data Dependence
– lack of flexibility
• Lack of Data Sharing
– no common names for attributes & entities
• Poor Security
25
DBMS Approach
• Database Management Systems approach
places a common interface between the
users of data (the application programs) and
the data files.
26
DBMS Components
• Data Definition Language, DDL
• Data Manipulation Language, DML
– Structured Query Language, SQL
• Data Dictionary, DD
27
Logical & Physical Views
• Logical View
– how the user sees the data
• Physical View
– how the data is physically saved on the storage
media
• The DBMS gives each user their own
logical view while storing the data using a
single physical view.
28
Advantages of DBMS
• Complexity & Confusion reduced
– all data stored in single centralized physical
view
• Data redundancy & inconsistency reduced
– data dictionary shows what data elements are
available, data element only present “once”
• Program-data dependence reduced
– each user can get desired logical view
29
Advantages of DBMS
• Security
– single point of access to data
• Reduced cost
– initial purchase cost of DBMS and related staff
are high, but savings in future development and
maintenance usually offset these costs
– Access & Flexibility
– DML usually provides easier access to data
30
Designing Databases
• Hierarchical Data Model
• Network Data Model
• Relational Data Model
31
Hierarchical Data Model
Author 1
Book 1 Book 2 Book 3
Publisher A Publisher B Publisher A
32
Hierarchical Data Model
• Data records are broken into segments
• Each segment contains some attributes
• Segments are arranged into a hierarchical
“tree-like” structure
• Physical locations pointers join related
segments into records
• Child segments can only have one parent
33
Network Data Model
Author 1
Book 1 Book 2 Book 3
Publisher A Publisher B
34
Network Data Model
• Same organization as hierarchical data
model
• Except that a child segment can have
multiple parents
35
Relational Data Model
Author 1
Author 2
Author 3
Book 1
Book 2
Book 3
Book 4
Book 5
Publisher 1
Publisher 2
36
Relating Fields
A1 Author 1
A2 Author 2
A3 Author 3
Book 1 A1 P1
Book 2 A3 P2
Book 3 A2 P2
Book 4 A1 P2
Book 5 A1 P1
P1 Publisher 1
P2 Publisher 2
37
Relating Fields
A1 Author 1
A2 Author 2
A3 Author 3
Book 1 A1 P1
Book 2 A3 P2
Book 3 A2 P2
Book 4 A1 P2
Book 5 A1 P1
P1 Publisher 1
P2 Publisher 2
38
Relational Data Model
ID Publisher
P1 Publisher 1
P2 Publisher 2
ID Author
A1 Author 1
A2 Author 2
A3 Author 3
Publisher-table
Author-table
Title AID PID
Book 1 A1 P1
Book 2 A3 P2
Book 3 A2 P2
Book 4 A1 P2
Book 5 A1 P1
Book-table
39
Relational Data Model
• Data Records are broken into segments
• Each segment contains some attributes
• Segments are arranged in tables
• There are NO “physical” location pointers
between tables
• Relations between tables are “implied” by
relating fields
40
Relations Generated When Asked
• Relationships between segments are not
predefined by pointers in the relational
model.
• Tables are JOINed together to display
relationships.
• JOINs occur at query time.
• Tables must have a common data element to
be joined.
41
Example JOIN
Select
Author, Title, Publisher
FROM
Author-table, Book-table, Publisher-table
WHERE
Author-table.ID = Book-table.AID, and
Book-table.PID = Publisher-table.ID
42
Results of Join
Author Title Publisher
Author 1 Book 1 Publisher 1
Author 1 Book 4 Publisher 2
Author 1 Book 5 Publisher 1
Author 2 Book 3 Publisher 2
Author 3 Book 4 Publisher 2
Answer-table
43
Relational Model Operations
• Selection
– select which rows to display
• Projection
– select which columns to display
• Join
– combine two or more tables
44
Types of Relations
• 1-1
– 1-to-1
• 1-n
– 1-to-many
• n-n
– many-to-many
45
Name of the game
• Using the relational model,
• Represent each type of relationship
– as simply as possible (using the fewest tables),
– with a minimum of duplicated data, and
– with a minimum of wasted space (empty fields)
46
Tables needed for 1-1
Author Title
Author1 Book1
Author2 Book2
Author3 Book3
Book
47
Tables needed for 1-n
ID Name
1 Author1
2 Author2
3 Author3
Author
ID Title
1 Book1
1 Book2
2 Book3
3 Book4
2 Book5
Book
48
Tables needed for n-n
ID Name
1 Author1
2 Author2
3 Author3
AID BID
1 1
1 2
2 1
2 2
3 1
3 5
3 4
1 5
2 5
ID Title
1 Book1
2 Book2
3 Book3
4 Book4
5 Book5
Author
Book
Writes
49
Advantages & Disadvantages
• Hierarchical & Network Data Models
– faster for “pre-defined” queries
– slower for ad-hoc queries
– inflexible, more expensive to maintain
• Relational Data Models
– flexible, less expensive to maintain
– most queries require joins and are slower than
“pre-defined” queries mentioned above
50
Entity-relationship diagram
• A conceptual model useful in database
design.
• Illustrates the relationships between various
entities in the database.
• Entities are represented by rectangles.
• Relationships represented by diamonds.
• Attributes can be assigned to both entities
and relationships.
51
ER-Diagram
Authors Books
write
Publishers
publish
n n
n
1
ID
Last_Name
First_Name
Middle_Name
DOB
DOD
Name
Address
Phone
Title
Date
Edition
52
Centralized Database
• All database files are stored on a central
computer.
• All database processing is performed by the
central computer.
• Problems
– can overload central system
– not very fault tolerant
– communications costs can be high
53
Distributed Databases
• Distributed Processing
– processing is performed locally by processors
connected by a communications network.
• Distributed Databases
– the physical files that make up the database are
stored in more than one location
54
Distributed Databases
• Duplicate Database
– each location has its own copy of the entire
database.
• Partitioned Database
– each location has a copy of the portion of the
database that it needs.
55
Distributed Databases
• Central Index
– Records are stored locally, but a centralized
index is maintained to quickly located any
record.
• Ask-the-network
– Records are stored locally and the network
must be polled each time a record is needed.
56
Data Warehousing
• A database with associated reporting and
query tools,
• that stores current and historical data
extracted from various operational systems
• and consolidated for management reporting
and analysis.
57
A Data Warehouse...
• Sits on top of existing isolated legacy
systems, “islands of information”, to
provide an enterprise-wide database.
• Provides single platform, standardized
access to current operational data and
historical data (not normally maintained on
legacy systems).
58
Obstacles to Database
Implementation
• Organizational
– structural changes
– political changes
• Cost/benefit considerations
• Placement of Data Management Function
– need data administration and planning at
highest possible organizational level

More Related Content

Similar to Week - 06, 07, 08 DataMgt Chapter 3.pptx

Database Management System-Module-IV(part-1).pptx
Database Management System-Module-IV(part-1).pptxDatabase Management System-Module-IV(part-1).pptx
Database Management System-Module-IV(part-1).pptx
AiswaryaMohan31
 
Relational databases
Relational databasesRelational databases
Relational databases
Fiddy Prasetiya
 
Data Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.pptData Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.ppt
sdsm2
 
File organization
File organizationFile organization
File organization
Gokul017
 
files,indexing,hashing,linear and non linear hashing
files,indexing,hashing,linear and non linear hashingfiles,indexing,hashing,linear and non linear hashing
files,indexing,hashing,linear and non linear hashing
Rohit Kumar
 
Presentation DBMS (1)
Presentation DBMS (1)Presentation DBMS (1)
Presentation DBMS (1)
Ali Raza
 
CS101- Introduction to Computing- Lecture 37
CS101- Introduction to Computing- Lecture 37CS101- Introduction to Computing- Lecture 37
CS101- Introduction to Computing- Lecture 37
Bilal Ahmed
 
demo2.ppt
demo2.pptdemo2.ppt
demo2.ppt
crazyvirtue
 
C1 basic concepts of database
C1 basic concepts of databaseC1 basic concepts of database
C1 basic concepts of database
Wan Azni
 
Data resource management and DSS
Data resource management and DSSData resource management and DSS
Data resource management and DSS
RajThakuri
 
overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam
pratikkadam78
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)
welcometofacebook
 
Lec20.pptx introduction to data bases and information systems
Lec20.pptx introduction to data bases and information systemsLec20.pptx introduction to data bases and information systems
Lec20.pptx introduction to data bases and information systems
samiullahamjad06
 
mg6.pptx
mg6.pptxmg6.pptx
mg6.pptx
Teshome48
 
Chapter 09
Chapter 09Chapter 09
Chapter 09
andyburghardt
 
UNIT machine learning unit 1,algorithm pdf
UNIT machine learning  unit 1,algorithm pdfUNIT machine learning  unit 1,algorithm pdf
UNIT machine learning unit 1,algorithm pdf
OmarFarooque9
 
Database intro
Database introDatabase intro
Database intro
varsha nihanth lade
 
Introduction to Databases by Dr. Kamal Gulati
Introduction to Databases by Dr. Kamal GulatiIntroduction to Databases by Dr. Kamal Gulati
12 ipt 0202 Organisation methods
12 ipt 0202   Organisation methods12 ipt 0202   Organisation methods
12 ipt 0202 Organisation methods
ctedds
 
[Www.pkbulk.blogspot.com]dbms12
[Www.pkbulk.blogspot.com]dbms12[Www.pkbulk.blogspot.com]dbms12
[Www.pkbulk.blogspot.com]dbms12
AnusAhmad
 

Similar to Week - 06, 07, 08 DataMgt Chapter 3.pptx (20)

Database Management System-Module-IV(part-1).pptx
Database Management System-Module-IV(part-1).pptxDatabase Management System-Module-IV(part-1).pptx
Database Management System-Module-IV(part-1).pptx
 
Relational databases
Relational databasesRelational databases
Relational databases
 
Data Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.pptData Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.ppt
 
File organization
File organizationFile organization
File organization
 
files,indexing,hashing,linear and non linear hashing
files,indexing,hashing,linear and non linear hashingfiles,indexing,hashing,linear and non linear hashing
files,indexing,hashing,linear and non linear hashing
 
Presentation DBMS (1)
Presentation DBMS (1)Presentation DBMS (1)
Presentation DBMS (1)
 
CS101- Introduction to Computing- Lecture 37
CS101- Introduction to Computing- Lecture 37CS101- Introduction to Computing- Lecture 37
CS101- Introduction to Computing- Lecture 37
 
demo2.ppt
demo2.pptdemo2.ppt
demo2.ppt
 
C1 basic concepts of database
C1 basic concepts of databaseC1 basic concepts of database
C1 basic concepts of database
 
Data resource management and DSS
Data resource management and DSSData resource management and DSS
Data resource management and DSS
 
overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)
 
Lec20.pptx introduction to data bases and information systems
Lec20.pptx introduction to data bases and information systemsLec20.pptx introduction to data bases and information systems
Lec20.pptx introduction to data bases and information systems
 
mg6.pptx
mg6.pptxmg6.pptx
mg6.pptx
 
Chapter 09
Chapter 09Chapter 09
Chapter 09
 
UNIT machine learning unit 1,algorithm pdf
UNIT machine learning  unit 1,algorithm pdfUNIT machine learning  unit 1,algorithm pdf
UNIT machine learning unit 1,algorithm pdf
 
Database intro
Database introDatabase intro
Database intro
 
Introduction to Databases by Dr. Kamal Gulati
Introduction to Databases by Dr. Kamal GulatiIntroduction to Databases by Dr. Kamal Gulati
Introduction to Databases by Dr. Kamal Gulati
 
12 ipt 0202 Organisation methods
12 ipt 0202   Organisation methods12 ipt 0202   Organisation methods
12 ipt 0202 Organisation methods
 
[Www.pkbulk.blogspot.com]dbms12
[Www.pkbulk.blogspot.com]dbms12[Www.pkbulk.blogspot.com]dbms12
[Www.pkbulk.blogspot.com]dbms12
 

More from TALHA RIAZ PERSOTA

Week - 04, 05 Software Chapter 2 info sys.ppt
Week - 04, 05 Software Chapter 2 info sys.pptWeek - 04, 05 Software Chapter 2 info sys.ppt
Week - 04, 05 Software Chapter 2 info sys.ppt
TALHA RIAZ PERSOTA
 
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.pptWeek - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
TALHA RIAZ PERSOTA
 
ch01_02.ppt
ch01_02.pptch01_02.ppt
ch01_02.ppt
TALHA RIAZ PERSOTA
 
Lab # 05 Workshop Practice.pptx
Lab # 05 Workshop Practice.pptxLab # 05 Workshop Practice.pptx
Lab # 05 Workshop Practice.pptx
TALHA RIAZ PERSOTA
 
Lab # 03 Workshop Practice.pptx
Lab # 03 Workshop Practice.pptxLab # 03 Workshop Practice.pptx
Lab # 03 Workshop Practice.pptx
TALHA RIAZ PERSOTA
 
Lab # 02 Workshop Practice.pptx
Lab # 02 Workshop Practice.pptxLab # 02 Workshop Practice.pptx
Lab # 02 Workshop Practice.pptx
TALHA RIAZ PERSOTA
 
Lab # 01 Workshop Practice.pptx
Lab # 01 Workshop Practice.pptxLab # 01 Workshop Practice.pptx
Lab # 01 Workshop Practice.pptx
TALHA RIAZ PERSOTA
 

More from TALHA RIAZ PERSOTA (7)

Week - 04, 05 Software Chapter 2 info sys.ppt
Week - 04, 05 Software Chapter 2 info sys.pptWeek - 04, 05 Software Chapter 2 info sys.ppt
Week - 04, 05 Software Chapter 2 info sys.ppt
 
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.pptWeek - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
Week - 01, 02, 03 Bits-n-Pieces Chapter 1.ppt
 
ch01_02.ppt
ch01_02.pptch01_02.ppt
ch01_02.ppt
 
Lab # 05 Workshop Practice.pptx
Lab # 05 Workshop Practice.pptxLab # 05 Workshop Practice.pptx
Lab # 05 Workshop Practice.pptx
 
Lab # 03 Workshop Practice.pptx
Lab # 03 Workshop Practice.pptxLab # 03 Workshop Practice.pptx
Lab # 03 Workshop Practice.pptx
 
Lab # 02 Workshop Practice.pptx
Lab # 02 Workshop Practice.pptxLab # 02 Workshop Practice.pptx
Lab # 02 Workshop Practice.pptx
 
Lab # 01 Workshop Practice.pptx
Lab # 01 Workshop Practice.pptxLab # 01 Workshop Practice.pptx
Lab # 01 Workshop Practice.pptx
 

Recently uploaded

Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
mahammadsalmanmech
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
JamalHussainArman
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
rpskprasana
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
enizeyimana36
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 

Week - 06, 07, 08 DataMgt Chapter 3.pptx

  • 2. 2 The Name of the Game • Information is a valuable resource. • It is expensive to collect, maintain, and use. • The goal of database management it to – maximize the benefits gained from information • maximize the accuracy of information – minimize the costs associated with information
  • 3. 3 Keeping Track of Things • Entity - person, place, thing or event on which we maintain information. • Attribute - A single piece of information describing a particular entity.
  • 4. 4 Data Hierarchy • Database - a collection of related files • File - a collection of uniform records • Record - a collection of related fields • Field - a collection of bytes • Byte (& words) • Bit
  • 5. 5 Terminology • Generic Database Spreadsheet • ----- Table Table/Sheet • Entity Record Row • Attribute Field Column
  • 6. 6 Key Field(Attribute) • A key field is an attribute that uniquely identifies a record in a file. – Examples: SSN, NAID • The values in the key field MUST be unique. • It is possible to use several fields to form a composite key. – Example: Lastname + firstname + middlename
  • 7. 7 Natural Keys • It is convenient and desirable to use attributes which “naturally occur” with an entity as a key. • Example - most students have a SSN by the time they enroll at HU, so the SSN would be natural key.
  • 8. 8 Accessing Information • Lookup items(records) by the value of their key. • Methods of access: – Sequential Access – Direct Access – Indexed Sequential Access
  • 9. 9 Ordered vs. Unordered • A database file (collection of records) may be: • ordered - physically arranged in the file so that the key field increases (or decreases) in a sequential fashion. • unordered - physically arranged in the file so the key field has no ordered relation with the preceding or succeeding key.
  • 10. 10 Costs & Benefits of Ordering • “In general” a record can be found faster in an ordered list than in an unordered list. – I’ll use the term file & list interchangeably. • “In general” you can turn an unordered list into an ordered list by sorting. • Sorting is a cost of keeping a list ordered. • In this course we will generally be dealing with ordered lists.
  • 11. 11 Sequential Access • Look at key of first record in file, • if not the target then look at next record, • if not the target then look at next record, … • If file has N records on average will have to look at N/2 records to find a random target. • Question - Why not just “skip over” some of the records?
  • 12. 12 Sequential Access • An employee database might use SSN as the key field. • If the target SSN is 540-12-3763, and • the first record SSN is 120-11-0007, then • how many records should you skip? • This is why sequential access has to look at every record.
  • 13. 13 Sequential Access • Historically data was stored on tapes. • Tapes store information sequentially and “only” allow for sequential access. • DASD (disks drives) can also store files sequentially. Files are written to the disk track-by-track, cylinder-by-cylinder in a “physically contiguous” fashion.
  • 14. 14 Direct Access • Direct access means that given a value for the key attribute the system can move “directly” to the corresponding record without having to look at an intervening records in the file. • Direct access requires that the system “know” the physical location of the target record on the disk.
  • 15. 15 Hashing Algorithms • To find the physical location on the disk a computation is performed on the key value which yields a “unique” physical address for the corresponding record. • Perfect hashing algorithms get you to a unique address. • Imperfect algorithms may hash several keys to the same address.
  • 16. 16 Hashing Example • Suppose that I were using SSN as the key and wanted to keep track of 100 entities. • Select 101 (a prime number closest to the number of records) and divide this into the SSN. • Remainder will always be a number between 0 and 100.
  • 17. 17 Hashing Example • The remainder represents the disk address. – A remainder of 52 could represent – cylinder 5, surface 2 • If two or more SSNs have the same remainder (hash to the same address) this is called a collision. Essentially these records are then searched sequentially.
  • 18. 18 Direct Access Note • The physical addresses in Direct Access have no relation to the sequential “order” of the keys. • For any two adjacent sequential keys there is no guarantee about the relationship between their physical locations on the disk, they may not be “physically contiguous”.
  • 19. 19 Sequential vs. Direct Access • Sequential Access – good when you want to process all records in key order, next record is always ready to be read/written. • Direct Access – good when you want to process records in a random order, next record can be found directly.
  • 20. 20 Indexed Sequential Access Method (ISAM) • Combines a sequential file with one or more levels of indexes. • Each index relates a physical location to the highest key value stored in that location. • You find physical location by looking in each level of the index and then sequentially searching the last physical location.
  • 21. 21 ISAM • In the library the books are laid out sequentially by call number (the key). • Look at floor index to determine the floor • Look at shelf index to determine the shelf • Sequentially search the shelf
  • 22. 22 ISAM • ISAM tries to give the best of both worlds. • When you want to process items sequentially you have an underlying sequential file. • When you want direct access you go through the indexes to get close, then a “small” sequential search at end.
  • 23. 23 Traditional File Systems • Also called: – flat file organization – data file approach • Typically an organization or a department within an organization would develop their applications and associated data files in an independent fashion.
  • 24. 24 Problems with Traditional Files • Data Redundancy – conflicting data • Program-Data Dependence – lack of flexibility • Lack of Data Sharing – no common names for attributes & entities • Poor Security
  • 25. 25 DBMS Approach • Database Management Systems approach places a common interface between the users of data (the application programs) and the data files.
  • 26. 26 DBMS Components • Data Definition Language, DDL • Data Manipulation Language, DML – Structured Query Language, SQL • Data Dictionary, DD
  • 27. 27 Logical & Physical Views • Logical View – how the user sees the data • Physical View – how the data is physically saved on the storage media • The DBMS gives each user their own logical view while storing the data using a single physical view.
  • 28. 28 Advantages of DBMS • Complexity & Confusion reduced – all data stored in single centralized physical view • Data redundancy & inconsistency reduced – data dictionary shows what data elements are available, data element only present “once” • Program-data dependence reduced – each user can get desired logical view
  • 29. 29 Advantages of DBMS • Security – single point of access to data • Reduced cost – initial purchase cost of DBMS and related staff are high, but savings in future development and maintenance usually offset these costs – Access & Flexibility – DML usually provides easier access to data
  • 30. 30 Designing Databases • Hierarchical Data Model • Network Data Model • Relational Data Model
  • 31. 31 Hierarchical Data Model Author 1 Book 1 Book 2 Book 3 Publisher A Publisher B Publisher A
  • 32. 32 Hierarchical Data Model • Data records are broken into segments • Each segment contains some attributes • Segments are arranged into a hierarchical “tree-like” structure • Physical locations pointers join related segments into records • Child segments can only have one parent
  • 33. 33 Network Data Model Author 1 Book 1 Book 2 Book 3 Publisher A Publisher B
  • 34. 34 Network Data Model • Same organization as hierarchical data model • Except that a child segment can have multiple parents
  • 35. 35 Relational Data Model Author 1 Author 2 Author 3 Book 1 Book 2 Book 3 Book 4 Book 5 Publisher 1 Publisher 2
  • 36. 36 Relating Fields A1 Author 1 A2 Author 2 A3 Author 3 Book 1 A1 P1 Book 2 A3 P2 Book 3 A2 P2 Book 4 A1 P2 Book 5 A1 P1 P1 Publisher 1 P2 Publisher 2
  • 37. 37 Relating Fields A1 Author 1 A2 Author 2 A3 Author 3 Book 1 A1 P1 Book 2 A3 P2 Book 3 A2 P2 Book 4 A1 P2 Book 5 A1 P1 P1 Publisher 1 P2 Publisher 2
  • 38. 38 Relational Data Model ID Publisher P1 Publisher 1 P2 Publisher 2 ID Author A1 Author 1 A2 Author 2 A3 Author 3 Publisher-table Author-table Title AID PID Book 1 A1 P1 Book 2 A3 P2 Book 3 A2 P2 Book 4 A1 P2 Book 5 A1 P1 Book-table
  • 39. 39 Relational Data Model • Data Records are broken into segments • Each segment contains some attributes • Segments are arranged in tables • There are NO “physical” location pointers between tables • Relations between tables are “implied” by relating fields
  • 40. 40 Relations Generated When Asked • Relationships between segments are not predefined by pointers in the relational model. • Tables are JOINed together to display relationships. • JOINs occur at query time. • Tables must have a common data element to be joined.
  • 41. 41 Example JOIN Select Author, Title, Publisher FROM Author-table, Book-table, Publisher-table WHERE Author-table.ID = Book-table.AID, and Book-table.PID = Publisher-table.ID
  • 42. 42 Results of Join Author Title Publisher Author 1 Book 1 Publisher 1 Author 1 Book 4 Publisher 2 Author 1 Book 5 Publisher 1 Author 2 Book 3 Publisher 2 Author 3 Book 4 Publisher 2 Answer-table
  • 43. 43 Relational Model Operations • Selection – select which rows to display • Projection – select which columns to display • Join – combine two or more tables
  • 44. 44 Types of Relations • 1-1 – 1-to-1 • 1-n – 1-to-many • n-n – many-to-many
  • 45. 45 Name of the game • Using the relational model, • Represent each type of relationship – as simply as possible (using the fewest tables), – with a minimum of duplicated data, and – with a minimum of wasted space (empty fields)
  • 46. 46 Tables needed for 1-1 Author Title Author1 Book1 Author2 Book2 Author3 Book3 Book
  • 47. 47 Tables needed for 1-n ID Name 1 Author1 2 Author2 3 Author3 Author ID Title 1 Book1 1 Book2 2 Book3 3 Book4 2 Book5 Book
  • 48. 48 Tables needed for n-n ID Name 1 Author1 2 Author2 3 Author3 AID BID 1 1 1 2 2 1 2 2 3 1 3 5 3 4 1 5 2 5 ID Title 1 Book1 2 Book2 3 Book3 4 Book4 5 Book5 Author Book Writes
  • 49. 49 Advantages & Disadvantages • Hierarchical & Network Data Models – faster for “pre-defined” queries – slower for ad-hoc queries – inflexible, more expensive to maintain • Relational Data Models – flexible, less expensive to maintain – most queries require joins and are slower than “pre-defined” queries mentioned above
  • 50. 50 Entity-relationship diagram • A conceptual model useful in database design. • Illustrates the relationships between various entities in the database. • Entities are represented by rectangles. • Relationships represented by diamonds. • Attributes can be assigned to both entities and relationships.
  • 52. 52 Centralized Database • All database files are stored on a central computer. • All database processing is performed by the central computer. • Problems – can overload central system – not very fault tolerant – communications costs can be high
  • 53. 53 Distributed Databases • Distributed Processing – processing is performed locally by processors connected by a communications network. • Distributed Databases – the physical files that make up the database are stored in more than one location
  • 54. 54 Distributed Databases • Duplicate Database – each location has its own copy of the entire database. • Partitioned Database – each location has a copy of the portion of the database that it needs.
  • 55. 55 Distributed Databases • Central Index – Records are stored locally, but a centralized index is maintained to quickly located any record. • Ask-the-network – Records are stored locally and the network must be polled each time a record is needed.
  • 56. 56 Data Warehousing • A database with associated reporting and query tools, • that stores current and historical data extracted from various operational systems • and consolidated for management reporting and analysis.
  • 57. 57 A Data Warehouse... • Sits on top of existing isolated legacy systems, “islands of information”, to provide an enterprise-wide database. • Provides single platform, standardized access to current operational data and historical data (not normally maintained on legacy systems).
  • 58. 58 Obstacles to Database Implementation • Organizational – structural changes – political changes • Cost/benefit considerations • Placement of Data Management Function – need data administration and planning at highest possible organizational level

Editor's Notes

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58