SlideShare a Scribd company logo
1 of 45
Department of Information Technology 1Data base Technologies (ITB4201)
File Operation
Dr. C.V. Suresh Babu
Professor
Department of IT
Hindustan Institute of Science & Technology
Department of Information Technology 2Data base Technologies (ITB4201)
Action Plan
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
• Quiz
Department of Information Technology 3Data base Technologies (ITB4201)
Files
• Files are the central element to most applications
– file as an input to applications
– file as an output for long-term storage and for later access
• Desirable properties of files:
– Long-term existence
– Controlled sharing between processes
– Structure that is convenient for particular applications
Department of Information Technology 4Data base Technologies (ITB4201)
File Structure
Fields and Records
• Fields
– Basic element of data
• e.g., student’s last name
– Contains a single value
– Characterized by its length and data type
• Records
– Collection of related fields
• e.g., a student record
– Treated as a unit
Department of Information Technology 5Data base Technologies (ITB4201)
File Structure
File and Database
• File
– Collection of similar records
– Treated as a single entity and may be referenced by name
– Access control restrictions usually apply at the file level
• Database
– Collection of related data
– Explicit relationships exist among elements
– Consists of one or more files
Department of Information Technology 6Data base Technologies (ITB4201)
A Big Picture
How to identify and locate a
selected file?
How to enforce user access
control in shared systems?
How to organize records
as a sequence of blocks
for I/O?
individual block I/O
requests must be
scheduled for optimizing
performanceHow to organize records in a
file and access a particular
record in a file?
Department of Information Technology 7Data base Technologies (ITB4201)
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
Department of Information Technology 8Data base Technologies (ITB4201)
File Organization
• The basic operations that a user or application may perform on
a file are performed at the record level
– The file is viewed as having some structure that organizes the records
• File organization refers to the logical structuring of records
– Determined by the way in which files are accessed (access method)
Department of Information Technology 9Data base Technologies (ITB4201)
Criteria for
File Organization
• Important criteria include:
– Short access time
– Ease of update
– Economy of storage
– Simple maintenance
– Reliability
Department of Information Technology 10Data base Technologies (ITB4201)
Criteria for
File Organization
• Priority will differ depending on the use
– For batch mode file processing, rapid access for retrieval of a
single record is of minimal concern
• These criteria may conflict
– Use of indexes (conflict with economy of storage) can be a
primary means of increasing the speed of access to data
Department of Information Technology 11Data base Technologies (ITB4201)
The Pile
• Data are collected in the order they arrive
– No structure
• Purpose is to accumulate a mass of data and save it
• Records may have different fields
– field should be self-describing (field name + value)
– field length should be known (delimiters, subfield or
default for a field type)
Department of Information Technology 12Data base Technologies (ITB4201)
The Pile
• Record access is by exhaustive search
• Used when data are collected and stored prior to
processing or data are not easy to organize
•  Uses space well when data vary in size and structure
•  Adequate for exhaustive searches
•  Easy to update
•  Unsuitable for most applications
Department of Information Technology 13Data base Technologies (ITB4201)
The Sequential File
• Fixed format used for records
• Records are of the same length
– same number of fixed-length fields in a particular order
• Only the values of fields need to be stored
• Field name and length are attributes of the file
structure
Department of Information Technology 14Data base Technologies (ITB4201)
The Sequential File
• Key field
– Uniquely identifies the record
– Records are stored in key sequence
•  Optimal for batch applications if they involve the processing of all
the records
•  Easily stored on tape and disk
•  Poor performance for interactive applications
– considerable processing and delay due to the sequential search of the file
for a key match
Department of Information Technology 15Data base Technologies (ITB4201)
Indexed Sequential File
• An index is added to support random access
– An index record contains a key field and a pointer into
the main file
– The index is a sequential file
– For searching
• Search the index to find the highest key value that is equal to
or precedes the desired key value
• Search continues in the main file at the location indicated by
the pointer
Department of Information Technology 16Data base Technologies (ITB4201)
Indexed Sequential File
Example
• Consider searching a particular key value in a sequential file with
1 million records
– without index
• requires on average one-half million record accesses
– with an index containing 1000 entries with the keys in the index evenly
distributed over the main file
• requires on average 500 accesses to the index file + 500 accesses to the main
file
Department of Information Technology 17Data base Technologies (ITB4201)
• An overflow file is added
• A new record is added to the overflow file and is located by
following a pointer from its predecessor record
• The indexed sequential file is occasionally merged with the
overflow file in batch mode
• Greatly reduces the time required to access a single
record, without sacrificing the sequential nature.
Indexed Sequential File
Department of Information Technology 18Data base Technologies (ITB4201)
Indexed File
• Records are accessed only through their indexes
– no restriction on the placement of records
– allows variable-length records
• Uses multiple indexes for different key fields
– An exhaustive index contains one entry for every
record in the main file
– A partial index contains entries to records where the
field of interest exists
Department of Information Technology 19Data base Technologies (ITB4201)
Indexed File
• When a new record is added to the main file, all of the index files
must be updated.
• Used mostly in applications where
– timeliness of information is critical and
– data are rarely processed exhaustively
– examples: airline reservation systems and inventory control systems
Department of Information Technology 20Data base Technologies (ITB4201)
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
Department of Information Technology 21Data base Technologies (ITB4201)
File Directory
• Contains information about files
– Attributes
– Location
– Ownership
• Directory itself is a file owned by the operating system
Department of Information Technology 22Data base Technologies (ITB4201)
Directory Elements
• Basic Information
– File name: must be unique
– File type: e.g., text, binary
– File organization
• Address Information
– Volume: device on which file is stored
– Starting address: e.g., cylinder, track on disk
– Size used: in bytes, words or blocks
– Size allocated: maximum size of the file
Department of Information Technology 23Data base Technologies (ITB4201)
Directory Elements
• Access Control Information
– Owner: able to grant/deny access to other users and to change these privileges
– Access information: e.g., user’s name and password for each authorized user
– Permitted actions: controls reading, writing, executing, transmitting over a
network
• Usage Information
– Date Created, Identity of Creator, Date Last Read Access, Identity of Last Reader,
Date Last Modified
Department of Information Technology 24Data base Technologies (ITB4201)
Hierarchical, or
Tree-Structured Directory
• Master directory with user directories
underneath it
• Each user directory may have
subdirectories and files as entries
• Each directory and subdirectory can be
organized as a sequential file
Department of Information Technology 25Data base Technologies (ITB4201)
Hierarchical, or
Tree-Structured Directory
•  Easily enforce access restriction on directories.
•  Easily organize collections of files.
•  Minimize the difficulty in assigning unique names.
Department of Information Technology 26Data base Technologies (ITB4201)
Naming
• The tree structure allows users to find a file by following a path
from the root or master directory down various branches until
the file is reached
• The series of directory names, culminating in the file name
itself, constitutes a pathname for the file
• Duplicate filenames are possible if they have different
pathnames
Department of Information Technology 27Data base Technologies (ITB4201)
Naming
• Usually an interactive user or a
process is associated with a current
or working directory
– Files are referenced relative to the
working directory unless an explicit full
pathname is used
Department of Information Technology 28Data base Technologies (ITB4201)
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
Department of Information Technology 29Data base Technologies (ITB4201)
File Sharing
• In multiuser system, there is almost always a requirement for
allowing files to be shared among a number of users
• Two issues
– Access rights
– Management of simultaneous access
Department of Information Technology 30Data base Technologies (ITB4201)
Access Rights
• A wide variety of access rights have been used by various
systems
– often as a hierarchy, with each right implying those that precede it.
• None
– User may not know the existence of file by not allowing to read the
user directory that includes this file
• Knowledge
– User can only determine that the file exists and who its owner is
Department of Information Technology 31Data base Technologies (ITB4201)
Access Rights cont…
• Execution
– The user can load and execute a program but cannot copy it, e.g.,
proprietary programs
• Reading
– The user can read the file for any purpose, including copying and
execution
• Appending
– The user can add data to the file but cannot modify or delete any of
the file’s contents
Department of Information Technology 32Data base Technologies (ITB4201)
Access Rights cont…
• Updating
– The user can modify, delete, and add to the file’s data.
• Changing protection
– User can change access rights granted to other users
• Deletion
– User can delete the file
Department of Information Technology 33Data base Technologies (ITB4201)
User Classes
• Access can be provided to different classes of users
– Owner: usually the files creator, has full rights and may grant rights
to others
– Specific users: individual users who are designated by user ID
– User groups: a set of users identified as a group
– All: all users who have access to this system
Department of Information Technology 34Data base Technologies (ITB4201)
Simultaneous Access
• When access is granted to append or update a file to more
than one user, the OS or file management system must enforce
discipline
• User may lock the entire file or individual records during
update
• Mutual exclusion and deadlock are issues for shared access, ref.
readers/writers problem
Department of Information Technology 35Data base Technologies (ITB4201)
Roadmap
• Overview
• File organisation and Access
• File Directories
• File Sharing
• Record Blocking
Department of Information Technology 36Data base Technologies (ITB4201)
Blocks and records
• Records are the logical unit of access of a structured file
• Blocks are the unit for I/O with secondary storage
• For I/O to be performed, records must be organized as blocks.
• Three methods of blocking are common
– Fixed length blocking
– Variable length spanned blocking
– Variable-length unspanned blocking
Department of Information Technology 37Data base Technologies (ITB4201)
Fixed Blocking
• Fixed-length records are used, and an integral number of
records are stored in a block
• Unused space at the end of a block is internal fragmentation
• Common for sequential files with fixed-length records
Department of Information Technology 38Data base Technologies (ITB4201)
Fixed Blocking
Department of Information Technology 39Data base Technologies (ITB4201)
Variable Length
Spanned Blocking
• Variable-length records are used and are packed into blocks
with no unused space
• Some records may span multiple blocks
– Continuation is indicated by a pointer to the successor block
•  Efficient for storage and does not limit the size of records
Department of Information Technology 40Data base Technologies (ITB4201)
Variable Blocking: Spanned
•  Difficult to implement
•  Records that span two blocks require two I/O operations
Department of Information Technology 41Data base Technologies (ITB4201)
Variable-length
unspanned blocking
• Uses variable length records without spanning
•  Wasted space in most blocks because of the inability to use
the remainder of a block if the next record is larger than the
remaining unused space
•  Limits record size to the size of a block
Department of Information Technology 42Data base Technologies (ITB4201)
Variable Blocking:
Unspanned
Department of Information Technology 43Data base Technologies (ITB4201)
Revisit the Big Picture
Describes the location of all
files plus their attributes
Only authorized users are
allowed to access particular
files in particular ways
Records must be
organized as a sequence
of blocks for output and
unblocked after input
individual block I/O
requests must be
scheduled for optimizing
performance
User views the file as having
some structure that
organizes the records;
different access methods
reflect different file structures
Department of Information Technology 44Data base Technologies (ITB4201)
Test Yourself
1. A file is:
a) an abstract data type
b) logical storage unit
c) usually non volatile
d) volatile
2. Large collection of files are called ____________
a) Fields
b) Records
c) Database
d) Sectors
3. A unit of storage that can store one or more records in a hash file organization is denoted as
a) Buckets
b) Disk pages
c) Blocks
d) Nodes
4. How can variable length records arise in a file
a) Storage of multiple record types in a file
b) Record types that allow variable lengths for one or more fields
c) Record types that allow repeating fields, such as arrays or multisets
d) All of the mentioned
5. The slotted page structure is used for _________
a) Organizing records in a block
b) Organizing blocks in a database
c) Deleting records from a block
d) None of the mentioned
Department of Information Technology 45Data base Technologies (ITB4201)
Answers
1. A file is:
a) an abstract data type
b) logical storage unit
c) usually non volatile
d) volatile
2. Large collection of files are called ____________
a) Fields
b) Records
c) Database
d) Sectors
3. A unit of storage that can store one or more records in a hash file organization is denoted as
a) Buckets
b) Disk pages
c) Blocks
d) Nodes
4. How can variable length records arise in a file
a) Storage of multiple record types in a file
b) Record types that allow variable lengths for one or more fields
c) Record types that allow repeating fields, such as arrays or multisets
d) All of the mentioned
5. The slotted page structure is used for _________
a) Organizing records in a block
b) Organizing blocks in a database
c) Deleting records from a block
d) None of the mentioned

More Related Content

What's hot

Unit1 principle of programming language
Unit1 principle of programming languageUnit1 principle of programming language
Unit1 principle of programming language
Vasavi College of Engg
 
1. Introduction to DBMS
1. Introduction to DBMS1. Introduction to DBMS
1. Introduction to DBMS
koolkampus
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
koolkampus
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
koolkampus
 
file system in operating system
file system in operating systemfile system in operating system
file system in operating system
tittuajay
 

What's hot (20)

Unit1 principle of programming language
Unit1 principle of programming languageUnit1 principle of programming language
Unit1 principle of programming language
 
Elmasri Navathe DBMS Unit-1 ppt
Elmasri Navathe DBMS Unit-1 pptElmasri Navathe DBMS Unit-1 ppt
Elmasri Navathe DBMS Unit-1 ppt
 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
 
File organisation
File organisationFile organisation
File organisation
 
11 Database Concepts
11 Database Concepts11 Database Concepts
11 Database Concepts
 
1. Introduction to DBMS
1. Introduction to DBMS1. Introduction to DBMS
1. Introduction to DBMS
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
 
Operating Systems: Virtual Memory
Operating Systems: Virtual MemoryOperating Systems: Virtual Memory
Operating Systems: Virtual Memory
 
Dbms database models
Dbms database modelsDbms database models
Dbms database models
 
Database Concepts and Terminologies
Database Concepts and TerminologiesDatabase Concepts and Terminologies
Database Concepts and Terminologies
 
rdbms-notes
rdbms-notesrdbms-notes
rdbms-notes
 
file system in operating system
file system in operating systemfile system in operating system
file system in operating system
 
File Pointers
File PointersFile Pointers
File Pointers
 
Data models
Data modelsData models
Data models
 
File organization and introduction of DBMS
File organization and introduction of DBMSFile organization and introduction of DBMS
File organization and introduction of DBMS
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
 
Memory management
Memory managementMemory management
Memory management
 
I/O Buffering
I/O BufferingI/O Buffering
I/O Buffering
 
Memory Hierarchy
Memory HierarchyMemory Hierarchy
Memory Hierarchy
 

Similar to Database File operation

Similar to Database File operation (20)

Indexing
IndexingIndexing
Indexing
 
File management in OS
File management in OSFile management in OS
File management in OS
 
File Management
File ManagementFile Management
File Management
 
file management
file managementfile management
file management
 
File Management
File ManagementFile Management
File Management
 
File Processing System
File Processing SystemFile Processing System
File Processing System
 
File Structure.pptx
File Structure.pptxFile Structure.pptx
File Structure.pptx
 
File system in operating system e learning
File system in operating system e learningFile system in operating system e learning
File system in operating system e learning
 
Rdbms
RdbmsRdbms
Rdbms
 
overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam overview of storage and indexing BY-Pratik kadam
overview of storage and indexing BY-Pratik kadam
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
Ch 8 data base
Ch 8 data baseCh 8 data base
Ch 8 data base
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
File System operating system operating system
File System  operating system operating systemFile System  operating system operating system
File System operating system operating system
 
Week 9
Week 9Week 9
Week 9
 
TYPO3 - The Future of DAM
TYPO3 - The Future of DAMTYPO3 - The Future of DAM
TYPO3 - The Future of DAM
 
Chap14
Chap14Chap14
Chap14
 
Database
DatabaseDatabase
Database
 
File system
File systemFile system
File system
 
Database management system lecture notes
Database management system lecture notesDatabase management system lecture notes
Database management system lecture notes
 

More from Dr. C.V. Suresh Babu

More from Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
 
Association rules
Association rulesAssociation rules
Association rules
 
Clustering
ClusteringClustering
Clustering
 
Classification
ClassificationClassification
Classification
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
 
DART
DARTDART
DART
 
Mycin
MycinMycin
Mycin
 
Expert systems
Expert systemsExpert systems
Expert systems
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
Bayes network
Bayes networkBayes network
Bayes network
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
 
Rule based system
Rule based systemRule based system
Rule based system
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
 
Production based system
Production based systemProduction based system
Production based system
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 

Database File operation

  • 1. Department of Information Technology 1Data base Technologies (ITB4201) File Operation Dr. C.V. Suresh Babu Professor Department of IT Hindustan Institute of Science & Technology
  • 2. Department of Information Technology 2Data base Technologies (ITB4201) Action Plan • Overview • File organisation and Access • File Directories • File Sharing • Record Blocking • Quiz
  • 3. Department of Information Technology 3Data base Technologies (ITB4201) Files • Files are the central element to most applications – file as an input to applications – file as an output for long-term storage and for later access • Desirable properties of files: – Long-term existence – Controlled sharing between processes – Structure that is convenient for particular applications
  • 4. Department of Information Technology 4Data base Technologies (ITB4201) File Structure Fields and Records • Fields – Basic element of data • e.g., student’s last name – Contains a single value – Characterized by its length and data type • Records – Collection of related fields • e.g., a student record – Treated as a unit
  • 5. Department of Information Technology 5Data base Technologies (ITB4201) File Structure File and Database • File – Collection of similar records – Treated as a single entity and may be referenced by name – Access control restrictions usually apply at the file level • Database – Collection of related data – Explicit relationships exist among elements – Consists of one or more files
  • 6. Department of Information Technology 6Data base Technologies (ITB4201) A Big Picture How to identify and locate a selected file? How to enforce user access control in shared systems? How to organize records as a sequence of blocks for I/O? individual block I/O requests must be scheduled for optimizing performanceHow to organize records in a file and access a particular record in a file?
  • 7. Department of Information Technology 7Data base Technologies (ITB4201) Roadmap • Overview • File organisation and Access • File Directories • File Sharing • Record Blocking
  • 8. Department of Information Technology 8Data base Technologies (ITB4201) File Organization • The basic operations that a user or application may perform on a file are performed at the record level – The file is viewed as having some structure that organizes the records • File organization refers to the logical structuring of records – Determined by the way in which files are accessed (access method)
  • 9. Department of Information Technology 9Data base Technologies (ITB4201) Criteria for File Organization • Important criteria include: – Short access time – Ease of update – Economy of storage – Simple maintenance – Reliability
  • 10. Department of Information Technology 10Data base Technologies (ITB4201) Criteria for File Organization • Priority will differ depending on the use – For batch mode file processing, rapid access for retrieval of a single record is of minimal concern • These criteria may conflict – Use of indexes (conflict with economy of storage) can be a primary means of increasing the speed of access to data
  • 11. Department of Information Technology 11Data base Technologies (ITB4201) The Pile • Data are collected in the order they arrive – No structure • Purpose is to accumulate a mass of data and save it • Records may have different fields – field should be self-describing (field name + value) – field length should be known (delimiters, subfield or default for a field type)
  • 12. Department of Information Technology 12Data base Technologies (ITB4201) The Pile • Record access is by exhaustive search • Used when data are collected and stored prior to processing or data are not easy to organize •  Uses space well when data vary in size and structure •  Adequate for exhaustive searches •  Easy to update •  Unsuitable for most applications
  • 13. Department of Information Technology 13Data base Technologies (ITB4201) The Sequential File • Fixed format used for records • Records are of the same length – same number of fixed-length fields in a particular order • Only the values of fields need to be stored • Field name and length are attributes of the file structure
  • 14. Department of Information Technology 14Data base Technologies (ITB4201) The Sequential File • Key field – Uniquely identifies the record – Records are stored in key sequence •  Optimal for batch applications if they involve the processing of all the records •  Easily stored on tape and disk •  Poor performance for interactive applications – considerable processing and delay due to the sequential search of the file for a key match
  • 15. Department of Information Technology 15Data base Technologies (ITB4201) Indexed Sequential File • An index is added to support random access – An index record contains a key field and a pointer into the main file – The index is a sequential file – For searching • Search the index to find the highest key value that is equal to or precedes the desired key value • Search continues in the main file at the location indicated by the pointer
  • 16. Department of Information Technology 16Data base Technologies (ITB4201) Indexed Sequential File Example • Consider searching a particular key value in a sequential file with 1 million records – without index • requires on average one-half million record accesses – with an index containing 1000 entries with the keys in the index evenly distributed over the main file • requires on average 500 accesses to the index file + 500 accesses to the main file
  • 17. Department of Information Technology 17Data base Technologies (ITB4201) • An overflow file is added • A new record is added to the overflow file and is located by following a pointer from its predecessor record • The indexed sequential file is occasionally merged with the overflow file in batch mode • Greatly reduces the time required to access a single record, without sacrificing the sequential nature. Indexed Sequential File
  • 18. Department of Information Technology 18Data base Technologies (ITB4201) Indexed File • Records are accessed only through their indexes – no restriction on the placement of records – allows variable-length records • Uses multiple indexes for different key fields – An exhaustive index contains one entry for every record in the main file – A partial index contains entries to records where the field of interest exists
  • 19. Department of Information Technology 19Data base Technologies (ITB4201) Indexed File • When a new record is added to the main file, all of the index files must be updated. • Used mostly in applications where – timeliness of information is critical and – data are rarely processed exhaustively – examples: airline reservation systems and inventory control systems
  • 20. Department of Information Technology 20Data base Technologies (ITB4201) Roadmap • Overview • File organisation and Access • File Directories • File Sharing • Record Blocking
  • 21. Department of Information Technology 21Data base Technologies (ITB4201) File Directory • Contains information about files – Attributes – Location – Ownership • Directory itself is a file owned by the operating system
  • 22. Department of Information Technology 22Data base Technologies (ITB4201) Directory Elements • Basic Information – File name: must be unique – File type: e.g., text, binary – File organization • Address Information – Volume: device on which file is stored – Starting address: e.g., cylinder, track on disk – Size used: in bytes, words or blocks – Size allocated: maximum size of the file
  • 23. Department of Information Technology 23Data base Technologies (ITB4201) Directory Elements • Access Control Information – Owner: able to grant/deny access to other users and to change these privileges – Access information: e.g., user’s name and password for each authorized user – Permitted actions: controls reading, writing, executing, transmitting over a network • Usage Information – Date Created, Identity of Creator, Date Last Read Access, Identity of Last Reader, Date Last Modified
  • 24. Department of Information Technology 24Data base Technologies (ITB4201) Hierarchical, or Tree-Structured Directory • Master directory with user directories underneath it • Each user directory may have subdirectories and files as entries • Each directory and subdirectory can be organized as a sequential file
  • 25. Department of Information Technology 25Data base Technologies (ITB4201) Hierarchical, or Tree-Structured Directory •  Easily enforce access restriction on directories. •  Easily organize collections of files. •  Minimize the difficulty in assigning unique names.
  • 26. Department of Information Technology 26Data base Technologies (ITB4201) Naming • The tree structure allows users to find a file by following a path from the root or master directory down various branches until the file is reached • The series of directory names, culminating in the file name itself, constitutes a pathname for the file • Duplicate filenames are possible if they have different pathnames
  • 27. Department of Information Technology 27Data base Technologies (ITB4201) Naming • Usually an interactive user or a process is associated with a current or working directory – Files are referenced relative to the working directory unless an explicit full pathname is used
  • 28. Department of Information Technology 28Data base Technologies (ITB4201) Roadmap • Overview • File organisation and Access • File Directories • File Sharing • Record Blocking
  • 29. Department of Information Technology 29Data base Technologies (ITB4201) File Sharing • In multiuser system, there is almost always a requirement for allowing files to be shared among a number of users • Two issues – Access rights – Management of simultaneous access
  • 30. Department of Information Technology 30Data base Technologies (ITB4201) Access Rights • A wide variety of access rights have been used by various systems – often as a hierarchy, with each right implying those that precede it. • None – User may not know the existence of file by not allowing to read the user directory that includes this file • Knowledge – User can only determine that the file exists and who its owner is
  • 31. Department of Information Technology 31Data base Technologies (ITB4201) Access Rights cont… • Execution – The user can load and execute a program but cannot copy it, e.g., proprietary programs • Reading – The user can read the file for any purpose, including copying and execution • Appending – The user can add data to the file but cannot modify or delete any of the file’s contents
  • 32. Department of Information Technology 32Data base Technologies (ITB4201) Access Rights cont… • Updating – The user can modify, delete, and add to the file’s data. • Changing protection – User can change access rights granted to other users • Deletion – User can delete the file
  • 33. Department of Information Technology 33Data base Technologies (ITB4201) User Classes • Access can be provided to different classes of users – Owner: usually the files creator, has full rights and may grant rights to others – Specific users: individual users who are designated by user ID – User groups: a set of users identified as a group – All: all users who have access to this system
  • 34. Department of Information Technology 34Data base Technologies (ITB4201) Simultaneous Access • When access is granted to append or update a file to more than one user, the OS or file management system must enforce discipline • User may lock the entire file or individual records during update • Mutual exclusion and deadlock are issues for shared access, ref. readers/writers problem
  • 35. Department of Information Technology 35Data base Technologies (ITB4201) Roadmap • Overview • File organisation and Access • File Directories • File Sharing • Record Blocking
  • 36. Department of Information Technology 36Data base Technologies (ITB4201) Blocks and records • Records are the logical unit of access of a structured file • Blocks are the unit for I/O with secondary storage • For I/O to be performed, records must be organized as blocks. • Three methods of blocking are common – Fixed length blocking – Variable length spanned blocking – Variable-length unspanned blocking
  • 37. Department of Information Technology 37Data base Technologies (ITB4201) Fixed Blocking • Fixed-length records are used, and an integral number of records are stored in a block • Unused space at the end of a block is internal fragmentation • Common for sequential files with fixed-length records
  • 38. Department of Information Technology 38Data base Technologies (ITB4201) Fixed Blocking
  • 39. Department of Information Technology 39Data base Technologies (ITB4201) Variable Length Spanned Blocking • Variable-length records are used and are packed into blocks with no unused space • Some records may span multiple blocks – Continuation is indicated by a pointer to the successor block •  Efficient for storage and does not limit the size of records
  • 40. Department of Information Technology 40Data base Technologies (ITB4201) Variable Blocking: Spanned •  Difficult to implement •  Records that span two blocks require two I/O operations
  • 41. Department of Information Technology 41Data base Technologies (ITB4201) Variable-length unspanned blocking • Uses variable length records without spanning •  Wasted space in most blocks because of the inability to use the remainder of a block if the next record is larger than the remaining unused space •  Limits record size to the size of a block
  • 42. Department of Information Technology 42Data base Technologies (ITB4201) Variable Blocking: Unspanned
  • 43. Department of Information Technology 43Data base Technologies (ITB4201) Revisit the Big Picture Describes the location of all files plus their attributes Only authorized users are allowed to access particular files in particular ways Records must be organized as a sequence of blocks for output and unblocked after input individual block I/O requests must be scheduled for optimizing performance User views the file as having some structure that organizes the records; different access methods reflect different file structures
  • 44. Department of Information Technology 44Data base Technologies (ITB4201) Test Yourself 1. A file is: a) an abstract data type b) logical storage unit c) usually non volatile d) volatile 2. Large collection of files are called ____________ a) Fields b) Records c) Database d) Sectors 3. A unit of storage that can store one or more records in a hash file organization is denoted as a) Buckets b) Disk pages c) Blocks d) Nodes 4. How can variable length records arise in a file a) Storage of multiple record types in a file b) Record types that allow variable lengths for one or more fields c) Record types that allow repeating fields, such as arrays or multisets d) All of the mentioned 5. The slotted page structure is used for _________ a) Organizing records in a block b) Organizing blocks in a database c) Deleting records from a block d) None of the mentioned
  • 45. Department of Information Technology 45Data base Technologies (ITB4201) Answers 1. A file is: a) an abstract data type b) logical storage unit c) usually non volatile d) volatile 2. Large collection of files are called ____________ a) Fields b) Records c) Database d) Sectors 3. A unit of storage that can store one or more records in a hash file organization is denoted as a) Buckets b) Disk pages c) Blocks d) Nodes 4. How can variable length records arise in a file a) Storage of multiple record types in a file b) Record types that allow variable lengths for one or more fields c) Record types that allow repeating fields, such as arrays or multisets d) All of the mentioned 5. The slotted page structure is used for _________ a) Organizing records in a block b) Organizing blocks in a database c) Deleting records from a block d) None of the mentioned