Advanced Database Lecture Notes


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Advanced Database Lecture Notes

  1. 1. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatChapter 17: Physical DB Design for Relational Q7. What are the steps involved in theDB Transparencies translation of logical data model to target DBMS?Q1: What are the sources of physical design? ANS:ANS: 1. Design base relation. 1. Logical Data Model. 2. Design a representation of the derived 2. Documentation that describe the model. data.Q2: Multiple Choice Questions (MCQ) 3. Design general constraints.- DB design that concerned with WHAT is: Q8. Why we need to design a base relationa. Logical b. Physical c. Conceptual step?ANS: (a) ANS:- Physical DB design concerned with: To decide how to represent the base relationsa. WHAT b. HOW c. Who identified in the logical data model in the targetANS: (b) DBMS.Q3. Define the term of Physical Design? Q9. In design base relation step, for eachANS: relation what we have to define?Is the process of producing a description of the ANS:implementation of the database in the secondary 1. The name of 2. The list of simple attributes in brackets.Q4. What did Physical design describe? 3. The primary key, auxiliary key, and FK. 1. Base Relations. 4. Referential integrity for each FK 2. File organization. identified in the relation. 3. Indexes used to achieve efficient access to Q10. In design base relation, for each attribute data. from data dictionary, what we have to define? 4. Integrity constraints. ANS: 5. Security measures. 1. Attribute domains { data types, length ,Q5. Describe the goal from translating logical domain constraints}.data model for target DBMS? 2. Optional And/ Or default values ofANS: attribute in the relation, and whether it canTo produce a relational database schema from the hold NULL.logical data model that can be implemented in the 3. Whether attribute is derived and if so howtarget DBMS it is computed?Q6. Why we need to know the functionality of Q11. Why we need to design a representationtarget DBMS? of derived data?ANS: ANS: 1. To know how to create a base relations. To decide how to represent the derived data 2. To know whether it support a definition identified in the logical data model in the target for primary, auxiliary, secondary, and DBMS. foreign keys. Q12. How to design a representation for 3. To know whether it is support domains. derived data? 4. To know whether it is support general ANS: constraints. 1. By examining the logical data model and 5. To know whether it is support integrity data dictionary we can produce a list of constraints. derived attributes. 6. To know whether is support NOT, NULLMiddle East University of Jordan (MEU)
  2. 2. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour Obeidat 2. Derived attributes have to options, either - By using peak load which it is the storing them into database or calculate during time of attribute / relation of them every time and it is required. when will be the high demand of 3. The chosen option based on: database. - If we store the derived data in the 2. By using transaction analysis information relation and keep it consistent with the to identify the parts of database that cause operational data which it derived performance problems. from. 3. Need to identify the high level of - The cost of calculated the derived data functionality of transaction such as every time. attribute update, search criteria used in 4. Less expensive option based on: chosen a query. subject based on performance constraints. 4. Often we will not analyze all transactions,Q13. Why we need to design general so by investigating the important ones by:constraints? - Using the attribute/ relation crossANS: reference matrix; which show theBecause some DBMS provide facilities than relation accessed by each transaction.others in defining enterprise general constraints - Using the usage map; which show the heavily used relations.PART TWO 5. Focus on the parts of database may beQ1. Why we need to define file organization problematic by:and indexes? - Map the path between transaction toANS: relations. 1. To determine the optimal file organization - Determine the relations that are to store the base relations. frequently accessed by transactions. 2. To determine the indexes to achieve an - Analyze the data usage of selected acceptable performance; The way that transaction that involves these tuples and relations stored in secondary relations. storage Q4. Why we need to identify file organization?Q2. What are the steps involved in design file ANS:organization and indexes? We need to choose specific file organization inANS: order to: 1. Analyze transaction. - Determine the efficient file 2. Choose file organization. organization for each base relation 3. Choose indexes. such as using: 4. Estimate disk space requirements. i. HeapQ3. How to analyze transactions? ii. HashANS: iii. Indexed Sequential Access 1. Attempt to identify performance criteria Method (ISAM). such as: iv. Clusters. - Transactions that runs frequently on v. B+ Trees. relations and have a significant impact Q5. TRUE or FALSE Question. on performance. ( ) Most of DBMS may not allow choosing or - Transactions that is critical to selecting file organization. organization. ANS: TRUE.Middle East University of Jordan (MEU)
  3. 3. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatPART III Q7. By balancing the overhead of maintenanceQ1. Why we need to choose indexes? for secondary index against performanceANS: improvement gained in retrieving data, whatWe need it to determine whether adding indexes this means? Or what this includes?to relation will improve the performance over the ANS:database. 1. Adding secondary record to everyQ2. Mention the both approaches used in secondary index when a new tuplechoosing indexes? inserted.ANS: 2. Increase the disk space to store secondary 1. One approach is to keep tuples in the index. relation unordered and add secondary 3. Update secondary index whenever the indexes as necessary. corresponding tuple updated. 2. Another approach is to order tuples in the 4. Possible for performance degradation relation by specifying primary index or while making query optimization in order clustering index. to consider all possible secondary indexes.Q3. In the approach of specifying primary or Q8. Mention the guidelines for making wish-clustering index, what we have to do? list?ANS: ANS:By choosing attribute for ordering or clustering 1. Avoid index small relations.based on: 2. Index primary key if it is not used as a key - Attribute often used for JOIN in file organization. operation in order to make JOIN 3. Add secondary index for attributes used in operation more efficient. built in functions. - Attribute used for accessing relation 4. Add secondary index for attributes tuples in order of that attribute. involved in SELECT, JOIN, ORDEREDQ4. MCQ BY, GROUB BY operations.- If attribute chosen was primary for the 5. Add secondary index for foreign keyrelation the index called: accessed frequently in the relations.a. Primary Index b. Clustering Index 6. Avoid index attribute updated frequently.ANS: (a) 7. Avoid index attribute contain domain with- If attribute chosen was not primary key a long string characters.attribute, the index in this case is: 8. Add secondary index for attribute used ina. Primary Index b. Clustering Index. index-only planANS: (b) 9. Avoid index for attribute involved in query to retrieve a significant proportionQ5. TRUE or FALSE Question of the relation.( ) – Each relation can have either primary 10. Add secondary index for attribute heavilyindex or Clustering index. used as a secondary key in the relation.ANS: TRUEQ6. What does Secondary Index provide?ANS:Provide a mechanism for specifying an additionalkey for base relation to make retrieving data moreefficient.Middle East University of Jordan (MEU)
  4. 4. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatChapter 22: Distributed Database & DDBMS Q8. Mention the advantages of DDBMS? ANS:Q1. What is distributed database? 1. EconomicANS: 2. Reflect the organizational structure.A logically interrelated shared date (and the 3. Improved performance.description of this data) physically distributed 4. Improved availability.over a computer network. 5. Improved reliability. 6. Improved shared-ability and localQ2. What is distributed DBMS? autonomy.ANS: 7. Modular Growth.Software that permits the management ofdistributed database and make the distribution Q9. Mention the disadvantages of DDBMS?transparent to users. ANS: 1. Cost.Q3. What distributed Processing? 2. Security.ANS: 3. Complexity.A centralized database can be accessed over a 4. Lack of network. 5. Lack of experience. 6. Database design more complex.Q4. What is Parallel DBMS? 7. Integrity control is difficult.ANS:A DBMS running across multiple processors and Q10. Mention the types of DDBMS?disks designed to execute operations in parallel, ANS:whenever possible, to improve performance. 1. Homogenous DDBMS. 2. Heterogeneous DDBMS.Q5. Why we need to Parallel DBMS?ANS: Q11. Define formally Homogenous DDBMS?Based on single processor it will not meet the ANS:requirements of - All sites have the same DBMS - Reliability. product. - Scalability. - This approach support incremental - Cost effective. growth, and increases performance. - Performance. - Much easier to design and manage.Q6. What is the idea behind Parallel DBMS? Q12. Define formally Heterogeneous DDBMS?ANS: ANS:Parallel DBMS link multiple, smaller machines to - Each site have different DBMSmeet the same throughput of single, larger product, possibly different data model.machine with greater scalability and reliability. - Occurs when each site have already implemented their own database, andQ7. Mention the architectures used in Parallel integration considered later.DBMS? - Translation should allow to have:ANS: 1. Different H/W. - Shared Memory. 2. Different DBMS product. - Shared Disk. 3. Different H/W and DBMS product - Shared Nothing. - Typically solution by using gateways.Middle East University of Jordan (MEU)
  5. 5. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatPART II: Distributed DB design.Q1. What are the key issues in DDB design? Q5. Define formally Completeness rule?ANS: ANS:1. Fragmentation: a relation may be divided into If relation R decomposed into R1,R2,...,Rn eachsub-relations and distributed over sites. tuple in R should be found in at least one2. Allocation: Each fragment is stored over sites fragment.with optimal distribution.3. Replication: A copy of fragment that may be Q6. Define formally Reconstruction rule?maintained at several sites. ANS: It is possible to define a relational operation thatQ2. Why we need to fragment? able to reconstruct relation R where:ANS: - In VF the operation is JOIN1. Usage: - In HF the operation is UNION- Working over fragments such as working withview rather than the entire relation. Q7. Define formally Disjointness rule?2. Efficiency: ANS:- Data is stored close to where it is frequently If a data item di found in a fragment Ri it shouldused. not appear in another fragment except in case of- Data is not needed by the local application is not vertical fragmentation which will repeat thestored. attribute of primary key.3. Parallelism: Noting That:- By working with fragments which represent a - In Vertical fragmentation data item isunit of distribution, transaction could be divided sub queries and implemented over fragments. - In Horizontal fragmentation data item4. Security: is tuple.- Data is not needed by the local application is not Q8. Mention the types of fragmentation?stored and is not available to unauthorized users. ANS: 1. Vertical fragmentation.Q3. Mention the disadvantages of fragment? 2. Horizontal fragmentation.ANS: 3. Mixed fragmentation. 1. Performance. 4. Derived fragmentation. 2. Integrity. Q9. TRUE or FALSE question ( ) if the relation is small it is recommendedQ4. What are the correctness rules in not to fragment this relation.fragmentation? ANS: TRUEANS: 1. Completeness. 2. Reconstruction. 3. Disjointness.Middle East University of Jordan (MEU)
  6. 6. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatChapter 14: Indexing Structure for files 4. Include index entry of key field value ofPART I: ELMASRI Edition Contents the first record in block which called________________________________________ block anchor.Q1. What are the types of single level index 5. It is an example of non-dense (sparse)used? index because there are index entry forANS: each block in data file and the key of this - Primary Index. block which it block anchor. - Secondary Index. - Clustering Index. Q8. Mention the characteristics of ClusteringQ2. Define the term Single Level Index? Index?ANS: ANS:Is auxiliary files that make accessing data file and 1. Defined on an ordered data file.searching for certain record in the file more 2. Data file ordered on a non key field. Andefficient this requires that this non key fieldQ3. Define the basics of Index and its form? contains a distinct value for each record inANS: data file.1. Index may be for one field in data file 3. Include one index entry for each distinct2. Index may be for several fields in the data file. value of field. 4. Index entry points on the first data block* The General Syntax for Index as the following: that contains that distinct value.<Field Value, Pointer to Record> 5. It is an example of a non dense index.* Index = Access Path on the field.Q4. Why index file occupies less disk block Q9. Mention the characteristics of Secondarythan data file? Index?ANS: ANS:Because index entries is much smaller 1. Secondary index provide a secondaryQ5. What is yield from binary search over mean of accessing a data file which have aindex file? primary access exist.ANS: 2. Secondary index may be for a candidateYields pointers to file records key that contain a unique value for eachQ6. Mention the characteristics of indexing? record in the data file or for a non keyANS: field that contain a duplicate values in - Dense Index: There is an index entry data file. for each search value in data file. 3. Index file contain two fields: - Non-Dense (Sparse) Index: there is an a. The first field contains the same data index entries for some of search type of unordered field in data file. values in data file. b. The second field contains either aQ7. Mention the characteristics of Primary record pointer or block pointer.Index? 4. Include index entry for each record in theANS: data file so it an example of dense index. 1. Defined on an ordered file. 2. Data file ordered on key field. 3. Include index entry for each block in data file.Middle East University of Jordan (MEU)
  7. 7. Date: 11/1/2013 Advanced Database Design Lectures Note Jasour ObeidatPART II: Multi Level IndexQ1. What is the idea behind the multi-levelindex? 1. Because of single level index based on ordered files, we can create an index for the index itself. So we can call the original index as first level, and the index of index the second level of index. 2. We can repeat this process to have second, third….etc level of index until all index entries fit one disk block. 3. Multi level index could be used for any type of index such as primary, clustering, and secondary index while the first level consist of more than one block.Q2. MCQMulti Level Index is a form of:a. Search Tree b. B-Tree c. B+-treeANS: (a)Q3. True OR False Question, Why???( ) – Insertion or deletion of new index entriesmay not cause a problem in Multi-level index.ANS: (False)Reason IF False:It causes a problem because every level is anordered file.Q4. What is the difference between (B-Tree)and (B+-Tree)?ANS: 1. In B-Tree: Pointers to data records exist at all levels of the tree. 2. In B+-Tree: Pointers to data records at the leaf nodes only. 3. B+-Tree can have less level than the corresponding B-Tree. 4. B+-Tree can have higher capacity of search records than in B-Tree.Middle East University of Jordan (MEU)