CS 542 Database Management SystemsJ Singh March 14, 2011
Plan for todayPutting it all together. Storage HierarchySecondary Storage ManagementSystem CatalogsBy the second break, we will have arrived at a place where we have most of the tools to build our own databaseAfter the second break,Data ModelingThis topic does not fit with the other three but some of you will need it for your project so I am including it
Storage HierarchyA “typical” system is shown here Many different levelsMain Memory (2011) 1μsec access / word, $12.50/GBDisk (2011, ref)3.4 msec access / page, $1.35/GBFurther away from primary storage,Cost per MB decreasesAccess speed decreasesStorage capacity increasesSecondary- and tertiary-storage must be non-volatileSource: Wikipedia
DBMS vs. OS File System    OS does disk space & buffer mgmt already    So why not let OS manage these tasks?Differences in OS support:  Portability issuesSome limitations, e.g., files don’t  span multiple disk devices.Buffer management in DBMS requires ability to:pin a page in buffer pool, force a page to disk (important for implementing CC & recovery),adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations.
Structure of a DBMSA typical DBMS has a layered architecture.Disk Storage hierarchy, RAIDDisk Space ManagementRoles, Free blocks Buffer ManagementBuffer Pool, Replacement policy Files and Access MethodsFile organizationheaps, sorted files, indexesFile and Page level storageQuery Optimizationand ExecutionRelational OperatorsFiles and Access MethodsBuffer ManagementDisk Space ManagementDBThese layersmust considerconcurrencycontrol andrecoveryIndex FilesSystem CatalogData Files
Five-minute rule (p1)Jim Gray, 1985, 1997When it comes to improving database performance,Pay per MB when buying RAMIf RAM is cheaper, we can afford to buy morePay for speed when buying diskFaster disk, can get away with less RAMThe critical question:At what point does the cost of keeping data in memory balance the cost of getting it from disk?Source: ACM Queue
Five-minute Rule (p2)In 1985 context: (ref)Block size: 1KBDisk:15 I/Os per second = $15K1 I/O per sec  $1K + overhead                       $2KMemory$5K /MB  $5 /KB 1KB = $5Trade-offSpend $5 for 1K in memory to save $2K in I/O cost? Sure!Any RAM cost up to $2K is good!$2K/$5 = 400 secondsFive minute rule:Have enough RAM to keep any data that will be used within 5 minutesIn 1997 context: (ref)Re-validated and re-publishedIn 2008 context: (ref)Still valid, but for much larger block sizes (64KB). Why?Disk  speeds have not increased at the same rate as RAM capacity, making it more economical to bring in bigger blocks.
Extending the Hierarchy (p1)Flash memories (Solid-State Disks) (ref)Fit nicely at the half-way point between memory and disk2011 price: 80¢/GBPersistent StorageLow PowerImplications of 5-minute rule:RAM-Flash buffer size 4KBFlash-Disk buffer size 256KBSpecial uses?Index Structures?Materialized Views?
Extending the Hierarchy (P2)Principles of caching also extend into the cloud. Comparison:Disk (2011)3.4 msec access, $1.35/GBAmazon S3 (2011)Highly Redundant storage constructed from cheap disks100-250 msec access (across the network), $0.14/GBPotential Applications,How to turn a key-value store (e.g., Amazon SimpleDB) into a document storeUse SimpleDB for its indexing capabilities Use S3 for storing documentsDatabase backups / checkpoints into the cloud
DisksSecondary storage device of choice. Data is stored and retrieved in units:called disk blocks or pages.Unlike RAM, time to retrieve a disk page varies depending upon location on disk.  Therefore, relative placement of pages on disk has major impact on DBMS performance
Components of a Disk TracksArm movementArm assemblySpindleThe platters spin.15,000 rpm available7,200 or 5,400 rpm are more typicalThe arm assembly is moved in or out to position  a head on a desired track. Tracks under heads  make a cylinder (imaginary!).Only one head reads/writes at any one time.Disk headSectorPlattersBlock size is a multiple of sector size (which is fixed).
Accessing a Disk PageTime to access (read/write) a disk block:seek time (moving arms to position disk head on track)rotational delay (waiting for block to rotate under head)transfer time (actually moving data to/from disk surface)Seek time and rotational delay dominate.Seek time varies from about 1 to 20msecRotational delay varies from 0 to 10msecTransfer rate is about 1msec per 4KB pageLower I/O cost: reduce seek/rotation delays
Disk Access OptimizationsInstead of responding to access requests in FIFO order, respond to them in an order that takes the disk characteristics into accountDisk SchedulingDistribute data accesses among several disks and have them return data in parallelalso known as Disk StripingMirror disksPrefetch for sequential accessesLocate blocks strategically on disk to minimize seek times and rotational delay
Disk SchedulingTracksArm movementArm assemblySpindleThe Elevator Algorithm (like an elevator in a building)Sort all requests by cylinder outer to inner,Move arm inward and return results for each requestSort all requests inner to outer,Move arm outward and return results for each requestRepeatDisk headSectorPlatters
Disk StripingDistribute the data among several disksSeek time goes up, not down!But data transfer time can go downR1R5R9R2R6R10R3R7R11R4R8R12
Mirror disksCan read from either copy, whichever is fasterMust write to both copiesCopy 1Copy 2
Locating Blocks for Sequential access‘Next’ block concept:  blocks on same track, followed byblocks on same cylinder, followed byblocks on adjacent cylinderBlocks in a file should be arranged sequentially on disk (by `next’), to minimize seek and rotational delay.For a sequential scan, pre-fetching several pages at a time is a big win
CS-542 Database Management SystemsRedundant Array of Independent Disks (RAID)
Introduction to RAIDArrangement of several disks that gives the abstraction of a single large diskGoals: Increase Performance and ReliabilityTechniquesData stripingData is partitionedDefinition: size of each partition is called the striping unitPartitions are distributed over several disksRedundancyMore disks  Increased reliabilityRedundant information allows reconstruction of data if disk fails
RAID Levels 0 and 1Level 0: No redundancyBest write performanceNot best in reading. (Why?)Level 1: Mirrored (two identical copies)Each disk has a mirror imageParallel reads, a write involves two disks.Maximum transfer rate = transfer rate of one disk
RAID Level 411110000101010100011100001100010ArrangementUses n data disks and 1 parity diskBlock is striped across the n disks. The parity disk holds XOR of the blocks.Read: from the n data disksWrite: update the data disks and also the parity diskUpon crash: remaining disks are used to reconstruct the dataProblem: performance bottleneck on the parity disk
RAID Level 5Arrangement1/n of the cylinders of each disk are set aside for parityThus the bottleneck is distributed evenly
Disk Space ManagementLowest layer of DBMS software manages space on disk.Higher levels call upon this layer to:allocate/de-allocate a pageread/write a pageHigher levels don’t need to know how this is done, or how free space is managed.
Buffer Management in a DBMSDBPage Requests from Higher LevelsBUFFER POOLdisk pagefree frameMAIN MEMORYDISKchoice of frame dictatedby replacement policyData must be in RAM for DBMS to operate on it!Table of <frame#, pageid> pairs is maintained.
When a Page is Requested ...If requested page is not in buffer pool:Choose a frame for replacementIf  frame is dirty, write it to diskRead requested page into chosen framePin the page and return its address.  If requests can be predicted (e.g., sequential scans), pages can be pre-fetched
More on Buffer ManagementRequestor of page must unpin it, and indicate whether page has been modified: dirty bit is used for this.Page in pool may be requested many times, a pin count is used.  A page is a candidate for replacement iffpin count = 0.CC & recovery may entail additional I/O when a frame is chosen for replacement. (Write-Ahead Log protocol; more later.)
Buffer Replacement PolicyFrame is chosen for replacement by a replacement policy:Least-recently-used (LRU), Clock, MRU etc.Policy can have big impact on # of I/O’s; depends on access pattern.Sequential flooding:  Nasty situation caused by LRU + repeated sequential scans.# buffer frames < # pages in file means each page request causes an I/O.  MRU much better in this situation (but not in all situations, of course).
Representing addresses (p1)We need pointers especially in object oriented databases. Two kind of addresses:Physical (e.g. host, driveID, cylinder, surface, sector (block), offset)Logical (unique ID). Physical addresses are very long8B is the minimum – up to 16B in some systemsExample: A database that is designed to last 100 years. If the database grows to encompass 1 million machines and each machine creates 1 object each nanoseconds then we could have 277 objects.10 bytes are needed to represent addresses for that many objects.
We need a map table for flexibility.The level of indirection gives the flexibility.For example, often we move records around, either within a block or from block to block.What about the programs that are pointing to these records? They are going to have dangling pointers, if they work with physical addresses.We only arrange the map table!physicallogicalLogical addressPhysical addressRepresenting Addresses (p2)
Pointer Swizzling (p1)Typical DB structure: Data maintained by server process, using physical or logical addresses of perhaps 8 bytes. Application programs are clients with their own (conventional memory) address spaces. When blocks and records are copied to client's memory, DB addresses must be swizzled = translated to virtual­memory addresses. Allows conventional pointer following. Especially important in OODBMS, where pointers­as­dataare common. DBMS uses translation tableDb addressmemory address
Pointer swizzling(p2)DBMS uses a translation tableMap Table vs. Translation TableLogical and Physical address are both representations for the database address.In contrast, memory addresses in the translation table are for copies of the corresponding object in memory.All addressable items in the database have entries in the map table, while only those items currently in memory are mentioned in the translation table.Mem-addrDBaddrdatabase addressmemory address
Swizzling ExampleDiskMemoryRead intomemorySwizzledBlock 1UnswizzledBlock 2
Pointer Swizzling (p3)Swizzling Options:Never swizzle. Keep a translation table of DB pointers  local pointers; consult map to follow any DB pointer. Problem: time to follow pointers. Automatic swizzling. When a block is copied to memory, replace all its DB pointers by local pointers. Problem: requires knowing where every pointer is (use block and record headers for schema info). Problem: large investment if not too many pointer­followings occur. Swizzle on demand. When a block is copied to memory, enter its own address and those of member records into translation table, but do not translate pointers within the block. If we follow a pointer, translate it the first time. Problem: requires a bit in pointer fields for DB/local, Problem: extra decision at each pointer following.
Pinned recordsPinned record = some swizzled pointer points to itPointers to pinned records have to be unswizzled before the pinned record is returned to diskWe need to know where the pointers to it areImplementation: keep a linked list of all (swizzled) records pointing to a record.yyxySwizzled pointer
Variable-Length DataSkipped discussion of fixed-length records, please read in the book.Real complexity is with variable-length recordsVarying-size data items (e.g., address)Repeating fields (stars-to-movie relationship)Sliding Records Use offset table in a block, pointing to current records. If a record grows, slide records around the block. Not enough space? Create overflow block; offset table must indicate “record moved.”
System CatalogsMeta information  stored in system catalogs.For each index:structure (e.g.,  B+ tree) and search key fieldsFor each relation:name, file name, file structure (e.g., Heap file)attribute name and type, for each attributeindex name, for each indexintegrity constraintsFor each view:view name and definitionPlus statistics, authorization, buffer pool size, etc.Catalogs are themselves stored as relationsExample: Attribute CatalogConceptually speaking…
Example: MySQL Information SchemaINFORMATION_SCHEMA TablesThe INFORMATION_SCHEMA SCHEMATA Table. The INFORMATION_SCHEMA TABLES Table. The INFORMATION_SCHEMA COLUMNS Table. The INFORMATION_SCHEMA STATISTICS Table. The INFORMATION_SCHEMA USER_PRIVILEGES Table. The INFORMATION_SCHEMA SCHEMA_PRIVILEGES Table. The INFORMATION_SCHEMA TABLE_PRIVILEGES Table. The INFORMATION_SCHEMA COLUMN_PRIVILEGES Table. The INFORMATION_SCHEMA CHARACTER_SETS Table… (Total 18 tables)
SummaryDisks provide cheap, non-volatile storage.Random access, but cost depends on location of page on diskImportant to arrange data sequentially to minimize seek and rotation delays.Buffer manager brings pages into RAM.Page stays in RAM until released by requestor.Written to disk when frame chosen for replacement. Frame to replace based on replacement policy.Tries to pre-fetch several pages at a time.
More Summary DBMS vs. OS File SupportDBMS needs features not found in many OSs.forcing a page to diskcontrolling the order of page writes to diskfiles spanning disksability to control pre-fetching and page replacement policy based on predictable access patternsTwo mapping structures help us map addressesMap tables take us from logical addresses to physical addressesTranslation tables take us from physical addresses to in-memory addresses (where applicable)Swizzling helps keep track of where in memory
Even More SummaryCatalog relations store information about relations, indexes and views.  Information common to all records in collection.
CS 542 – Database Management SystemsData Modeling
Data Modeling TechniquesEntity-Relationship ModelingE/R Diagrams allow us to sketch database schema designsDesigns are pictures called entity-relationship diagramsWeak Entity Sets Skipping, please read in book if interested, not on examConverting E/R Diagrams to RelationsUnified Modeling Language (UML)Skipping, please read in book if interested, not on examObject Definition Language (ODL)Skipping, please read in book if interested, not on exam
Framework for E/RDesign is a serious business.The “boss” (or customer) knows they want a database, but they don’t know what they want in it.Sketching the key components is an efficient way to develop a working database.44
Entity SetsEntity = “thing” or object.Entity set = collection of similar entities.Similar to a class in object-oriented languages.Attribute= property of (the entities of) an entity set.Attributes are simple values, e.g. integers or character strings, not structs, sets, etc.In an entity-relationship diagram:Entity set = rectangle.Attribute = oval, with a line to the rectangle representing its entity set.
Example: Beer ManufacturersEntity set Beers has two attributes, name and manf (manufacturer).Each Beers entity has values for these two attributes, e.g. (Bud, Anheuser-Busch)namemanfBeers
RelationshipsA relationship connects two or more entity sets.It is represented by a diamond, with lines to each of the entity sets involved.manfnamenameaddrSellsBeersBarsBars sell somebeers.licenseDrinkers likesome beers.LikesFrequentsNote:license =beer, full,noneDrinkers frequentsome bars.Drinkersaddrname
Relationship SetThe current “value” of an entity set is the set of entities that belong to it.Example: the set of all bars in our database.The “value” of a relationship is a relationship set, a set of tuples with one component for each related entity set.For the relationship Sells, we might have a relationship set like:Multiway RelationshipsSometimes, we need a relationship that connects more than two entity sets.Suppose that drinkers will only drink certain beers at certain bars.Our three binary relationships Likes, Sells, and Frequents do not allow us to make this distinction.But a 3-way relationship would.
Example: 3-Way RelationshipnameaddrnamemanfBarsBeerslicensePreferencesDrinkersnameaddr
A Typical Relationship SetEach row of a relationship set typically consists of foreign keys to other tables
Many-Many RelationshipsFocus: binary relationships, such as Sells between Bars  and Beers.In a many-many  relationship, an entity of either set can be connected to many entities of the other set.E.g., a bar sells many beers; a beer is sold by many bars.
Many-One RelationshipsSome binary relationships are many -one from one entity set to another.Each entity of the first set is connected to at most one entity of the second set.But an entity of the second set can be connected to zero, one, or many entities of the first set.FavBeer, (Drinkers Beers) is many-oneA drinker has at most one favBeerA beer can be the favorite of any number of drinkers, including zero
One-One RelationshipsIn a one-one relationship, each entity of either entity set is related to at most one entity of the other set.Example: Relationship Best-seller between entity sets Manfs (manufacturer) and Beers.A beer cannot be made by more than one manufacturerNo manufacturer can have more than one best-seller (assume no ties).
Representing “Multiplicity”Show a many-one relationship by an arrow entering the “one” side.Show a one-one relationship by arrows entering both entity sets.Rounded arrow = “exactly one,” i.e., each entity of the first set is related to exactly one entity of the target set.
Example: Many-One RelationshipLikesDrinkersBeersFavoriteNotice: two relationshipsconnect the same entitysets, but are different.
Example: One-One RelationshipConsider Best-seller between Manfs  and Beers.But a beer manufacturer has to have a best-seller.Shown with a rounded arrowSome beers are not the best-seller of any manufacturerA rounded arrow to Manfs would be inappropriateA manufacturer hasexactly one bestseller.A beer is the best-seller for 0 or 1manufacturer.Best-sellerManfsBeers
Attributes on RelationshipsSometimes it is useful to attach an attribute to a relationship.Think of this attribute as a property of tuples in the relationship set.SellsBarsBeerspricePrice is a function of both the bar and the beer,not of one alone.
Subclasses in E/R DiagramsSubclassfewer entities, more properties.Example: Ales are a kind of beer.Not every beer is an ale, but some are.In addition to all the properties (attributes and relationships) of beers, suppose ales also have the attribute color.Assume subclasses form a tree.I.e., no multiple inheritance.Isa triangles indicate the subclass relationship.BeersnamemanfisaAlescolor
E/R vs. Object-Oriented SubclassesPete’s AleIn OO, objects are in one class only.Subclasses inherit from superclasses.In contrast, E/R entities have representatives  in all subclasses to which they belong.Rule: if entity e is represented in a subclass, then e is represented in the superclass (and recursively up the tree).BeersnamemanfisaAlescolor
Designating keysShow keys by underlining the attributeBeersnamemanfisaAlescolor
62Example: GoodnamenameaddrManfByBeersManfsThis design gives the address of each manufacturer exactly once.
63Example: BadnamenameaddrManfByBeersManfsmanfThis design states the manufacturer of a beer twice: as an attribute and as a related entity.
64Example: BadnamemanfmanfAddrBeersThis design repeats the manufacturer’s address once for each beer and loses the address if there are temporarily no beers for a manufacturer.
Example: GoodnamenameaddrManfByBeersManfsManfs deserves to be an entity set because of  the nonkey attribute addr.Beers deserves to be an entity set because it is  the “many” of the many-one relationship ManfBy.
Example: BadnamenameManfByBeersManfsSince the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set.From E/R Diagrams to RelationsEntity set relation.Attributes attributes.Relationships relations whose attributes are only:The keys of the connected entity sets.Attributes of the relationship itself.
Entity Set RelationRelation:  Beers(name, manf)namemanfBeers68
Relationship RelationLikes21FavoriteBuddiesLikes(drinker, beer)Favorite(drinker, beer)wifehusbandBuddies(name1, name2)MarriedMarried(husband, wife)namenameaddrmanfDrinkersBeers
Combining RelationsOK to combine into one relation:The relation for an entity-set EThe relations for many-one relationships of which E  is the “many.”Example: Drinkers(name, addr) and Favorite(drinker, beer) combine to make Drinker1(name, addr, favBeer).
Risk with Many-Many RelationshipsCombining Drinkers with Likes would be a mistake.  It leads to redundancy, as:Redundancyname	      addr                  beerSally	 123 Maple	    BudSally	 123 Maple	    Miller
Subclasses: Three ApproachesObject-oriented: One relation per subset of subclasses, with all relevant attributes.Use nulls: One relation; entities have NULL in attributes that don’t belong to them.E/R style: One relation for each subclass:Key attribute(s).Attributes of that subclass.72
Example: Subclass RelationsBeersnamemanfisaAlescolor
Object-OrientedBeersAlesGood for queries like “find the color of ales made by Pete’s”BeersnamemanfisaAlescolor
E/R StyleBeersAlesGood for queries like “find all beers, including ales, made by Pete’s”BeersnamemanfisaAlescolor
Using NULLSBeersSaves space and does everything with one tableBeersnamemanfisaAlescolor
Data Models forNoSQL DatabasesClass Discussion at Next Meeting. How would you represent a many-to-many relationships in?Amazon SimpleDB?Cassandra?Google App Engine?MongoDB?Redis?Other?Inviting a 3-minute presentation (on 3/21) for 20 bonus pointsOnly one presentation per DBPlease volunteer by Tuesday noon if interestedI will let you know by Wednesday noon if you were selected
SummaryData Modeling is an essential part of designing an applicationIntersects business and technologyEssential elementsEntitiesRelationshipsIs-a relationshipsMultiplicityHas to be done with an eye toward the long term(But has to avoid analysis paralysis)Attributes can be added later but Entities and Relationships are baked-in in the beginning and very hard to change laterPay particular attention to multiplicity of relationshipsBest to separate modeling from “table design”Needed for all databases, Relational or not.
Next meetingsMarch21: Sort and Join ProcessingSort: Chapter 15Join: Sections 16.1 – 16.4
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management

CS 542 Putting it all together -- Storage Management

  • 1.
    CS 542 DatabaseManagement SystemsJ Singh March 14, 2011
  • 2.
    Plan for todayPuttingit all together. Storage HierarchySecondary Storage ManagementSystem CatalogsBy the second break, we will have arrived at a place where we have most of the tools to build our own databaseAfter the second break,Data ModelingThis topic does not fit with the other three but some of you will need it for your project so I am including it
  • 3.
    Storage HierarchyA “typical”system is shown here Many different levelsMain Memory (2011) 1μsec access / word, $12.50/GBDisk (2011, ref)3.4 msec access / page, $1.35/GBFurther away from primary storage,Cost per MB decreasesAccess speed decreasesStorage capacity increasesSecondary- and tertiary-storage must be non-volatileSource: Wikipedia
  • 4.
    DBMS vs. OSFile System OS does disk space & buffer mgmt already So why not let OS manage these tasks?Differences in OS support: Portability issuesSome limitations, e.g., files don’t span multiple disk devices.Buffer management in DBMS requires ability to:pin a page in buffer pool, force a page to disk (important for implementing CC & recovery),adjust replacement policy, and pre-fetch pages based on access patterns in typical DB operations.
  • 5.
    Structure of aDBMSA typical DBMS has a layered architecture.Disk Storage hierarchy, RAIDDisk Space ManagementRoles, Free blocks Buffer ManagementBuffer Pool, Replacement policy Files and Access MethodsFile organizationheaps, sorted files, indexesFile and Page level storageQuery Optimizationand ExecutionRelational OperatorsFiles and Access MethodsBuffer ManagementDisk Space ManagementDBThese layersmust considerconcurrencycontrol andrecoveryIndex FilesSystem CatalogData Files
  • 6.
    Five-minute rule (p1)JimGray, 1985, 1997When it comes to improving database performance,Pay per MB when buying RAMIf RAM is cheaper, we can afford to buy morePay for speed when buying diskFaster disk, can get away with less RAMThe critical question:At what point does the cost of keeping data in memory balance the cost of getting it from disk?Source: ACM Queue
  • 7.
    Five-minute Rule (p2)In1985 context: (ref)Block size: 1KBDisk:15 I/Os per second = $15K1 I/O per sec  $1K + overhead  $2KMemory$5K /MB  $5 /KB 1KB = $5Trade-offSpend $5 for 1K in memory to save $2K in I/O cost? Sure!Any RAM cost up to $2K is good!$2K/$5 = 400 secondsFive minute rule:Have enough RAM to keep any data that will be used within 5 minutesIn 1997 context: (ref)Re-validated and re-publishedIn 2008 context: (ref)Still valid, but for much larger block sizes (64KB). Why?Disk speeds have not increased at the same rate as RAM capacity, making it more economical to bring in bigger blocks.
  • 8.
    Extending the Hierarchy(p1)Flash memories (Solid-State Disks) (ref)Fit nicely at the half-way point between memory and disk2011 price: 80¢/GBPersistent StorageLow PowerImplications of 5-minute rule:RAM-Flash buffer size 4KBFlash-Disk buffer size 256KBSpecial uses?Index Structures?Materialized Views?
  • 9.
    Extending the Hierarchy(P2)Principles of caching also extend into the cloud. Comparison:Disk (2011)3.4 msec access, $1.35/GBAmazon S3 (2011)Highly Redundant storage constructed from cheap disks100-250 msec access (across the network), $0.14/GBPotential Applications,How to turn a key-value store (e.g., Amazon SimpleDB) into a document storeUse SimpleDB for its indexing capabilities Use S3 for storing documentsDatabase backups / checkpoints into the cloud
  • 10.
    DisksSecondary storage deviceof choice. Data is stored and retrieved in units:called disk blocks or pages.Unlike RAM, time to retrieve a disk page varies depending upon location on disk. Therefore, relative placement of pages on disk has major impact on DBMS performance
  • 11.
    Components of aDisk TracksArm movementArm assemblySpindleThe platters spin.15,000 rpm available7,200 or 5,400 rpm are more typicalThe arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!).Only one head reads/writes at any one time.Disk headSectorPlattersBlock size is a multiple of sector size (which is fixed).
  • 12.
    Accessing a DiskPageTime to access (read/write) a disk block:seek time (moving arms to position disk head on track)rotational delay (waiting for block to rotate under head)transfer time (actually moving data to/from disk surface)Seek time and rotational delay dominate.Seek time varies from about 1 to 20msecRotational delay varies from 0 to 10msecTransfer rate is about 1msec per 4KB pageLower I/O cost: reduce seek/rotation delays
  • 13.
    Disk Access OptimizationsInsteadof responding to access requests in FIFO order, respond to them in an order that takes the disk characteristics into accountDisk SchedulingDistribute data accesses among several disks and have them return data in parallelalso known as Disk StripingMirror disksPrefetch for sequential accessesLocate blocks strategically on disk to minimize seek times and rotational delay
  • 14.
    Disk SchedulingTracksArm movementArmassemblySpindleThe Elevator Algorithm (like an elevator in a building)Sort all requests by cylinder outer to inner,Move arm inward and return results for each requestSort all requests inner to outer,Move arm outward and return results for each requestRepeatDisk headSectorPlatters
  • 15.
    Disk StripingDistribute thedata among several disksSeek time goes up, not down!But data transfer time can go downR1R5R9R2R6R10R3R7R11R4R8R12
  • 16.
    Mirror disksCan readfrom either copy, whichever is fasterMust write to both copiesCopy 1Copy 2
  • 17.
    Locating Blocks forSequential access‘Next’ block concept: blocks on same track, followed byblocks on same cylinder, followed byblocks on adjacent cylinderBlocks in a file should be arranged sequentially on disk (by `next’), to minimize seek and rotational delay.For a sequential scan, pre-fetching several pages at a time is a big win
  • 18.
    CS-542 Database ManagementSystemsRedundant Array of Independent Disks (RAID)
  • 19.
    Introduction to RAIDArrangementof several disks that gives the abstraction of a single large diskGoals: Increase Performance and ReliabilityTechniquesData stripingData is partitionedDefinition: size of each partition is called the striping unitPartitions are distributed over several disksRedundancyMore disks  Increased reliabilityRedundant information allows reconstruction of data if disk fails
  • 20.
    RAID Levels 0and 1Level 0: No redundancyBest write performanceNot best in reading. (Why?)Level 1: Mirrored (two identical copies)Each disk has a mirror imageParallel reads, a write involves two disks.Maximum transfer rate = transfer rate of one disk
  • 21.
    RAID Level 411110000101010100011100001100010ArrangementUsesn data disks and 1 parity diskBlock is striped across the n disks. The parity disk holds XOR of the blocks.Read: from the n data disksWrite: update the data disks and also the parity diskUpon crash: remaining disks are used to reconstruct the dataProblem: performance bottleneck on the parity disk
  • 22.
    RAID Level 5Arrangement1/nof the cylinders of each disk are set aside for parityThus the bottleneck is distributed evenly
  • 23.
    Disk Space ManagementLowestlayer of DBMS software manages space on disk.Higher levels call upon this layer to:allocate/de-allocate a pageread/write a pageHigher levels don’t need to know how this is done, or how free space is managed.
  • 24.
    Buffer Management ina DBMSDBPage Requests from Higher LevelsBUFFER POOLdisk pagefree frameMAIN MEMORYDISKchoice of frame dictatedby replacement policyData must be in RAM for DBMS to operate on it!Table of <frame#, pageid> pairs is maintained.
  • 25.
    When a Pageis Requested ...If requested page is not in buffer pool:Choose a frame for replacementIf frame is dirty, write it to diskRead requested page into chosen framePin the page and return its address. If requests can be predicted (e.g., sequential scans), pages can be pre-fetched
  • 26.
    More on BufferManagementRequestor of page must unpin it, and indicate whether page has been modified: dirty bit is used for this.Page in pool may be requested many times, a pin count is used. A page is a candidate for replacement iffpin count = 0.CC & recovery may entail additional I/O when a frame is chosen for replacement. (Write-Ahead Log protocol; more later.)
  • 27.
    Buffer Replacement PolicyFrameis chosen for replacement by a replacement policy:Least-recently-used (LRU), Clock, MRU etc.Policy can have big impact on # of I/O’s; depends on access pattern.Sequential flooding: Nasty situation caused by LRU + repeated sequential scans.# buffer frames < # pages in file means each page request causes an I/O. MRU much better in this situation (but not in all situations, of course).
  • 28.
    Representing addresses (p1)Weneed pointers especially in object oriented databases. Two kind of addresses:Physical (e.g. host, driveID, cylinder, surface, sector (block), offset)Logical (unique ID). Physical addresses are very long8B is the minimum – up to 16B in some systemsExample: A database that is designed to last 100 years. If the database grows to encompass 1 million machines and each machine creates 1 object each nanoseconds then we could have 277 objects.10 bytes are needed to represent addresses for that many objects.
  • 29.
    We need amap table for flexibility.The level of indirection gives the flexibility.For example, often we move records around, either within a block or from block to block.What about the programs that are pointing to these records? They are going to have dangling pointers, if they work with physical addresses.We only arrange the map table!physicallogicalLogical addressPhysical addressRepresenting Addresses (p2)
  • 30.
    Pointer Swizzling (p1)TypicalDB structure: Data maintained by server process, using physical or logical addresses of perhaps 8 bytes. Application programs are clients with their own (conventional memory) address spaces. When blocks and records are copied to client's memory, DB addresses must be swizzled = translated to virtual­memory addresses. Allows conventional pointer following. Especially important in OODBMS, where pointers­as­dataare common. DBMS uses translation tableDb addressmemory address
  • 31.
    Pointer swizzling(p2)DBMS usesa translation tableMap Table vs. Translation TableLogical and Physical address are both representations for the database address.In contrast, memory addresses in the translation table are for copies of the corresponding object in memory.All addressable items in the database have entries in the map table, while only those items currently in memory are mentioned in the translation table.Mem-addrDBaddrdatabase addressmemory address
  • 32.
  • 33.
    Pointer Swizzling (p3)SwizzlingOptions:Never swizzle. Keep a translation table of DB pointers  local pointers; consult map to follow any DB pointer. Problem: time to follow pointers. Automatic swizzling. When a block is copied to memory, replace all its DB pointers by local pointers. Problem: requires knowing where every pointer is (use block and record headers for schema info). Problem: large investment if not too many pointer­followings occur. Swizzle on demand. When a block is copied to memory, enter its own address and those of member records into translation table, but do not translate pointers within the block. If we follow a pointer, translate it the first time. Problem: requires a bit in pointer fields for DB/local, Problem: extra decision at each pointer following.
  • 34.
    Pinned recordsPinned record= some swizzled pointer points to itPointers to pinned records have to be unswizzled before the pinned record is returned to diskWe need to know where the pointers to it areImplementation: keep a linked list of all (swizzled) records pointing to a record.yyxySwizzled pointer
  • 35.
    Variable-Length DataSkipped discussionof fixed-length records, please read in the book.Real complexity is with variable-length recordsVarying-size data items (e.g., address)Repeating fields (stars-to-movie relationship)Sliding Records Use offset table in a block, pointing to current records. If a record grows, slide records around the block. Not enough space? Create overflow block; offset table must indicate “record moved.”
  • 36.
    System CatalogsMeta information stored in system catalogs.For each index:structure (e.g., B+ tree) and search key fieldsFor each relation:name, file name, file structure (e.g., Heap file)attribute name and type, for each attributeindex name, for each indexintegrity constraintsFor each view:view name and definitionPlus statistics, authorization, buffer pool size, etc.Catalogs are themselves stored as relationsExample: Attribute CatalogConceptually speaking…
  • 37.
    Example: MySQL InformationSchemaINFORMATION_SCHEMA TablesThe INFORMATION_SCHEMA SCHEMATA Table. The INFORMATION_SCHEMA TABLES Table. The INFORMATION_SCHEMA COLUMNS Table. The INFORMATION_SCHEMA STATISTICS Table. The INFORMATION_SCHEMA USER_PRIVILEGES Table. The INFORMATION_SCHEMA SCHEMA_PRIVILEGES Table. The INFORMATION_SCHEMA TABLE_PRIVILEGES Table. The INFORMATION_SCHEMA COLUMN_PRIVILEGES Table. The INFORMATION_SCHEMA CHARACTER_SETS Table… (Total 18 tables)
  • 38.
    SummaryDisks provide cheap,non-volatile storage.Random access, but cost depends on location of page on diskImportant to arrange data sequentially to minimize seek and rotation delays.Buffer manager brings pages into RAM.Page stays in RAM until released by requestor.Written to disk when frame chosen for replacement. Frame to replace based on replacement policy.Tries to pre-fetch several pages at a time.
  • 39.
    More Summary DBMSvs. OS File SupportDBMS needs features not found in many OSs.forcing a page to diskcontrolling the order of page writes to diskfiles spanning disksability to control pre-fetching and page replacement policy based on predictable access patternsTwo mapping structures help us map addressesMap tables take us from logical addresses to physical addressesTranslation tables take us from physical addresses to in-memory addresses (where applicable)Swizzling helps keep track of where in memory
  • 40.
    Even More SummaryCatalogrelations store information about relations, indexes and views. Information common to all records in collection.
  • 41.
    CS 542 –Database Management SystemsData Modeling
  • 42.
    Data Modeling TechniquesEntity-RelationshipModelingE/R Diagrams allow us to sketch database schema designsDesigns are pictures called entity-relationship diagramsWeak Entity Sets Skipping, please read in book if interested, not on examConverting E/R Diagrams to RelationsUnified Modeling Language (UML)Skipping, please read in book if interested, not on examObject Definition Language (ODL)Skipping, please read in book if interested, not on exam
  • 43.
    Framework for E/RDesignis a serious business.The “boss” (or customer) knows they want a database, but they don’t know what they want in it.Sketching the key components is an efficient way to develop a working database.44
  • 44.
    Entity SetsEntity =“thing” or object.Entity set = collection of similar entities.Similar to a class in object-oriented languages.Attribute= property of (the entities of) an entity set.Attributes are simple values, e.g. integers or character strings, not structs, sets, etc.In an entity-relationship diagram:Entity set = rectangle.Attribute = oval, with a line to the rectangle representing its entity set.
  • 45.
    Example: Beer ManufacturersEntityset Beers has two attributes, name and manf (manufacturer).Each Beers entity has values for these two attributes, e.g. (Bud, Anheuser-Busch)namemanfBeers
  • 46.
    RelationshipsA relationship connectstwo or more entity sets.It is represented by a diamond, with lines to each of the entity sets involved.manfnamenameaddrSellsBeersBarsBars sell somebeers.licenseDrinkers likesome beers.LikesFrequentsNote:license =beer, full,noneDrinkers frequentsome bars.Drinkersaddrname
  • 47.
    Relationship SetThe current“value” of an entity set is the set of entities that belong to it.Example: the set of all bars in our database.The “value” of a relationship is a relationship set, a set of tuples with one component for each related entity set.For the relationship Sells, we might have a relationship set like:Multiway RelationshipsSometimes, we need a relationship that connects more than two entity sets.Suppose that drinkers will only drink certain beers at certain bars.Our three binary relationships Likes, Sells, and Frequents do not allow us to make this distinction.But a 3-way relationship would.
  • 48.
  • 49.
    A Typical RelationshipSetEach row of a relationship set typically consists of foreign keys to other tables
  • 50.
    Many-Many RelationshipsFocus: binaryrelationships, such as Sells between Bars and Beers.In a many-many relationship, an entity of either set can be connected to many entities of the other set.E.g., a bar sells many beers; a beer is sold by many bars.
  • 51.
    Many-One RelationshipsSome binaryrelationships are many -one from one entity set to another.Each entity of the first set is connected to at most one entity of the second set.But an entity of the second set can be connected to zero, one, or many entities of the first set.FavBeer, (Drinkers Beers) is many-oneA drinker has at most one favBeerA beer can be the favorite of any number of drinkers, including zero
  • 52.
    One-One RelationshipsIn aone-one relationship, each entity of either entity set is related to at most one entity of the other set.Example: Relationship Best-seller between entity sets Manfs (manufacturer) and Beers.A beer cannot be made by more than one manufacturerNo manufacturer can have more than one best-seller (assume no ties).
  • 53.
    Representing “Multiplicity”Show amany-one relationship by an arrow entering the “one” side.Show a one-one relationship by arrows entering both entity sets.Rounded arrow = “exactly one,” i.e., each entity of the first set is related to exactly one entity of the target set.
  • 54.
    Example: Many-One RelationshipLikesDrinkersBeersFavoriteNotice:two relationshipsconnect the same entitysets, but are different.
  • 55.
    Example: One-One RelationshipConsiderBest-seller between Manfs and Beers.But a beer manufacturer has to have a best-seller.Shown with a rounded arrowSome beers are not the best-seller of any manufacturerA rounded arrow to Manfs would be inappropriateA manufacturer hasexactly one bestseller.A beer is the best-seller for 0 or 1manufacturer.Best-sellerManfsBeers
  • 56.
    Attributes on RelationshipsSometimesit is useful to attach an attribute to a relationship.Think of this attribute as a property of tuples in the relationship set.SellsBarsBeerspricePrice is a function of both the bar and the beer,not of one alone.
  • 57.
    Subclasses in E/RDiagramsSubclassfewer entities, more properties.Example: Ales are a kind of beer.Not every beer is an ale, but some are.In addition to all the properties (attributes and relationships) of beers, suppose ales also have the attribute color.Assume subclasses form a tree.I.e., no multiple inheritance.Isa triangles indicate the subclass relationship.BeersnamemanfisaAlescolor
  • 58.
    E/R vs. Object-OrientedSubclassesPete’s AleIn OO, objects are in one class only.Subclasses inherit from superclasses.In contrast, E/R entities have representatives in all subclasses to which they belong.Rule: if entity e is represented in a subclass, then e is represented in the superclass (and recursively up the tree).BeersnamemanfisaAlescolor
  • 59.
    Designating keysShow keysby underlining the attributeBeersnamemanfisaAlescolor
  • 60.
    62Example: GoodnamenameaddrManfByBeersManfsThis designgives the address of each manufacturer exactly once.
  • 61.
    63Example: BadnamenameaddrManfByBeersManfsmanfThis designstates the manufacturer of a beer twice: as an attribute and as a related entity.
  • 62.
    64Example: BadnamemanfmanfAddrBeersThis designrepeats the manufacturer’s address once for each beer and loses the address if there are temporarily no beers for a manufacturer.
  • 63.
    Example: GoodnamenameaddrManfByBeersManfsManfs deservesto be an entity set because of the nonkey attribute addr.Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy.
  • 64.
    Example: BadnamenameManfByBeersManfsSince themanufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set.From E/R Diagrams to RelationsEntity set relation.Attributes attributes.Relationships relations whose attributes are only:The keys of the connected entity sets.Attributes of the relationship itself.
  • 65.
    Entity Set RelationRelation: Beers(name, manf)namemanfBeers68
  • 66.
    Relationship RelationLikes21FavoriteBuddiesLikes(drinker, beer)Favorite(drinker,beer)wifehusbandBuddies(name1, name2)MarriedMarried(husband, wife)namenameaddrmanfDrinkersBeers
  • 67.
    Combining RelationsOK tocombine into one relation:The relation for an entity-set EThe relations for many-one relationships of which E is the “many.”Example: Drinkers(name, addr) and Favorite(drinker, beer) combine to make Drinker1(name, addr, favBeer).
  • 68.
    Risk with Many-ManyRelationshipsCombining Drinkers with Likes would be a mistake. It leads to redundancy, as:Redundancyname addr beerSally 123 Maple BudSally 123 Maple Miller
  • 69.
    Subclasses: Three ApproachesObject-oriented:One relation per subset of subclasses, with all relevant attributes.Use nulls: One relation; entities have NULL in attributes that don’t belong to them.E/R style: One relation for each subclass:Key attribute(s).Attributes of that subclass.72
  • 70.
  • 71.
    Object-OrientedBeersAlesGood for querieslike “find the color of ales made by Pete’s”BeersnamemanfisaAlescolor
  • 72.
    E/R StyleBeersAlesGood forqueries like “find all beers, including ales, made by Pete’s”BeersnamemanfisaAlescolor
  • 73.
    Using NULLSBeersSaves spaceand does everything with one tableBeersnamemanfisaAlescolor
  • 74.
    Data Models forNoSQLDatabasesClass Discussion at Next Meeting. How would you represent a many-to-many relationships in?Amazon SimpleDB?Cassandra?Google App Engine?MongoDB?Redis?Other?Inviting a 3-minute presentation (on 3/21) for 20 bonus pointsOnly one presentation per DBPlease volunteer by Tuesday noon if interestedI will let you know by Wednesday noon if you were selected
  • 75.
    SummaryData Modeling isan essential part of designing an applicationIntersects business and technologyEssential elementsEntitiesRelationshipsIs-a relationshipsMultiplicityHas to be done with an eye toward the long term(But has to avoid analysis paralysis)Attributes can be added later but Entities and Relationships are baked-in in the beginning and very hard to change laterPay particular attention to multiplicity of relationshipsBest to separate modeling from “table design”Needed for all databases, Relational or not.
  • 76.
    Next meetingsMarch21: Sortand Join ProcessingSort: Chapter 15Join: Sections 16.1 – 16.4