CS 542 Database Management Systems<br />Controlling Database Integrity and Performance<br />J Singh <br />January 31, 2011...
Today’s Topics<br />Database Integrity<br />Primary Key Constraints – Prevent Duplicates<br />Foreign Key Constraints – Pr...
Primary Key Constraints<br />What are Primary Keys good for?<br />Uniquely identify the subject of each tuple<br />Ensure ...
Foreign Key Constraints (p1)<br />Main Idea: Prevent Dangling Tuples<br />Foreign Key<br />Key Reference<br />Foreign Key<...
Foreign Key Constraints (p2)<br />Alternative methods of defining a foreign key<br />CREATE TABLE City (<br />CountryCode ...
Foreign Key Constraints (p3)<br />Foreign Key<br />Key Reference<br />Referential Integrity Options<br />Restrict (default...
Foreign Key Constraints (p4)<br />Chicken and Egg definitions<br />CREATE TABLE chicken (<br />cID INT PRIMARY KEY, <br />...
Foreign Key Constraints (p5)<br />Chicken and Egg insertion<br />INSERT INTO chicken<br />  VALUES(1, 1001);<br />INSERT I...
Attribute-Based Constraints<br />NOT NULL<br />The most common<br />Reasonability Constraints<br />Validate incoming data?...
Tuple-Based Constraints<br />Validate the entire tuple whenever anything in that tuple is updated<br />More integrity enfo...
Assertions<br />Validate the entire database whenever anything in the database is updated<br />Part of the database, not a...
Views<br />Also called Virtual Views<br />Don’t actually exist in the database but behave as if they do<br />Can be subset...
Operations on Views (p1)<br />SELECT<br />   SELECT * FROM CityLanguage WHERE Language='Dutch';<br />Shouldn’t ‘temporaril...
Operations on Views (p2)<br />UPDATE, INSERT not always possible, except<br />Can sometimes be implemented using INSTEAD O...
Top-Down Datalog Recursion Revisited<br />IDB’s are conceptualized (and implemented) as Views<br />for IDB predicate p(x,y...
Indexes<br />Main Idea: Data Structures for Fast Search<br />Motivation:<br />Preventing the need for linear search throug...
Selection of Indexes (p1)<br />Why not create an index for every attribute?<br />Useful indexes, and not so useful ones<br...
Selection of Indexes (p2)<br />The Mantra:<br />Don’t define indexes too early: know your workload first<br />Be as empiri...
CS 542 Database Management Systems<br />Report Proposals<br />J Singh <br />January 31, 2011<br />
Next meeting<br />February 7<br />Index Structures, Chapter 14<br />
Upcoming SlideShare
Loading in …5
×

CS 542 Controlling Database Integrity and Performance

903 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
903
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CS 542 Controlling Database Integrity and Performance

  1. 1. CS 542 Database Management Systems<br />Controlling Database Integrity and Performance<br />J Singh <br />January 31, 2011<br />
  2. 2. Today’s Topics<br />Database Integrity<br />Primary Key Constraints – Prevent Duplicates<br />Foreign Key Constraints – Prevent Dangling References<br />Attribute Constraints – Prevent Inconsistent Attribute Values<br />TupleConstraints – More vigilant checking of attribute values<br />Assertions – Paranoid integrity checking<br />Views<br />Performance Topics<br />Indexes<br />Discussion of presentation topic proposals<br />
  3. 3. Primary Key Constraints<br />What are Primary Keys good for?<br />Uniquely identify the subject of each tuple<br />Ensure that there are no duplicates<br />Cannot be null – that would imply a NULL subject.<br />A table may not have more than one primary key<br />A Primary Key may consist of one or more columns<br />Multiple Unique keys are OK<br />For Table R, <P1, P2, …, Pm> together constitute a primary key if for each tuple in R,<br /><P1, P2, …, Pm> are unique<br />P1, P2, …, Pm are non-null<br /><U1, U2, …, Um> together constitute a unique key if for each tuple in R,<br /><U1, U2, …, Um> are unique<br />But U1, U2, …, Umcan be null<br />
  4. 4. Foreign Key Constraints (p1)<br />Main Idea: Prevent Dangling Tuples<br />Foreign Key<br />Key Reference<br />Foreign Key<br />Must point to a Key Reference<br />CREATE TABLE City (<br /> ::<br />CountryCode char(3)<br /> REFERENCES Country(Code)<br />)<br />Key Reference<br />Must be unique or primary key<br />Try: INSERT INTO city<br />(Name, CountryCode) value ('xyzzy', 'XYZ');<br />Try: UPDATE city<br />set CountryCode='XYZ' where CountryCode='FIN';<br />Key reference must already exist before a referencing tuple can be added<br />
  5. 5. Foreign Key Constraints (p2)<br />Alternative methods of defining a foreign key<br />CREATE TABLE City (<br />CountryCode char(3) REFERENCES COUNTRY(Code), …)<br />CREATE TABLE City (<br />CountryCode char(3), …,<br /> FOREIGN KEY CountryCode<br /> [CONSTRAINT [ctyREFcntry]] REFERENCES COUNTRY(Code))<br />CREATE TABLE City (<br />CountryCode char(3), …)<br /> Then, later,<br /> ALTER TABLE City ADD [CONSTRAINT [ctyREFcntry]]<br /> FOREIGN KEY CountryCode REFERENCES COUNTRY(Code);<br />Notation: [] signifies optional<br />
  6. 6. Foreign Key Constraints (p3)<br />Foreign Key<br />Key Reference<br />Referential Integrity Options<br />Restrict (default)<br />Reject request<br />Cascade<br />Reflect changes back<br />Set Null<br />Set the foreign key to NULL<br />Changes to Key References<br />Try: DELETE FROM country<br /> WHERE code=‘FIN’;<br />Try: UPDATE country<br /> SET Code='XYZ' <br /> WHERE Code='FIN‘;<br />
  7. 7. Foreign Key Constraints (p4)<br />Chicken and Egg definitions<br />CREATE TABLE chicken (<br />cID INT PRIMARY KEY, <br />eID INT <br /> REFERENCES egg(eID));<br />CREATE TABLE egg(<br />eID INT PRIMARY KEY,<br />cID INT<br /> REFERENCES chicken(cID));<br />Consistently fails<br />Can’t define a foreign key to a table before it has been defined<br />Solution<br />Define the tables w/o constraints<br />CREATE TABLE chicken(<br />cID INT PRIMARY KEY,<br />eID INT); <br />CREATE TABLE egg(<br />eID INT PRIMARY KEY,<br />cID INT);<br />And then add foreign keys<br />ALTER TABLE chicken <br /> ADD CONSTRAINT c_e<br /> FOREIGN KEY (eID)<br /> REFERENCES egg(eID);<br />ALTER TABLE egg <br /> ADD CONSTRAINT e_c<br /> FOREIGN KEY (cID)<br /> REFERENCES chicken(cID);<br />
  8. 8. Foreign Key Constraints (p5)<br />Chicken and Egg insertion<br />INSERT INTO chicken<br /> VALUES(1, 1001);<br />INSERT INTO egg <br /> VALUES(1001, 1);<br />Still consistently fails<br />Need a way to postpone constraint checking<br />How long to postpone?<br />Until transaction commit <br />Solution<br />Define the tables with deferred constraint-checking<br />ALTER TABLE chicken<br /> ADD CONSTRAINT c_e<br /> FOREIGN KEY (eID) <br /> REFERENCES egg(eID)<br /> INITIALLY DEFERRED DEFERRABLE;<br />ALTER TABLE egg <br /> ADD CONSTRAINT e_c<br /> FOREIGN KEY (cID)<br /> REFERENCES chicken(cID)<br /> INITIALLY DEFERRED DEFERRABLE;<br />And then<br />INSERT INTO chicken VALUES(1, 1001);<br />INSERT INTO egg VALUES(1001, 1);<br />COMMIT;<br />
  9. 9. Attribute-Based Constraints<br />NOT NULL<br />The most common<br />Reasonability Constraints<br />Validate incoming data? e.g.,<br />Population Density < 30000<br />Specification:<br />Population INT(11) NOT NULL<br /> CHECK (Population <= 30000 * SurfaceArea),<br />The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take<br />Including subqueries<br />The attribute constraint is checked when assigned<br />Can be violated underneath as long as it is not re-evaluated<br />For example, if we update SurfaceArea, the violation won’t be flagged<br />Not implemented in all databases, e.g., MySQL<br />
  10. 10. Tuple-Based Constraints<br />Validate the entire tuple whenever anything in that tuple is updated<br />More integrity enforcement than with attribute-based constraints e.g.,<br />Population Density <= 30000<br />Specification:<br />Population INT(11) NOT NULL,<br />CHECK (Population <= 30000 * SurfaceArea),<br />The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take<br />Including subqueries<br />The attribute constraint is checked when tuple is updated<br />If we update SurfaceArea, the violation will be flagged<br />But the violation of<br />CHECK (Population > (<br /> SELECT SUM(Population)<br /> FROM City WHERE City.CountryCode = Code))<br />which specifies a subquery involving another table, will not be flagged<br />Not implemented in all databases, e.g., MySQL<br />
  11. 11. Assertions<br />Validate the entire database whenever anything in the database is updated<br />Part of the database, not any specific table<br />Specification: Table-like<br />CREATE ASSERTION CountryPop CHECK (<br /> NOT EXISTS<br /> (SELECT * FROM Country <br /> WHERE Population < <br /> (SELECT SUM(Population)<br /> FROM City WHERE City.CountryCode = Code)))<br />Difficult to implement efficiently<br />Often not implemented<br />I don’t know of any implementations<br />Can be implemented for specific cases using Triggers, see Section 7.5<br />
  12. 12. Views<br />Also called Virtual Views<br />Don’t actually exist in the database but behave as if they do<br />Can be subsets of the data or joins – actually, arbitrary queries<br />Subset example,<br />CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry<br />FROM city c WHERE population > 0<br />Join example<br />CREATE VIEW CityLanguage as <br /> SELECT city.name, city.countrycode, lang.languageas Language <br /> FROM city, countrylanguage as lang<br /> WHERE city.countrycode = lang.countrycode<br /> AND lang.isOfficial= ‘T‘;<br />
  13. 13. Operations on Views (p1)<br />SELECT<br /> SELECT * FROM CityLanguage WHERE Language='Dutch';<br />Shouldn’t ‘temporarily’ create the table and SELECT from it.<br />Should use the definition of CityLanguage to make a query, i.e.,<br />SELECT * <br /> FROM <br /> (SELECT …blabla…<br /> FROM city, countrylanguage as lang<br /> WHERE city.countrycode = lang.countrycode<br /> AND lang.isOfficial = 'T')<br /> WHERE Language='Dutch';<br />
  14. 14. Operations on Views (p2)<br />UPDATE, INSERT not always possible, except<br />Can sometimes be implemented using INSTEAD OF triggers<br />Modifications are permitted when the view is derived from a single table R and<br />The WHERE clause does not involve R in a Subquery<br />The FROM clause can only consist of one occurrence of R<br />The valued of all attributes not specified in the view definition can be ‘manufactured’ by the database<br />Example. For the view ct<br />CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycodeAS cntry<br />FROM city c WHERE population > 0<br /> the query<br />INSERT INTO ct (nm, cntry) values ('FirSPA', 'FIN') <br /> can be automatically rewritten as <br />INSERT INTO CITY (Name, CountryCode) values ('FirSPA', 'FIN')<br />
  15. 15. Top-Down Datalog Recursion Revisited<br />IDB’s are conceptualized (and implemented) as Views<br />for IDB predicate p(x,y, …)<br /> FOR EACH subgoal of p DO<br /> IF subgoal is IDB, recursive call;<br /> IF subgoal is EDB, look up<br />
  16. 16. Indexes<br />Main Idea: Data Structures for Fast Search<br />Motivation:<br />Preventing the need for linear search through a big table<br />Example query: <br />SELECT * FROM City WHERE CountryCode = 'FIN';<br />Another: <br />SELECT * FROM City <br /> WHERE Population > (0.4 * (<br /> SELECT Population FROM Country <br /> WHERE CountryCode= Code));<br />Expected time for first example: O(n). For the second, O(n2)<br />Declaration<br />CREATE INDEX CityIndex ON City(CountryCode);<br />CREATE INDEX CityPopIndex ON City(Population);<br />CREATE INDEX CountryPopIndex ON Country(Population);<br />
  17. 17. Selection of Indexes (p1)<br />Why not create an index for every attribute?<br />Useful indexes, and not so useful ones<br />Primary key?<br />Unique key?<br />From previous examples, <br />CityIndex?<br />CityPopIndex?<br />CountryPopIndex?<br />
  18. 18. Selection of Indexes (p2)<br />The Mantra:<br />Don’t define indexes too early: know your workload first<br />Be as empirical as is practical<br />The Greedy approach to index selection:<br />Start with no indexes<br />Evaluate candidate indexes, choose the one potentially most effective<br />Repeat<br />Query execution will take advantage of defined indexes<br />
  19. 19. CS 542 Database Management Systems<br />Report Proposals<br />J Singh <br />January 31, 2011<br />
  20. 20. Next meeting<br />February 7<br />Index Structures, Chapter 14<br />

×