SlideShare a Scribd company logo
CS 542 Database Management Systems Controlling Database Integrity and Performance J Singh  January 31, 2011
Today’s Topics Database Integrity Primary Key Constraints – Prevent Duplicates Foreign Key Constraints – Prevent Dangling References Attribute Constraints – Prevent Inconsistent Attribute Values Tuple Constraints – More vigilant checking of attribute values Assertions – Paranoid integrity checking Views Performance Topics Indexes Discussion of presentation topic proposals
Primary Key Constraints What are Primary Keys good for? Uniquely identify the subject of each tuple Ensure that there are no duplicates Cannot be null – that would imply a NULL subject. A table may not have more than one primary key A Primary Key may consist of one or more columns Multiple Unique keys are OK For Table R, <P1, P2, …, Pm> together constitute a primary key if for each tuple in R, <P1, P2, …, Pm> are unique P1, P2, …, Pm are non-null <U1, U2, …, Um> together constitute a unique key if for each tuple in R, <U1, U2, …, Um> are unique But U1, U2, …, Umcan be null
Foreign Key Constraints (p1) Main Idea: Prevent Dangling Tuples Foreign Key Key Reference Foreign Key Must point to a Key Reference CREATE TABLE City (   :: CountryCode char(3)   REFERENCES Country(Code) ) Key Reference Must be unique or primary key Try: INSERT INTO city (Name, CountryCode) value ('xyzzy', 'XYZ'); Try: UPDATE city set CountryCode='XYZ' where CountryCode='FIN'; Key reference must already exist before a referencing tuple can be added
Foreign Key Constraints (p2) Alternative methods of defining a foreign key CREATE TABLE City ( CountryCode char(3) REFERENCES COUNTRY(Code), …) CREATE TABLE City ( CountryCode char(3), …,      FOREIGN KEY CountryCode        [CONSTRAINT [ctyREFcntry]] REFERENCES COUNTRY(Code)) CREATE TABLE City ( CountryCode char(3), …)     Then, later,      ALTER TABLE City ADD [CONSTRAINT [ctyREFcntry]]         FOREIGN KEY CountryCode REFERENCES COUNTRY(Code); Notation: [] signifies optional
Foreign Key Constraints (p3) Foreign Key Key Reference Referential Integrity Options Restrict (default) Reject request Cascade Reflect changes back Set Null Set the foreign key to NULL Changes to Key References Try: DELETE FROM country        WHERE code=‘FIN’; Try: UPDATE country        SET Code='XYZ'         WHERE Code='FIN‘;
Foreign Key Constraints (p4) Chicken and Egg definitions CREATE TABLE chicken ( cID INT PRIMARY KEY,  eID INT      REFERENCES egg(eID)); CREATE TABLE egg( eID INT PRIMARY KEY, cID INT   REFERENCES chicken(cID)); Consistently fails Can’t define a foreign key to a table before it has been defined Solution Define the tables w/o constraints CREATE TABLE chicken( cID INT PRIMARY KEY, eID INT);  CREATE TABLE egg( eID INT PRIMARY KEY, cID INT); And then add foreign keys ALTER TABLE chicken    ADD CONSTRAINT c_e     FOREIGN KEY (eID)     REFERENCES egg(eID); ALTER TABLE egg    ADD CONSTRAINT e_c     FOREIGN KEY (cID)     REFERENCES chicken(cID);
Foreign Key Constraints (p5) Chicken and Egg insertion INSERT INTO chicken   VALUES(1, 1001); INSERT INTO egg    VALUES(1001, 1); Still consistently fails Need a way to postpone constraint checking How long to postpone? Until transaction commit   Solution Define the tables with deferred constraint-checking ALTER TABLE chicken   ADD CONSTRAINT c_e     FOREIGN KEY (eID)      REFERENCES egg(eID)   INITIALLY DEFERRED DEFERRABLE; ALTER TABLE egg    ADD CONSTRAINT e_c     FOREIGN KEY (cID)     REFERENCES chicken(cID)   INITIALLY DEFERRED DEFERRABLE; And then INSERT INTO chicken VALUES(1, 1001); INSERT INTO egg VALUES(1001, 1); COMMIT;
Attribute-Based Constraints NOT NULL The most common Reasonability Constraints Validate incoming data? e.g., Population Density < 30000 Specification: Population INT(11) NOT NULL   CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when assigned Can be violated underneath as long as it is not re-evaluated For example, if we update SurfaceArea, the violation won’t be flagged Not implemented in all databases, e.g., MySQL
Tuple-Based Constraints Validate the entire tuple whenever anything in that tuple is updated More integrity enforcement than with attribute-based constraints e.g., Population Density <= 30000 Specification: Population INT(11) NOT NULL, CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when tuple is updated If we update SurfaceArea, the violation will be flagged But the violation of CHECK (Population > (       SELECT SUM(Population)           FROM City WHERE City.CountryCode = Code)) 	which specifies a subquery involving another table, will not be flagged Not implemented in all databases, e.g., MySQL
Assertions Validate the entire database whenever anything in the database is updated Part of the database, not any specific table Specification: Table-like CREATE ASSERTION CountryPop CHECK (   NOT EXISTS     (SELECT * FROM Country       WHERE Population <         (SELECT SUM(Population)         FROM City WHERE City.CountryCode = Code))) Difficult to implement efficiently Often not implemented I don’t know of any implementations Can be implemented for specific cases using Triggers, see Section 7.5
Views Also called Virtual Views Don’t actually exist in the database but behave as if they do Can be subsets of the data or joins – actually, arbitrary queries Subset example, CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry FROM city c WHERE population > 0 Join example CREATE VIEW CityLanguage as    SELECT city.name, city.countrycode, lang.language as Language    FROM city, countrylanguage as lang   WHERE city.countrycode = lang.countrycode   AND lang.isOfficial = ‘T‘;
Operations on Views (p1) SELECT    SELECT * FROM CityLanguage WHERE Language='Dutch'; Shouldn’t ‘temporarily’ create the table and SELECT from it. Should use the definition of CityLanguage to make a query, i.e.,    SELECT *       FROM        (SELECT …blabla…        FROM city, countrylanguage as lang        WHERE city.countrycode = lang.countrycode        AND lang.isOfficial = 'T')      WHERE Language='Dutch';
Operations on Views (p2) UPDATE, INSERT not always possible, except Can sometimes be implemented using INSTEAD OF triggers Modifications are permitted when the view is derived from a single table R and The WHERE clause does not involve R in a Subquery The FROM clause can only consist of one occurrence of R The valued of all attributes not specified in the view definition can be ‘manufactured’ by the database Example. For the view ct CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry FROM city c WHERE population > 0      the query INSERT INTO ct (nm, cntry) values ('FirSPA', 'FIN')       can be automatically rewritten as  INSERT INTO CITY (Name, CountryCode) values ('FirSPA', 'FIN')
Top-Down Datalog Recursion Revisited IDB’s are conceptualized (and implemented) as Views for IDB predicate p(x,y, …) 	FOR EACH subgoal of p DO 	  IF subgoal is IDB, recursive call; 	  IF subgoal is EDB, look up
Indexes Main Idea: Data Structures for Fast Search Motivation: Preventing the need for linear search through a big table Example query:  SELECT * FROM City WHERE CountryCode = 'FIN'; Another:   SELECT * FROM City    WHERE Population > (0.4 * (     SELECT Population FROM Country      WHERE CountryCode = Code)); Expected time for first example: O(n). For the second, O(n2) Declaration CREATE INDEX CityIndex ON City(CountryCode); CREATE INDEX CityPopIndex ON City(Population); CREATE INDEX CountryPopIndex ON Country(Population);
Selection of Indexes (p1) Why not create an index for every attribute? Useful indexes, and not so useful ones Primary key? Unique key? From previous examples,  CityIndex? CityPopIndex? CountryPopIndex?
Selection of Indexes (p2) The Mantra: Don’t define indexes too early: know your workload first Be as empirical as is practical The Greedy approach to index selection: Start with no indexes Evaluate candidate indexes, choose the one potentially most effective Repeat Query execution will take advantage of defined indexes
CS 542 Database Management Systems Report Proposals J Singh  January 31, 2011
Report Proposals – General Observations Simply Impressive! Corrective Themes When in doubt, prefer depth over breadth Tilt the balance toward obtaining and working with real data Focus on your contributions Separate the report from the project If your intent in the project is to do a significant piece of development, make the report about the design Go light on implementation; toy application is good to get your feet wet but leave the heavy lifting for the project For big papers, don’t try to swallow it whole. Take a piece and focus on that.
Next meeting February 7 Index Structures, Chapter 14

More Related Content

What's hot

Getting Started with Regular Expressions In MarcEdit
Getting Started with Regular Expressions In MarcEditGetting Started with Regular Expressions In MarcEdit
Getting Started with Regular Expressions In MarcEdit
Terry Reese
 
SQL JOINS
SQL JOINSSQL JOINS
SQL JOINS
Swapnali Pawar
 
Join sql
Join sqlJoin sql
Join sql
Vikas Gupta
 
How to Use VLOOKUP in Excel
How to Use VLOOKUP in ExcelHow to Use VLOOKUP in Excel
How to Use VLOOKUP in Excel
Milorad Krstevski
 
Sql join
Sql  joinSql  join
Sql join
Vikas Gupta
 
Sql joins inner join self join outer joins
Sql joins inner join self join outer joinsSql joins inner join self join outer joins
Sql joins inner join self join outer joins
Deepthi Rachumallu
 
Insertion in singly linked list
Insertion in singly linked listInsertion in singly linked list
Insertion in singly linked list
Keval Bhogayata
 
linked lists in data structures
linked lists in data structureslinked lists in data structures
linked lists in data structures
DurgaDeviCbit
 
Types Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql ServerTypes Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql Server
programmings guru
 
SQL JOINS- Reena P V
SQL JOINS- Reena P VSQL JOINS- Reena P V
SQL JOINS- Reena P V
Dipayan Sarkar
 
linked list in data structure
linked list in data structure linked list in data structure
linked list in data structure
shameen khan
 
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
Terry Reese
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
DataminingTools Inc
 
Linked list
Linked listLinked list
Linked list
FURQAN M LODHI
 
Sql joins
Sql joinsSql joins
Sql joins
Berkeley
 
Doubly Linked List
Doubly Linked ListDoubly Linked List
Doubly Linked List
Ninad Mankar
 
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
Balwant Gorad
 

What's hot (20)

Getting Started with Regular Expressions In MarcEdit
Getting Started with Regular Expressions In MarcEditGetting Started with Regular Expressions In MarcEdit
Getting Started with Regular Expressions In MarcEdit
 
SQL JOINS
SQL JOINSSQL JOINS
SQL JOINS
 
Join sql
Join sqlJoin sql
Join sql
 
Index Tuning
Index TuningIndex Tuning
Index Tuning
 
How to Use VLOOKUP in Excel
How to Use VLOOKUP in ExcelHow to Use VLOOKUP in Excel
How to Use VLOOKUP in Excel
 
Sql join
Sql  joinSql  join
Sql join
 
Sql joins inner join self join outer joins
Sql joins inner join self join outer joinsSql joins inner join self join outer joins
Sql joins inner join self join outer joins
 
Insertion in singly linked list
Insertion in singly linked listInsertion in singly linked list
Insertion in singly linked list
 
linked lists in data structures
linked lists in data structureslinked lists in data structures
linked lists in data structures
 
single linked list
single linked listsingle linked list
single linked list
 
linklisr
linklisrlinklisr
linklisr
 
Types Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql ServerTypes Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql Server
 
SQL JOINS- Reena P V
SQL JOINS- Reena P VSQL JOINS- Reena P V
SQL JOINS- Reena P V
 
linked list in data structure
linked list in data structure linked list in data structure
linked list in data structure
 
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
 
Linked list
Linked listLinked list
Linked list
 
Sql joins
Sql joinsSql joins
Sql joins
 
Doubly Linked List
Doubly Linked ListDoubly Linked List
Doubly Linked List
 
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
Linked List, Types of Linked LIst, Various Operations, Applications of Linked...
 

Viewers also liked

Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema Gajjar
Reema Gajjar
 
12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMSkoolkampus
 
B+Tree Indexes and InnoDB
B+Tree Indexes and InnoDBB+Tree Indexes and InnoDB
B+Tree Indexes and InnoDB
Ovais Tariq
 
Indexing structure for files
Indexing structure for filesIndexing structure for files
Indexing structure for files
Zainab Almugbel
 
Indexing and-hashing
Indexing and-hashingIndexing and-hashing
Indexing and-hashingAmi Ranjit
 
5013 Indexing Presentation
5013 Indexing Presentation5013 Indexing Presentation
5013 Indexing Presentation
lmartin8
 

Viewers also liked (7)

Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema Gajjar
 
12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS
 
Indexing Data Structure
Indexing Data StructureIndexing Data Structure
Indexing Data Structure
 
B+Tree Indexes and InnoDB
B+Tree Indexes and InnoDBB+Tree Indexes and InnoDB
B+Tree Indexes and InnoDB
 
Indexing structure for files
Indexing structure for filesIndexing structure for files
Indexing structure for files
 
Indexing and-hashing
Indexing and-hashingIndexing and-hashing
Indexing and-hashing
 
5013 Indexing Presentation
5013 Indexing Presentation5013 Indexing Presentation
5013 Indexing Presentation
 

Similar to CS 542 Database Index Structures

CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceJ Singh
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processingJ Singh
 
PostThis
PostThisPostThis
PostThis
testingphase
 
Writeable ct es_pgcon_may_2011
Writeable ct es_pgcon_may_2011Writeable ct es_pgcon_may_2011
Writeable ct es_pgcon_may_2011
David Fetter
 
Linq intro
Linq introLinq intro
Linq intro
Bình Trọng Án
 
SQL
SQLSQL
Sql
SqlSql
DConf 2016 std.database (a proposed interface & implementation)
DConf 2016 std.database (a proposed interface & implementation)DConf 2016 std.database (a proposed interface & implementation)
DConf 2016 std.database (a proposed interface & implementation)
cruisercoder
 
PDBC
PDBCPDBC
PDBC
Sunil OS
 
Language Integrated Query By Nyros Developer
Language Integrated Query By Nyros DeveloperLanguage Integrated Query By Nyros Developer
Language Integrated Query By Nyros Developer
Nyros Technologies
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
HosniJuarez2
 
Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
Nitesh Singh
 
Old Oracle Versions
Old Oracle VersionsOld Oracle Versions
Old Oracle VersionsJeffrey Kemp
 
The STL
The STLThe STL
The STL
adil raja
 
Embedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for JavaEmbedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for JavaJevgeni Kabanov
 
Functional Principles for OO Developers
Functional Principles for OO DevelopersFunctional Principles for OO Developers
Functional Principles for OO Developers
jessitron
 
ORACLE PL SQL
ORACLE PL SQLORACLE PL SQL
ORACLE PL SQL
Srinath Maharana
 

Similar to CS 542 Database Index Structures (20)

CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and Performance
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processing
 
PostThis
PostThisPostThis
PostThis
 
Writeable ct es_pgcon_may_2011
Writeable ct es_pgcon_may_2011Writeable ct es_pgcon_may_2011
Writeable ct es_pgcon_may_2011
 
Writeable CTEs: The Next Big Thing
Writeable CTEs: The Next Big ThingWriteable CTEs: The Next Big Thing
Writeable CTEs: The Next Big Thing
 
Linq intro
Linq introLinq intro
Linq intro
 
SQL
SQLSQL
SQL
 
Sql
SqlSql
Sql
 
DConf 2016 std.database (a proposed interface & implementation)
DConf 2016 std.database (a proposed interface & implementation)DConf 2016 std.database (a proposed interface & implementation)
DConf 2016 std.database (a proposed interface & implementation)
 
Less08 Schema
Less08 SchemaLess08 Schema
Less08 Schema
 
Sql 2006
Sql 2006Sql 2006
Sql 2006
 
PDBC
PDBCPDBC
PDBC
 
Language Integrated Query By Nyros Developer
Language Integrated Query By Nyros DeveloperLanguage Integrated Query By Nyros Developer
Language Integrated Query By Nyros Developer
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
 
Old Oracle Versions
Old Oracle VersionsOld Oracle Versions
Old Oracle Versions
 
The STL
The STLThe STL
The STL
 
Embedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for JavaEmbedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for Java
 
Functional Principles for OO Developers
Functional Principles for OO DevelopersFunctional Principles for OO Developers
Functional Principles for OO Developers
 
ORACLE PL SQL
ORACLE PL SQLORACLE PL SQL
ORACLE PL SQL
 

More from J Singh

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashing
J Singh
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
J Singh
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 update
J Singh
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engineJ Singh
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)J Singh
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsJ Singh
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
J Singh
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data Laboratory
J Singh
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop EcosystemJ Singh
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map Reduce
J Singh
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
J Singh
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
J Singh
 
CS 542 -- Concurrency Control, Distributed Commit
CS 542 -- Concurrency Control, Distributed CommitCS 542 -- Concurrency Control, Distributed Commit
CS 542 -- Concurrency Control, Distributed CommitJ Singh
 
CS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlCS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlJ Singh
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query ExecutionJ Singh
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementJ Singh
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 
CS 542 Introduction
CS 542 IntroductionCS 542 Introduction
CS 542 IntroductionJ Singh
 
Cloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointCloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointJ Singh
 

More from J Singh (20)

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashing
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 update
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engine
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and Tradeoffs
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data Laboratory
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map Reduce
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
CS 542 -- Concurrency Control, Distributed Commit
CS 542 -- Concurrency Control, Distributed CommitCS 542 -- Concurrency Control, Distributed Commit
CS 542 -- Concurrency Control, Distributed Commit
 
CS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlCS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency Control
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
CS 542 Introduction
CS 542 IntroductionCS 542 Introduction
CS 542 Introduction
 
Cloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointCloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's Viewpoint
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

CS 542 Database Index Structures

  • 1. CS 542 Database Management Systems Controlling Database Integrity and Performance J Singh January 31, 2011
  • 2. Today’s Topics Database Integrity Primary Key Constraints – Prevent Duplicates Foreign Key Constraints – Prevent Dangling References Attribute Constraints – Prevent Inconsistent Attribute Values Tuple Constraints – More vigilant checking of attribute values Assertions – Paranoid integrity checking Views Performance Topics Indexes Discussion of presentation topic proposals
  • 3. Primary Key Constraints What are Primary Keys good for? Uniquely identify the subject of each tuple Ensure that there are no duplicates Cannot be null – that would imply a NULL subject. A table may not have more than one primary key A Primary Key may consist of one or more columns Multiple Unique keys are OK For Table R, <P1, P2, …, Pm> together constitute a primary key if for each tuple in R, <P1, P2, …, Pm> are unique P1, P2, …, Pm are non-null <U1, U2, …, Um> together constitute a unique key if for each tuple in R, <U1, U2, …, Um> are unique But U1, U2, …, Umcan be null
  • 4. Foreign Key Constraints (p1) Main Idea: Prevent Dangling Tuples Foreign Key Key Reference Foreign Key Must point to a Key Reference CREATE TABLE City ( :: CountryCode char(3) REFERENCES Country(Code) ) Key Reference Must be unique or primary key Try: INSERT INTO city (Name, CountryCode) value ('xyzzy', 'XYZ'); Try: UPDATE city set CountryCode='XYZ' where CountryCode='FIN'; Key reference must already exist before a referencing tuple can be added
  • 5. Foreign Key Constraints (p2) Alternative methods of defining a foreign key CREATE TABLE City ( CountryCode char(3) REFERENCES COUNTRY(Code), …) CREATE TABLE City ( CountryCode char(3), …, FOREIGN KEY CountryCode [CONSTRAINT [ctyREFcntry]] REFERENCES COUNTRY(Code)) CREATE TABLE City ( CountryCode char(3), …) Then, later, ALTER TABLE City ADD [CONSTRAINT [ctyREFcntry]] FOREIGN KEY CountryCode REFERENCES COUNTRY(Code); Notation: [] signifies optional
  • 6. Foreign Key Constraints (p3) Foreign Key Key Reference Referential Integrity Options Restrict (default) Reject request Cascade Reflect changes back Set Null Set the foreign key to NULL Changes to Key References Try: DELETE FROM country WHERE code=‘FIN’; Try: UPDATE country SET Code='XYZ' WHERE Code='FIN‘;
  • 7. Foreign Key Constraints (p4) Chicken and Egg definitions CREATE TABLE chicken ( cID INT PRIMARY KEY, eID INT REFERENCES egg(eID)); CREATE TABLE egg( eID INT PRIMARY KEY, cID INT REFERENCES chicken(cID)); Consistently fails Can’t define a foreign key to a table before it has been defined Solution Define the tables w/o constraints CREATE TABLE chicken( cID INT PRIMARY KEY, eID INT); CREATE TABLE egg( eID INT PRIMARY KEY, cID INT); And then add foreign keys ALTER TABLE chicken ADD CONSTRAINT c_e FOREIGN KEY (eID) REFERENCES egg(eID); ALTER TABLE egg ADD CONSTRAINT e_c FOREIGN KEY (cID) REFERENCES chicken(cID);
  • 8. Foreign Key Constraints (p5) Chicken and Egg insertion INSERT INTO chicken VALUES(1, 1001); INSERT INTO egg VALUES(1001, 1); Still consistently fails Need a way to postpone constraint checking How long to postpone? Until transaction commit Solution Define the tables with deferred constraint-checking ALTER TABLE chicken ADD CONSTRAINT c_e FOREIGN KEY (eID) REFERENCES egg(eID) INITIALLY DEFERRED DEFERRABLE; ALTER TABLE egg ADD CONSTRAINT e_c FOREIGN KEY (cID) REFERENCES chicken(cID) INITIALLY DEFERRED DEFERRABLE; And then INSERT INTO chicken VALUES(1, 1001); INSERT INTO egg VALUES(1001, 1); COMMIT;
  • 9. Attribute-Based Constraints NOT NULL The most common Reasonability Constraints Validate incoming data? e.g., Population Density < 30000 Specification: Population INT(11) NOT NULL CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when assigned Can be violated underneath as long as it is not re-evaluated For example, if we update SurfaceArea, the violation won’t be flagged Not implemented in all databases, e.g., MySQL
  • 10. Tuple-Based Constraints Validate the entire tuple whenever anything in that tuple is updated More integrity enforcement than with attribute-based constraints e.g., Population Density <= 30000 Specification: Population INT(11) NOT NULL, CHECK (Population <= 30000 * SurfaceArea), The condition in CHECK(cond) can take any value that a condition in WHERE(cond) can take Including subqueries The attribute constraint is checked when tuple is updated If we update SurfaceArea, the violation will be flagged But the violation of CHECK (Population > ( SELECT SUM(Population) FROM City WHERE City.CountryCode = Code)) which specifies a subquery involving another table, will not be flagged Not implemented in all databases, e.g., MySQL
  • 11. Assertions Validate the entire database whenever anything in the database is updated Part of the database, not any specific table Specification: Table-like CREATE ASSERTION CountryPop CHECK ( NOT EXISTS (SELECT * FROM Country WHERE Population < (SELECT SUM(Population) FROM City WHERE City.CountryCode = Code))) Difficult to implement efficiently Often not implemented I don’t know of any implementations Can be implemented for specific cases using Triggers, see Section 7.5
  • 12. Views Also called Virtual Views Don’t actually exist in the database but behave as if they do Can be subsets of the data or joins – actually, arbitrary queries Subset example, CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry FROM city c WHERE population > 0 Join example CREATE VIEW CityLanguage as SELECT city.name, city.countrycode, lang.language as Language FROM city, countrylanguage as lang WHERE city.countrycode = lang.countrycode AND lang.isOfficial = ‘T‘;
  • 13. Operations on Views (p1) SELECT SELECT * FROM CityLanguage WHERE Language='Dutch'; Shouldn’t ‘temporarily’ create the table and SELECT from it. Should use the definition of CityLanguage to make a query, i.e., SELECT * FROM (SELECT …blabla… FROM city, countrylanguage as lang WHERE city.countrycode = lang.countrycode AND lang.isOfficial = 'T') WHERE Language='Dutch';
  • 14. Operations on Views (p2) UPDATE, INSERT not always possible, except Can sometimes be implemented using INSTEAD OF triggers Modifications are permitted when the view is derived from a single table R and The WHERE clause does not involve R in a Subquery The FROM clause can only consist of one occurrence of R The valued of all attributes not specified in the view definition can be ‘manufactured’ by the database Example. For the view ct CREATE VIEW ct AS SELECT c.Name AS nm, c.countrycode AS cntry FROM city c WHERE population > 0 the query INSERT INTO ct (nm, cntry) values ('FirSPA', 'FIN') can be automatically rewritten as INSERT INTO CITY (Name, CountryCode) values ('FirSPA', 'FIN')
  • 15. Top-Down Datalog Recursion Revisited IDB’s are conceptualized (and implemented) as Views for IDB predicate p(x,y, …) FOR EACH subgoal of p DO IF subgoal is IDB, recursive call; IF subgoal is EDB, look up
  • 16. Indexes Main Idea: Data Structures for Fast Search Motivation: Preventing the need for linear search through a big table Example query: SELECT * FROM City WHERE CountryCode = 'FIN'; Another: SELECT * FROM City WHERE Population > (0.4 * ( SELECT Population FROM Country WHERE CountryCode = Code)); Expected time for first example: O(n). For the second, O(n2) Declaration CREATE INDEX CityIndex ON City(CountryCode); CREATE INDEX CityPopIndex ON City(Population); CREATE INDEX CountryPopIndex ON Country(Population);
  • 17. Selection of Indexes (p1) Why not create an index for every attribute? Useful indexes, and not so useful ones Primary key? Unique key? From previous examples, CityIndex? CityPopIndex? CountryPopIndex?
  • 18. Selection of Indexes (p2) The Mantra: Don’t define indexes too early: know your workload first Be as empirical as is practical The Greedy approach to index selection: Start with no indexes Evaluate candidate indexes, choose the one potentially most effective Repeat Query execution will take advantage of defined indexes
  • 19. CS 542 Database Management Systems Report Proposals J Singh January 31, 2011
  • 20. Report Proposals – General Observations Simply Impressive! Corrective Themes When in doubt, prefer depth over breadth Tilt the balance toward obtaining and working with real data Focus on your contributions Separate the report from the project If your intent in the project is to do a significant piece of development, make the report about the design Go light on implementation; toy application is good to get your feet wet but leave the heavy lifting for the project For big papers, don’t try to swallow it whole. Take a piece and focus on that.
  • 21. Next meeting February 7 Index Structures, Chapter 14