Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

1,465 views

Published on

Toon Koppelaars presentation on Oracle database constraints

Published in: Software
  • Be the first to comment

Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

  1. 1. Why SQL DBMS’s Still Lack Full Declarative Constraint Support (A polite excuse) Toon Koppelaars
  2. 2. Who am I? •A database guy, relational dude –Developer (with DBA experience) •Oracle technology user since 1987 •Co-author of this book  •Today’s talk = last chapter
  3. 3. Agenda •SQL Assertions? –A few examples •Validation Execution Models –More efficient along the way •Serializability – Probably the biggest problem •Conclusions
  4. 4. SQL Assertions? •Examples 1.There is at most one president 2.One cannot manage more than two departments 3.Department with president and/or manager should have an administrator  Data integrity constraints –Just like: CHECK, PK, UK, FK –‘all the other ones’ They constrain data allowed in our tables They constrain data allowed in our tables
  5. 5. Syntax? •Create assertion [some_name] as check([some_SQL_expression]); •For over 25 years part of Ansi/Iso SQL standard
  6. 6. Example 1 •create assertion at_most_one_president as check (1 => (select count(*) from EMP e where e.JOB = ‘PRESIDENT’ ) ); •Task of DBMS: make sure it’s true at all times - Closed SQL expression (no free variables) - Evaluates to true or false
  7. 7. Example 2 •create assertion cannot_manage_more_than_2 as check (not exists (select ‘x’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2 ) );
  8. 8. Example 3 •create assertion admin_in_dept_with_vip as check (not exists (select ‘x’ from (select distinct e.DEPTNO from EMP e) e where exists (select ‘y’ from EMP e1 where e1.DEPTNO = e.DEPTNO and e1.JOB in (‘P’,’M’)) and not exists (select ‘z’ from EMP e2 where e2.DEPTNO = e.DEPTNO and e2.JOB = ‘ADMIN’) ) );
  9. 9. •Can this be the killer feature of OracleXIII?
  10. 10. Imagine... •If DBMS vendor would support these –How much less lines of code one would need to write in application development –How much less bugs this results into –How easy we could accomodate change requested by the business –How data quality would improve
  11. 11. Point to be made •SQL assertions ‘cover’ all the other declarative constraints available –CHECK  can be written as assertion –PK/UK  can be written as assertion –FK  can be written as assertion •We just have shorthands for these, since these are so common in every database design
  12. 12. Check writen as assertion •create assertion hire_only_18_plus as check (not exists (select ‘x’ from EMP e where e.HIRED - e.BORN < 18*365 ) ); CHECK((HIRED – BORN) >= 18*365)
  13. 13. PK/UK written as assertion •create assertion empno_is_unique as check (not exists (select ‘x’ from EMP e1 ,EMP e2 where e1.EMPNO = e2.EMPNO and e1.rowid != e2.rowid ) ); PRIMARY KEY (EMPNO)
  14. 14. FK written as assertion •create assertion work_in_known_dept as check (not exists (select ‘x’ from EMP e where not exists (select ‘y’ from DEPT d where d.deptno = e.deptno ) ) ); FOREIGN KEY (DEPTNO) REFERENCES DEPT(DEPTNO)
  15. 15. Scientific Problem •Why is CREATE ASSERTION not supported? •Think about this: –The DBMS has our assertion expression but: •Would you accept full evaluation of given expression for every insert/update/delete? –No. •Performance would be horrible in real world db’s •You require a much more sophisticated execution model in the real world
  16. 16. Scientific Problem •The challenge is: –Given: arbitrary complex assertion expression + –Given: arbitrary complex SQL statement  –What is the minimum required check that needs to be executed by the DBMS for the assertion to remain true? •Can be ‘nothing’ if assertion is immune for the given DML
  17. 17. For instance... •Department that employs president or manager should also employ administrator –Obviously only when DML operates on EMP 1.Insert into emp values(100,’Smith’,’SALESREP’ ,’1/1/80’,’20/11/07’,7000,20); 2.Insert into emp values(:b0,:b1,:b2 ,:b3,:b4,:b5,:b6); 3.Insert into emp (select * from emp_loaded); 4.Update emp set msal=1.05*msal; 5.Delete from emp where empno=101;
  18. 18. Execution Models •Using triggers: –EM1: always –EM2: involved tables –EM3: involved columns –EM4: table polarities –EM5: involved literals (TE) –EM6: TE + minimal check
  19. 19. Execution Model 1 •Evaluate every assertion on every DML statement –Evaluating every <boolean expression with subqueries> –On every DML statement •In after-statement table trigger –WOULD BE *VERY* INEFFICIENT •Let’s quickly forget this “EM1”
  20. 20. Execution Model 2 •Only evaluate the assertions that involve the table that is being “DML-led” –Finding involved tables by parsing the assertion expression –Per assertion 100% code generation of a (insert/update/delete) “after statement” table trigger •Select from dual where [expression] •Found?  OK •Not found  raise_application_error
  21. 21. Execution Model 2 •Using examples 2 and 3: –Admin_in_dept_with_vip  EMP table –Cannot_manage_more_than_2  DEPT table
  22. 22. Execution Model 2 create trigger EMP_AIUDS_EMP01 after insert or update or delete on EMP declare pl_dummy varchar(40); begin -- select 'Constraint EMP01 is satisfied' into pl_dummy from DUAL where not exists(select ‘a violation’ from (select distinct deptno from EMP) d where exists(select e2.* from EMP e2 where e2.DEPTNO = d.DEPTNO and e2.JOB in ('PRESIDENT','MANAGER')) and not exists(select e3.* from EMP e3 where e3.DEPTNO = d.DEPTNO and e3.JOB = 'ADMIN')); -- exception when no_data_found then raise_application_error(-20999,'Constraint EMP01 is violated.'); end; Assertion predicate
  23. 23. Execution Model 2 create trigger DEPT_AIUDS_DEPT01 after insert or update or delete on DEPT declare pl_dummy varchar(1); begin -- select ‘a' into pl_dummy from DUAL where not exists (select ‘a violation’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2)); -- exception when no_data_found then raise_application_error(-20999,'Constraint DEPT01 is violated.'); end; EM2 could be supported declaratively
  24. 24. Execution Model 2 •Inefficiencies: –EMP01 and DEPT01 are checked when updating columns that are not involved, for instance: •Updating EMP.ENAME •Updating DEPT.LOC
  25. 25. Execution Model 3 •For inserts + deletes: EM3 == EM2 •For updates: only evaluate assertions that involve *columns* being changed –Simple parse will find columns –Assumes ‘clean specification’ •Create trigger syntax allows specification of columns that are being changed
  26. 26. Execution Model 3 create trigger EMP_AUS_EMP01 after update of DEPTNO,JOB on EMP declare pl_dummy varchar(40); Begin -- select 'Constraint EMP01 is satisfied' into pl_dummy from DUAL where not exists(select ‘department in violation’ from (select distinct deptno from EMP) d where exists(select e2.* from EMP e2 where e2.DEPTNO = d.DEPTNO and e2.JOB in ('PRESIDENT','MANAGER')) and not exists(select e3.* from EMP e3 where e3.DEPTNO = d.DEPTNO and e3.JOB = 'ADMIN')); -- exception when no_data_found then raise_application_error(-20999,'Constraint EMP01 is violated.'); end;
  27. 27. Execution Model 3 create trigger DEPT_AIUDS_DEPT01 after update of MGR on DEPT declare pl_dummy varchar(40); begin -- select ‘a' into pl_dummy from DUAL where not exists (select ‘x’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2)); -- exception when no_data_found then raise_application_error(-20999,'Constraint DEPT01 is violated.'); end; EM3 could be supported declaratively
  28. 28. Execution Model 3 •Inefficiencies: –Sometimes inserts (e)or deletes can never violate a constraint –For Cannot_manage_more_than_2, deleting a department does not require re-validation –For Admin_in_dept_with_vip, both inserts and deletes do require re- validation
  29. 29. Execution Model 4 •For updates: EM4 = EM3 •For deletes and inserts EM4 drops unnecessary delete (e)or insert table triggers –Polarity of a table for a given constraint •Positive: inserts can violate •Negative: deletes can violate •Neutral: both can violate •Undefined: table is not involved Polarity of table for given constraint can be computed via special parsing
  30. 30. Execution Model 4 •EM4 maintains: –All EM3 triggers, except for one: •Drops: –Cannot_manage_more_than_2 delete trigger EM4 could be supported declaratively
  31. 31. Execution Model 4 •Inefficiencies: –Admin_in_dept_with_vip: eg. inserting SALESMAN, or deleting TRAINER does not require re-validation •Start involving literals mentioned in assertions •If assertion does not have literals then next EM is same as EM4 –Cannot_manage_more_than_2 has no literals
  32. 32. Execution Model 5 •How do we see: –Salesman is inserted? Trainer deleted? •Could parse the DML statement –But does not always work due to absence of literals •Make use of column values of affected rows –Requires “Transition Effect” (TE) of a DML statement •Common concept (aka. “delta”-tables) •Inserted_rows, Updated_rows, Deleted_rows –Maintaining TE is straightforward (see book) •EM5: Only check the assertion when a property holds in the TE
  33. 33. Execution Model 5 •TE property for Admin_in_dept_with_vip: 1.Inserted_rows holds a president or a manager or, 2.Deleted_rows holds an admininstrator 3.Updated rows shows that ...
  34. 34. Execution Model 5 create trigger EMP_AIS_EMP01 after insert on EMP declare pl_dummy varchar(40); begin -- If this returns no rows, then EMP01 cannot be violated. select 'EMP01 must be validated' into pl_dummy from DUAL where exists (select 'A president or manager has just been inserted' from inserted_rows where JOB in ('PRESIDENT','MANAGER')); -- begin -- <same trigger code as EM4> -- end; exception when no_data_found then -- No need to validate EMP01. null; -- end;
  35. 35. Execution Model 5 create trigger EMP_ADS_EMP01 after delete on EMP declare pl_dummy varchar(40); begin -- If this returns no rows, then EMP01 cannot be violated. select 'EMP01 must be validated' into pl_dummy from DUAL where exists (select 'An administrator has just been deleted' from deleted_rows where JOB = 'ADMIN'); -- begin -- <same trigger code as EM4> -- end; exception when no_data_found then -- No need to validate EMP01. null; -- end;
  36. 36. Execution Model 5 •Update TE-property for EMP01: select 'EMP01 is in need of validation' from DUAL where exists (select 'Some department just won a president/ manager or just lost an administrator' from updated_rows where (n_job in ('PRESIDENT','MANAGER') and o_job not in ('PRESIDENT','MANAGER') or (o_job='ADMIN' and n_job<>'ADMIN') or (o_deptno<>n_deptno and (o_job='ADMIN' or n_job in ('PRESIDENT','MANAGER'))) •Can be deduced from insert + delete properties •EM5 fully declarative too? –Here it gets complex...
  37. 37. Execution Model 5 •Inefficiencies: –Admin_in_dept_with_vip: triggers validate all departments •Unacceptable in real-world databases •Only some require re-validation –Cannot_manage_more_than_2: triggers validate all department managers •Unacceptable in real-world databases •Only some require re-validation
  38. 38. Execution Model 6 •On TE-property + optimized validation query –Use the TE-query to find: •Which deptno-values require re-validation •Which mgr-values require re-validation –Then use these values in the assertion- expression
  39. 39. create trigger EMP_AIS_EMP01 after insert on EMP declare pl_dummy varchar(40); begin -- for r in (select distinct deptno from inserted_rows where JOB in ('PRESIDENT','MANAGER')); loop begin -- Note: this now uses r.deptno value from preceeding TE-query. select 'Constraint EMP01 is satisfied' into pl_dummy from DUAL where not exists(select ‘department in violation’ from (select distinct deptno from EMP where deptno = r.deptno) d where exists(select e2.* from EMP e2 where e2.DEPTNO = d.DEPTNO and e2.JOB in ('PRESIDENT','MANAGER')) and not exists(select e3.* from EMP e3 where e3.DEPTNO = d.DEPTNO and e3.JOB = 'ADMIN')); -- exception when no_data_found then -- raise_application_error(-20999, 'Constraint EMP01 is violated for department '||to_char(r.deptno)||'.'); -- end; end loop; end;
  40. 40. Execution Model 6 •This requires: –Detecting that the ASSERTION can be (re)written as a universal quantification –Can sometimes be done in multiple ways •Which to choose? –EM6 fully declarative? •Complexity introduced in EM5 further increases •Remember: given any arbitrary complex assertion + dml-statement
  41. 41. Still not there yet... •Then there is something else too... –Which is often overseen by database professionals –And, which is neglected in every research paper (I’ve read...) that deals with generating constraint validation code  Serializability
  42. 42. Concurrent Transactions •Deptno 13 has two admins and one manager –TX1 deletes an admin from 13 •Does not yet commit –TX2 deletes the other admin 13 •Commits –TX1 commits •Constraint is violated for deptno 13! –TX1 and TX2 must be serialized
  43. 43. Concurrent Transactions •Note: this is *not* about locking rows of data, but rather: locking a constraint No two TX’s can validate at same time •We can use DBMS_LOCK to serialize these transactions –See book for example code •Again complexity further increases
  44. 44. Concurrent Transactions •Concurrency impact of acquiring rule locks –EM1: One TX at a time –EM2: One TX per table at a time –EM3, EM4, EM5 slowly relaxes •Up to EM5: not acceptable –EM6: Only if two TX’s actually validate *and* involve same deptno (EMP01 assertion) •Acceptable
  45. 45. Another complicating factor •Deferrabilty... –Involves temporarily storing violation cases for re-evaluation at commit time –More comments on that in chapter 11 of the book
  46. 46. The Polite Excuse •Inefficient EM’s –Could be supported, but: •Are unacceptable wgt. Performance & transaction concurrency •Efficient EM’s –Aligned with business requirements •Performance and TX concurrency But, –Need more research to determine if they could be declaratively supported
  47. 47. Questions? •For slidedeck email me at toon@rulegen.com •Twitter @ToonKoppelaars

×