SlideShare a Scribd company logo
1 of 250
Download to read offline
SQL Antipatterns
                           Strike Back
                             Bill Karwin

                                  1
Monday, April 20, 2009                      1
Antipattern Categories
                          Database Design                 Database Creation
                           Antipatterns                     Antipatterns
                                                          CREATE TABLE BugsProducts (
                                                             bug_id INTEGER REFERENCES Bugs,
                                                             product VARCHAR(100) REFERENCES Products,
                                                             PRIMARY KEY (bug_id, product)
                                                          );




                                 Query                           Application
                               Antipatterns                      Antipatterns
                         SELECT b.product, COUNT(*)       $dbHandle = new PDO(‘mysql:dbname=test’);
                         FROM BugsProducts AS b           $stmt = $dbHandle->prepare($sql);
                         GROUP BY b.product;              $result = $stmt->fetchAll();




                                                      2
Monday, April 20, 2009                                                                                   2
Antipattern Categories
                                                          Database Creation
                          Database Design
                                                            Antipatterns
                           Antipatterns
                                                          CREATE TABLE BugsProducts (
                                                             bug_id INTEGER REFERENCES Bugs,
                                                             product VARCHAR(100) REFERENCES Products,
                                                             PRIMARY KEY (bug_id, product)
                                                          );




                                                                 Application
                                 Query
                                                                 Antipatterns
                               Antipatterns
                         SELECT b.product, COUNT(*)       $dbHandle = new PDO(‘mysql:dbname=test’);
                         FROM BugsProducts AS b           $stmt = $dbHandle->prepare($sql);
                         GROUP BY b.product;              $result = $stmt->fetchAll();




                                                      3
Monday, April 20, 2009                                                                                   3
Database Design Antipatterns


                     1. Metadata Tribbles
                     2. Entity-Attribute-Value
                     3. Polymorphic Associations
                     4. Naive Trees



                                         4
Monday, April 20, 2009                              4
Metadata Tribbles
                         I want these things off the ship. I don’t care if it takes
                            every last man we’ve got, I want them off the ship.
                                                               — James T. Kirk



                                                     5
Monday, April 20, 2009                                                                5
Metadata Tribbles



                     • Objective:       improve performance
                         of a very large table.




                                           6
Monday, April 20, 2009                                        6
Metadata Tribbles


                     • Antipattern:        separate into many tables
                         with similar structure
                         •   Separate tables per distinct value in attribute

                         •   e.g., per year, per month, per user,
                             per postal code, etc.




                                                  7
Monday, April 20, 2009                                                         7
Metadata Tribbles

                     • Must create a new table for each new value
                         CREATE TABLE Bugs_2005 ( . . . );
                         CREATE TABLE Bugs_2006 ( . . . );
                         CREATE TABLE Bugs_2007 ( . . . );
                         CREATE TABLE Bugs_2008 ( . . . );
                                                              mixing data
                         ...
                                                             with metadata




                                                 8
Monday, April 20, 2009                                                       8
Metadata Tribbles

                     • Automatic primary keys cause conflicts:
                         CREATE TABLE Bugs_2005 (bug_id SERIAL . . . );
                         CREATE TABLE Bugs_2006 (bug_id SERIAL . . . );
                         CREATE TABLE Bugs_2007 (bug_id SERIAL . . . );
                         CREATE TABLE Bugs_2008 (bug_id SERIAL . . . );
                         ...
                                          same values allocated
                                            in multiple tables



                                                    9
Monday, April 20, 2009                                                    9
Metadata Tribbles

                     • Difficult to query across tables
                         SELECT b.status, COUNT(*) AS count_per_status
                         FROM (
                         
   SELECT * FROM Bugs_2009
                         
    UNION
                         
   SELECT * FROM Bugs_2008
                         
    UNION
                         
   SELECT * FROM Bugs_2007
                         
    UNION
                         
   SELECT * FROM Bugs_2006 ) AS b
                         GROUP BY b.status;



                                               10
Monday, April 20, 2009                                                   10
Metadata Tribbles


                     • Table structures are not kept in sync
                             ALTER TABLE Bugs_2009
                             
 ADD COLUMN hours NUMERIC;

                         •   Prior tables don’t contain new column

                         •   Dissimilar tables can’t be combined with UNION




                                                 11
Monday, April 20, 2009                                                        11
Metadata Tribbles
                     • Solution #1:            use horizontal partitioning
                         •   Physically split, while logically whole

                         •   MySQL 5.1 supports partitioning

                                                                BUGS
                                                                 (2007)


                                                                       BUGS
                             BUGS                                      (2008)


                                                                           BUGS
                                                                            (2009)


                                                   12
Monday, April 20, 2009                                                               12
Metadata Tribbles
                     • Solution #2:         use vertical partitioning
                         •   Move bulky and seldom-used columns to a
                             second table in one-to-one relationship

                                            ( 1 .. 1 )
                         PRODUCTS                          INSTALLERS




                                                13
Monday, April 20, 2009                                                  13
Metadata Tribbles


                     • Columns can also be tribbles:
                         CREATE TABLE Bugs (
                         
   bug_id

 
 SERIAL PRIMARY KEY,
                         
   ...
                         
   product_id1
 BIGINT,
                         
   product_id2
 BIGINT,
                         
   product_id3
 BIGINT
                         );




                                             14
Monday, April 20, 2009                                        14
Metadata Tribbles
                     • Solution #3:       add a dependent table
                         CREATE TABLE BugsProducts (
                         
   bug_id

 
 BIGINT REFERENCES bugs,
                         
   product_id
 BIGINT REFERENCES products,
                         
   PRIMARY KEY (bug_id, product_id)
                         );



                                           (1 .. *)     BUGS
                                 BUGS                 PRODUCTS




                                              15
Monday, April 20, 2009                                                 15
Entity-Attribute-Value
                                  If you try and take a cat apart to see how it works,
                         the first thing you have on your hands is a non-working cat.
                                                                — Richard Dawkins



                                                   16
Monday, April 20, 2009                                                                   16
Entity-Attribute-Value
                     • Objective: make a table with a variable
                           set of attributes
                  bug_id      bug_type   priority   description     severity       sponsor

                                                    crashes when      loss of
                    1234        BUG        high
                                                        saving     functionality


                    3456      FEATURE      low      support XML                    Acme Corp.




                                                     17
Monday, April 20, 2009                                                                          17
Entity-Attribute-Value

                • Antipattern: store all attributes in a
                         second table, one attribute per row
                         CREATE TABLE eav (
                         
 bug_id

 
 BIGINT NOT NULL,
                         
 attr_name

 VARCHAR(20) NOT NULL,
                         
 attr_value

 VARCHAR(100),
                         
 PRIMARY KEY (bug_id, attr_name),
                         
 FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id)
                         );
                                            mixing data
                                           with metadata



                                                18
Monday, April 20, 2009                                                    18
Entity-Attribute-Value


                         bug_id   attr_name           attr_value
                          1234     priority                high
                          1234    description      crashes when saving
                          1234     severity        loss of functionality
                          3456     priority                low
                          3456    description         support XML
                          3456     sponsor            Acme Corp.



                                              19
Monday, April 20, 2009                                                     19
Entity-Attribute-Value
                     • Difficult to rely on attribute names

                         bug_id   attr_name        attr_value
                          1234      created        2008-04-01
                          3456    created_date     2008-04-01




                                              20
Monday, April 20, 2009                                          20
Entity-Attribute-Value
                     • Difficult to enforce data type integrity

                         bug_id   attr_name        attr_value
                          1234    created_date      2008-02-31
                          3456    created_date        banana




                                            21
Monday, April 20, 2009                                           21
Entity-Attribute-Value

                     • Difficult to enforce mandatory attributes
                         (i.e. NOT NULL)
                         •   SQL constraints apply to columns, not rows

                         •   No way to declare that a row must exist with a
                             certain attr_name value (‘created_date’)

                         •   Maybe create a trigger on INSERT for bugs?



                                                22
Monday, April 20, 2009                                                        22
Entity-Attribute-Value
                     • Difficult to enforce referential integrity for
                         attribute values
                         bug_id        attr_name           attr_value
                             1234         priority             new
                             3456         priority             fixed
                             5678         priority            banana


                         •    Constraints apply to all rows in the column, not
                              selected rows depending on value in attr_name


                                                     23
Monday, April 20, 2009                                                           23
Entity-Attribute-Value
                • Difficult to reconstruct a row of attributes:
                         SELECT b.bug_id,                                            need one JOIN
                         
   e1.attr_value AS created_date,                           per attribute
                         
   e2.attr_value AS priority,
                         
   e3.attr_value AS description,
                         
   e4.attr_value AS status,
                         
   e5.attr_value AS reported_by
                         FROM Bugs b
                         LEFT JOIN eav e1 ON (b.bug_id = e1.bug_id AND e1.attr_name = ‘created_date’)
                         LEFT JOIN eav e2 ON (b.bug_id = e2.bug_id AND e2.attr_name = ‘priority’)
                         LEFT JOIN eav e3 ON (b.bug_id = e3.bug_id AND e3.attr_name = ‘description’)
                         LEFT JOIN eav e4 ON (b.bug_id = e4.bug_id AND e4.attr_name = ‘status’)
                         LEFT JOIN eav e5 ON (b.bug_id = e5.bug_id AND e5.attr_name = ‘reported_by’);


                          bug_id created_date priority       description         status   reported_by
                           1234     2008-04-01    high    Crashes when I save.   NEW          Bill




                                                             24
Monday, April 20, 2009                                                                                  24
Entity-Attribute-Value


                     • Solution:         use metadata for metadata
                         •   Define attributes in columns

                         •   ALTER TABLE to add attribute columns

                         •   Define related tables for related types




                                                 25
Monday, April 20, 2009                                                25
Entity-Attribute-Value
                     • Solution #1:                Single Table Inheritance
                         •   One table with many columns

                         •   Columns are NULL when inapplicable
                         CREATE TABLE Issues (
                         
    issue_id
 
   
    SERIAL PRIMARY KEY,
                         
    created_date

     DATE NOT NULL,
                         
    priority
 
   
    VARCHAR(20),
                         
    description
 
     TEXT,
                         
    issue_type
 
      CHAR(1) CHECK (issue_type IN (‘B’, ‘F’)),
                         
    bug_severity
 
    VARCHAR(20),
                         
    feature_sponsor
   VARCHAR(100)
                         );




                                                        26
Monday, April 20, 2009                                                                       26
Entity-Attribute-Value
                     • Solution #2:           Concrete Table Inheritance
                         •   Define similar tables for similar types

                         •   Duplicate common columns in each table

           CREATE TABLE Bugs (                        CREATE TABLE Features (
           
  bug_id
 
    SERIAL PRIMARY KEY,        
  bug_id
 
    SERIAL PRIMARY KEY,
           
  created_date
DATE NOT NULL,             
  created_date
DATE NOT NULL,
           
  priority
 
  VARCHAR(20),               
  priority
 
  VARCHAR(20),
           
  description
 TEXT,                      
  description
 TEXT,
           
  severity
 
  VARCHAR(20)                
  sponsor
 
   VARCHAR(100)
           );                                         );




                                                 27
Monday, April 20, 2009                                                                      27
Entity-Attribute-Value
                     • Solution #2:         Concrete Table Inheritance
                         •   Use UNION to search both tables:
                             SELECT * FROM (
                             
 SELECT issue_id, description FROM Bugs
                             
 UNION ALL
                             
 SELECT issue_id, description FROM Features
                             ) unified_table
                             WHERE description LIKE ...




                                               28
Monday, April 20, 2009                                                      28
Entity-Attribute-Value
                     • Solution #3:              Class Table Inheritance
                         •   Common columns in base table
                         •   Subtype-specific columns in subtype tables
             CREATE TABLE Bugs (                         CREATE TABLE Features (
             
  issue_id
 BIGINT PRIMARY KEY,            
  issue_id
 BIGINT PRIMARY KEY,
             
  severity
 VARCHAR(20),                   
  sponsor
 VARCHAR(100),
             
  FOREIGN KEY (issue_id)                   
  FOREIGN KEY (issue_id)
             
    REFERENCES Issues (issue_id)           
    REFERENCES Issues (issue_id)
             );                                          );

                                   CREATE TABLE Issues (
                                   
  issue_id
 
  SERIAL PRIMARY KEY,
                                   
  created_date
DATE NOT NULL
                                   
  priority
 
  VARCHAR(20),
                                   
  description
 TEXT
                                   );
                                                    29
Monday, April 20, 2009                                                                       29
Entity-Attribute-Value
                     • Solution #3:          Class Table Inheritance
                         •   Easy to query common columns:
                             SELECT * FROM Issues
                             WHERE description LIKE ...

                         •   Easy to query one subtype at a time:
                             SELECT * FROM Issues
                             JOIN Bugs USING (issue_id);




                                                30
Monday, April 20, 2009                                                 30
Entity-Attribute-Value

                     • Appropriate usage of EAV:
                         •   If attributes must be fully flexible and dynamic

                         •   You must enforce constraints in application code

                         •   Don’t try to fetch one object in a single row

                         •   Consider non-relational solutions
                             for semi-structured data, e.g. RDF/XML



                                                 31
Monday, April 20, 2009                                                          31
Polymorphic Associations
                         Of course, some people do go both ways.
                                             — The Scarecrow




                                           32
Monday, April 20, 2009                                             32
Polymorphic Assocations
                • Objective: reference multiple parents
                                                  BUGS


                         COMMENTS


                                                FEATURES


                                     33
Monday, April 20, 2009                                     33
Polymorphic Assocations
                • Can’t make a FOREIGN KEY constraint
                         reference two tables:
                         CREATE TABLE Comments (
                         
 comment_id
 SERIAL PRIMARY KEY,
                         
 comment
 
 TEXT NOT NULL,
                         
 issue_type
 VARCHAR(15) CHECK
                                     
                         
 
 (issue_type IN (‘Bugs’, ‘Features’)),
                         
 issue_id
 
 BIGINT NOT NULL,
                         
 FOREIGN KEY issue_id REFERENCES
                         );
                                           you need this to be
                                            Bugs or Features



                                                       34
Monday, April 20, 2009                                               34
Polymorphic Assocations
                • Instead, you have to define table with no
                         FOREIGN KEY or referential integrity:
                         CREATE TABLE Comments (
                         
 comment_id
 SERIAL PRIMARY KEY,
                         
 comment
 
 TEXT NOT NULL,
                         
 issue_type
 VARCHAR(15) CHECK
                                     
                         
 
 (issue_type IN (‘Bugs’, ‘Features’)),
                         
 issue_id
 
 BIGINT NOT NULL
                         );




                                                   35
Monday, April 20, 2009                                               35
Polymorphic Assocations
                                comment
                                        comment issue_type issue_id
                                   id
             issue_id                                                                     issue_id
                                  6789        “It crashes”       Bugs       1234
                   ...                                                                         ...
                                  9876        “Great idea!”     Features    2345
                 1234                                                                       2345
                                                   Comments
                Bugs                                                                     Features



                                                  Query result:
                                                                    c.         b.          f.
                   comment_id comment issue_type
                                                                 issue_id   issue_id   issue_id
                         6789     “It crashes”       Bug           1234      1234       NULL
                                    “Great
                         9876                      Feature         2345      NULL       2345
                                     idea!”


                                                           36
Monday, April 20, 2009                                                                               36
Polymorphic Assocations
                • You can’t use a different table for each row.
                         You must name all tables explicitly.
                         SELECT * FROM Comments
                         JOIN
 
 
 
 USING (issue_id);

                                                    you need this to be
                                                     Bugs or Features




                                               37
Monday, April 20, 2009                                                    37
Polymorphic Assocations
                • Instead, join to each parent table:
                         SELECT *
                         FROM Comments c
                         LEFT JOIN Bugs b
 ON (c.issue_type = ‘Bugs’
                         
 AND c.issue_id = b.issue_id)
                         LEFT JOIN Features f ON (c.issue_type = ‘Features’
                         
 AND c.issue_id = f.issue_id);


                                                 you have to get
                                                these strings right




                                                    38
Monday, April 20, 2009                                                        38
Polymorphic Assocations
                • Solution #1:                 exclusive arcs
                         CREATE TABLE Comments (
                         
 comment_id
 SERIAL PRIMARY KEY,
                         
 comment
 
 TEXT NOT NULL,
                         
 bug_id

 
 BIGINT,                both columns are nullable;
                         
 feature_id

 BIGINT,             exactly one must be non-null
                         
 FOREIGN KEY bug_id
                         
 
 REFERENCES Bugs(bug_id)
                         
 FOREIGN KEY feature_id
                         
 
 REFERENCES Features(feature_id)
                         );




                                                    39
Monday, April 20, 2009                                                                     39
Polymorphic Assocations
                • Solution #1:             exclusive arcs
                         • Referential integrity is enforced
                         • But hard to make sure exactly one is non-null
                         • Queries are easier:
                           SELECT * FROM Comments c
                           LEFT JOIN Bugs b
 USING (bug_id)
                           LEFT JOIN Features f USING (feature_id);




                                                40
Monday, April 20, 2009                                                     40
Polymorphic Assocations
                • Solution #2:   reverse the relationship

                                 BUGS
                                                     BUGS
                               COMMENTS


                   COMMENTS



                                FEATURES
                                                    FEATURES
                               COMMENTS



                                     41
Monday, April 20, 2009                                         41
Polymorphic Assocations
                • Solution #2:                 reverse the relationship
                         CREATE TABLE BugsComments (
                         
  comment_id
 BIGINT NOT NULL,
                         
  bug_id
 
   BIGINT NOT NULL,
                         
  PRIMARY KEY (comment_id),
                         
  FOREIGN KEY (comment_id) REFERENCES Comments(comment_id),
                         
  FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id)
                         );

                         CREATE TABLE FeaturesComments (
                         
  comment_id
 BIGINT NOT NULL,
                         
  feature_id
 BIGINT NOT NULL,
                         
  PRIMARY KEY (comment_id),
                         
  FOREIGN KEY (comment_id) REFERENCES Comments(comment_id),
                         
  FOREIGN KEY (feature_id) REFERENCES Features(feature_id)
                         );



                                                     42
Monday, April 20, 2009                                                                  42
Polymorphic Assocations
                • Solution #2:                   reverse the relationship
                         •   Referential integrity is enforced
                         •   Query comments for a given bug:
                             SELECT * FROM BugsComments b
                             JOIN Comments c USING (comment_id)
                             WHERE b.bug_id = 1234;

                         •   Query bug/feature for a given comment:
                             SELECT * FROM Comments
                             LEFT JOIN (BugsComments JOIN Bugs USING (bug_id))
                             
 USING (comment_id)
                             LEFT JOIN (FeaturesComments JOIN Features USING (feature_id))
                             
 USING (comment_id)
                             WHERE comment_id = 9876;



                                                       43
Monday, April 20, 2009                                                                       43
Polymorphic Assocations
                • Solution #3:   use a base parent table

                     BUGS         ISSUES              FEATURES




                                 COMMENTS



                                     44
Monday, April 20, 2009                                           44
Polymorphic Assocations
                • Solution #3:                   use a base parent table
                         CREATE TABLE Issues (
                         
  issue_id
 
  SERIAL PRIMARY KEY
                         );
                         CREATE TABLE Bugs (
                         
  issue_id
 
 BIGINT PRIMARY KEY,
                         
  ...
                         
  FOREIGN KEY (issue_id) REFERENCES Issues(issue_id)
                         );
                         CREATE TABLE Comments (
                         
  comment_id
 SERIAL PRIMARY KEY,
                         
  comment
    TEXT NOT NULL,
                         
  issue_id
 
 BIGINT NOT NULL,
                         
  FOREIGN KEY (issue_id) REFRENCES Issues(issue_id)
                         );



                                                        45
Monday, April 20, 2009                                                           45
Polymorphic Assocations
                • Solution #3:              use a base parent table
                         • Referential integrity is enforced
                         • Queries are easier:
                            SELECT * FROM Comments
                            JOIN Issues USING (issue_id)
                            LEFT JOIN Bugs USING (issue_id)
                            LEFT JOIN Features USING (issue_id);




                                                 46
Monday, April 20, 2009                                                46
Polymorphic Assocations
                • Enforcing disjoint subtypes:
                         CREATE TABLE Issues (
                         
  issue_id
 
  SERIAL PRIMARY KEY,
                         
  issue_type
 CHAR(1) NOT NULL CHECK (issue_type IN (‘B’, ‘F’)),
                         
  UNIQUE KEY (issue_id, issue_type)
                         );

                                                                            referential integrity
                         CREATE TABLE Bugs (
                         
  issue_id
 
 BIGINT PRIMARY KEY,
                         
  issue_type
 CHAR(1) NOT NULL CHECK (issue_type = ‘B’),
                         
  ...
                         
  FOREIGN KEY (issue_id, issue_type)
                         
  
    REFERENCES Issues(issue_id, issue_type)
                         );



                                                        47
Monday, April 20, 2009                                                                              47
Naive Trees



                              48
Monday, April 20, 2009                 48
Naive Trees


                • Objective:               store/query hierarchical data
                         •   Categories/subcategories

                         •   Bill of materials

                         •   Threaded discussions




                                                 49
Monday, April 20, 2009                                                     49
Naive Trees
                                                       (1) Fran:
                                                    What’s the cause
                                                     of this bug?


                                                                      (4) Kukla:
                                         (2) Ollie:
                                                                     We need to
                                     I think it’s a null
                                                                   check valid input.
                                          pointer.


                              (3) Fran:                                               (6) Fran:
                                                               (5) Ollie:
                         No, I checked for                                       Yes, please add a
                                                           Yes, that’s a bug.
                                that.                                                  check.



                                                                                    (7) Kukla:
                                                                                   That fixed it.



                                                            50
Monday, April 20, 2009                                                                               50
Naive Trees
                • Adjacency List
                         •   Naive solution nearly everyone uses
                         •   Each entry in the tree knows immediate parent
                         comment_id parent_id author            comment
                              1       NULL     Fran    What’s the cause of this bug?
                              2         1      Ollie     I think it’s a null pointer.
                              3         2      Fran       No, I checked for that.
                              4         1      Kukla   We need to check valid input.
                              5         4      Ollie         Yes, that’s a bug.
                              6         4      Fran       Yes, please add a check
                              7         6      Kukla           That fixed it.


                                                  51
Monday, April 20, 2009                                                                  51
Naive Trees
                • Adjacency List
                         •   Easy to inserting a new comment:
                             INSERT INTO Comments (parent_id, author, comment)
                             
 VALUES (7, ‘Kukla’, ‘Thanks!’);

                         •   Easy to move a subtree to a new position:
                             UPDATE Comments SET parent_id = 3
                             WHERE comment_id = 6;




                                                 52
Monday, April 20, 2009                                                           52
Naive Trees
                • Adjacency List
                         •   Querying a node’s immediate children is easy:
                             SELECT * FROM Comments c1
                             LEFT JOIN Comments c2
                              ON (c2.parent_id = c1.comment_id);

                         •   Querying a node’s immediate parent is easy:
                             SELECT * FROM Comments c1
                             JOIN Comments c2
                              ON (c1.parent_id = c2.comment_id);


                                                 53
Monday, April 20, 2009                                                       53
Naive Trees
                • Adjacency List
                         •   Hard to query all descendants in a deep tree:
                             SELECT * FROM Comments c1
                             LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id)
                             LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id)
                             LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id)
                             LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id)
                             LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id)
                             LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id)
                             LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id)
                             LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id)
                             LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id)
                             ...
                                              it still doesn’t support
                                                  unlimited depth!


                                                        54
Monday, April 20, 2009                                                                   54
Naive Trees
                • Solution #1:                   Path Enumeration
                         •   Store chain of ancestors as a string in each node
                         •   Good for breadcrumbs, or sorting by hierarchy

                         comment_id    path       author            comment
                              1       1/           Fran    What’s the cause of this bug?
                              2       1/2/         Ollie     I think it’s a null pointer.
                              3       1/2/3/       Fran       No, I checked for that.
                              4       1/4/         Kukla   We need to check valid input.
                              5       1/4/5/       Ollie         Yes, that’s a bug.
                              6       1/4/6/       Fran       Yes, please add a check
                              7       1/4/6/7/     Kukla           That fixed it.


                                                      55
Monday, April 20, 2009                                                                      55
Naive Trees
                • Solution #1:               Path Enumeration
                         •   Easy to query all ancestors of comment #7:
                             SELECT * FROM Comments
                             WHERE ‘1/4/6/7/’ LIKE path || ‘%’;

                         •   Easy to query all descendants of comment #4:
                             SELECT * FROM Comments
                             WHERE path LIKE ‘1/4/%’;




                                                  56
Monday, April 20, 2009                                                      56
Naive Trees
                • Solution #1:                Path Enumeration
                         •   Easy to add child of comment 7:
                             INSERT INTO Comments (author, comment)
                             VALUES (‘Ollie’, ‘Good job!’);
                             SELECT path FROM Comments
                             WHERE comment_id = 7;
                             UPDATE Comments
                             SET path = $parent_path || LAST_INSERT_ID() || ‘/’
                             WHERE comment_id = LAST_INSERT_ID();



                                                   57
Monday, April 20, 2009                                                            57
Naive Trees
                • Solution #2:               Nested Sets
                         •   Each comment encodes its descendants
                             using two numbers:

                         •   A comment’s right number is less than all the
                             numbers used by the comment’s descendants.

                         •   A comment’s left number is greater than all the
                             numbers used by the comment’s descendants.




                                                  58
Monday, April 20, 2009                                                         58
Naive Trees
                • Solution #2:                          Nested Sets
                                                                (1) Fran:
                                                             What’s the cause
                                                              of this bug?
                                                         1                       14
                                               (2) Ollie:                         (4) Kukla:
                                           I think it’s a null                 We need to check
                                                pointer.                         valid input.
                                       2                         5         6                    13

                                 (3) Fran:                                                          (6) Fran:
                                                                         (5) Ollie:
                             No, I checked for                                                 Yes, please add a
                                                                     Yes, that’s a bug.
                                    that.                                                            check.
                         3                       4               7                    8    9                       12

                                                                                                 (7) Kukla:
                                                                                                That fixed it.

                                                                                          10                       11


                                                                      59
Monday, April 20, 2009                                                                                                  59
Naive Trees
                • Solution #2:                 Nested Sets

             comment_id      nsleft        nsright    author               comment
                         1     1              14          Fran    What’s the cause of this bug?
                         2     2               5          Ollie     I think it’s a null pointer.
                         3     3               4          Fran       No, I checked for that.
                         4     6              13          Kukla   We need to check valid input.
                         5     7               8          Ollie         Yes, that’s a bug.
                         6     9              12          Fran       Yes, please add a check
                         7    10              11          Kukla           That fixed it.

                                   these are not
                                    foreign keys

                                                     60
Monday, April 20, 2009                                                                             60
Naive Trees
                • Solution #2:               Nested Sets
                         •   Easy to query all ancestors of comment #7:
                             SELECT * FROM Comments child
                             JOIN Comments ancestor
                              ON (child.left BETWEEN
 ancestor.nsleft
                             
 
 
 
 
 
 AND
 ancestor.nsright)
                             WHERE child.comment_id = 7;




                                                  61
Monday, April 20, 2009                                                    61
Naive Trees
                • Solution #2:                          Nested Sets
                                                                                                                ancestors
                                                                (1) Fran:
                                                             What’s the cause
                                                              of this bug?
                                                         1                       14
                                               (2) Ollie:                         (4) Kukla:
                                           I think it’s a null                 We need to check
                                                pointer.                         valid input.
                                                                                                                            child
                                       2                         5         6                    13

                                 (3) Fran:                                                          (6) Fran:
                                                                         (5) Ollie:
                             No, I checked for                                                 Yes, please add a
                                                                     Yes, that’s a bug.
                                    that.                                                            check.
                         3                       4               7                    8    9                       12

                                                                                                 (7) Kukla:
                                                                                                That fixed it.

                                                                                          10                       11


                                                                      62
Monday, April 20, 2009                                                                                                              62
Naive Trees
                • Solution #2:              Nested Sets
                         •   Easy to query all descendants of comment #4:
                             SELECT * FROM Comments parent
                             JOIN Comments descendant
                              ON (descendant.left BETWEEN
 parent.nsleft
                             
 
 
 
 
 
 
 
 AND
 parent.nsright)
                             WHERE parent.comment_id = 4;




                                                 63
Monday, April 20, 2009                                                      63
Naive Trees
                • Solution #2:                          Nested Sets
                                                                                                                parent
                                                                (1) Fran:
                                                             What’s the cause
                                                              of this bug?
                                                         1                       14
                                                                                                                         descendants
                                               (2) Ollie:                         (4) Kukla:
                                           I think it’s a null                 We need to check
                                                pointer.                         valid input.
                                       2                         5         6                    13

                                 (3) Fran:                                                          (6) Fran:
                                                                         (5) Ollie:
                             No, I checked for                                                 Yes, please add a
                                                                     Yes, that’s a bug.
                                    that.                                                            check.
                         3                       4               7                    8    9                       12

                                                                                                 (7) Kukla:
                                                                                                That fixed it.

                                                                                          10                       11


                                                                      64
Monday, April 20, 2009                                                                                                                 64
Naive Trees
                • Solution #2:                     Nested Sets
                         •   Hard to insert a new child of comment #5:
                             UPDATE Comment
                             SET
 nsleft = CASE WHEN nsleft >= 8 THEN nsleft+2 ELSE nsleft END,
                             
    nsright = nsright+2
                             WHERE nsright >= 7;

                             INSERT INTO Comment (nsleft, nsright, author, comment)
                              VALUES (8, 9, 'Fran', 'I agree!');

                         •   Recalculate left values for all nodes to the right of
                             the new child. Recalculate right values for all
                             nodes above and to the right.


                                                          65
Monday, April 20, 2009                                                                            65
Naive Trees
                • Solution #2:                          Nested Sets
                                                                (1) Fran:
                                                             What’s the cause
                                                              of this bug?
                                                         1                         16
                                                                                   14
                                               (2) Ollie:                           (4) Kukla:
                                           I think it’s a null                   We need to check
                                                pointer.                           valid input.
                                       2                         5           6                  15
                                                                                                13

                                 (3) Fran:                                                          (6) Fran:
                                                                         (5) Ollie:
                             No, I checked for                                                 Yes, please add a
                                                                     Yes, that’s a bug.
                                    that.                                                            check.
                         3                       4               7                       89
                                                                                        10 11                      14
                                                                                                                   12

                                                                           (8) Fran:             (7) Kukla:
                                                                            I agree!            That fixed it.

                                                                 8                      9 10
                                                                                          12                       13
                                                                                                                   11


                                                                      66
Monday, April 20, 2009                                                                                                  66
Naive Trees
                • Solution #2:                    Nested Sets
                         •   Hard to query the parent of comment #6:
                             SELECT parent.* FROM Comments AS c
                             JOIN Comments AS parent
                              ON (c.nsleft BETWEEN parent.nsleft AND parent.nsright)
                             LEFT OUTER JOIN Comments AS in_between
                              ON (c.nsleft BETWEEN in_between.nsleft AND in_between.nsright
                               AND in_between.nsleft BETWEEN parent.nsleft AND parent.nsright)
                             WHERE c.comment_id = 6 AND in_between.comment_id IS NULL;

                         •   Parent of #6 is an ancestor who has no
                             descendant who is also an ancestor of #6.

                         •   Querying a child is a similar problem.

                                                         67
Monday, April 20, 2009                                                                           67
Naive Trees
                • Solution #3:                  Closure Table
                         •   Store every path from ancestors to descendants

                         •   Requires an additional table:
                             CREATE TABLE TreePaths (
                             
  ancestor

   BIGINT NOT NULL,
                             
  descendant
 BIGINT NOT NULL,
                             
  PRIMARY KEY (ancestor, descendant),
                             
  FOREIGN KEY(ancestor) REFERENCES Comments(comment_id),
                             
  FOREIGN KEY(descendant) REFERENCES Comments(comment_id),
                             );




                                                      68
Monday, April 20, 2009                                                                     68
Naive Trees
                • Solution #3:                      Closure Table
                                                           (1) Fran:
                                                        What’s the cause
                                                         of this bug?



                                           (2) Ollie:                     (4) Kukla:
                                       I think it’s a null             We need to check
                                            pointer.                     valid input.



                             (3) Fran:                                                    (6) Fran:
                                                                  (5) Ollie:
                         No, I checked for                                           Yes, please add a
                                                              Yes, that’s a bug.
                                that.                                                      check.




                                                                                       (7) Kukla:
                                                                                      That fixed it.




                                                               69
Monday, April 20, 2009                                                                                   69
Naive Trees
                • Solution #3:
                                                                              ancestor descendant
                                                       Closure Table             1         1
                                                                                 1         2
                         comment_id author            comment
                                                                                 1         3
                             1       Fran    What’s the cause of this bug?
                                                                                 1         4
                             2       Ollie     I think it’s a null pointer.
                                                                                 1         5
                             3       Fran       No, I checked for that.
                                                                                 1         6
                             4       Kukla   We need to check valid input.
                                                                                 1         7
                             5       Ollie         Yes, that’s a bug.
                                                                                 2         2
                             6       Fran       Yes, please add a check
                                                                                 2         3
                             7       Kukla           That fixed it.
                                                                                 3         3
                                                                                 4         4
                                       requires O(n²)                            4         5
                                        rows at most                             4         6
                                                                                 4         7
                                        but far fewer
                                                                                 5         5
                                         in practice
                                                                                 6         6
                                                                                 6         7
                                                                                 7         7
                                                                   70
Monday, April 20, 2009                                                                              70
Naive Trees
                • Solution #3:             Closure Table
                         •   Easy to query descendants of comment #4:
                             SELECT c.* FROM Comments c
                             JOIN TreePaths t
                              ON (c.comment_id = t.descendant)
                             WHERE t.ancestor = 4;




                                                71
Monday, April 20, 2009                                                  71
Naive Trees
                • Solution #3:             Closure Table
                         •   Easy to query ancestors of comment #6:
                             SELECT c.* FROM Comments c
                             JOIN TreePaths t
                              ON (c.comment_id = t.ancestor)
                             WHERE t.descendant = 6;




                                                72
Monday, April 20, 2009                                                72
Naive Trees
                • Solution #3:              Closure Table
                         •   Easy to insert a new child of comment #5:
                                                                   generates comment #8
                             INSERT INTO Comments ...
                             INSERT INTO TreePaths (ancestor, descendant)
                             
 VALUES (8, 8);
                             INSERT INTO TreePaths (ancestor, descendant)
                             
 SELECT ancestor, 8 FROM TreePaths
                             
 WHERE descendant = 5;



                                                 73
Monday, April 20, 2009                                                                    73
Naive Trees
                • Solution #3:              Closure Table
                         •   Easy to delete a child comment #7:
                             DELETE FROM TreePaths
                             WHERE descendant = 7;

                         •   Even easier with ON DELETE CASCADE




                                                 74
Monday, April 20, 2009                                            74
Naive Trees
                • Solution #3:              Closure Table
                         •   Easy to delete the subtree under comment #4:
                             DELETE FROM TreePaths WHERE descendant IN
                             
 (SELECT descendant FROM TreePaths
                             
 WHERE ancestor = 4);

                         •   For MySQL, use multi-table DELETE:
                             DELETE p FROM TreePaths p
                             JOIN TreePaths a USING (descendant)
                             WHERE a.ancestor = 4;


                                                75
Monday, April 20, 2009                                                      75
Naive Trees
                • Solution #3:
                                                                ancestor descendant   depth
                                            Closure Table          1         1          0
                                                                   1         2          1

                         •   Add a depth column to make it         1         3          2

                             easier to query immediate parent      1         4          1
                                                                   1         5          2
                             or child:                             1         6          2
                                                                   1         7          3
                         SELECT c.* FROM Comments c                2         2          0

                         JOIN TreePaths t                          2         3          1

                          ON (c.comment_id = t.descendant)         3         3          0
                                                                   4         4          0
                         WHERE t.ancestor = 4                      4         5          1

                          AND t.depth = 1;                         4         6          1
                                                                   4         7          2
                                                                   5         5          0
                                                                   6         6          0
                                                                   6         7          1
                                                                   7         7          0
                                                 76
Monday, April 20, 2009                                                                        76
Naive Trees
                • Summary of Designs:
                          Number of    Query         Query    Modify   Referential
              Design
                           Tables      Child        Subtree    Tree     Integrity
         Adjacency List       1         Easy         Hard      Easy        Yes
             Path
                              1         Easy         Easy      Hard        No
          Enumeration

           Nested Sets        1        Hard          Easy      Hard        No

          Closure Table       2         Easy         Easy      Easy        Yes




                                               77
Monday, April 20, 2009                                                               77
Antipattern Categories
                         Database Design                   Database Creation
                          Antipatterns                       Antipatterns
                                                           CREATE TABLE BugsProducts (
                                                              bug_id INTEGER REFERENCES Bugs,
                                                              product VARCHAR(100) REFERENCES Products,
                                                              PRIMARY KEY (bug_id, product)
                                                           );




                                 Query                            Application
                               Antipatterns                       Antipatterns
                         SELECT b.product, COUNT(*)        $dbHandle = new PDO(‘mysql:dbname=test’);
                         FROM BugsProducts AS b            $stmt = $dbHandle->prepare($sql);
                         GROUP BY b.product;               $result = $stmt->fetchAll();




                                                      78
Monday, April 20, 2009                                                                                    78
Database Creation Antipatterns


                     5. ENUM Antipattern
                     6. Rounding Errors
                     7. Indexes Are Magical




                                          79
Monday, April 20, 2009                            79
ENUM Antipattern



                                80
Monday, April 20, 2009                      80
ENUM Antipattern


                     • Objective:      restrict a column
                         to a fixed set of values
                                                     OK
                         INSERT INTO bugs (status)
                         
   VALUES (‘new’)
                         INSERT INTO bugs (status)
                         
   VALUES (‘banana’)
                                                     FAIL



                                               81
Monday, April 20, 2009                                      81
ENUM Antipattern


                     • Antipattern:       use ENUM data type,
                         when the set of values may change
                         CREATE TABLE Bugs (
                         
  ...
                         
  status
 
 ENUM(‘new’, ‘open’, ‘fixed’)
                         );




                                                82
Monday, April 20, 2009                                              82
ENUM Antipattern


                     • Changing the set of values is a metadata
                         alteration
                     • You must know the current set of values
                         ALTER TABLE Bugs MODIFY COLUMN
                         
   status ENUM(‘new’, ‘open’, ‘fixed’, ‘duplicate’);




                                                   83
Monday, April 20, 2009                                                          83
ENUM Antipattern

                     • Difficult to get a list of possible values
                         SELECT column_type
                         FROM information_schema.columns
                         WHERE table_schema = ‘bugtracker_schema’
                          AND table_name = ‘Bugs’
                          AND column_name = ‘status’;

                     • Returns a LONGTEXT you must parse:
                         “ENUM(‘new’, ‘open’, ‘fixed’)”



                                                  84
Monday, April 20, 2009                                              84
ENUM Antipattern


                     • Solution:        use ENUM only if values are
                         set in stone
                         CREATE TABLE Bugs (
                         
  ...
                         
  bug_type
 
 ENUM(‘defect’, ‘feature’)
                         );




                                               85
Monday, April 20, 2009                                                85
ENUM Antipattern
                     • Use a lookup table if values may change
                         CREATE TABLE BugStatus (
                         
  status
 VARCHAR(10) PRIMARY KEY
                         );
                         INSERT INTO BugStatus (status)
                          VALUES (‘NEW’), (‘OPEN’), (‘FIXED’);


                                       BUGS           BUGSTATUS




                                                86
Monday, April 20, 2009                                            86
ENUM Antipattern


                     • Adding/removing a value is a data
                         operation, not a metadata operation
                     • You don’t need to know the current values
                         INSERT INTO BugStatus (status)
                          VALUES (‘DUPLICATE’);




                                                87
Monday, April 20, 2009                                             87
ENUM Antipattern

                     • Use an attribute to retire values, not
                         DELETE
                         CREATE TABLE BugStatus (
                         
   status
 
 VARCHAR(10) PRIMARY KEY,
                         
   active
 
 TINYINT NOT NULL DEFAULT 1
                         );
                         UPDATE BugStatus
                         SET active = 0
                         WHERE status = ‘DUPLICATE’;



                                              88
Monday, April 20, 2009                                              88
Rounding Errors
                          10.0 times 0.1 is hardly ever 1.0.
                                      — Brian Kernighan




                                          89
Monday, April 20, 2009                                         89
Rounding Errors



                     • Objective:         store real numbers exactly
                         •   Especially money

                         •   Work estimate hours




                                                90
Monday, April 20, 2009                                                 90
Rounding Errors


                     • Antipattern:          use FLOAT data type
                         ALTER TABLE Bugs
                          ADD COLUMN hours FLOAT;
                         INSERT INTO Bugs (bug_id, hours)
                          VALUES (1234, 3.3);




                                               91
Monday, April 20, 2009                                             91
Rounding Errors

                     • FLOAT is inexact
                         SELECT hours FROM Bugs
                         WHERE bug_id = 1234;

                         ‣ 3.3
                         SELECT hours * 1000000000 FROM Bugs
                         WHERE bug_id = 1234;

                         ‣ 3299999952.3163


                                              92
Monday, April 20, 2009                                         92
Rounding Errors



                     • Inexact decimals                      assuming infinite precision
                         •   1/3 + 1/3 + 1/3 = 1.0

                         •   0.333 + 0.333 + 0.333 = 0.999            finite precision




                                                     93
Monday, April 20, 2009                                                                    93
Rounding Errors


                     • IEEE 754 standard for representing
                         floating-point numbers in base-2
                         •   Some numbers round off, aren’t stored exactly

                         •   Comparisons to original value fail
                             SELECT * FROM Bugs
                             WHERE hours = 3.3;       comparison
                                                         fails




                                                94
Monday, April 20, 2009                                                       94
Rounding Errors

                     • Solution: use NUMERIC data type
                         ALTER TABLE Bugs
                          ADD COLUMN hours NUMERIC(9,2)
                         INSERT INTO bugs (bug_id, hours)
                          VALUES (1234, 3.3);
                         SELECT * FROM Bugs             comparison
                         WHERE hours = 3.3;              succeeds




                                                95
Monday, April 20, 2009                                               95
Indexes are Magical
                            Whenever any result is sought, the question will then
                         arise — by what course of calculation can these results
                               be arrived at by the machine in the shortest time?
                                                           — Charles Babbage

                                                    96
Monday, April 20, 2009                                                              96
Indexes are Magical



                • Objective: execute queries with optimal
                         performance




                                       97
Monday, April 20, 2009                                      97
Indexes are Magical


                • Antipatterns:
                         •   Creating indexes blindly

                         •   Executing non-indexable queries

                         •   Rejecting indexes because of their overhead




                                                  98
Monday, April 20, 2009                                                     98
Indexes are Magical
                • Creating indexes blindly:
                         CREATE TABLE Bugs (
                         
 bug_id

 
 SERIAL PRIMARY KEY,
                         
 date_reported DATE NOT NULL,
                         
 summary
 
 VARCHAR(80) NOT NULL,
                         
 status
 
 
 VARCHAR(10) NOT NULL, redundant index
                         
 hours
 
 
 NUMERIC(9,2),
                                                                    bulky index
                         
 INDEX (bug_id),
                         
 INDEX (summary),                      unnecessary index
                         
 INDEX (hours),                           unnecessary
                         
 INDEX (bug_id, date_reported, status)   covering index
                         );


                                                   99
Monday, April 20, 2009                                                               99
Indexes are Magical
                • Executing non-indexable queries:
                         •   SELECT * FROM Bugs
                             WHERE description LIKE ‘%crash%’;
                                                                    non-leftmost

                         •                                          string match
                             SELECT * FROM Bugs
                             WHERE MONTH(date_reported) = 4;          function applied
                                                                         to column

                         •   SELECT * FROM Bugs
                             WHERE last_name = “...” OR first_name = “...”;

                         •
                                                                  no index spans
                             SELECT * FROM Accounts                   all rows
                             ORDER BY first_name, last_name;
                                                                    non-leftmost
                                                                 composite key match

                                                100
Monday, April 20, 2009                                                                   100
Indexes are Magical
                • Telephone book analogy
                         •   Easy to search for Dean Thomas:     uses index
                                                                  to match
                             SELECT * FROM TelephoneBook
                             WHERE full_name LIKE ‘Thomas, %’;

                         •   Hard to search for Thomas Riddle:   requires full
                                                                  table scan
                             SELECT * FROM TelephoneBook
                             WHERE full_name LIKE ‘%, Thomas’;




                                                 101
Monday, April 20, 2009                                                           101
Indexes are Magical
                • Rejecting indexes because of their overhead:
                                                                     Query w/ Index: O(log n)
                   80
                                                                     Update Index: O(log n)
                                                                     Query w/o Index: O(n)
                   60


                                                                           the benefit quickly
                   40
                                                                         justifies the overhead

                   20


                     0
                         0 100 200 300 400 500 600 700 800 9001000



                                                       102
Monday, April 20, 2009                                                                           102
Indexes are Magical
                • Solution: “MENTOR” your indexes
                         Measure
                         Explain
                         Nominate
                         Test
                         Optimize
                         Repair

                                     103
Monday, April 20, 2009                              103
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                      • Profile your application.
                         Measure
                                      • Focus on the most costly
                         Explain       SQL queries:
                         Nominate      • Longest-running.
                                       • Most frequently run.
                         Test
                                       • Blockers, lockers, and
                         Optimize          deadlocks.

                         Repair

                                     104
Monday, April 20, 2009                                             104
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                      • Analyze the optimization
                         Measure
                                        plan of costly queries,
                         Explain        e.g. MySQL’s EXPLAIN
                                      • Identify tables that aren’t
                         Nominate
                                        using indexes:
                         Test          • Temporary table
                         Optimize      • Filesort
                         Repair

                                     105
Monday, April 20, 2009                                                105
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                     • Could an index improve
                         Measure
                                      access to these tables?
                         Explain      • ORDER BY criteria
                         Nominate     • MIN() / MAX()
                                      • WHERE conditions
                         Test
                                     • Which column(s) need
                         Optimize     indexes?
                         Repair

                                    106
Monday, April 20, 2009                                          106
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                      • After creating index,
                         Measure
                                        measure again.
                         Explain      • Confirm the new index
                         Nominate       made a difference.
                                      • Impress your boss!
                         Test
                                       “The new index gave a 27%
                         Optimize      performance improvement!”

                         Repair

                                     107
Monday, April 20, 2009                                             107
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                     • Indexes are compact,
                         Measure
                                       frequently-used data.
                         Explain     • Try to cache indexes in
                         Nominate      memory:
                                      • MyISAM: key_buffer_size,
                         Test             LOAD INDEX INTO CACHE
                                      • InnoDB: innodb_buffer_pool_size
                         Optimize
                         Repair

                                    108
Monday, April 20, 2009                                                    108
Indexes are Magical
                • Solution: “MENTOR” your indexes
                                      • Indexes require periodic
                         Measure
                                        maintenance.
                         Explain      • Like a filesystem requires
                         Nominate       periodic defragmentation.
                                      • Analyze or rebuild
                         Test           indexes, e.g. in MySQL:
                         Optimize      • ANALYZE TABLE
                                       • OPTIMIZE TABLE
                         Repair

                                     109
Monday, April 20, 2009                                              109
Indexes are Magical
                • Solution: “MENTOR” your indexes
                         • Sounds like the name                    BY THE AUTHOR OF THE BESTSELLER
                                                            WHY DOESN’T MY CACHE GET ANY HITS?

                           of a “self-help” book!         William K. Karwin
                                                          MENTOR
                                                            YOUR
                                                          INDEXES
                                                            How to Break the
                                just kidding!             Bad Performance Habits
                              please don’t ask                  That Make
                            when it’s coming out!
                                                              You Miserable



                                                    110
Monday, April 20, 2009                                                                               110
Antipattern Categories
                          Database Design                   Database Creation
                           Antipatterns                       Antipatterns
                                                            CREATE TABLE BugsProducts (
                                                               bug_id INTEGER REFERENCES Bugs,
                                                               product VARCHAR(100) REFERENCES Products,
                                                               PRIMARY KEY (bug_id, product)
                                                            );




                                                                   Application
                                 Query
                                                                   Antipatterns
                               Antipatterns
                         SELECT b.product, COUNT(*)         $dbHandle = new PDO(‘mysql:dbname=test’);
                         FROM BugsProducts AS b             $stmt = $dbHandle->prepare($sql);
                         GROUP BY b.product;                $result = $stmt->fetchAll();




                                                      111
Monday, April 20, 2009                                                                                     111
Query Antipatterns

                     8. NULL antipatterns
                     9. Ambiguous Groups
                     10. Random Order
                     11. JOIN antipattern
                     12. Goldberg Machine



                                            112
Monday, April 20, 2009                            112
NULL Antipatterns
                    As we know, there are known knowns; there are things we know
                  we know. We also know there are known unknowns; that is to say
                 we know there are some things we do not know. But there are also
                     unknown unknowns — the ones we don't know we don't know.
                                                         — Donald Rumsfeld

                                               113
Monday, April 20, 2009                                                              113
NULL Antipatterns



                • Objective: handle “missing” values,
                         store them as missing, and support them in
                         queries.




                                             114
Monday, April 20, 2009                                                114
NULL Antipatterns



                • Antipatterns:
                         •   Use NULL as an ordinary value

                         •   Use an ordinary value as NULL




                                                115
Monday, April 20, 2009                                       115
NULL Antipatterns


                • Using NULL in most expressions results in
                         an unknown value.
                                                           NULL is not zero
                         SELECT NULL + 10;
                         SELECT ‘Bill’ || NULL;         NULL is not an empty string


                         SELECT FALSE OR NULL;            NULL is not FALSE




                                                  116
Monday, April 20, 2009                                                                116
NULL Antipatterns


                • The opposite of unknown is still unknown.
                         SELECT * FROM Bugs
                                                      which query returns bugs
                         WHERE assigned_to = 123;     that are not yet assigned?
                                                          neither query!
                         SELECT * FROM Bugs
                         WHERE NOT (assigned_to = 123);




                                             117
Monday, April 20, 2009                                                             117
NULL Antipatterns



                • Choosing an ordinary value in lieu of NULL:
                         UPDATE Bugs SET assigned_to = -1
                         WHERE assigned_to IS NULL;
                                                            assigned_to is a foreign key
                                                             so this value doesn’t work




                                              118
Monday, April 20, 2009                                                                     118
NULL Antipatterns


                • Choosing an ordinary value in lieu of NULL:
                         UPDATE Bugs SET hours = -1
                         WHERE hours IS NULL;              this makes SUM()
                                                                inaccurate
                         SELECT SUM(hours)
                         FROM Bugs                       special-case code
                         WHERE status = ‘OPEN’        you were trying to avoid
                         AND hours <> -1;               by prohibiting NULL




                                             119
Monday, April 20, 2009                                                           119
NULL Antipatterns


                • Choosing an ordinary value in lieu of NULL:
                         •   Any given value may be significant in a column

                         •   Every column needs a different value

                         •   You need to remember or document the value
                             used for “missing” on a case-by-case basis




                                                 120
Monday, April 20, 2009                                                       120
NULL Antipatterns


                • Solution: use NULL appropriately
                         •   NULL signifies “missing” or “inapplicable”

                         •   Works for every data type

                         •   Already standard and well-understood




                                                 121
Monday, April 20, 2009                                                   121
NULL Antipatterns
                • Understanding NULL in expressions
                           Expression         Expected     Actual
                             NULL = 0               TRUE   Unknown
                           NULL = 12345         FALSE      Unknown
                          NULL <> 12345             TRUE   Unknown
                           NULL + 12345         12345      Unknown
                           NULL || ‘string’     string’    Unknown
                           NULL = NULL              TRUE   Unknown
                          NULL <> NULL          FALSE      Unknown




                                              122
Monday, April 20, 2009                                               122
NULL Antipatterns
                • Understanding NULL in boolean expressions
                           Expression      Expected     Actual
                          NULL AND TRUE      FALSE      Unknown
                          NULL AND FALSE     FALSE       FALSE
                          NULL OR FALSE      FALSE      Unknown
                          NULL OR TRUE           TRUE    TRUE
                           NOT (NULL)            TRUE   Unknown




                                           123
Monday, April 20, 2009                                            123
NULL Antipatterns
                • SQL supports IS NULL predicate that
                         returns true or false, never unknown:
                         SELECT * FROM Bugs
                         WHERE assigned_to IS NULL;
                         SELECT * FROM Bugs
                         WHERE assigned_to IS NOT NULL;




                                              124
Monday, April 20, 2009                                           124
NULL Antipatterns
                • SQL-99 supports IS DISTINCT FROM
                         predicate that returns true or false:
                         SELECT * FROM Bugs
                         WHERE assigned_to IS DISTINCT FROM 123;
                         SELECT * FROM Bugs
                         WHERE assigned_to IS NOT DISTINCT FROM 123;
                         SELECT * FROM Bugs                MySQL operator works like
                         WHERE assigned_to <=> 123;        IS NOT DISTINCT FROM




                                              125
Monday, April 20, 2009                                                                 125
NULL Antipatterns
                • Change NULL to ordinary value on demand
                         with COALESCE():
                           SELECT COALESCE(
                            first_name || ‘ ’ || middle_initial || ‘ ’ || last_name,
                            first_name || ‘ ’ || last_name) AS full_name
                           FROM Accounts;
                         • Also called NVL() or ISNULL()
                           in some database brands.




                                                  126
Monday, April 20, 2009                                                                126
Ambiguous Groups
                         Please accept my resignation. I don’t want to belong
                                to any club that will accept me as a member.
                                                           — Groucho Marx



                                                 127
Monday, April 20, 2009                                                          127
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back
Sql Antipatterns Strike Back

More Related Content

What's hot

Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Collectors in the Wild
Collectors in the WildCollectors in the Wild
Collectors in the WildJosé Paumard
 
Trees In The Database - Advanced data structures
Trees In The Database - Advanced data structuresTrees In The Database - Advanced data structures
Trees In The Database - Advanced data structuresLorenzo Alberton
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index CookbookMYXPLAIN
 
Capabilities for Resources and Effects
Capabilities for Resources and EffectsCapabilities for Resources and Effects
Capabilities for Resources and EffectsMartin Odersky
 
Hierarchical data models in Relational Databases
Hierarchical data models in Relational DatabasesHierarchical data models in Relational Databases
Hierarchical data models in Relational Databasesnavicorevn
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAPEDB
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
 
PostgreSQL and JDBC: striving for high performance
PostgreSQL and JDBC: striving for high performancePostgreSQL and JDBC: striving for high performance
PostgreSQL and JDBC: striving for high performanceVladimir Sitnikov
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Boost Performance With My S Q L 51 Partitions
Boost Performance With  My S Q L 51 PartitionsBoost Performance With  My S Q L 51 Partitions
Boost Performance With My S Q L 51 PartitionsPerconaPerformance
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 

What's hot (20)

Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Collectors in the Wild
Collectors in the WildCollectors in the Wild
Collectors in the Wild
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
 
Trees In The Database - Advanced data structures
Trees In The Database - Advanced data structuresTrees In The Database - Advanced data structures
Trees In The Database - Advanced data structures
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index Cookbook
 
Capabilities for Resources and Effects
Capabilities for Resources and EffectsCapabilities for Resources and Effects
Capabilities for Resources and Effects
 
Hierarchical data models in Relational Databases
Hierarchical data models in Relational DatabasesHierarchical data models in Relational Databases
Hierarchical data models in Relational Databases
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
 
PostgreSQL and JDBC: striving for high performance
PostgreSQL and JDBC: striving for high performancePostgreSQL and JDBC: striving for high performance
PostgreSQL and JDBC: striving for high performance
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Boost Performance With My S Q L 51 Partitions
Boost Performance With  My S Q L 51 PartitionsBoost Performance With  My S Q L 51 Partitions
Boost Performance With My S Q L 51 Partitions
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 

More from Karwin Software Solutions LLC (12)

Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
InnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick FiguresInnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick Figures
 
SQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and ProfitSQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and Profit
 
Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Schemadoc
SchemadocSchemadoc
Schemadoc
 
Percona toolkit
Percona toolkitPercona toolkit
Percona toolkit
 
MySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB StatusMySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB Status
 
Requirements the Last Bottleneck
Requirements the Last BottleneckRequirements the Last Bottleneck
Requirements the Last Bottleneck
 
Mentor Your Indexes
Mentor Your IndexesMentor Your Indexes
Mentor Your Indexes
 
Sql Injection Myths and Fallacies
Sql Injection Myths and FallaciesSql Injection Myths and Fallacies
Sql Injection Myths and Fallacies
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 

Recently uploaded

Water manjdjagement in problem soil.pptx
Water manjdjagement in problem soil.pptxWater manjdjagement in problem soil.pptx
Water manjdjagement in problem soil.pptxFaizanAhmad697379
 
Pushpa 2 Release Date not Postponed According to the Makers.pdf
Pushpa 2 Release Date not Postponed According to the Makers.pdfPushpa 2 Release Date not Postponed According to the Makers.pdf
Pushpa 2 Release Date not Postponed According to the Makers.pdfkigaya33
 
batwheels_01batwheels_01batwheels_01batwheels_01
batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01
batwheels_01batwheels_01batwheels_01batwheels_01Patricia Pham
 
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单utykdaq
 
PPT aviator (A small guide on spinmatch).pptx
PPT aviator (A small guide on spinmatch).pptxPPT aviator (A small guide on spinmatch).pptx
PPT aviator (A small guide on spinmatch).pptxspinmatch13
 
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...ATM HTML Games
 
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...Art Feeling Films
 
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单utykdaq
 
Top Best IPTV Providers in the UK for 2024.pdf
Top Best IPTV Providers in the UK for 2024.pdfTop Best IPTV Providers in the UK for 2024.pdf
Top Best IPTV Providers in the UK for 2024.pdfXtreame HDTV
 
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单utykdaq
 
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjj
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjjVector Methods.pptxjjjjjjjjjjjjjjjjjjjjjj
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjjjoshuaclack73
 
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptx
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptxQ4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptx
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptxramprakash8457
 
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单utykdaq
 
plantillas-powerpoint-hello-kitty.pptx.n
plantillas-powerpoint-hello-kitty.pptx.nplantillas-powerpoint-hello-kitty.pptx.n
plantillas-powerpoint-hello-kitty.pptx.nMayu Eliz
 
"My Silence, My Grave: The Making Of" Booklet
"My Silence, My Grave: The Making Of" Booklet"My Silence, My Grave: The Making Of" Booklet
"My Silence, My Grave: The Making Of" BookletDontLarry
 
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pure
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pureBromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pure
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pureamy56318795
 
Get the Best Online Lotus ID from the Top Betting ID Provider.
Get the Best Online Lotus ID from the Top Betting ID Provider.Get the Best Online Lotus ID from the Top Betting ID Provider.
Get the Best Online Lotus ID from the Top Betting ID Provider.virat777exch786
 
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单utykdaq
 

Recently uploaded (19)

Water manjdjagement in problem soil.pptx
Water manjdjagement in problem soil.pptxWater manjdjagement in problem soil.pptx
Water manjdjagement in problem soil.pptx
 
kiff2
kiff2kiff2
kiff2
 
Pushpa 2 Release Date not Postponed According to the Makers.pdf
Pushpa 2 Release Date not Postponed According to the Makers.pdfPushpa 2 Release Date not Postponed According to the Makers.pdf
Pushpa 2 Release Date not Postponed According to the Makers.pdf
 
batwheels_01batwheels_01batwheels_01batwheels_01
batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01batwheels_01
batwheels_01batwheels_01batwheels_01batwheels_01
 
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单
一比一原版(Westminster毕业证)威斯敏斯特大学毕业证成绩单
 
PPT aviator (A small guide on spinmatch).pptx
PPT aviator (A small guide on spinmatch).pptxPPT aviator (A small guide on spinmatch).pptx
PPT aviator (A small guide on spinmatch).pptx
 
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...
Driving & Racing Games Online for Free_ Explore the Excitement with ATM HTML ...
 
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...
A KING’S HEART THE STORY OF TSAR BORIS III (Drama) (Feature Film Project in D...
 
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单
一比一原版(OSU毕业证)俄亥俄州立大学哥伦布分校毕业证成绩单
 
Top Best IPTV Providers in the UK for 2024.pdf
Top Best IPTV Providers in the UK for 2024.pdfTop Best IPTV Providers in the UK for 2024.pdf
Top Best IPTV Providers in the UK for 2024.pdf
 
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单
一比一原版(UCL毕业证)伦敦大学学院毕业证成绩单
 
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjj
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjjVector Methods.pptxjjjjjjjjjjjjjjjjjjjjjj
Vector Methods.pptxjjjjjjjjjjjjjjjjjjjjjj
 
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptx
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptxQ4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptx
Q4 WEEK 1 JUDGE THE RELEVANCE AND WORTH OF IDEAS.pptx
 
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单
一比一原版(UAL毕业证)伦敦艺术大学毕业证成绩单
 
plantillas-powerpoint-hello-kitty.pptx.n
plantillas-powerpoint-hello-kitty.pptx.nplantillas-powerpoint-hello-kitty.pptx.n
plantillas-powerpoint-hello-kitty.pptx.n
 
"My Silence, My Grave: The Making Of" Booklet
"My Silence, My Grave: The Making Of" Booklet"My Silence, My Grave: The Making Of" Booklet
"My Silence, My Grave: The Making Of" Booklet
 
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pure
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pureBromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pure
Bromazolam CAS 71368-80-4 high quality opiates, Safe transportation, 99% pure
 
Get the Best Online Lotus ID from the Top Betting ID Provider.
Get the Best Online Lotus ID from the Top Betting ID Provider.Get the Best Online Lotus ID from the Top Betting ID Provider.
Get the Best Online Lotus ID from the Top Betting ID Provider.
 
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单
一比一原版(SDSU毕业证)圣地亚哥州立大学毕业证成绩单
 

Sql Antipatterns Strike Back

  • 1. SQL Antipatterns Strike Back Bill Karwin 1 Monday, April 20, 2009 1
  • 2. Antipattern Categories Database Design Database Creation Antipatterns Antipatterns CREATE TABLE BugsProducts ( bug_id INTEGER REFERENCES Bugs, product VARCHAR(100) REFERENCES Products, PRIMARY KEY (bug_id, product) ); Query Application Antipatterns Antipatterns SELECT b.product, COUNT(*) $dbHandle = new PDO(‘mysql:dbname=test’); FROM BugsProducts AS b $stmt = $dbHandle->prepare($sql); GROUP BY b.product; $result = $stmt->fetchAll(); 2 Monday, April 20, 2009 2
  • 3. Antipattern Categories Database Creation Database Design Antipatterns Antipatterns CREATE TABLE BugsProducts ( bug_id INTEGER REFERENCES Bugs, product VARCHAR(100) REFERENCES Products, PRIMARY KEY (bug_id, product) ); Application Query Antipatterns Antipatterns SELECT b.product, COUNT(*) $dbHandle = new PDO(‘mysql:dbname=test’); FROM BugsProducts AS b $stmt = $dbHandle->prepare($sql); GROUP BY b.product; $result = $stmt->fetchAll(); 3 Monday, April 20, 2009 3
  • 4. Database Design Antipatterns 1. Metadata Tribbles 2. Entity-Attribute-Value 3. Polymorphic Associations 4. Naive Trees 4 Monday, April 20, 2009 4
  • 5. Metadata Tribbles I want these things off the ship. I don’t care if it takes every last man we’ve got, I want them off the ship. — James T. Kirk 5 Monday, April 20, 2009 5
  • 6. Metadata Tribbles • Objective: improve performance of a very large table. 6 Monday, April 20, 2009 6
  • 7. Metadata Tribbles • Antipattern: separate into many tables with similar structure • Separate tables per distinct value in attribute • e.g., per year, per month, per user, per postal code, etc. 7 Monday, April 20, 2009 7
  • 8. Metadata Tribbles • Must create a new table for each new value CREATE TABLE Bugs_2005 ( . . . ); CREATE TABLE Bugs_2006 ( . . . ); CREATE TABLE Bugs_2007 ( . . . ); CREATE TABLE Bugs_2008 ( . . . ); mixing data ... with metadata 8 Monday, April 20, 2009 8
  • 9. Metadata Tribbles • Automatic primary keys cause conflicts: CREATE TABLE Bugs_2005 (bug_id SERIAL . . . ); CREATE TABLE Bugs_2006 (bug_id SERIAL . . . ); CREATE TABLE Bugs_2007 (bug_id SERIAL . . . ); CREATE TABLE Bugs_2008 (bug_id SERIAL . . . ); ... same values allocated in multiple tables 9 Monday, April 20, 2009 9
  • 10. Metadata Tribbles • Difficult to query across tables SELECT b.status, COUNT(*) AS count_per_status FROM ( SELECT * FROM Bugs_2009 UNION SELECT * FROM Bugs_2008 UNION SELECT * FROM Bugs_2007 UNION SELECT * FROM Bugs_2006 ) AS b GROUP BY b.status; 10 Monday, April 20, 2009 10
  • 11. Metadata Tribbles • Table structures are not kept in sync ALTER TABLE Bugs_2009 ADD COLUMN hours NUMERIC; • Prior tables don’t contain new column • Dissimilar tables can’t be combined with UNION 11 Monday, April 20, 2009 11
  • 12. Metadata Tribbles • Solution #1: use horizontal partitioning • Physically split, while logically whole • MySQL 5.1 supports partitioning BUGS (2007) BUGS BUGS (2008) BUGS (2009) 12 Monday, April 20, 2009 12
  • 13. Metadata Tribbles • Solution #2: use vertical partitioning • Move bulky and seldom-used columns to a second table in one-to-one relationship ( 1 .. 1 ) PRODUCTS INSTALLERS 13 Monday, April 20, 2009 13
  • 14. Metadata Tribbles • Columns can also be tribbles: CREATE TABLE Bugs ( bug_id SERIAL PRIMARY KEY, ... product_id1 BIGINT, product_id2 BIGINT, product_id3 BIGINT ); 14 Monday, April 20, 2009 14
  • 15. Metadata Tribbles • Solution #3: add a dependent table CREATE TABLE BugsProducts ( bug_id BIGINT REFERENCES bugs, product_id BIGINT REFERENCES products, PRIMARY KEY (bug_id, product_id) ); (1 .. *) BUGS BUGS PRODUCTS 15 Monday, April 20, 2009 15
  • 16. Entity-Attribute-Value If you try and take a cat apart to see how it works, the first thing you have on your hands is a non-working cat. — Richard Dawkins 16 Monday, April 20, 2009 16
  • 17. Entity-Attribute-Value • Objective: make a table with a variable set of attributes bug_id bug_type priority description severity sponsor crashes when loss of 1234 BUG high saving functionality 3456 FEATURE low support XML Acme Corp. 17 Monday, April 20, 2009 17
  • 18. Entity-Attribute-Value • Antipattern: store all attributes in a second table, one attribute per row CREATE TABLE eav ( bug_id BIGINT NOT NULL, attr_name VARCHAR(20) NOT NULL, attr_value VARCHAR(100), PRIMARY KEY (bug_id, attr_name), FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id) ); mixing data with metadata 18 Monday, April 20, 2009 18
  • 19. Entity-Attribute-Value bug_id attr_name attr_value 1234 priority high 1234 description crashes when saving 1234 severity loss of functionality 3456 priority low 3456 description support XML 3456 sponsor Acme Corp. 19 Monday, April 20, 2009 19
  • 20. Entity-Attribute-Value • Difficult to rely on attribute names bug_id attr_name attr_value 1234 created 2008-04-01 3456 created_date 2008-04-01 20 Monday, April 20, 2009 20
  • 21. Entity-Attribute-Value • Difficult to enforce data type integrity bug_id attr_name attr_value 1234 created_date 2008-02-31 3456 created_date banana 21 Monday, April 20, 2009 21
  • 22. Entity-Attribute-Value • Difficult to enforce mandatory attributes (i.e. NOT NULL) • SQL constraints apply to columns, not rows • No way to declare that a row must exist with a certain attr_name value (‘created_date’) • Maybe create a trigger on INSERT for bugs? 22 Monday, April 20, 2009 22
  • 23. Entity-Attribute-Value • Difficult to enforce referential integrity for attribute values bug_id attr_name attr_value 1234 priority new 3456 priority fixed 5678 priority banana • Constraints apply to all rows in the column, not selected rows depending on value in attr_name 23 Monday, April 20, 2009 23
  • 24. Entity-Attribute-Value • Difficult to reconstruct a row of attributes: SELECT b.bug_id, need one JOIN e1.attr_value AS created_date, per attribute e2.attr_value AS priority, e3.attr_value AS description, e4.attr_value AS status, e5.attr_value AS reported_by FROM Bugs b LEFT JOIN eav e1 ON (b.bug_id = e1.bug_id AND e1.attr_name = ‘created_date’) LEFT JOIN eav e2 ON (b.bug_id = e2.bug_id AND e2.attr_name = ‘priority’) LEFT JOIN eav e3 ON (b.bug_id = e3.bug_id AND e3.attr_name = ‘description’) LEFT JOIN eav e4 ON (b.bug_id = e4.bug_id AND e4.attr_name = ‘status’) LEFT JOIN eav e5 ON (b.bug_id = e5.bug_id AND e5.attr_name = ‘reported_by’); bug_id created_date priority description status reported_by 1234 2008-04-01 high Crashes when I save. NEW Bill 24 Monday, April 20, 2009 24
  • 25. Entity-Attribute-Value • Solution: use metadata for metadata • Define attributes in columns • ALTER TABLE to add attribute columns • Define related tables for related types 25 Monday, April 20, 2009 25
  • 26. Entity-Attribute-Value • Solution #1: Single Table Inheritance • One table with many columns • Columns are NULL when inapplicable CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY, created_date DATE NOT NULL, priority VARCHAR(20), description TEXT, issue_type CHAR(1) CHECK (issue_type IN (‘B’, ‘F’)), bug_severity VARCHAR(20), feature_sponsor VARCHAR(100) ); 26 Monday, April 20, 2009 26
  • 27. Entity-Attribute-Value • Solution #2: Concrete Table Inheritance • Define similar tables for similar types • Duplicate common columns in each table CREATE TABLE Bugs ( CREATE TABLE Features ( bug_id SERIAL PRIMARY KEY, bug_id SERIAL PRIMARY KEY, created_date DATE NOT NULL, created_date DATE NOT NULL, priority VARCHAR(20), priority VARCHAR(20), description TEXT, description TEXT, severity VARCHAR(20) sponsor VARCHAR(100) ); ); 27 Monday, April 20, 2009 27
  • 28. Entity-Attribute-Value • Solution #2: Concrete Table Inheritance • Use UNION to search both tables: SELECT * FROM ( SELECT issue_id, description FROM Bugs UNION ALL SELECT issue_id, description FROM Features ) unified_table WHERE description LIKE ... 28 Monday, April 20, 2009 28
  • 29. Entity-Attribute-Value • Solution #3: Class Table Inheritance • Common columns in base table • Subtype-specific columns in subtype tables CREATE TABLE Bugs ( CREATE TABLE Features ( issue_id BIGINT PRIMARY KEY, issue_id BIGINT PRIMARY KEY, severity VARCHAR(20), sponsor VARCHAR(100), FOREIGN KEY (issue_id) FOREIGN KEY (issue_id) REFERENCES Issues (issue_id) REFERENCES Issues (issue_id) ); ); CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY, created_date DATE NOT NULL priority VARCHAR(20), description TEXT ); 29 Monday, April 20, 2009 29
  • 30. Entity-Attribute-Value • Solution #3: Class Table Inheritance • Easy to query common columns: SELECT * FROM Issues WHERE description LIKE ... • Easy to query one subtype at a time: SELECT * FROM Issues JOIN Bugs USING (issue_id); 30 Monday, April 20, 2009 30
  • 31. Entity-Attribute-Value • Appropriate usage of EAV: • If attributes must be fully flexible and dynamic • You must enforce constraints in application code • Don’t try to fetch one object in a single row • Consider non-relational solutions for semi-structured data, e.g. RDF/XML 31 Monday, April 20, 2009 31
  • 32. Polymorphic Associations Of course, some people do go both ways. — The Scarecrow 32 Monday, April 20, 2009 32
  • 33. Polymorphic Assocations • Objective: reference multiple parents BUGS COMMENTS FEATURES 33 Monday, April 20, 2009 33
  • 34. Polymorphic Assocations • Can’t make a FOREIGN KEY constraint reference two tables: CREATE TABLE Comments ( comment_id SERIAL PRIMARY KEY, comment TEXT NOT NULL, issue_type VARCHAR(15) CHECK (issue_type IN (‘Bugs’, ‘Features’)), issue_id BIGINT NOT NULL, FOREIGN KEY issue_id REFERENCES ); you need this to be Bugs or Features 34 Monday, April 20, 2009 34
  • 35. Polymorphic Assocations • Instead, you have to define table with no FOREIGN KEY or referential integrity: CREATE TABLE Comments ( comment_id SERIAL PRIMARY KEY, comment TEXT NOT NULL, issue_type VARCHAR(15) CHECK (issue_type IN (‘Bugs’, ‘Features’)), issue_id BIGINT NOT NULL ); 35 Monday, April 20, 2009 35
  • 36. Polymorphic Assocations comment comment issue_type issue_id id issue_id issue_id 6789 “It crashes” Bugs 1234 ... ... 9876 “Great idea!” Features 2345 1234 2345 Comments Bugs Features Query result: c. b. f. comment_id comment issue_type issue_id issue_id issue_id 6789 “It crashes” Bug 1234 1234 NULL “Great 9876 Feature 2345 NULL 2345 idea!” 36 Monday, April 20, 2009 36
  • 37. Polymorphic Assocations • You can’t use a different table for each row. You must name all tables explicitly. SELECT * FROM Comments JOIN USING (issue_id); you need this to be Bugs or Features 37 Monday, April 20, 2009 37
  • 38. Polymorphic Assocations • Instead, join to each parent table: SELECT * FROM Comments c LEFT JOIN Bugs b ON (c.issue_type = ‘Bugs’ AND c.issue_id = b.issue_id) LEFT JOIN Features f ON (c.issue_type = ‘Features’ AND c.issue_id = f.issue_id); you have to get these strings right 38 Monday, April 20, 2009 38
  • 39. Polymorphic Assocations • Solution #1: exclusive arcs CREATE TABLE Comments ( comment_id SERIAL PRIMARY KEY, comment TEXT NOT NULL, bug_id BIGINT, both columns are nullable; feature_id BIGINT, exactly one must be non-null FOREIGN KEY bug_id REFERENCES Bugs(bug_id) FOREIGN KEY feature_id REFERENCES Features(feature_id) ); 39 Monday, April 20, 2009 39
  • 40. Polymorphic Assocations • Solution #1: exclusive arcs • Referential integrity is enforced • But hard to make sure exactly one is non-null • Queries are easier: SELECT * FROM Comments c LEFT JOIN Bugs b USING (bug_id) LEFT JOIN Features f USING (feature_id); 40 Monday, April 20, 2009 40
  • 41. Polymorphic Assocations • Solution #2: reverse the relationship BUGS BUGS COMMENTS COMMENTS FEATURES FEATURES COMMENTS 41 Monday, April 20, 2009 41
  • 42. Polymorphic Assocations • Solution #2: reverse the relationship CREATE TABLE BugsComments ( comment_id BIGINT NOT NULL, bug_id BIGINT NOT NULL, PRIMARY KEY (comment_id), FOREIGN KEY (comment_id) REFERENCES Comments(comment_id), FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id) ); CREATE TABLE FeaturesComments ( comment_id BIGINT NOT NULL, feature_id BIGINT NOT NULL, PRIMARY KEY (comment_id), FOREIGN KEY (comment_id) REFERENCES Comments(comment_id), FOREIGN KEY (feature_id) REFERENCES Features(feature_id) ); 42 Monday, April 20, 2009 42
  • 43. Polymorphic Assocations • Solution #2: reverse the relationship • Referential integrity is enforced • Query comments for a given bug: SELECT * FROM BugsComments b JOIN Comments c USING (comment_id) WHERE b.bug_id = 1234; • Query bug/feature for a given comment: SELECT * FROM Comments LEFT JOIN (BugsComments JOIN Bugs USING (bug_id)) USING (comment_id) LEFT JOIN (FeaturesComments JOIN Features USING (feature_id)) USING (comment_id) WHERE comment_id = 9876; 43 Monday, April 20, 2009 43
  • 44. Polymorphic Assocations • Solution #3: use a base parent table BUGS ISSUES FEATURES COMMENTS 44 Monday, April 20, 2009 44
  • 45. Polymorphic Assocations • Solution #3: use a base parent table CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY ); CREATE TABLE Bugs ( issue_id BIGINT PRIMARY KEY, ... FOREIGN KEY (issue_id) REFERENCES Issues(issue_id) ); CREATE TABLE Comments ( comment_id SERIAL PRIMARY KEY, comment TEXT NOT NULL, issue_id BIGINT NOT NULL, FOREIGN KEY (issue_id) REFRENCES Issues(issue_id) ); 45 Monday, April 20, 2009 45
  • 46. Polymorphic Assocations • Solution #3: use a base parent table • Referential integrity is enforced • Queries are easier: SELECT * FROM Comments JOIN Issues USING (issue_id) LEFT JOIN Bugs USING (issue_id) LEFT JOIN Features USING (issue_id); 46 Monday, April 20, 2009 46
  • 47. Polymorphic Assocations • Enforcing disjoint subtypes: CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY, issue_type CHAR(1) NOT NULL CHECK (issue_type IN (‘B’, ‘F’)), UNIQUE KEY (issue_id, issue_type) ); referential integrity CREATE TABLE Bugs ( issue_id BIGINT PRIMARY KEY, issue_type CHAR(1) NOT NULL CHECK (issue_type = ‘B’), ... FOREIGN KEY (issue_id, issue_type) REFERENCES Issues(issue_id, issue_type) ); 47 Monday, April 20, 2009 47
  • 48. Naive Trees 48 Monday, April 20, 2009 48
  • 49. Naive Trees • Objective: store/query hierarchical data • Categories/subcategories • Bill of materials • Threaded discussions 49 Monday, April 20, 2009 49
  • 50. Naive Trees (1) Fran: What’s the cause of this bug? (4) Kukla: (2) Ollie: We need to I think it’s a null check valid input. pointer. (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. (7) Kukla: That fixed it. 50 Monday, April 20, 2009 50
  • 51. Naive Trees • Adjacency List • Naive solution nearly everyone uses • Each entry in the tree knows immediate parent comment_id parent_id author comment 1 NULL Fran What’s the cause of this bug? 2 1 Ollie I think it’s a null pointer. 3 2 Fran No, I checked for that. 4 1 Kukla We need to check valid input. 5 4 Ollie Yes, that’s a bug. 6 4 Fran Yes, please add a check 7 6 Kukla That fixed it. 51 Monday, April 20, 2009 51
  • 52. Naive Trees • Adjacency List • Easy to inserting a new comment: INSERT INTO Comments (parent_id, author, comment) VALUES (7, ‘Kukla’, ‘Thanks!’); • Easy to move a subtree to a new position: UPDATE Comments SET parent_id = 3 WHERE comment_id = 6; 52 Monday, April 20, 2009 52
  • 53. Naive Trees • Adjacency List • Querying a node’s immediate children is easy: SELECT * FROM Comments c1 LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id); • Querying a node’s immediate parent is easy: SELECT * FROM Comments c1 JOIN Comments c2 ON (c1.parent_id = c2.comment_id); 53 Monday, April 20, 2009 53
  • 54. Naive Trees • Adjacency List • Hard to query all descendants in a deep tree: SELECT * FROM Comments c1 LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id) LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id) LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id) LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id) LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id) LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id) LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id) LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id) LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id) ... it still doesn’t support unlimited depth! 54 Monday, April 20, 2009 54
  • 55. Naive Trees • Solution #1: Path Enumeration • Store chain of ancestors as a string in each node • Good for breadcrumbs, or sorting by hierarchy comment_id path author comment 1 1/ Fran What’s the cause of this bug? 2 1/2/ Ollie I think it’s a null pointer. 3 1/2/3/ Fran No, I checked for that. 4 1/4/ Kukla We need to check valid input. 5 1/4/5/ Ollie Yes, that’s a bug. 6 1/4/6/ Fran Yes, please add a check 7 1/4/6/7/ Kukla That fixed it. 55 Monday, April 20, 2009 55
  • 56. Naive Trees • Solution #1: Path Enumeration • Easy to query all ancestors of comment #7: SELECT * FROM Comments WHERE ‘1/4/6/7/’ LIKE path || ‘%’; • Easy to query all descendants of comment #4: SELECT * FROM Comments WHERE path LIKE ‘1/4/%’; 56 Monday, April 20, 2009 56
  • 57. Naive Trees • Solution #1: Path Enumeration • Easy to add child of comment 7: INSERT INTO Comments (author, comment) VALUES (‘Ollie’, ‘Good job!’); SELECT path FROM Comments WHERE comment_id = 7; UPDATE Comments SET path = $parent_path || LAST_INSERT_ID() || ‘/’ WHERE comment_id = LAST_INSERT_ID(); 57 Monday, April 20, 2009 57
  • 58. Naive Trees • Solution #2: Nested Sets • Each comment encodes its descendants using two numbers: • A comment’s right number is less than all the numbers used by the comment’s descendants. • A comment’s left number is greater than all the numbers used by the comment’s descendants. 58 Monday, April 20, 2009 58
  • 59. Naive Trees • Solution #2: Nested Sets (1) Fran: What’s the cause of this bug? 1 14 (2) Ollie: (4) Kukla: I think it’s a null We need to check pointer. valid input. 2 5 6 13 (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. 3 4 7 8 9 12 (7) Kukla: That fixed it. 10 11 59 Monday, April 20, 2009 59
  • 60. Naive Trees • Solution #2: Nested Sets comment_id nsleft nsright author comment 1 1 14 Fran What’s the cause of this bug? 2 2 5 Ollie I think it’s a null pointer. 3 3 4 Fran No, I checked for that. 4 6 13 Kukla We need to check valid input. 5 7 8 Ollie Yes, that’s a bug. 6 9 12 Fran Yes, please add a check 7 10 11 Kukla That fixed it. these are not foreign keys 60 Monday, April 20, 2009 60
  • 61. Naive Trees • Solution #2: Nested Sets • Easy to query all ancestors of comment #7: SELECT * FROM Comments child JOIN Comments ancestor ON (child.left BETWEEN ancestor.nsleft AND ancestor.nsright) WHERE child.comment_id = 7; 61 Monday, April 20, 2009 61
  • 62. Naive Trees • Solution #2: Nested Sets ancestors (1) Fran: What’s the cause of this bug? 1 14 (2) Ollie: (4) Kukla: I think it’s a null We need to check pointer. valid input. child 2 5 6 13 (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. 3 4 7 8 9 12 (7) Kukla: That fixed it. 10 11 62 Monday, April 20, 2009 62
  • 63. Naive Trees • Solution #2: Nested Sets • Easy to query all descendants of comment #4: SELECT * FROM Comments parent JOIN Comments descendant ON (descendant.left BETWEEN parent.nsleft AND parent.nsright) WHERE parent.comment_id = 4; 63 Monday, April 20, 2009 63
  • 64. Naive Trees • Solution #2: Nested Sets parent (1) Fran: What’s the cause of this bug? 1 14 descendants (2) Ollie: (4) Kukla: I think it’s a null We need to check pointer. valid input. 2 5 6 13 (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. 3 4 7 8 9 12 (7) Kukla: That fixed it. 10 11 64 Monday, April 20, 2009 64
  • 65. Naive Trees • Solution #2: Nested Sets • Hard to insert a new child of comment #5: UPDATE Comment SET nsleft = CASE WHEN nsleft >= 8 THEN nsleft+2 ELSE nsleft END, nsright = nsright+2 WHERE nsright >= 7; INSERT INTO Comment (nsleft, nsright, author, comment) VALUES (8, 9, 'Fran', 'I agree!'); • Recalculate left values for all nodes to the right of the new child. Recalculate right values for all nodes above and to the right. 65 Monday, April 20, 2009 65
  • 66. Naive Trees • Solution #2: Nested Sets (1) Fran: What’s the cause of this bug? 1 16 14 (2) Ollie: (4) Kukla: I think it’s a null We need to check pointer. valid input. 2 5 6 15 13 (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. 3 4 7 89 10 11 14 12 (8) Fran: (7) Kukla: I agree! That fixed it. 8 9 10 12 13 11 66 Monday, April 20, 2009 66
  • 67. Naive Trees • Solution #2: Nested Sets • Hard to query the parent of comment #6: SELECT parent.* FROM Comments AS c JOIN Comments AS parent ON (c.nsleft BETWEEN parent.nsleft AND parent.nsright) LEFT OUTER JOIN Comments AS in_between ON (c.nsleft BETWEEN in_between.nsleft AND in_between.nsright AND in_between.nsleft BETWEEN parent.nsleft AND parent.nsright) WHERE c.comment_id = 6 AND in_between.comment_id IS NULL; • Parent of #6 is an ancestor who has no descendant who is also an ancestor of #6. • Querying a child is a similar problem. 67 Monday, April 20, 2009 67
  • 68. Naive Trees • Solution #3: Closure Table • Store every path from ancestors to descendants • Requires an additional table: CREATE TABLE TreePaths ( ancestor BIGINT NOT NULL, descendant BIGINT NOT NULL, PRIMARY KEY (ancestor, descendant), FOREIGN KEY(ancestor) REFERENCES Comments(comment_id), FOREIGN KEY(descendant) REFERENCES Comments(comment_id), ); 68 Monday, April 20, 2009 68
  • 69. Naive Trees • Solution #3: Closure Table (1) Fran: What’s the cause of this bug? (2) Ollie: (4) Kukla: I think it’s a null We need to check pointer. valid input. (3) Fran: (6) Fran: (5) Ollie: No, I checked for Yes, please add a Yes, that’s a bug. that. check. (7) Kukla: That fixed it. 69 Monday, April 20, 2009 69
  • 70. Naive Trees • Solution #3: ancestor descendant Closure Table 1 1 1 2 comment_id author comment 1 3 1 Fran What’s the cause of this bug? 1 4 2 Ollie I think it’s a null pointer. 1 5 3 Fran No, I checked for that. 1 6 4 Kukla We need to check valid input. 1 7 5 Ollie Yes, that’s a bug. 2 2 6 Fran Yes, please add a check 2 3 7 Kukla That fixed it. 3 3 4 4 requires O(n²) 4 5 rows at most 4 6 4 7 but far fewer 5 5 in practice 6 6 6 7 7 7 70 Monday, April 20, 2009 70
  • 71. Naive Trees • Solution #3: Closure Table • Easy to query descendants of comment #4: SELECT c.* FROM Comments c JOIN TreePaths t ON (c.comment_id = t.descendant) WHERE t.ancestor = 4; 71 Monday, April 20, 2009 71
  • 72. Naive Trees • Solution #3: Closure Table • Easy to query ancestors of comment #6: SELECT c.* FROM Comments c JOIN TreePaths t ON (c.comment_id = t.ancestor) WHERE t.descendant = 6; 72 Monday, April 20, 2009 72
  • 73. Naive Trees • Solution #3: Closure Table • Easy to insert a new child of comment #5: generates comment #8 INSERT INTO Comments ... INSERT INTO TreePaths (ancestor, descendant) VALUES (8, 8); INSERT INTO TreePaths (ancestor, descendant) SELECT ancestor, 8 FROM TreePaths WHERE descendant = 5; 73 Monday, April 20, 2009 73
  • 74. Naive Trees • Solution #3: Closure Table • Easy to delete a child comment #7: DELETE FROM TreePaths WHERE descendant = 7; • Even easier with ON DELETE CASCADE 74 Monday, April 20, 2009 74
  • 75. Naive Trees • Solution #3: Closure Table • Easy to delete the subtree under comment #4: DELETE FROM TreePaths WHERE descendant IN (SELECT descendant FROM TreePaths WHERE ancestor = 4); • For MySQL, use multi-table DELETE: DELETE p FROM TreePaths p JOIN TreePaths a USING (descendant) WHERE a.ancestor = 4; 75 Monday, April 20, 2009 75
  • 76. Naive Trees • Solution #3: ancestor descendant depth Closure Table 1 1 0 1 2 1 • Add a depth column to make it 1 3 2 easier to query immediate parent 1 4 1 1 5 2 or child: 1 6 2 1 7 3 SELECT c.* FROM Comments c 2 2 0 JOIN TreePaths t 2 3 1 ON (c.comment_id = t.descendant) 3 3 0 4 4 0 WHERE t.ancestor = 4 4 5 1 AND t.depth = 1; 4 6 1 4 7 2 5 5 0 6 6 0 6 7 1 7 7 0 76 Monday, April 20, 2009 76
  • 77. Naive Trees • Summary of Designs: Number of Query Query Modify Referential Design Tables Child Subtree Tree Integrity Adjacency List 1 Easy Hard Easy Yes Path 1 Easy Easy Hard No Enumeration Nested Sets 1 Hard Easy Hard No Closure Table 2 Easy Easy Easy Yes 77 Monday, April 20, 2009 77
  • 78. Antipattern Categories Database Design Database Creation Antipatterns Antipatterns CREATE TABLE BugsProducts ( bug_id INTEGER REFERENCES Bugs, product VARCHAR(100) REFERENCES Products, PRIMARY KEY (bug_id, product) ); Query Application Antipatterns Antipatterns SELECT b.product, COUNT(*) $dbHandle = new PDO(‘mysql:dbname=test’); FROM BugsProducts AS b $stmt = $dbHandle->prepare($sql); GROUP BY b.product; $result = $stmt->fetchAll(); 78 Monday, April 20, 2009 78
  • 79. Database Creation Antipatterns 5. ENUM Antipattern 6. Rounding Errors 7. Indexes Are Magical 79 Monday, April 20, 2009 79
  • 80. ENUM Antipattern 80 Monday, April 20, 2009 80
  • 81. ENUM Antipattern • Objective: restrict a column to a fixed set of values OK INSERT INTO bugs (status) VALUES (‘new’) INSERT INTO bugs (status) VALUES (‘banana’) FAIL 81 Monday, April 20, 2009 81
  • 82. ENUM Antipattern • Antipattern: use ENUM data type, when the set of values may change CREATE TABLE Bugs ( ... status ENUM(‘new’, ‘open’, ‘fixed’) ); 82 Monday, April 20, 2009 82
  • 83. ENUM Antipattern • Changing the set of values is a metadata alteration • You must know the current set of values ALTER TABLE Bugs MODIFY COLUMN status ENUM(‘new’, ‘open’, ‘fixed’, ‘duplicate’); 83 Monday, April 20, 2009 83
  • 84. ENUM Antipattern • Difficult to get a list of possible values SELECT column_type FROM information_schema.columns WHERE table_schema = ‘bugtracker_schema’ AND table_name = ‘Bugs’ AND column_name = ‘status’; • Returns a LONGTEXT you must parse: “ENUM(‘new’, ‘open’, ‘fixed’)” 84 Monday, April 20, 2009 84
  • 85. ENUM Antipattern • Solution: use ENUM only if values are set in stone CREATE TABLE Bugs ( ... bug_type ENUM(‘defect’, ‘feature’) ); 85 Monday, April 20, 2009 85
  • 86. ENUM Antipattern • Use a lookup table if values may change CREATE TABLE BugStatus ( status VARCHAR(10) PRIMARY KEY ); INSERT INTO BugStatus (status) VALUES (‘NEW’), (‘OPEN’), (‘FIXED’); BUGS BUGSTATUS 86 Monday, April 20, 2009 86
  • 87. ENUM Antipattern • Adding/removing a value is a data operation, not a metadata operation • You don’t need to know the current values INSERT INTO BugStatus (status) VALUES (‘DUPLICATE’); 87 Monday, April 20, 2009 87
  • 88. ENUM Antipattern • Use an attribute to retire values, not DELETE CREATE TABLE BugStatus ( status VARCHAR(10) PRIMARY KEY, active TINYINT NOT NULL DEFAULT 1 ); UPDATE BugStatus SET active = 0 WHERE status = ‘DUPLICATE’; 88 Monday, April 20, 2009 88
  • 89. Rounding Errors 10.0 times 0.1 is hardly ever 1.0. — Brian Kernighan 89 Monday, April 20, 2009 89
  • 90. Rounding Errors • Objective: store real numbers exactly • Especially money • Work estimate hours 90 Monday, April 20, 2009 90
  • 91. Rounding Errors • Antipattern: use FLOAT data type ALTER TABLE Bugs ADD COLUMN hours FLOAT; INSERT INTO Bugs (bug_id, hours) VALUES (1234, 3.3); 91 Monday, April 20, 2009 91
  • 92. Rounding Errors • FLOAT is inexact SELECT hours FROM Bugs WHERE bug_id = 1234; ‣ 3.3 SELECT hours * 1000000000 FROM Bugs WHERE bug_id = 1234; ‣ 3299999952.3163 92 Monday, April 20, 2009 92
  • 93. Rounding Errors • Inexact decimals assuming infinite precision • 1/3 + 1/3 + 1/3 = 1.0 • 0.333 + 0.333 + 0.333 = 0.999 finite precision 93 Monday, April 20, 2009 93
  • 94. Rounding Errors • IEEE 754 standard for representing floating-point numbers in base-2 • Some numbers round off, aren’t stored exactly • Comparisons to original value fail SELECT * FROM Bugs WHERE hours = 3.3; comparison fails 94 Monday, April 20, 2009 94
  • 95. Rounding Errors • Solution: use NUMERIC data type ALTER TABLE Bugs ADD COLUMN hours NUMERIC(9,2) INSERT INTO bugs (bug_id, hours) VALUES (1234, 3.3); SELECT * FROM Bugs comparison WHERE hours = 3.3; succeeds 95 Monday, April 20, 2009 95
  • 96. Indexes are Magical Whenever any result is sought, the question will then arise — by what course of calculation can these results be arrived at by the machine in the shortest time? — Charles Babbage 96 Monday, April 20, 2009 96
  • 97. Indexes are Magical • Objective: execute queries with optimal performance 97 Monday, April 20, 2009 97
  • 98. Indexes are Magical • Antipatterns: • Creating indexes blindly • Executing non-indexable queries • Rejecting indexes because of their overhead 98 Monday, April 20, 2009 98
  • 99. Indexes are Magical • Creating indexes blindly: CREATE TABLE Bugs ( bug_id SERIAL PRIMARY KEY, date_reported DATE NOT NULL, summary VARCHAR(80) NOT NULL, status VARCHAR(10) NOT NULL, redundant index hours NUMERIC(9,2), bulky index INDEX (bug_id), INDEX (summary), unnecessary index INDEX (hours), unnecessary INDEX (bug_id, date_reported, status) covering index ); 99 Monday, April 20, 2009 99
  • 100. Indexes are Magical • Executing non-indexable queries: • SELECT * FROM Bugs WHERE description LIKE ‘%crash%’; non-leftmost • string match SELECT * FROM Bugs WHERE MONTH(date_reported) = 4; function applied to column • SELECT * FROM Bugs WHERE last_name = “...” OR first_name = “...”; • no index spans SELECT * FROM Accounts all rows ORDER BY first_name, last_name; non-leftmost composite key match 100 Monday, April 20, 2009 100
  • 101. Indexes are Magical • Telephone book analogy • Easy to search for Dean Thomas: uses index to match SELECT * FROM TelephoneBook WHERE full_name LIKE ‘Thomas, %’; • Hard to search for Thomas Riddle: requires full table scan SELECT * FROM TelephoneBook WHERE full_name LIKE ‘%, Thomas’; 101 Monday, April 20, 2009 101
  • 102. Indexes are Magical • Rejecting indexes because of their overhead: Query w/ Index: O(log n) 80 Update Index: O(log n) Query w/o Index: O(n) 60 the benefit quickly 40 justifies the overhead 20 0 0 100 200 300 400 500 600 700 800 9001000 102 Monday, April 20, 2009 102
  • 103. Indexes are Magical • Solution: “MENTOR” your indexes Measure Explain Nominate Test Optimize Repair 103 Monday, April 20, 2009 103
  • 104. Indexes are Magical • Solution: “MENTOR” your indexes • Profile your application. Measure • Focus on the most costly Explain SQL queries: Nominate • Longest-running. • Most frequently run. Test • Blockers, lockers, and Optimize deadlocks. Repair 104 Monday, April 20, 2009 104
  • 105. Indexes are Magical • Solution: “MENTOR” your indexes • Analyze the optimization Measure plan of costly queries, Explain e.g. MySQL’s EXPLAIN • Identify tables that aren’t Nominate using indexes: Test • Temporary table Optimize • Filesort Repair 105 Monday, April 20, 2009 105
  • 106. Indexes are Magical • Solution: “MENTOR” your indexes • Could an index improve Measure access to these tables? Explain • ORDER BY criteria Nominate • MIN() / MAX() • WHERE conditions Test • Which column(s) need Optimize indexes? Repair 106 Monday, April 20, 2009 106
  • 107. Indexes are Magical • Solution: “MENTOR” your indexes • After creating index, Measure measure again. Explain • Confirm the new index Nominate made a difference. • Impress your boss! Test “The new index gave a 27% Optimize performance improvement!” Repair 107 Monday, April 20, 2009 107
  • 108. Indexes are Magical • Solution: “MENTOR” your indexes • Indexes are compact, Measure frequently-used data. Explain • Try to cache indexes in Nominate memory: • MyISAM: key_buffer_size, Test LOAD INDEX INTO CACHE • InnoDB: innodb_buffer_pool_size Optimize Repair 108 Monday, April 20, 2009 108
  • 109. Indexes are Magical • Solution: “MENTOR” your indexes • Indexes require periodic Measure maintenance. Explain • Like a filesystem requires Nominate periodic defragmentation. • Analyze or rebuild Test indexes, e.g. in MySQL: Optimize • ANALYZE TABLE • OPTIMIZE TABLE Repair 109 Monday, April 20, 2009 109
  • 110. Indexes are Magical • Solution: “MENTOR” your indexes • Sounds like the name BY THE AUTHOR OF THE BESTSELLER WHY DOESN’T MY CACHE GET ANY HITS? of a “self-help” book! William K. Karwin MENTOR YOUR INDEXES How to Break the just kidding! Bad Performance Habits please don’t ask That Make when it’s coming out! You Miserable 110 Monday, April 20, 2009 110
  • 111. Antipattern Categories Database Design Database Creation Antipatterns Antipatterns CREATE TABLE BugsProducts ( bug_id INTEGER REFERENCES Bugs, product VARCHAR(100) REFERENCES Products, PRIMARY KEY (bug_id, product) ); Application Query Antipatterns Antipatterns SELECT b.product, COUNT(*) $dbHandle = new PDO(‘mysql:dbname=test’); FROM BugsProducts AS b $stmt = $dbHandle->prepare($sql); GROUP BY b.product; $result = $stmt->fetchAll(); 111 Monday, April 20, 2009 111
  • 112. Query Antipatterns 8. NULL antipatterns 9. Ambiguous Groups 10. Random Order 11. JOIN antipattern 12. Goldberg Machine 112 Monday, April 20, 2009 112
  • 113. NULL Antipatterns As we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns — the ones we don't know we don't know. — Donald Rumsfeld 113 Monday, April 20, 2009 113
  • 114. NULL Antipatterns • Objective: handle “missing” values, store them as missing, and support them in queries. 114 Monday, April 20, 2009 114
  • 115. NULL Antipatterns • Antipatterns: • Use NULL as an ordinary value • Use an ordinary value as NULL 115 Monday, April 20, 2009 115
  • 116. NULL Antipatterns • Using NULL in most expressions results in an unknown value. NULL is not zero SELECT NULL + 10; SELECT ‘Bill’ || NULL; NULL is not an empty string SELECT FALSE OR NULL; NULL is not FALSE 116 Monday, April 20, 2009 116
  • 117. NULL Antipatterns • The opposite of unknown is still unknown. SELECT * FROM Bugs which query returns bugs WHERE assigned_to = 123; that are not yet assigned? neither query! SELECT * FROM Bugs WHERE NOT (assigned_to = 123); 117 Monday, April 20, 2009 117
  • 118. NULL Antipatterns • Choosing an ordinary value in lieu of NULL: UPDATE Bugs SET assigned_to = -1 WHERE assigned_to IS NULL; assigned_to is a foreign key so this value doesn’t work 118 Monday, April 20, 2009 118
  • 119. NULL Antipatterns • Choosing an ordinary value in lieu of NULL: UPDATE Bugs SET hours = -1 WHERE hours IS NULL; this makes SUM() inaccurate SELECT SUM(hours) FROM Bugs special-case code WHERE status = ‘OPEN’ you were trying to avoid AND hours <> -1; by prohibiting NULL 119 Monday, April 20, 2009 119
  • 120. NULL Antipatterns • Choosing an ordinary value in lieu of NULL: • Any given value may be significant in a column • Every column needs a different value • You need to remember or document the value used for “missing” on a case-by-case basis 120 Monday, April 20, 2009 120
  • 121. NULL Antipatterns • Solution: use NULL appropriately • NULL signifies “missing” or “inapplicable” • Works for every data type • Already standard and well-understood 121 Monday, April 20, 2009 121
  • 122. NULL Antipatterns • Understanding NULL in expressions Expression Expected Actual NULL = 0 TRUE Unknown NULL = 12345 FALSE Unknown NULL <> 12345 TRUE Unknown NULL + 12345 12345 Unknown NULL || ‘string’ string’ Unknown NULL = NULL TRUE Unknown NULL <> NULL FALSE Unknown 122 Monday, April 20, 2009 122
  • 123. NULL Antipatterns • Understanding NULL in boolean expressions Expression Expected Actual NULL AND TRUE FALSE Unknown NULL AND FALSE FALSE FALSE NULL OR FALSE FALSE Unknown NULL OR TRUE TRUE TRUE NOT (NULL) TRUE Unknown 123 Monday, April 20, 2009 123
  • 124. NULL Antipatterns • SQL supports IS NULL predicate that returns true or false, never unknown: SELECT * FROM Bugs WHERE assigned_to IS NULL; SELECT * FROM Bugs WHERE assigned_to IS NOT NULL; 124 Monday, April 20, 2009 124
  • 125. NULL Antipatterns • SQL-99 supports IS DISTINCT FROM predicate that returns true or false: SELECT * FROM Bugs WHERE assigned_to IS DISTINCT FROM 123; SELECT * FROM Bugs WHERE assigned_to IS NOT DISTINCT FROM 123; SELECT * FROM Bugs MySQL operator works like WHERE assigned_to <=> 123; IS NOT DISTINCT FROM 125 Monday, April 20, 2009 125
  • 126. NULL Antipatterns • Change NULL to ordinary value on demand with COALESCE(): SELECT COALESCE( first_name || ‘ ’ || middle_initial || ‘ ’ || last_name, first_name || ‘ ’ || last_name) AS full_name FROM Accounts; • Also called NVL() or ISNULL() in some database brands. 126 Monday, April 20, 2009 126
  • 127. Ambiguous Groups Please accept my resignation. I don’t want to belong to any club that will accept me as a member. — Groucho Marx 127 Monday, April 20, 2009 127