Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SQL Joinsand Query OptimizationHow it worksby Brian Gallagherwww.BrianGallagher.com         G
WhyWh JOIN? Combine data from multiple tables Reduces the number of queries needed Moves processing burden to database ...
Types of JOINsT      f JOIN   INNER        the most common, return all rows with matching records                       ...
Sample T t DataS   l Test D t   Sample Customer and Job records
INNER JOIN
LEFT (OUTER) JOIN     (     )
One-To-Many LEFT JOIN          y
RIGHT (OUTER) JOIN      (     )
FULL (OUTER) JOIN     (     )
FULL (OUTER) JOIN FULL JOIN is not supported in MS-Access Can be emulated using UNION queries   They    Th are rarely u...
CROSS JOINsThe Join you never use…except every time
Making BigM ki it Bi   A CROSS JOIN combines every row on the left    table with every row in the right table   Resultin...
JoinsJ i each record t th other t bl       h      d to the th table
CROSS JOIN’s resulting set of all        JOIN srecord combinations
How CH   Cross J i M k I          Joins Make Inner J i                           Joins   Tables are joined using a CROSS ...
Using “ON” criteria to select records
One-to-Many CROSS JOIN          y
One to Many: Cross Join to Inner Join
Unpredictable RU    di t bl Record O d                  d Orders   Databases are not required to return records in    any...
IndexesUnique, clustered, etc.
WhatWh t are Indexes         I d Indexes are a pre-sorted list of database  field values This list speeds up queries A L...
Indexes and P fI d       d Performance   Indexes GREATLY speed up queries on the    fields being indexed   Indexes SLIGH...
Types of IndexesT      fI d   Simple Index     Indexesthe field specified     Can have multiple simple indexes     Spe...
Simple IndexesSi l I d   An Index contains a list of all field values    and pointers to the record with that value      ...
How Indexes help SELECTs               p
Composite I dC     it Indexes   Useful only when you will be querying multiple    fields simultaneously   Most selective...
Composite Index   p
Searching I dS    hi Indexes There are high performance algorithms for              high-performance  searching pre-sorte...
Design StrategyMake it workMake it fastMake it pretty        p y
Focus on what’s neededF         h t’     d d   Make your query as specific as possible     Makes  joins simpler (and the...
Maximum S l ti it / SM i     Selectivity Specificity                        ifi it If differents parts of the WHERE claus...
Optimizing QueriesFaster is Better
Horizontal P titi iH i    t l Partitioning Avoid extremely “wide” records (records  with lots of fields)                 ...
Add I d    Indexes   Make sure all fields searched on regularly    are indexed
Define Clustered IndexD fi Cl t d I d The most-likely criteria to be sorted on  should be defined as your clustered index...
Normalize D tN    li Data No redundant data Link related data Don’t go crazy
Denormalize D tD      li Data   If joining more 4 tables or more in a single    q y,    query, consider de-normalizing da...
Size records to fit database pagesize   SQL Server 7.0, 2000, and 2005 data pages are    8K (8192 bytes) in size.     Of...
Use Primary KU aPi       Key   Make sure you have a primary key defined   Ideally, it should be a single unique field   ...
UseU an IDENTITY C l              Column   If there is:     no Primary Key on the table     no unique index on the tabl...
Move TEXT, NTEXT, and IMAGEdata into table These types normally stored outside the  table (uses a pointer to the data in ...
Consider R li ti i advanceC   id Replication in d   If Replication is to be used, factor the    decision into the origina...
Use Built-in Referential I t itU B ilt i R f       ti l Integrity Use foreign keys and validation  constraints built into...
Re-test hR t t when changing servers            h   i   Retest and rebenchmark applications when    moving to a new serve...
Constraints are F tC   t i t       Fast   Constraints on fields are faster than     Triggers     Rules     Defaults
Don’t duplicate ff tD ’t d li t effort   Don’t check for the same thing twice     (duh, but it happens)     Don’t use a...
LimitLi it records t b JOIN d           d to be JOINed   Use WHERE clause to minimize rows to    be JOINed.     Particul...
Index F i KI d Foreign Keys Fields in a table that are a foreign key are  not indexed automatically j                    ...
Minimize DuplicateMi i i D li t JOIN fi ld d t                   field data   JOINs are slower when there are few    diff...
JOIN on unique indexed fi ld          i    i d   d fields   JOINs will perform fastest when joining    indexed fields wit...
JOIN Numeric Fi ld     N    i Fields   JOINs on numeric fields perform much    faster than JOINs on other datatype fields...
JOIN th exact same d t t     the    t      datatypes   JOINs should be to the exact same    datatype for best p         y...
UseU ANSI JOIN syntax               t   Improves readability   Less likely to cause programmer errors   No Aliases:    ...
Use Table AliasesU T bl Ali   Shortens code   Makes it easier to follow, especially with long queries   Identifies whic...
Don’tD ’t use SELECT *   Requires additional parsing on the server to    extract field names   Returns unneccesary data ...
Store i separate fil in filSt    in      t files i filegroup For very large joins placing tables to be  j  joined in sepa...
Don’tD ’t use CROSS JOINs               JOIN Unless actually needed (rarely) do not use  a CROSS JOIN (returns all combin...
JOINsJOIN vs. Subquery         S b   Depending on the specifics, either could be faster. Write    both and test performan...
Avoid Subqueries unless neededA id S b     i     l       d d Avoid subqueries unless actually required Most subqueries c...
Use an Indexed View(Enterprise 2000 and later only) An Indexed View maintains an updated  record of how the tables are jo...
Use Database’s Performance    Database sOptimization Tools Most major databases have methods for  monitoring and optimizi...
AvoidA id DISTINCT when possible               h       ibl   DISTINCT clauses are frequently (and usually    unintentiona...
The End               (for now)Google: Query Optimizationfor lots more information
Upcoming SlideShare
Loading in …5
×

SQL Joins and Query Optimization

23,002 views

Published on

A visual explanation of how to join database tables and tips on how to make your queries run faster, better, stronger.

Published in: Technology
  • Dating direct: ♥♥♥ http://bit.ly/2Qu6Caa ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/2Qu6Caa ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Copas Url to Read eBook === http://bestadaododadj.justdied.com/2897674342-echec-les-archer-d-avalon-t2.html
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Get HERE to Download This eBook === http://bestadaododadj.justdied.com/2890009122-jardins-d-ombre-du-jardinier-paresseux-t1.html
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

SQL Joins and Query Optimization

  1. 1. SQL Joinsand Query OptimizationHow it worksby Brian Gallagherwww.BrianGallagher.com G
  2. 2. WhyWh JOIN? Combine data from multiple tables Reduces the number of queries needed Moves processing burden to database server Improves data integrity (over combining data in d t i program code) d )
  3. 3. Types of JOINsT f JOIN INNER  the most common, return all rows with matching records , g  SELECT * FROM T1 INNER JOIN T2 ON T1.fld1 = T2.fld2 LEFT or LEFT OUTER  return all rows on the left (first) table and right (second)  If no matching record on the right side, NULL-values for each field are returned  SELECT * FROM T1 LEFT OUTER JOIN T2 ON T1 fld1 = T2 fld2 T1.fld1 T2.fld2 FULL or FULL OUTER  return all rows on the left (first) table and right (second)  If no matching record on the left or right side, NULL-values for each field are returned  S SELECT * FROM T1 FULL OU C O U OUTER JO JOIN T2 ON T1.fld1 = T2.fld2 O . d . d CROSS  the least common, return all possible row combinations  SELECT * FROM T1, T2 RIGHT or RIGHT OUTER  Don’t use them without a very good reason.  They do not add any functionality over a LEFT JOIN and make code more confusing  Works the same as the LEFT and LEFT OUTER JOINs, but the second table is the one which all rows are returned from. These two queries are functionally identical:  SELECT * FROM T1 LEFT JOIN T2 ON T1.fld1 = T2.fld2  SELECT * FROM T2 RIGHT JOIN T1 ON T1.fld1 = T2.fld2
  4. 4. Sample T t DataS l Test D t Sample Customer and Job records
  5. 5. INNER JOIN
  6. 6. LEFT (OUTER) JOIN ( )
  7. 7. One-To-Many LEFT JOIN y
  8. 8. RIGHT (OUTER) JOIN ( )
  9. 9. FULL (OUTER) JOIN ( )
  10. 10. FULL (OUTER) JOIN FULL JOIN is not supported in MS-Access Can be emulated using UNION queries They Th are rarely used l d
  11. 11. CROSS JOINsThe Join you never use…except every time
  12. 12. Making BigM ki it Bi A CROSS JOIN combines every row on the left table with every row in the right table Resulting recordset will b th t t l of b th row R lti d t ill be the total f both counts multiplied together  Leftrow has 10 000 records 10,000  Right row has 30,000 records  Resulting recordset has 300,000,000 records If no JOIN condition is specified, a CROSS JOIN will be executed
  13. 13. JoinsJ i each record t th other t bl h d to the th table
  14. 14. CROSS JOIN’s resulting set of all JOIN srecord combinations
  15. 15. How CH Cross J i M k I Joins Make Inner J i Joins Tables are joined using a CROSS JOIN JOIN criteria is evaluated, removing all records which d ’t match th “ON” condition hi h don’t t h the diti SELECT * FROM Customers C INNER JOIN Job J ON C.CustID = J.CustID The remaining rows are the recordset returned for the INNER JOIN
  16. 16. Using “ON” criteria to select records
  17. 17. One-to-Many CROSS JOIN y
  18. 18. One to Many: Cross Join to Inner Join
  19. 19. Unpredictable RU di t bl Record O d d Orders Databases are not required to return records in any particular order Records may be returned by the order they are stored on disk or may be unordered, depending on how the query engine handles data If you need data in a particular order, use an dd t i ti l d ORDER BY clause Some databases may return data presorted by Primary key or Clustered index  DO NOT DEPEND on this behavior  It is not reliably portable across different databases  Do not make a program rely on a behavior not specified by the program – bad programming style
  20. 20. IndexesUnique, clustered, etc.
  21. 21. WhatWh t are Indexes I d Indexes are a pre-sorted list of database field values This list speeds up queries A LOT  An index is much smaller than the full recordset  An index tracks only the fields specified when created
  22. 22. Indexes and P fI d d Performance Indexes GREATLY speed up queries on the fields being indexed Indexes SLIGHTLY slow d I d l down INSERTS INSERTS, UPDATES and DELETES  The indexes need to be updated anytime a record changes  This adds a small bit of overhead to the system The increase in query speed usually greatly offsets the slight extra load on INSERTS, UPDATES and DELETES  Most database use is typically reading (instead of writing)
  23. 23. Types of IndexesT fI d Simple Index  Indexesthe field specified  Can have multiple simple indexes  Speeds up queries using the indexed field Composite Index  Indexescombinations of fields  Can have multiple composite indexes  Speeds up queries using the specified combination of fi ld bi ti f fields
  24. 24. Simple IndexesSi l I d An Index contains a list of all field values and pointers to the record with that value p
  25. 25. How Indexes help SELECTs p
  26. 26. Composite I dC it Indexes Useful only when you will be querying multiple fields simultaneously Most selective (resulting in the fewest records matching) fields should be listed first Slight performance gain over multiple Simple Indexes when used correctly No performance gain (possibly even a performance loss) when used incorrectly.
  27. 27. Composite Index p
  28. 28. Searching I dS hi Indexes There are high performance algorithms for high-performance searching pre-sorted dataI d Index data stored i t d t t d internally as bi ll binary t trees (typically) Searching through binary tree much faster than reading through table
  29. 29. Design StrategyMake it workMake it fastMake it pretty p y
  30. 30. Focus on what’s neededF h t’ d d Make your query as specific as possible  Makes joins simpler (and therefore faster)  Returns no unwanted data  Increases response time  Reduces network and server load Put as much as possible in the WHERE clause J i your fi ld properly Join fields l
  31. 31. Maximum S l ti it / SM i Selectivity Specificity ifi it If differents parts of the WHERE clause will result in smaller data sets than other ones, list them the ones that will most g greatly reduce the data first y List the others in the same order
  32. 32. Optimizing QueriesFaster is Better
  33. 33. Horizontal P titi iH i t l Partitioning Avoid extremely “wide” records (records with lots of fields) ) Split data into two tables with a common ID field All the record data is rarely used in all queries, queries so it reduces traffic and speeds up processing and disk reads
  34. 34. Add I d Indexes Make sure all fields searched on regularly are indexed
  35. 35. Define Clustered IndexD fi Cl t d I d The most-likely criteria to be sorted on should be defined as your clustered index y Clustered index defines the order the records will physically be stored on disk
  36. 36. Normalize D tN li Data No redundant data Link related data Don’t go crazy
  37. 37. Denormalize D tD li Data If joining more 4 tables or more in a single q y, query, consider de-normalizing data g (combining data from external tables back into the main table))  Can return faster query results  Risks include having to manage duplicate data in database and larger disk usage
  38. 38. Size records to fit database pagesize SQL Server 7.0, 2000, and 2005 data pages are 8K (8192 bytes) in size.  Of the th 8192 available b t i each d t page, only il bl bytes in h data l 8060 bytes are used to store a row. The rest is used for overhead.  If you have a row that was 4031 bytes long, then only one row could fit into a data page, and the remaining 4029 bytes left in the page would be empty y p g py  Making each record 1 byte shorter would halve the disk access required to read the table
  39. 39. Use Primary KU aPi Key Make sure you have a primary key defined Ideally, it should be a single unique field  You can define composite keys, but they are generally not needed if the database is designed p p y properly Most tables should have a unique TableNameID field  This allows you to identify any row by a single unique value
  40. 40. UseU an IDENTITY C l Column If there is:  no Primary Key on the table  no unique index on the table Then  Add an IDENTITY (unique value) column to the table  Optionally, index the IDENTITY column if you will query it regularly
  41. 41. Move TEXT, NTEXT, and IMAGEdata into table These types normally stored outside the table (uses a pointer to the data in the ( p field) If these types will be searched frequently frequently, consider moving them into the database table s table’s storage with: sp_tableoption tablename, text in row, on or sp_tableoption tablename, text in row, size
  42. 42. Consider R li ti i advanceC id Replication in d If Replication is to be used, factor the decision into the original design of the g g database
  43. 43. Use Built-in Referential I t itU B ilt i R f ti l Integrity Use foreign keys and validation constraints built into the database Don’t manage it in the application Benefits:  Faster execution  Can’t mess it up with application errors
  44. 44. Re-test hR t t when changing servers h i Retest and rebenchmark applications when moving to a new server, such as:  Development  Staging  Production Problems often caused b P bl ft d by:  More rows in test data on servers “closer” to production  Server configurations different  Tables not indexed the same way
  45. 45. Constraints are F tC t i t Fast Constraints on fields are faster than  Triggers  Rules  Defaults
  46. 46. Don’t duplicate ff tD ’t d li t effort Don’t check for the same thing twice  (duh, but it happens)  Don’t use a trigger and a constraint to do the same thing g  Same for constraints and defaults  Same for constraints and rules
  47. 47. LimitLi it records t b JOIN d d to be JOINed Use WHERE clause to minimize rows to be JOINed.  Particularly in the OUTER table of an OUTER JOIN
  48. 48. Index F i KI d Foreign Keys Fields in a table that are a foreign key are not indexed automatically j y just because they are a foreign key. Add these indexes manually
  49. 49. Minimize DuplicateMi i i D li t JOIN fi ld d t field data JOINs are slower when there are few different keys in the j y joining table g
  50. 50. JOIN on unique indexed fi ld i i d d fields JOINs will perform fastest when joining indexed fields with no duplicate data p
  51. 51. JOIN Numeric Fi ld N i Fields JOINs on numeric fields perform much faster than JOINs on other datatype fields. yp
  52. 52. JOIN th exact same d t t the t datatypes JOINs should be to the exact same datatype for best p yp performance  Same type of field  Same field length  Same encoding (ASCII vs. Unicode, etc)
  53. 53. UseU ANSI JOIN syntax t Improves readability Less likely to cause programmer errors No Aliases: SELECT fname, lname, department FROM names INNER JOIN departments ON names.employeeid names employeeid = departments employeeid departments.employeeid Microsoft syntax example: SELECT fname, lname, department FROM names departments names, WHERE names.employeeid = departments.employeeid Code is more portable between databases
  54. 54. Use Table AliasesU T bl Ali Shortens code Makes it easier to follow, especially with long queries Identifies which table each field is coming from No table aliases: SELECT fname, lname, department FROM names INNER JOIN departments ON names.employeeid = departments.employeeid Microsoft syntax example: SELECT N.fname, N.lname, D.department FROM names N INNER JOIN departments D ON N.employeeid = D.employeeid
  55. 55. Don’tD ’t use SELECT * Requires additional parsing on the server to extract field names Returns unneccesary data (unless you are actually using every field) Returns duplicate data on JOINs (do you really need two copies of the RecordID field?) Can cause errors (in some databases) if JOINed tables have fields with the same name
  56. 56. Store i separate fil in filSt in t files i filegroup For very large joins placing tables to be j joined in separate p y p physical files within the same filegroup can improve performance SQL Server can spawn separate threads for processing each file
  57. 57. Don’tD ’t use CROSS JOINs JOIN Unless actually needed (rarely) do not use a CROSS JOIN (returns all combinations ( of records on both sides of JOIN) People sometimes will do a CROSS JOIN and then use DISTINCT and GROUP BY to eliminate all the duplication  Don’t do that.
  58. 58. JOINsJOIN vs. Subquery S b Depending on the specifics, either could be faster. Write both and test performance to be sure which is best. JOIN: SELECT a.* FROM Table1 a INNER JOIN Table2 b ON a.Table1ID = b.Table1ID WHERE b.Table2ID = SomeValue Subquery: SELECT a.* FROM Table1 a WHERE Table1ID IN a. (SELECT Table1ID FROM Table2 WHERE Table2ID = SomeValue)
  59. 59. Avoid Subqueries unless neededA id S b i l d d Avoid subqueries unless actually required Most subqueries can be expressed as JOINs Subqueries are generally harder to read and understand (for humans) and, therefore, therefore maintain
  60. 60. Use an Indexed View(Enterprise 2000 and later only) An Indexed View maintains an updated record of how the tables are joined via a j clustered index This slows INSERTs, UPDATEs and INSERTs DELETEs a bit, so consider the tradeoff for the faster queries
  61. 61. Use Database’s Performance Database sOptimization Tools Most major databases have methods for monitoring and optimizing database g p g performance Google “databaseName optimizing” for databaseName optimizing tons of links
  62. 62. AvoidA id DISTINCT when possible h ibl DISTINCT clauses are frequently (and usually unintentionally) used to hide an incorrect JOIN Properly normalized database will not frequently need DISTINCT clauses Look for incorrect JOINs or create more explicit WHERE clauses to avoid needing DISTINCT Of course, use it when appropriate
  63. 63. The End (for now)Google: Query Optimizationfor lots more information

×