Your SlideShare is downloading. ×
0
Robert bright
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Robert bright

245

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
245
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Ten Tips for Writing Efficient SQL By Robert Bright Abstract Tip #3 Tip #7 Use OR instead of UNION on the same table 6) Use IN instead of EXISTSAs a Web Developer at the Ontario Universities’ Application Center (OUAC), I worked a lot with SQL and When selecting data from a single table that requires a logical or, it is easier to view the process of the query by using an UNION. A simple trick to increase the speed of an EXISTS sub query is to replace it with IN. The IN method is faster than EXISTSdatabase programming. I learned several techniques to write SQL statements that were increased in This method is inefficient because it requires an unnecessary intermediate table. By joining the inner query with the outer query because it doesn’t check unnecessary rows in the comparison.efficiency. The intention of this presentation is to share the techniques I learned for writing efficient SQL through an OR, it will eliminate the extra sub query and intermediate table. Example: One of the options for the degree listing program I wrote at OUAC was to list all the available degrees at astatements so that future co-op student can benefit from this knowledge. Example: While creating a tool that modified the help pages dynamically at OUAC, I needed to find a specific file that belonged to a specific University. So if I were checking for U of Guelph, I would look for all the degrees that were associated with the University. I was tempted to use an UNION to find the exact data, but an OR proved to be more efficient. university number 149. By replacing the EXISTS in the sub query with an IN, I made the query more efficient. Before: SELECT hemenbr, hename FROM buma.helpfiles WHERE hemenbr = 5 UNION Before: select cgrfnbr from category where EXISTS (select cpcgnbr from cgprrel where cpprnbr = 149 ) SELECT hemenbr, henam FROM buma.helpfiles WHERE hename = help_address.html After: select cgrfnbr from category where cgrfnbr IN (select cpcgnbr from cgprrel where cpprnbr = 149 )Information about the Employer After: SELECT DISTINCT hemenbr, hename FROM buma.helpfiles WHERE hemenbr = 5 OR hename = help_address.htmlThe Ontario Universities’ Application Centre (OUAC), located in Guelph, Ontario, Canada, is a central 36% Queries Before 17% Afterbureau whose key function is the processing of applications for admission to the province’s universities. Queries Before After Time Reduction Time Reduction 0 1 2 3 Time in ms 4 5 6 Job Description 0 5 10 15 20 25 Time in msI worked at OUAC as a Web Developer. I developed web page to improve the usability of the OntarioUniversity application process. I spent the majority of my time creating two internal systems. The first Tip #4 Tip #8was created with the purpose to allow employees of OUAC to modify the contents of the help files Use EXISTS instead of LEFT JOIN Avoid including a HAVING clause in SELECT statementswithout having to know any programming or HTML skills. The second system created lists of degree The LEFT JOIN merges the outer query with the inner query and keeps the extra rows from the outer table. The same result can be obtained The HAVING statement is quite useless in a SELECT statement. It works by going though the final result table of the query anyprograms available at universities. Users were now able to see where a program is taught all at once by using an EXISTS sub query. The will eliminate the need to compare two tables as the inner query acts as a filter when the outer query parsing out the rows that don’t meet the HAVING condition. Instead, you can put the condition inside the query with a WHEREinstead of having to search every university. I developed all the web sites and systems with HTML, executes. clause. This will be included in the creation of the table and will eliminate having to go back through the results a second time.JavaScript, SQL, and the IBM scripting language Net.Data. Example: While creating a tool that modified the help pages dynamically at OUAC, I needed to find which Universities had help files Example: In the help file tool I created at OUAC, I had to select all the University numbers except for the one that belonged to the associated with them. By using an EXISTS sub query instead of LEFT JOIN, I increased the efficiency of this query by avoiding a table test case. So I could cut out that row with a HAVING clause at the end of the statement, but a WHERE proved to be more efficient. comparison. Before: select merfnbr from merchant group by merfnbr having merfnbr!=2 Purpose of this Report Before: SELECT merfnbr, mestname FROM buma.merchant LEFT JOIN buma.helpfiles ON merfnbr=hemenbr After: select merfnbr from merchant where merfnbr!=2 group by merfnbr After: SELECT merfnbr, mestname FROM buma.merchant WHERE EXISTS (SELECT * FROM buma.helpfiles where merfnbr = hemenbr)On my co-op at OUAC I worked intensively with databases and SQL queries. I learned several techniquesto improve the sped and efficiently of the queries. The intention with this report to share this knowledge so Queries 23% 26% Before Queries Before Aftercurrent and future co-op students will know how to write better SQL statements. AfterEach technique was tested by running both the original query and improved query ten times each. Irecorded the average time of each query to show the speed increase of using the more efficient query. 0 5 10 15 Time in ms 20 25 30 Time Reduction 0 5 10 Time in ms 15 20 25 Time Reduction Tip #1 Tip #5 Select all your data at once Tip #9 Use BETWEEN instead of IN The BETWEEN keyword is very useful for filtering out values in a specific range. It is much faster than typing each value in the range Each time a query is performed there is the overhead cost of have to open a connection to the database. Having many separateUse Column Names Instead of * in a SELECT Statement queries that select data from the same table is very inefficient since each query adds its overhead cost to the execution time. ByIf you are selecting only a few columns from a table there is no need to use SELECT *. Though this is into an IN. putting all these queries into one, it will reduce the overhead cost significantly. easier to write, it will cost more time for the database to complete the query. By selecting Example: While at OUAC I built a small webpage that displayed all possible degrees and their information. Each degree belonged to a Example: When creating the help file tool at OUAC, I needed to retrieve lots of data on each file. I required the file name, the only the columns you need, you are reducing the size of the result table and in turn grouped category. In the database the category numbers where in a specific range. So I was able to benefit from using a BETWEEN content, the associated University, etc.. Having these selections as different queries proved to be very inefficient, so I put them increasing the speed of the query. instead having each value inside an IN. together into one statement.Example: While creating a tool that modified the help pages dynamically at OUAC, I needed to get each Before: SELECT crpcgnbr FROM cgryrel WHERE crpcgnbr IN (508858, 508859, 508860, 508861,508862, 508863, 508864) After: SELECT crpcgnbr FROM cgryrel WHERE crpcgnbr BETWEEN 508858 and 508864 Before: select hetitle, hename from helpfileswhere heshnbr=24; file’s information from the database. By replacing the * in my query with the column select hecontent, hemenbr from helpfiles where heshnbr=24; names, I increased the speed of the query. After: select hetitle, hename, hecontent, hemenbr from helpfiles where heshnbr=24;Before: SELECT * FROM buma.helpfilesAfter: SELECT heshnbr, hemenbr, hename, hetitle, hecontent, hefield1, hefield2 FROM buma.helpfiles 59% Queries Before 32% After Queries Before After Time Reduction 34% 0 2 4 6 8 10 12 Time Reduction Queries Before Time in ms After 0 5 10 15 20 25 30 Time in ms Time Reduction Tip #6 Tip #10 0 10 20 30 40 50 Time in ms Minimize the number of sub queries Tip #2 Each time a sub query is performed, I new result table must be created and then merged with the outer table. This takes a long time Remove any redundant mathematics to perform this on a database. So it is important to minimize the amount of sub queries to speed up the results. There will be times where you will be performing mathematics within an SQL statement. They can be a drag on the performance if Example: The degree listing program I made at OUAC was based on a very redundant database. All the relationships were put into written improperly. For each time the query find a row it will recalculate the math. So eliminating any unnecessary math in theUse EXISTS instead of DISTINCT one of two tables. So sorting out the information was very difficult. The only method to get the data was to use several sub queries. statement will make it perform faster.The DISTINCT keyword works by selecting all the columns in the table then parses out any duplicates. By simply removing one unnecessary sub query from this statement increased the speed significantly. Example: The degree listing program I created at OUAC has the option to show a specific range on Universities based on theirInstead, if you use sub query with the EXISTS keyword, you can avoid having to return an entire table Before: select cgsdesc, cgrfnbr from category where cgoid=degree and cgrfnbr IN reference numbers. It was easier to show the users a single digit list then add 3000 to get the reference number. But having theExample: While creating a tool that modified the help pages dynamically at OUAC, I needed to find which (select cpprnbr from cgprrel where cpprnbr IN (select cpcgnbr from cgprrel where cpprnbr IN addition inside the query was inefficient so I preformed the math outside it.Universities had help files associated with them. By using an EXISTS sub query instead of DISTINCT, I (select prrfnbr from product where prrfnbr IN (select cpprnbr from cgprrel where cpcgnbr IN Before: SELECT merfnbr FROM buma.merchant WHERE merfnbr + 3000 < 5000;increased the efficiency of this query. (select cgrfnbr from category where cgoid IS NULL)) and prrfnbr IN After: SELECT merfnbr FROM buma.merchant WHERE merfnbr < 2000;Before: SELECT DISTINCT hetitle, hename (select cpprnbr from cgprrel where cpcgnbr = 190200)))) FROM buma.helpfiles h , buma.merchant m WHERE m.merfnbr = h.hemenbr After: select cgsdesc, cgrfnbr from category where cgoid=degree and cgrfnbr INAfter: SELECT hetitle, hename FROM buma.helpfiles h WHERE EXISTS (select cpprnbr from cgprrel where cpprnbr IN(select cpcgnbr from cgprrel where cpprnbr IN (SELECT m.merfnbr FROM buma.merchant m) (select prrfnbr from product where prrfnbr IN (select cpprnbr from cgprrel where cpcgnbr = 572191) 11% Queries and prrfnbr IN (select cpprnbr from cgprrel where cpcgnbr = 190200)))) Before After Time Reduction 48% 41% 14 15 16 Time in ms 17 18 Queries Queries Before Before After After Time Reduction Time Reduction Summary 0 10 20 30 40 50 0 10 20 30 40 Time in ms Time in ms The purpose of this report was to share the knowledge I gained about writing efficient SQL from my co-op as a web developer at OUAC. Increasing the speed of queries is very important is web development as web pages are viewed thousands of times per day and therefore a simple increase in speed of a SQL query can create a greater speed in web page viewing.

×