SQL206 SQL Median

532 views

Published on

How to calculate the median in SQL Server.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
532
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

SQL206 SQL Median

  1. 1. Median SQL Programming Median query – How to calculate the median Parts Median
  2. 2. Notes on Median Slides <ul><li>These slides will be part of our upcoming intermediate and/or perhaps advanced SQL queries course. </li></ul><ul><li>The basic concept of using TOP was found on a tek-tips SQL forum. </li></ul><ul><li>At this time we are using Chris Date’s famous parts table. We will add versions for the bookstore database as well. </li></ul><ul><li>This script has been tested with SQL Server only at this time. </li></ul>Parts Median
  3. 3. Contact Information Parts Median P.O. Box 6142 Laguna Niguel, CA 92607 949-489-1472 http://www.d2associates.com [email_address] Copyright 2001-2011. All rights reserved.
  4. 4. Median Resources <ul><li>SQL scripts will be found on box.net at </li></ul><ul><ul><li>http://tinyurl.com/SQLScripts </li></ul></ul><ul><li>Slides can be viewed on SlideShare… </li></ul><ul><ul><li>http://www.slideshare.net/OCDatabases </li></ul></ul><ul><li>Follow up questions? </li></ul><ul><ul><li>[email_address] </li></ul></ul>Parts Median
  5. 5. Assumptions <ul><li>It is assumed the student is familiar with how to create a database and how to put it in use if required. </li></ul><ul><li>These statements are not covered in these slides. </li></ul>Parts Median
  6. 6. Business Case <ul><li>SQL has a function AVG which will take the average or arithmetic mean. It does not have one for the median. </li></ul><ul><li>These slides will show how to calculate the median of a dataset. </li></ul><ul><ul><li>The median is the value in a series above which lie 50% of the values and below which lie the other 50%. </li></ul></ul><ul><ul><li>If there are an even number of values in the series it is the average of the two innermost above values. </li></ul></ul><ul><li>The median has many uses. One common use is in real estate where the median may give us a better feel for the typical prices paid. </li></ul>Parts Median
  7. 7. Approach <ul><li>We will calculate the median by using an SQL select of the top 50 percent of a dataset. </li></ul><ul><li>This will be done twice. Once to obtain the record 50% of the way down from the top and again to find the record 50% of the way up from the bottom. </li></ul><ul><ul><li>If there are an odd number (including 1) of records the same row will be retrieved twice which is fine. </li></ul></ul><ul><li>We will then average the two values returned. </li></ul>Parts Median
  8. 8. Create Table <ul><li>We will use Chris Date’s famous parts table. </li></ul>Parts Median CREATE TABLE Parts (part_nbr VARCHAR(5) NOT NULL PRIMARY KEY , part_name VARCHAR(50) NOT NULL , part_color VARCHAR(50) NOT NULL , part_wgt INTEGER NOT NULL , city_name VARCHAR(50) NOT NULL );
  9. 9. Load Data <ul><li>Load the following data and/or experiment with your own values… </li></ul>Parts Median INSERT INTO Parts (part_nbr, part_name, part_color, part_wgt, city_name) VALUES ('p1', 'Nut', 'Red', 12, 'London') , ('p2', 'Bolt', 'Green', 17, 'Paris') , ('p3', 'Cam', 'Blue', 12, 'Paris') , ('p4', 'Screw', 'Red', 14, 'London') , ('p5', 'Cam', 'Blue', 12, 'Paris') , ('p6', 'Cog', 'Red', 19, 'London') ;
  10. 10. Calculate the median <ul><li>Union the result of the two select tops. Then average the two results. </li></ul>Parts Median select avg(wgt) as median from (select max(part_wgt) as wgt From (select top 50 percent * from parts order by part_wgt asc) a union select min(part_wgt) from (select top 50 percent * from parts order by part_wgt desc) d) u;
  11. 11. Results Parts Median
  12. 12. Explanation <ul><li>Use a subquery to select the top 50 percent of the dataset in ascending order. </li></ul><ul><li>Use a named outer query (table expression) to select the bottom value from this list. Assign a column alias to the max(value). </li></ul><ul><li>Use a subquery to select the bottom 50 percent of the dataset in descending order. </li></ul><ul><li>Use a named outer query (table expression) to select the top value from this list. </li></ul><ul><li>Union the result of the two named queries into another named query . </li></ul><ul><li>Select from this named query. Average the values in the union and assign a new column alias of median. </li></ul>Parts Median

×