Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SQL206 SQL Median

574 views

Published on

How to calculate the median in SQL Server.

  • Be the first to comment

  • Be the first to like this

SQL206 SQL Median

  1. 1. Median SQL Programming Median query – How to calculate the median Parts Median
  2. 2. Notes on Median Slides <ul><li>These slides will be part of our upcoming intermediate and/or perhaps advanced SQL queries course. </li></ul><ul><li>The basic concept of using TOP was found on a tek-tips SQL forum. </li></ul><ul><li>At this time we are using Chris Date’s famous parts table. We will add versions for the bookstore database as well. </li></ul><ul><li>This script has been tested with SQL Server only at this time. </li></ul>Parts Median
  3. 3. Contact Information Parts Median P.O. Box 6142 Laguna Niguel, CA 92607 949-489-1472 http://www.d2associates.com [email_address] Copyright 2001-2011. All rights reserved.
  4. 4. Median Resources <ul><li>SQL scripts will be found on box.net at </li></ul><ul><ul><li>http://tinyurl.com/SQLScripts </li></ul></ul><ul><li>Slides can be viewed on SlideShare… </li></ul><ul><ul><li>http://www.slideshare.net/OCDatabases </li></ul></ul><ul><li>Follow up questions? </li></ul><ul><ul><li>[email_address] </li></ul></ul>Parts Median
  5. 5. Assumptions <ul><li>It is assumed the student is familiar with how to create a database and how to put it in use if required. </li></ul><ul><li>These statements are not covered in these slides. </li></ul>Parts Median
  6. 6. Business Case <ul><li>SQL has a function AVG which will take the average or arithmetic mean. It does not have one for the median. </li></ul><ul><li>These slides will show how to calculate the median of a dataset. </li></ul><ul><ul><li>The median is the value in a series above which lie 50% of the values and below which lie the other 50%. </li></ul></ul><ul><ul><li>If there are an even number of values in the series it is the average of the two innermost above values. </li></ul></ul><ul><li>The median has many uses. One common use is in real estate where the median may give us a better feel for the typical prices paid. </li></ul>Parts Median
  7. 7. Approach <ul><li>We will calculate the median by using an SQL select of the top 50 percent of a dataset. </li></ul><ul><li>This will be done twice. Once to obtain the record 50% of the way down from the top and again to find the record 50% of the way up from the bottom. </li></ul><ul><ul><li>If there are an odd number (including 1) of records the same row will be retrieved twice which is fine. </li></ul></ul><ul><li>We will then average the two values returned. </li></ul>Parts Median
  8. 8. Create Table <ul><li>We will use Chris Date’s famous parts table. </li></ul>Parts Median CREATE TABLE Parts (part_nbr VARCHAR(5) NOT NULL PRIMARY KEY , part_name VARCHAR(50) NOT NULL , part_color VARCHAR(50) NOT NULL , part_wgt INTEGER NOT NULL , city_name VARCHAR(50) NOT NULL );
  9. 9. Load Data <ul><li>Load the following data and/or experiment with your own values… </li></ul>Parts Median INSERT INTO Parts (part_nbr, part_name, part_color, part_wgt, city_name) VALUES ('p1', 'Nut', 'Red', 12, 'London') , ('p2', 'Bolt', 'Green', 17, 'Paris') , ('p3', 'Cam', 'Blue', 12, 'Paris') , ('p4', 'Screw', 'Red', 14, 'London') , ('p5', 'Cam', 'Blue', 12, 'Paris') , ('p6', 'Cog', 'Red', 19, 'London') ;
  10. 10. Calculate the median <ul><li>Union the result of the two select tops. Then average the two results. </li></ul>Parts Median select avg(wgt) as median from (select max(part_wgt) as wgt From (select top 50 percent * from parts order by part_wgt asc) a union select min(part_wgt) from (select top 50 percent * from parts order by part_wgt desc) d) u;
  11. 11. Results Parts Median
  12. 12. Explanation <ul><li>Use a subquery to select the top 50 percent of the dataset in ascending order. </li></ul><ul><li>Use a named outer query (table expression) to select the bottom value from this list. Assign a column alias to the max(value). </li></ul><ul><li>Use a subquery to select the bottom 50 percent of the dataset in descending order. </li></ul><ul><li>Use a named outer query (table expression) to select the top value from this list. </li></ul><ul><li>Union the result of the two named queries into another named query . </li></ul><ul><li>Select from this named query. Average the values in the union and assign a new column alias of median. </li></ul>Parts Median

×