Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sampling Data in T-SQL


Published on

Explanation and examples of sampling SQL tables via random or systematic methods with emphasis on NCQA/HEDIS methodology

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Sampling Data in T-SQL

  1. 1. Sampling Data in T-SQLOne of the great benefits of databases is that numerical analysis can be done against the entire population.Trends and behaviors can be performed just as easily against ten million records as against one hundred.That is, assuming that all of the information necessary is contained within the data set. Unfortunately,that isn’t always the case; sometimes it is necessary to conduct manual research for additional informationand report against the results.The goal of sampling is to select the smallest number of records that significantly represent thecharacteristics of the whole. The statistical process of determining sample size N will not be addressedhere; it is assumed that you know N and are only curious about how to select that many records in a waythat guarantees a reasonable level of randomness.One of the inspirations to document the process of sampling SQL data was the NCQA/HEDIS SystematicSample Methodology. NCQA is the National Committee for Quality Assurance, which runs a researchprogram called the Healthcare Effectiveness Data and Information Set (HEDIS). The majority of HEDISmeasures involve analyzing the entire population within a health plan – women between specific ages,and counting the number who have received specific procedures – a breast cancer screening. However,some measures require data not currently in the medical or claims data, and these utilizes a hybridapproach to identifying the eligible population, then sampling that population to identify specificmembers whose medical records will be examined manually to determine how that person’s results willbe evaluated.But before jumping into the NCQA/HEDIS sampling methodology, let’s examine some simpleralternatives. Appendix A contains a T-SQL script to create and populate a sample table called Individualswithin a database named Demonstration. You are welcome to create the same table and run the examplequeries to make this more of a hands-on experience. Likewise, the techniques shown here can be used inany number of situations beyond HEDIS measures. On a side note: the names used are drawn from thestaff list of the NPR radio show Car Talk.Example 1: Selecting all recordsThe SQL query:Select I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals Iwill produce a result set of 137 row in data entry order.T-SQL Sampling Page 1 of 18
  2. 2. Note that some records are duplicated and the results are show in data entry order. Selecting record indata entry or natural order is not sufficient for sampling – there are too many biases built in. Thus,assuming that we want the sample size N to be 25, the Top 25 constraint in the query will return the just25 records, just the wrong one from a statistical standpoint.SelectTop 25 I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals IThe need to randomize the records selected can be achieved by using the NewId() function to put thereturned records in random order. By adding the clause Order by NewId() to the statement you will see adifferent set of records returned every time you run the query.SelectTop 25 percent I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals IOrderBy NewId()Note that by adding percent to the Top 25 clause you will return a percentage of the total table – in thiscase, 35 rows out of the 137 total.T-SQL Sampling Page 2 of 18
  3. 3. This is the simplest way to extract a specific sized random sample from a data set. Unfortunately, not allstatisticians will accept a purely random sample. Whether this bias is the result of pre-computer eradifficulties in generating random numbers and applying them to a population or it represents a legitimateconcern about the distribution of the data, many situations will require a periodic or systematic sample ofthe data. Until the abacus and quill pen generation that creates these specifications finally dies offsystematic sampling will remain a mandatory skill.The simplest, and most common (but not in HEDIS), systematic sampling method is to select a startingrecord and then every nth record from that record on. Determining the nth record interval found bydividing the total population/record count (137 in our example) by the sample size N or 5.48, which isrounded down to 5. A simple way to skip a set numbers of records in a set is with modulus. Themathematical function modulus (also just mod) shows the remainder of one number divided by another.For example, the modulus of 6 /5 is 1, while mod 15/5 is 0. In T-SQL modulus operator is “%.” Theindividuals table has a sequential system assigned field call IdNo, so selecting every 5th record can beachieved by adding the constraint IdNo%5 = 0; to start at a number greater than 1 (say 11), add theconstraint IdNo> 11. To change the offset, add some starting value to IdNo.The query Select I.IdNo, I.LName, I.FName, I.BirthDate From Individuals I Where (IdNo + 2)%5 = 0 And I.IdNo > 11will return 25 records exactly, with a three position offset from mod 5.Omitting the IdNo> 11 condition will return 27 rows, but you can apply a Top 25 constraint to return theexact N desired. This method does a very good job of returning every nth record as determined by theIdNo field. However, if the table has significant deletions, and the deleted records follow any kind ofpattern, the exact periodicity of the sequence can be jeopardized.The Row_Number() function in T-SQL allows you assign a new sequential number to each row when thequery is executed. The same mod function can be used with the new row number with a much moreT-SQL Sampling Page 3 of 18
  4. 4. regular segmentation. To jump ahead a bit, the NCQA/HEDIS systematic sampling methodologyrequires that the population be ordered by last name, first name, and birth date (in descending order oneyear, and ascending the next) and since Row_Number() requires a sort, the following query orders andnumbers the full data set per that requirement.The querySelect ROW_NUMBER()Over (OrderBy I.LName, I.FName, I.BirthDate Asc) RowNum, I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals IreturnsFrom here it is trivial to apply the previous mod example to the RowNum field and return a reasonableperiodic sample. Unfortunately, the NCQA/HEDIS systematic sampling methodology does not use a setinterval. Instead, each ith member (in this example, the second through 25th records returned) has aspecific calculation.The first record is called START and is determined by multiplying a random number supplied by NCQAby the eligible members (EM) divided by the final sample size (FSS). In the calculations EM/FSS isreferred to as N. For this exercise assume that START is equal to 2.The calculation for each member of the sample is th i member = START + [(i-1) x N]In T-SQL a common table expression (CTE) can be used to determine the row to be selected as follows.;With Row_List_CTE(RowNum)As ( Select ROW_NUMBER()Over (OrderBy I.LName, I.FName, I.BirthDate Asc) FromT-SQL Sampling Page 4 of 18
  5. 5. Individuals I )SelectTop 25 RowNum , RowNum - 1 SelRec , 137.0/25.0 NR ,(RowNum-1)*(137.0/25.0) RN1 ,CAST(ROUND(2+(RowNum-1)*(137.0/25.0),0)AsInt) FinalFrom Row_List_CTEThe first line defines the CTE Row_List_CTE with a single column called RowNum. The Select clausefollowing the As populates the CTE with rows 1 through 137.The next Select statement uses the CTE to demonstrate each step in the calculation.RowNumis the original row number. SelRec is the sample record to select and is the [(i-1)… part of thecalculation. NR is N in the calculation, the result of EM/FSS (137/25) or 5.48. RN1 shows the core of thecalculation or [(i-1) x N]. The Final field shows the calculation that rounds RN1 to the nearest integerand adds the START value of 2. This is the row number of the record to be used in the sample. Thisapproach selects an interval of 5 or 6 rows between each selected record which does, grudgingly, improvethe quality of the sample by varying the interval.To show the basic elements of name, birth date, and age in the final query the same CTE can be used toselect the demographics and determine the sample.Declare @EvalDt DateDeclare @N Decimal(9,3)Set @EvalDt =12-31-2012Set @N =137.0/25.0;With Ind_List_CTE(RowNum, IdNo, LName, FName, BirthDate)As ( Select ROW_NUMBER()Over (OrderBy I.LName, I.FName, I.BirthDate Asc)RowNum, I.IdNo,T-SQL Sampling Page 5 of 18
  6. 6. I.LName, I.FName, I.BirthDate From Individuals I )Select RowNum , IdNo , FName , LName , BirthDate ,DATEDIFF(Hour,BirthDate, @EvalDt)/8766 As AgeFrom Ind_List_CTEWhere RowNum In( SelectTop 25 --CAST(ROUND(2+(RowNum-1) * @N,0) As Int) Floor(ROUND(2+(RowNum-1)* @N,0)) From Ind_List_CTE)In order to make the query more useable in the future, key variables are declared in the first section.Because the typical HEDIS measure wants the age of the member at the conclusion of the evaluation yearthe parameter @EvalDate is defined and set to 12-31-2012. Subsequently age will be determined bycalculating the difference in hours between the birth date and the evaluation date and dividing that by8766, the average number of hours in a year.Rather than use the Cast function from the previous example the Floor function is used to calculate therow number to be selected for the sample. This calculation is then used in the where clause filter therecords for the sample.T-SQL Sampling Page 6 of 18
  7. 7. The resulting data set follows the NCQA/HEDIS technical specifications exactly and will return the sameresults from the same source data every time.One concern resulting from this approach is that identically named parents and children are unlikely toappear together in the same sample. However, this will likely exclude duplicates as well, so might beconsidered an advantage. Feel free to experiment with the sample data to determine which records areduplicates and which may just represent multiple generationsLikewise, feel free to adapt the examples presented here to your circumstances, either for a randomsample or a truly systematic sample, for HEDIS measure or some other application.To work through these examples, first copy and paste Appendix A into SQL Server Management Serverand run the script to create and populate the table. Use Appendix B to create a comprehensive set ofqueries. Move the comment marker to activate/deactivate query sections.T-SQL Sampling Page 7 of 18
  8. 8. Appendix ACreating the sample data./* Create the sample table Individuals and populate it */USE DemonstrationGO/****** Object: Table [dbo].[Individuals] Script Date: 02/26/201310:34:23 ******/IFEXISTS(SELECT*FROMsys.objectsWHEREobject_id=OBJECT_ID(N[dbo].[Individuals])ANDtypein(NU))DROPTABLE [dbo].[Individuals]GO/****** Object: Table [dbo].[Individuals] Script Date: 02/26/201310:34:24 ******/SETANSI_NULLSONGOSETQUOTED_IDENTIFIERONGOSETANSI_PADDINGONGOCREATETABLE [dbo].[Individuals]( [IdNo] [int] IDENTITY(1,1)NOTNULL, [FName] [varchar](50)NULL, [LName] [varchar](50)NULL, [BirthDate] [date] NULL, [Sex] [varchar](10)NULL, [Address] [varchar](100)NULL, [City] [varchar](50)NULL, [St] [varchar](10)NULL, [Zip] [varchar](10)NULL, [Phone] [varchar](25)NULL, [Fee] [money] NULL)ON [PRIMARY]GOSETANSI_PADDINGOFFGOSetNocountonGoInsertinto dbo.Individuals values (Imelda,Czechs, 8/20/2013,M,121Lasting Light Way,Buck County Village,PA,13432,(345) 148-4523 x123,38.63 )
  9. 9. Insertinto dbo.Individuals values (Imelda,Czechs, 8/20/2013,M,121Lasting Light Way,Buck County Village,PA,13432,(345) 148-4523 x123,38.63 )Insertinto dbo.Individuals values (DouseAnne,Burnham,12/15/1935,Sex,2345 WestMaine,Anytown,IL,60604,808/445-5934, 9.25)Insertinto dbo.Individuals values (Sue,Flockey,8/4/1981,M,2012 SMichigan Ave,Chicago,IL,60600,312/668-5531, 71.25)Insertinto dbo.Individuals values (Dasha,Chekhov,9/24/1984,F,2132 SMichigan,Chicago,IL,60601,312/134-7467, 63.04)Insertinto dbo.Individuals values (Vishnu,Payup,4/4/1960,M,4022 NDamen,Chicago,IL,60612,708/205-1234x123, 74.67)Insertinto dbo.Individuals values (Bjorn A.,PayneDiaz,7/16/1960,F,4515 N Damen,Chicago,IL,60612,(312) 321-5678, 62.37)Insertinto dbo.Individuals values (Wilma,Butfit,11/28/1988,F,4523 NPaulina,Chicago,IL,60611,312/819-3891, 43.03)Insertinto dbo.Individuals values (Carmine,Dioxide,9/13/1981,M,4533N Paulina,Chicago,IL,60606,312/222-9266, 73.02)Insertinto dbo.Individuals values (UlandaHugh,Lucky,6/20/1976,F,5433 West Ave,Chicago,IL,60601,(900)851-3471 , 7.05)Insertinto dbo.Individuals values (Will,PriceRandomly,10/2/1970,F,695 N. Clinton,Chicago,IL,60601,(312) 390-6886 x1212 , 35.29)Insertinto dbo.Individuals values (Rush,Inuit,8/23/1957,F,7979 W.Fullerton,Chicago,IL,60607,312/677-6019, 29.48)Insertinto dbo.Individuals values (Lou,Segusi,4/19/1957,F,7981 W.Fullerton,Chicago,IL,60607,312/244-4610, 9.03)Insertinto dbo.Individuals values (Turner,Luce,1/12/1988,M,77 SunsetStrip,Hollywood,CA,90211,114/219-4103, 6.47)Insertinto dbo.Individuals values (Everett,Possum,6/30/1994,M,123Sesame St,Lansing,IL,60645,514/196-4755, 65.03)Insertinto dbo.Individuals values (Bud,Uronner,1/24/1963,M,640 KayDrive,Palo Alto,CA,90909,537/178-3081, 23.48)Insertinto dbo.Individuals values (Stu,Earley,1/4/1942,F,234 SouthWillintonm Apt 3C,Townton,NJ,04323,174/697-1209, 15.25)Insertinto dbo.Individuals values (Amadeus O.,Early,9/6/2001,M,1131N. Devon Av,Chicago,IL,60630,174/697-1209, 50.40)Insertinto dbo.Individuals values (Viola,Fuss,10/25/2009,M,4200Peake Lane,Portsmouth,RI,23703,(312)222-4343, 82.83)Insertinto dbo.Individuals values (Phyllis,Steen,2/21/1959,M,611 N.Devon,Chicago,IL,60630,(312)239-4343, 1.81)Insertinto dbo.Individuals values (Dot,Snice,10/4/2000,F,7311 QuickAvenue,River Forest,IL,60630,(312)222-4343, 46.58)Insertinto dbo.Individuals values (Luciano,Pavearoadi,12/4/1960,,414Linden Avenue,Chicago,IL,60630,(312)239-4343, 60.43)Insertinto dbo.Individuals values (Lois,Steem,9/6/1981,M,629 S.Ridgeland Ave.,Chicago,IL,60630,(312)222-4343, 30.75)Insertinto dbo.Individuals values (Kurt,Reply,12/4/1965,F,827 N.Marion Street,Chicago,IL,60630,(312)222-4343, 23.63)Insertinto dbo.Individuals values (Hugo,Gurll,8/30/1952,F,1123 FairOaks,Chicago,IL,60630,(312)222-4343, 80.40)Insertinto dbo.Individuals values (Gladys,Radio,2/11/1962,M,100 N.Elmwood Ave.,Chicago,IL,60630,(312)222-4343, 98.55)Insertinto dbo.Individuals values (Kent C.,Detrees,4/30/2000,M,7708Monroe,Forest Park,IL,60130,(708)692-4343, 5.50)
  10. 10. Insertinto dbo.Individuals values (Joaquin,dePlanque,1/20/1975,M,825 ForestAvenue,Chicago,IL,60630,(312)222-4343, 27.52)Insertinto dbo.Individuals values (Lisa,Carr,3/20/1953,F,401 LindenAvenue,Chicago,IL,60630,(312)222-4343, 77.95)Insertinto dbo.Individuals values (Orson,Buggy,8/22/1970,F,165 N.Kenilworth, #6G,Chicago,IL,60630,(312)222-4343, 50.55)Insertinto dbo.Individuals values (Nomar,Wheaton,9/5/1940,M,606 S.Scoville,Chicago,IL,60630,(312)222-4343, 24.81)Insertinto dbo.Individuals values (Janet,Torino,9/11/1936,F,115 S.Harvey,Chicago,IL,60630,(312)222-4343, 28.03)Insertinto dbo.Individuals values (Hubert H.,HumveeII,11/14/1949,M,1025 Randolph,Chicago,IL,60630,(312)222-4343,73.87)Insertinto dbo.Individuals values (Hugh,Wake,1/24/2008,M,112 N.Kenilworth,Chicago,IL,60630,(312)222-4343, 33.37)Insertinto dbo.Individuals values (Adam,Illion,6/12/1967,M,110 N.Taylor,Chicago,IL,60630,(773) 232-1212 , 63.02)Insertinto dbo.Individuals values (Luke,Warm,2/26/1966,F,4917 W.Midway Park, Apt C,Chicago,IL,60644,(312)222-4343, 76.40)Insertinto dbo.Individuals values (Joaquin,Matilda,8/17/1946,M,1001S. Devon Ave,Chicago,IL,60630,(312)222-4343, 4.12)Insertinto dbo.Individuals values (James,Bondo,8/30/1984,M,447 N.Kenilworth Ave.,Chicago,IL,60630,(312)222-4343, 59.37)Insertinto dbo.Individuals values (Rusty,Steele,10/7/1980,F,603Edgewood Place,River Forest,IL,60540,(312)222-4343, 52.02)Insertinto dbo.Individuals values (Megan,Model,12/13/1974,F,116 S.Devon,Chicago,IL,60630,(312)222-4343, 19.08)Insertinto dbo.Individuals values (Fitz,Matush,5/6/1951,M,1032 N.Devon Ave,Chicago,IL,60630,(312)222-4343, 61.14)Insertinto dbo.Individuals values (Mischa,Turnov,7/7/1975,M,1838Woodland Ave.,Western Springs,IL,60559,(312)222-4343, 15.07)Insertinto dbo.Individuals values (Nadia,Geddit,4/28/1947,M,1000Home Ave.,Chicago,IL,60630,(312)222-4343, 84.60)Insertinto dbo.Individuals values (Freida,Gogh,3/14/2007,M,7301Ibsen,Chicago,IL,60631,(312)222-4343, 27.92)Insertinto dbo.Individuals values (Frieda,Wander,11/14/1971,M,15 E.Jackson,Chicago,IL,60604,(312)222-4343, 51.54)Insertinto dbo.Individuals values (Sasha,Noyes,7/14/1963,F,428 S.East Ave.,Chicago,IL,60630,(312)222-4343, 80.72)Insertinto dbo.Individuals values (Ed,Amame,2/24/1954,F,746 N.Lombard Ave.,Chicago,IL,60630,(312)222-4343, 44.23)Insertinto dbo.Individuals values (Vera Lee,Isay,10/10/1954,M,1028Gunderson Ave.,Chicago,IL,60630,(312)222-4343, 24.85)Insertinto dbo.Individuals values (Juan,Anatou,3/2/2005,F,1028Gunderson Ave.,Chicago,IL,60630,(312)222-4343, 13.79)Insertinto dbo.Individuals values (I.,ShelbyReleased,12/4/1952,M,1126 Hayes,Chicago,IL,60630,, 25.58)Insertinto dbo.Individuals values (Tilda,Plierslip,3/2/1981,M,108Bishop Quarter Lane,Chicago,IL,60630,(312)222-4343, 76.78)Insertinto dbo.Individuals values (Odessa,PaigeTurner,10/27/1982,F,152 N. Scoville,Chicago,IL,60630,(312)222-4343, 85.12)Insertinto dbo.Individuals values (Hadley,Newham,12/22/2012,F,433 S.Ridgeland Ave.,Chicago,IL,60630,(312)222-4343, 31.11)Insertinto dbo.Individuals values (Menachem,Down,7/30/1967,M,633 N.Marion,Chicago,IL,60630,(312)222-4343, 16.88)
  11. 11. Insertinto dbo.Individuals values (Eureka,Garlic,10/8/1943,M,1031 S.Gunderson,Chicago,IL,60630,(312)222-4343, 62.59)Insertinto dbo.Individuals values (Isaiah,Olchap,10/9/1950,M,743 S.Gunderson,Chicago,IL,60630,(312)222-4343, 34.92)Insertinto dbo.Individuals values (Laura,Biden,5/11/2012,F,647Woodbine,Chicago,IL,60630,(312)222-4343, 99.88)Insertinto dbo.Individuals values (Hugo,First,1/3/2005,F,100 ForestPlace,Chicago,IL,60630,(312)222-4343, 81.48)Insertinto dbo.Individuals values (Angus,MacCoatup,8/18/2001,F,425Washington Blvd. #1,Chicago,IL,60630,(312)222-4343, 97.90)Insertinto dbo.Individuals values (Phillip,Airtime,10/26/1955,M,831N. Grove,Chicago,IL,60630,(312)222-4343, 67.01)Insertinto dbo.Individuals values (Bruno,Moore,1/9/1981,M,13 E. LakeStreet,Northlake,IL,60164,(312)222-4343, 73.63)Insertinto dbo.Individuals values (Carlos,Antenna,7/15/1991,M,151Lemoyne Parkway,Chicago,IL,60630,(312)222-4343, 76.83)Insertinto dbo.Individuals values(Euripedes,Ibreakayourface,2/24/1975,F,127 S. HomeAve.,Chicago,IL,60630,(312)222-4343, 56.94)Insertinto dbo.Individuals values (Sam,Boney,6/13/1955,F,911Lathrop,River Forest,IL,60630,(312)222-4343, 3.92)Insertinto dbo.Individuals values (Barbara,Seville,9/21/1945,F,128S. Austin,Chicago,IL,60630,(312)222-4343, 40.32)Insertinto dbo.Individuals values (Horatio,Algebra,2/23/1958,M,641N. Marion St.,Chicago,IL,60630,(312)222-4343, 38.21)Insertinto dbo.Individuals values (Amos,Reid,6/22/1943,M,416Harrison St.,Chicago,IL,60630,(312)222-4343, 95.53)Insertinto dbo.Individuals values (Ira,Caull,10/10/1956,M,721Ontario #204,Chicago,IL,60630,(312)222-4343, 27.43)Insertinto dbo.Individuals values (Victor,Analysis,11/5/2001,F,1656W. Estes Ave.,Chicago,IL,60645,(312)222-4343, 6.05)Insertinto dbo.Individuals values (Art,Majors,10/7/1968,M,746Clinton Place,River Forest,IL,60630,(312)222-4343, 24.95)Insertinto dbo.Individuals values (Bernadette,Bridge,7/14/1993,M,426S. Elmwood Ave.,Chicago,IL,60630,(312)222-4343, 96.87)Insertinto dbo.Individuals values (Wayne,Back,9/9/1993,M,1136 S.Scoville Ave.,Chicago,IL,60630,(312)222-4343, 59.72)Insertinto dbo.Individuals values (Juan,Menudo,4/15/1993,M,117 S.Euclid Ave.,Chicago,IL,60630,(312)222-4343, 44.15)Insertinto dbo.Individuals values (Jacques,Hughes,10/1/1966,F,1021N. Elmwood Ave.,Chicago,IL,60630,(312)222-4343, 25.14)Insertinto dbo.Individuals values (Yessir,Itsaflat,5/16/1955,F,11050Westminster,Westchester,IL,60154,(312)222-4343, 39.31)Insertinto dbo.Individuals values (Al,Lowetta,6/27/1941,M,936Chicago Ave.,Chicago,IL,60630,(312)222-4343, 15.16)Insertinto dbo.Individuals values (Saul,Wellingood,9/21/1984,M,124S. Elmwood,Chicago,IL,60630,(312)222-4343, 42.79)Insertinto dbo.Individuals values (Jillian,Here,2/13/1947,M,124 S.Elmwood Ave.,Chicago,IL,60630,(312)222-4343, 14.97)Insertinto dbo.Individuals values (Colette,ODay,12/28/1971,M,1125Linden,Chicago,IL,60630,(312)222-4343, 81.40)Insertinto dbo.Individuals values (Hugh,Jass,4/13/1992,F,141 S.Taylor Ave.,Chicago,IL,60630,(312)222-4343, 35.32)Insertinto dbo.Individuals values (Gladys,Overwith,10/6/1942,F,1000N. Harvey,Chicago,IL,60630,(312)222-4343, 50.18)
  12. 12. Insertinto dbo.Individuals values(George,Stayontopothis,4/19/1988,F,1500 Monroe,RiverForest,IL,60630,(312)222-4343, 97.75)Insertinto dbo.Individuals values (Ophelia,Paine,9/9/1997,M,111 N.Elmwood,Chicago,IL,60630,(312)222-4343, 44.48)Insertinto dbo.Individuals values (Xavier,Breath,12/2/2002,F,119 S.Harvey Ave.,Chicago,IL,60630,(312)222-4343, 22.64)Insertinto dbo.Individuals values (Levon,Hold,1/18/1980,F,147Harrison,Chicago,IL,60630,(312)222-4343, 7.16)Insertinto dbo.Individuals values (Billy,Aiken,3/15/1965,F,1200Linden Ave.,Chicago,IL,60630,(312)222-4343, 19.58)Insertinto dbo.Individuals values (C.,Boynton Glick,6/25/1942,M,114Lake,Chicago,IL,60630,(312)222-4343, 41.97)Insertinto dbo.Individuals values (Philip,Harmonic,1/12/1985,M,134Gale Ave.,River Forest,IL,60630,(312)222-4343, 61.82)Insertinto dbo.Individuals values (Yvonne,Apeesamey,5/31/1957,M,1047Wenonah,Chicago,IL,60630,(312)222-4343, 98.38)Insertinto dbo.Individuals values (Eileen,Tudor-Wright,6/24/2012,M,415 N. ElmwoodAve.,Chicago,IL,60630,(312)222-4343, 82.41)Insertinto dbo.Individuals values (Nadia,Belimi,10/24/1993,M,129 S.Ridgeland,Chicago,IL,60630,(312)222-4343, 62.28)Insertinto dbo.Individuals values (Dustin,Dubree,6/18/1977,F,152554th Ave.,Phoenix,IL,60426,(312)222-4343, 13.63)Insertinto dbo.Individuals values (Evan,Elpus,7/8/1956,F,122 N.Ridgeland,Chicago,IL,60630,(312)222-4343, 2.75)Insertinto dbo.Individuals values (Cody,Pendant,8/8/2013,F,120 S.Taylor Ave.,Chicago,IL,60630,(312)222-4343, 92.84)Insertinto dbo.Individuals values (Pat,Pending,4/10/2010,M,125 S.Elmwood,Chicago,IL,60630,(312)222-4343, 6.47)Insertinto dbo.Individuals values (Hugh,Lyon Sack,12/1/1974,F,636Linden Ave.,Chicago,Il,60630,(312)222-4343, 69.50)Insertinto dbo.Individuals values (Drew A.,Blank,2/26/2000,M,116 S.Scoville Ave.,Chicago,IL,60630,(312)222-4343, 15.34)Insertinto dbo.Individuals values (Lauren,Order,7/31/1936,M,167 N.Ridgeland,Our Fair City,MA,10101,(312)222-4343, 71.99)Insertinto dbo.Individuals values (Rex,Galore,4/20/1965,M,623 N.Euclid,Chicago,IL,60630,(312)222-4343, 95.65)Insertinto dbo.Individuals values (Haywood,Jabuzoff,3/18/2006,F,720S. Harvey,Chicago,IL,60630,(312)222-4343, 49.34)Insertinto dbo.Individuals values (Justin,Volk V,10/2/1979,F,938Norht Blvd., #205,Chicago,IL,60630,(312)222-4343, 96.82)Insertinto dbo.Individuals values (HeronimusB.,Blind,8/28/1973,M,1126 EdmerAve.,Chicago,IL,60630,(312)222-4343, 32.02)Insertinto dbo.Individuals values(Donnatella,DiCoppas,1/5/1998,M,635Fairoaks,Chicago,IL,60630,(312)222-4343, 92.51)Insertinto dbo.Individuals values (Gil T.,Azell,4/29/1950,M,412Randolph St.,Chicago,IL,60630,(312)222-4343, 28.18)Insertinto dbo.Individuals values (Major,Error,9/19/1991,M,124 S.Devon,Chicago,IL,60630,(312)222-4343, 83.65)Insertinto dbo.Individuals values (Ginger,Vitis,8/5/1964,F,904Forest Ave.,Chicago,IL,60630,(312)222-4343, 99.68)Insertinto dbo.Individuals values (Don,Pickett,1/29/1993,M,1020Clinton Ave.,Chicago,IL,60630,(312)222-4343, 91.54)
  13. 13. Insertinto dbo.Individuals values (Ike,Arumba,5/16/1956,M,1112 N.Elmwood Ave,Chicago,IL,60630,(312)222-4343, 40.02)Insertinto dbo.Individuals values (Tyra,Meesu,7/23/1973,F,P.O. BOX770,Chicago,IL,60630,(312)222-4343, 21.45)Insertinto dbo.Individuals values (Bill,Shredder,12/22/1995,F,110 W.Madison Ave. #2F,Chicago,IL,60630,(312)222-4343, 24.35)Insertinto dbo.Individuals values (Dot,Matrix,4/19/1969,F,933Jackson Ave.,River Forest,IL,60630,(312)222-4343, 31.08)Insertinto dbo.Individuals values (Fred,Knott,5/7/1989,M,121 HomeAve.,Chicago,IL,60630,(312)222-4343, 23.69)Insertinto dbo.Individuals values (Marianna,Trench,12/27/1965,M,141S. Scoville Ave.,Chicago,IL,60630,(312)222-4343, 4.01)Insertinto dbo.Individuals values (Anita,Hammer,6/23/1980,M,1231Belleforte,Chicago,IL,60630,(312)222-4343, 26.66)Insertinto dbo.Individuals values (Upton,Leftus,9/23/1987,F,126 N.Ridgeland,Chicago,IL,60630,(312)222-4343, 73.82)Insertinto dbo.Individuals values (AmandaB.,Reckondwyth,8/25/1936,F,1132 N.Ridgeland,Chicago,IL,60630,(312)222-4343, 20.11)Insertinto dbo.Individuals values (Nomar,Winter,6/24/1948,F,800Gunderson Ave.,Chicago,IL,60630,(312)222-4343, 74.19)Insertinto dbo.Individuals values (Iona,Heap,9/14/1999,M,424 S.Austin Blvd. #3,Chicago,IL,60630,(312)222-4343, 26.92)Insertinto dbo.Individuals values (Lucinda,Boltz,7/31/2007,F,170 N.Cuyler,Chicago,IL,60630,(312)222-4343, 93.56)Insertinto dbo.Individuals values (Kay,Sera,7/29/1976,M,283 PleasentValley Rd,Westville,OH,34534,, 1.72)Insertinto dbo.Individuals values (Juan,Moorehouse,8/13/1967,F,234Coldwater,Minneapolis,MN,57564,, 68.11)Insertinto dbo.Individuals values (Rose,Hips,8/15/1983,F,121 TemonaDr,Pleasent Hills,PA,50143,, 96.28)Insertinto dbo.Individuals values (Isabelle,Ringing,9/2/1936,M,350N. Orleans, #892,Chicago,IL,60654- ,, 99.67)Insertinto dbo.Individuals values (Maury,Missions,2/6/1984,M,5411 WFullerton Ave,Chicago,IL,60639-1482,, 40.95)Insertinto dbo.Individuals values (Oscar,Ruitt,4/11/1940,M,2141South Tan Court,Chicago,IL,60616- ,, 23.96)Insertinto dbo.Individuals values (Lois,Bidder,10/27/2012,F,1400 WAugusta Blvd,Chicago,IL,60622-3939,, 52.40)Insertinto dbo.Individuals values (Donatella,Debois,6/1/1982,M,25 EWashington St Fl 16,Chicago,IL,60602-1708,, 2.93)Insertinto dbo.Individuals values (Eamon,Lowe,7/21/1990,M,1515 WMonroe St,Chicago,IL,60607-2497,, 17.38)Insertinto dbo.Individuals values (Linus,Scrimmage,1/19/2012,F,923N. Robinson, Suite 400,Oklahoma City,OK,73102-2203,, 49.70)Insertinto dbo.Individuals values (Holly,Unlikely,10/16/1961,F,3First National Plaz,Chicago,IL,60602- ,, 67.60)Insertinto dbo.Individuals values (Eileen,Yorway,6/30/1992,M,2448 WGrace St,Chicago,IL,60618-4719,, 8.05)Insertinto dbo.Individuals values (Lee,Eyeapoka,12/29/1997,F,100 WRandolph St,Chicago,IL,60601-3108,, 84.23)Insertinto dbo.Individuals values (Donatello,Nobatti,7/19/1999,M,401S Clinton St,Chicago,IL,60607- ,, 3.02)Insertinto dbo.Individuals values (Ewell,Rudy Day,9/27/1980,F,500 NPeshtigo Ct,Chicago,IL,60611-4309,, 37.42)Insertinto dbo.Individuals values (Sumner,Reruns,8/14/1990,M,401 SClinton St,Chicago,IL,60607- ,, 33.97)
  14. 14. Insertinto dbo.Individuals values (Holly,Unlikely,4/2/1986,M,77 WJackson Blvd,Chicago,IL,60604-3511,, 9.13)Insertinto dbo.Individuals values (Ophelia,Self,1/13/1945,F,4554 N.Broadway, St. 301,Chicago,IL,60640- ,, 81.84)-- EOF Add_Individuals.sql
  15. 15. Appendix BQueries-- Samples.sql/* Sample methods to select a sample set from a larger table*/Use DemonstrationGo-- View the entire data set/*-- Shows entire table contents in data entry orderSelect I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals I*//* Shows the first 25 records in the order that they were entered. Not a good approachto retrieving a trustworthly sample. Includes dupelicate rows 1 and 2Select Top 25 I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals I*//* Shows 25 randomly selected records. This is the preferred method for a truely random sample. Each time this query run it will select a different set of records. Add "Percent" after "Top 25"to select a percentage of the full data setSelect Top 25 percent I.IdNo, I.LName, I.FName, I.BirthDateFrom
  16. 16. Individuals IOrder By NewId()*//* Use the modulus to select every nth record based on IdNo. Avoid any early record bias by startingat a higher IdNoSelect I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals IWhere (IdNo + 2) %5 = 0 And I.IdNo > 11*//* The NCQA Systematic Selection standard call for dataset to be sorted by last name, first name and birthday and for this set to be numbered consectutively. This query, which will be used in the following examples as a common table expression. It is shown here to demonstrate the sortingfunctionality of Row_Number directly.Select ROW_NUMBER() Over (Order By I.LName, I.FName, I.BirthDate Asc)RowNum, I.IdNo, I.LName, I.FName, I.BirthDateFrom Individuals I*//* Systematic selection of records per NCQA only requires the row number for the actually selection. Oncecompleted, the results can be joined back to the orignal dataset for additional records;With Ind_List_CTE (RowNum, IdNo, LName, FName, BirthDate)As ( Select ROW_NUMBER() Over (Order By I.LName, I.FName, I.BirthDate Asc)RowNum, I.IdNo, I.LName, I.FName, I.BirthDate From Individuals I )
  17. 17. Select * from Ind_List_CTE*//* For the systematic sample per HEDIS/NCQA Technical Specifications. Assume a starting value of 2;With Row_List_CTE (RowNum)As ( Select ROW_NUMBER() Over (Order By I.LName, I.FName, I.BirthDate Asc) From Individuals I )Select Top 25 RowNum , RowNum - 1 SelRec , 137.0/25.0 NR , (RowNum-1) * (137.0/25.0) RN1 , CAST(ROUND(2+(RowNum-1) * (137.0/25.0),0) As Int) FinalFrom Row_List_CTE*//* Final Query*/Declare @EvalDt DateDeclare @N Decimal(9,3)Set @EvalDt =12-31-2012Set @N =137.0/25.0;With Ind_List_CTE(RowNum, IdNo, LName, FName, BirthDate)As ( Select ROW_NUMBER()Over (OrderBy I.LName, I.FName, I.BirthDate Asc)RowNum, I.IdNo, I.LName, I.FName, I.BirthDate From Individuals I )Select RowNum , IdNo , FName , LName
  18. 18. , BirthDate ,DATEDIFF(Hour,BirthDate, @EvalDt)/8766 As AgeFrom Ind_List_CTEWhere RowNum In( SelectTop 25 --CAST(ROUND(2+(RowNum-1) * @N,0) As Int) Floor(ROUND(2+(RowNum-1)* @N,0)) From Ind_List_CTE)-- EOF Samples.sql