Frequency Measures Prepared by: Dr. Alber Paules
Introduction to Frequency Distributions• In healthcare, we deal with vast quantities of clinical data. Since it is very difficult to look at data in raw form, data are summarized into frequency distributions.• A frequency distribution shows the values that a variable can take and the number of observations associated with each value.
Ungrouped Frequency Distribution LOS Frequency 1 2 2 6 3 6 Frequency 4 5 Distribution for 5 11 Patient LOS 6 6 7 8 8 5 9 3 10 1 11 2 12 3
Grouped Frequency Distribution Class Interval Frequency 1-2 8 Frequency 3-4 11 Distribution for 5-6 17 Patient LOS 7-8 13 9-10 4 11-12 5
The Variable• A Variable is a characteristic or property that may take on different values.• Height, weight, gender, and third-party payer are examples of variables.
The VariableVariables can be classified into: 1. A Quantitative variable: measures outcomes that are expressed numerically. Examples include patient’s age, LOS, weight, height. 2. A Qualitative variable: consists of outcomes that cannot be expressed numerically without modification/coding. Examples include patient satisfaction (very satisfied, satisfied, neutral, dissatisfied, very dissatisfied), and evaluation of a departmental performance (poor, average, good).
The VariableQuantitative variables can either be: i. Discrete variables: can assume only certain values, and there are usually gaps between the consecutive values. For example, the number of beds in a hospital can take only integer values (e.g., 92, 95); thus, there is a “gap” between possible values. Furthermore, you cannot say that the hospital has 93.56 beds. Typically, discrete variables result from counting. ii. Continuous variables: here, observations can assume any value within a specific range. Measuring body weight is an example; it can take decimal values (e.g., 80.7 Kgms). Typically, continuous variables result from measuring.
Scales/Levels of Measurement(arranged from the lowest to the highest level of measurement)1. Nominal scale Used with Qualitative Variables2. Ordinal Scale3. Scale for Metric Variables Used with Quantitative Variables- The level of measurement of the data often dictates the calculations that can be done to summarize and present the data. It also determines the statistical tests that should be performed with this data.
Scales/Levels of Measurement1. Nominal scale: • Here, observations of a qualitative variable may only be classified and counted. • Measures are organized into categories; there is no recognition of order within these categories. Examples of categories on the nominal scale of measurement are gender, nationality, and classification of the six colors of M&M’s milk chocolate candies.
Scales/Levels of MeasurementNominal scale: Gender Number Here, we Male 6 classified patients by the gender Female 4 attribute andcounted them TOTAL 10
Scales/Levels of Measurement2. Ordinal scale: • Here, observations of a qualitative variable be classified, ranked and counted. • Here, categories have an order. • Examples of ordinal variable are the ordering of adjectives describing patient satisfaction; numbers may be assigned to represent the ordering of the variables (e.g., the Likert-type scale that have five points from “strongly disagree” to “strongly agree”).
Scales/Levels of MeasurementOrdinal scale: Degree of Frequency Agreement S. Agree = 5 10 Here, we classified the Agree = 4 8 degree of agreement on a ranked Neutral = 3 3 scale and counted the Disagree = 2 2 responses S. Disagree = 1 1
Scales/Levels of Measurement2. Ordinal scale: • Here, the order of numbers (e.g., 1-5) is meaningful and the number can be dealt as a weight/score and can yield a mean. • Here, there are no equal intervals between successive categories; so, we cannot say that the patient who gives a score (1 = strongly agree) is 3-time more agreeing with the statement than the patient who responds (3 = neutral). We are not able to distinguish the magnitude of the difference between groups.
Scales/Levels of Measurement3. Scales for Metric variables: Interval scale: – Here, the intervals between successive values are equal. – There is a defined unit of measure, values can be ranked and there is a meaningful difference between values; but no true zero point and no ratios between values (but there are ratios between intervals).
Scales/Levels of Measurement3. Scales for Metric variables: Interval scale: – Time of day is measured on an interval scale; we cannot say that 10:00 AM is twice 5:00 AM, but we can say that the interval between 0:00 AM (midnight) and 10:00 AM is twice the duration is twice as long as the interval between 0:00 AM and 10:00 AM. – We cannot say that 0:00 AM means absence of time. – We cannot say that 10:00 AM is twice as long as 5:00 AM.
Scales/Levels of Measurement3. Scales for Metric variables: Ratio scale: – This is the highest level of measurement. – The interval between successive values are equal. – There is a real zero point and ratio between values (e.g.) for weight, if you have 0 kgms, then you have no weight. On the length scale; we can say "Mary is twice as tall as Jill" (i.e.) the ratio of two numbers is meaningful.
Scales/Levels of MeasurementN.B. • The scale of measurement you use depends on the method of measurement.
Scales/Levels of MeasurementRatio scale: Students Score (out of 10) A 8 Here, we B 4 scored (counted C 10 the right answers for each D 3 student) E 6
Scales/Levels of MeasurementNominal scale: Success status FrequencyHere, weclassified Failed 3 thestudents and Passed 2 counted
Scales/Levels of MeasurementOrdinal scale: Success status FrequencyHere, weclassified Excellent 1 thestudents on a Good 2 rankedscale and counted Fair 2
Measures of Central Tendency and Variability• Two main types of measures are used to describe frequency distributions: measures of central tendency and measures of variability.• Measures of central tendency focus on the typical value of a data set, while measures of variability measure dispersion around the typical value of a data set.
Measures of Central Tendency• Measures of central tendency summarize the typical value of a variable.• There are three major measures of central tendency: 1. Mode: appropriate for nominal and metric data 2. Median: appropriate for ordinal and metric data 3. Mean: appropriate for metric data
Measures of Central TendencyMode – The mode is the simplest measure of central tendency. – Mode is defined as the most frequently occurring observation for metric data or the most frequently occurring category for nominal data. – It is the only measure of central tendency that is appropriate for nominal data, because an average cannot be taken from data that are placed in categories (averages only work when a variable has a unit of measurement).
Measures of Central TendencyMode – The mode offers several advantages: for example, it is not sensitive to extreme values and it is easy to communicate and explain to others. – On the other hand, the mode does not provide information about the entire frequency distribution - it only tells us the most frequently occurring value (it records only one value ignoring other values - unlike the mean) in the frequency distribution.
Measures of Central TendencyMedian – When categories of a variable are ordered, the measure of central tendency should take order into account. The Median does so by finding the value of the variable which corresponds to the middle case. – The median is the midpoint of the frequency distribution (above which 50% of the cases fall and below which 50% of the cases fall); it is appropriate for both ordinal and metric data. – Usually used with the metric data set which has outlier values which may affect the mean value.
Measures of Central TendencyMedian – If there is an odd number of observations, the median is the middle number. – If there is an even number of observations, the median is the average between the two middle observations (i.e.) the midpoint between the two middle observations. If the two middle observations take on the same value, the median is that value. – There are several advantages of using the median: 1. It is relatively easy to obtain, 2. It is not influenced by extreme values.
Measures of Central TendencyMedian The following example illustrates one common use of the "median": assume that one faculty have 50 available scholarships and that it has to choose 50 scholars among the 101 graduates who applied for such scholarships; one way to choose the best 50 is to arrange them in an ascending order based on their scores, and choosing the 50 ones beyond the one with the median score value (the one with the 50th percentile score).
Measures of Central TendencyMean – Symbolized by X, it is the arithmetic average of the values on the variable; it is appropriate for metric data. – The mode and the median can be computed on metric data but they do not take full advantage of the numeric data in the frequency distribution. – It is calculated by dividing the summation of the values observed by the total number of observation in the distribution.
Measures of Central TendencyMean – There are two disadvantages associated with the mean: • First, the mean can take a fractional value even when the variable itself can take only integer values (e.g.) 16.7 days - this means that number of days is between 16 and 17 days. • Second, the mean is sensitive to extreme measures (i.e.) strongly influenced by outliers, which can produce a misleading value.
Measures of Central TendencyWeighted Mean – Often in the healthcare setting, we have separate samples (e.g., for different time intervals) with separate means for each, and each sample may be of different size. – The weighted mean takes into account the difference in the sizes of the samples and is therefore more precise.
Measures of Central Tendency Weighted Mean Month Discharges ALOS Jan 974 4.46 Feb 763 5.20 Mar 574 3.21 Average of the Means = 4.46+5.20+3.21/3 = 4.29 Weighted Mean = [974(4.46)] + [763(5.20)] + [574(3.21)] /2311= 4.39 More precise
Measures of Central TendencyN.B. – In case a data set has a normal (symmetrical) distribution, the three measures of central tendency coincide at the same point (i.e., mean=mode=median).
Measures of Variability• Measures of variability tell the spread of the frequency distribution (i.e.) how widely the observations are spread out around the measure of central tendency.• The most commonly used measures of spread are the variance and the standard deviation.• Measures of spread increase in value with greater variation on the variable. Measures of spread equal zero when there is no variation.
Measures of VariabilityRange – The simplest measure of spread is the range. It is simply the difference between the smallest and the largest values in a frequency distribution: Range = Xmax – Xmin – The range is easy to calculate but it is affected by extreme measures; only the two most extreme scores affect its value, so it is not sensitive to other values in the distribution. The range is also dependant on the sample size: in general, the larger the sample size, the greater the range.
Measures of VariabilityRange – Two frequency distributions may have the same range, but the observations may differ greatly in variability. For example, consider the following two frequency distributions: Distribution 1 1 2 3 4 5 6 7 8 9 10 Distribution 2 1 1.5 3 3.5 3.7 7 8 8.2 10 10
Measures of VariabilityRange (continue previous slide) – The range for both distributions is 9. But if we compare the two distributions, we see that there is more variation in distribution 2 than in distribution 1. – This is confirmed when the variance for each distribution is calculated – the variances for distributions 1 and 2 are 3.03 and 3.44, respectively.
Measures of VariabilityVariance – The variance (s2) is the average of the squared deviations from the mean; the squared deviations of the mean are calculated by subtracting the mean of a frequency distribution from each value in the distribution. The difference between the two values is then squared. The squared differences are summed and divided by (n -1).
Measures of VariabilityVariance n 2 (Xi X) s2 = i 1 n-1 The interpretation of the variance is not easy at the descriptive level because the original units of measure are squared.
Measures of VariabilityStandard Deviation – The square root of variance is the standard deviation (s). It measures variability in the same unit of measurement as the sample. – The standard deviation is the most widely used measure of variation that is used in descriptive statistics.
Measures of VariabilityStandard Deviation – Example: if the mean for ages of nursing home residents is 82 and the SD is 4.45, this means that approximately 68% of the nursing home residents are between the ages 77.55 and 86.45.
Percentiles– Percentiles are measures of ranking.– If in a graduation class, a student has the 85th percentile rank; this means that 85% of the students of the class are ranked lower than this student. Thus, the percentile rank for an individual graduating last would be 0% because there is no classmates ranked below. In a class of 200, the first will have a percentile rank of 99.5; this means that 99.5% of the class students (199 students) are ranked lower than him.– The individual with the “median” rank has the 50th percentile rank.
Ratios, Proportions, and Rates – Qualitative nominal variables often have only two possible categories, such as alive or dead, or male or female. Variables having only two possible categories are called dichotomous. – The frequency measures used with dichotomous variables are ratios, proportions, and rates. – The 3 measures are based on the same formula: X Ratio, proportion, rate = 10n Y
Ratios, Proportions, and RatesRatios – In a ratio, the values of a variable, such as sex (x = female, y = male), may be expressed so that x and y are completely independent of each other, or x may be included in y. – For example, the sex of patients discharged from a hospital could be compared in either of two ways: Female/male or x/y Female/(male + female) or x/(x+y) − Both expressions are considered ratios (the 2nd type is called proportion)
Ratios, Proportions, and RatesRatios – For example, suppose that the female discharges from your hospital during July were 457 and the male discharges were 395; then female-to-male ratio would be 457/395 or 1.16/1 (i.e.) there were 1.16 female discharges for every male discharge.
Ratios, Proportions, and RatesProportions – A proportion is a particular type of ratio. – A proportion is a ratio in which x is a portion of the whole, x+y. – In the pre-mentioned example, the proportion of female discharges during July would be 457/ (457+395) = 457/852 or 0.54/1.00 (i.e.) the proportion of discharges that were female is 0.54.
Ratios, Proportions, and RatesRates – In healthcare, rates are often used to measure an event over time and are sometimes used as performance improvement measures. – The basic formula for a rate is: No. of cases or events occurring during a given time period x 10n Number of cases or population at risk during same time period
Ratios, Proportions, and RatesRates – In inpatient facilities, there are many commonly computed rates. – In computing the CS rate, for example, we count the number of C-sections performed during a given period of time; this value is placed in the numerator. The number of cases or population at risk is the number of women who delivered during the same time period; this number is placed in the denominator. – By convention, inpatient hospital rates are calculated as a rate per 100 cases and are expressed as a percentage.
Rates• Example: for the month of July, 23 C-sections were performed; during the same time period, 149 women delivered. What is the CS rate for the month of July? The rate would be (23/149) x 100 = 15.4%; thus, the CS rate for the month of July is 15.4%.
Inpatient Census• Inpatient census refers to the number of hospital inpatients present at any one time.• Because the census may change throughout the day as admissions and discharges occur; in most facilities the official count takes place at midnight (12:00 am). This count is referred to as (Daily Inpatient Census).• (Daily Inpatient Census) includes any patient who was admitted and discharged the same day. (e.g.) a patient who was admitted at 1 pm and died at 4 pm on the same day.
Sample Daily Inpatient Census Report May 2 Number of patients in hospital at midnight, May 1 230 + Number of patients admitted May 2 +35 - Number of patients discharged, including deaths, May 2 -40 Number of Patients in hospital at midnight, May 2 225 + Number of patients both admitted and discharged on May 2 , +5including deaths (those patients are not present during time of census but they utilized the hospital services during this day) Daily Patient Census at midnight, May 2 230
Inpatient Bed Occupancy Rate• The (Inpatient Bed Occupancy Rate) is the percentage of official beds occupied by hospital inpatients for a given period of time.• The numerator here is the total number of (daily inpatient census) during a certain period, while the denominator is the total number of (bed count days) = no. of beds multiplied by the no. of days during the same period.• Formula= Total no. of daily inpatient census for a given period x 100 Total no. of inpatient bed count days for the same period
Inpatient Bed Occupancy Rate• For example, if 200 patients occupied 280 beds on May 2, the (inpatient bed occupancy rate) would be (200/280)x100 = 71.4%• If the total inpatient census for 7 days = 1729; the no. of beds (280) would be multiplied by the no. of days (7), and (the inpatient bed occupancy rate) for that week would be [1729/(280x7)]x100 = 88.2%• The inpatient bed occupancy may exceed 100%; this may occur in case of epidemics or disasters when hospitals set up extra temporary beds that are not included in the official count.
Bed Turnover Rate• Bed turnover rate informs us about the number of times each hospital bed change occupants.• Formula= Total no. of discharges, including deaths, for a given time period Average bed count for the same periodAverage bed count = summation of the daily no. of available bedsdivided by the no. of days• For example, if hospital XYZ experienced discharges (including deaths) = 2060 during April; and the average bed count was 400. then the bed turnover rate = 2060/400 = 5 (this means that on average each hospital bed had 5 occupants during April)
Length of Stay (LOS) Data• LOS is calculated after patient is discharged.• It refers to the number of calendar days from the day of patient admission to the day of discharge. Day of discharge is not counted.• It is calculated by subtracting the day of admission from the date of discharge. For example, the LOS of a patient admitted on May 12 and discharged on May 17 is 5 days (17-12 = 5).• When patient is admitted and discharged on the same day, the LOS = 1 day. Similarly is the LOS of the patient who is admitted on one day and discharged the next day.
Length of Stay (LOS) Data• When the LOS for all patients discharged (or died) for a given period of time is summed, the result is the total LOS. Patient LOS 1 5 2 3 3 1 4 8 5 10 Total LOS 27
Length of Stay (LOS) Data• The total LOS divided by the number of patients discharged is the average LOS (ALOS).• Formula = Total LOS for a given time period Total no. of discharges (including deaths) for the same period• For the previous example, 27/5 = 5.4 days
Hospital Inpatient Death (Mortality) Rate• It is the basic indicator of mortality in a healthcare facility. Total no. of inpatient deaths for a given time period x 100 Total no. of discharges, including deaths, for the same period• There are more specific mortality measures (e.g.) maternal death rates.
Measures of Morbidity• Some commonly used measures to describe the presence of disease in a community or a specific location, such as a nursing home (long-term care facilities), are incidence and prevalence rates.
Incidence Rate• The formula for calculating the incidence rate is: Total no. of new cases of a specific disease during a given time interval x 10n Total population at risk during the same time interval For 10n, a value is selected so that the smallest rate calculated results in a whole number.
Prevalence Rate• The formula for calculating the prevalence rate is: All new & preexisting cases of a specific disease during a given time interval x 10n Total population during the same time periodExample: At Manor Nursing Home, 10 new cases of Klebsiella pneumoniaeoccurred in January. For the month of January there were a total of 17cases of Klebsiella pneumoniae. The facility had 250 residents duringJanuary.Incidence rate for the month of January= (10/250) x 100 = 4%Prevalence rate for the month of January = (17/250) x 100 = 6.8%
Hospital Infection Rates• The most common morbidity rates calculated for hospitals are related to hospital-acquired (nosocomial) infections.• Examples of nosocomial infections include: urinary tract infections, infections related to intravascular catheters, surgical wound infections, respiratory tract infections, ……etc.• The hospital infection rate can be calculated for the entire hospital or for a specific unit in the hospital.
Hospital Infection Rates• The formula for calculating the nosocomial infection rate is: Total no. of hospital infections for a given time period x 100 Total no. of discharges, including deaths, for the same period• The formula for calculating the post-operative infection rate is: No. of infections in clean surgical cases for a given time period x 100 Total no. of surgical operations for the same period A clean surgical case is the one in which no infection existed prior to surgery
Safety-Related Measures• Incidence rates are used to monitor safety-related issues. Nosocomial infection rate is a safety-related measure.• The numerator of the rate is the number of times the specific event occurred in the observed population. The denominator includes the patient-days (patients days are obtained by summing the LOS for all patients during a given time interval).
Safety-Related Measures• The use of the number of hospital discharges as a denominator to calculate error and infection rates is crude and inaccurate. It is better to use "patient-days" as a denominator, since a patient with a 4-day admission would have on average twice the risk of exposure as one with a 2-day admission.• For example, VAP at ICU’s can be calculated using the denominator (1000 ventilated patient-days) rather than the denominator (Total no. of discharges from ICU’s).
Safety-Related Measures• The use of unified denominators allows for valid comparisons and identification of real differences in frequency.• For example; if ward "A" at one hospital has a number of falls = 6 for 182 patient-days (i.e., incidence rate = 33 per 1,000 patient-days), while ward "B" has a number of falls = 12 for 720 patient-days (i.e., incidence rate = 16 per 1,000 patient-days); focusing attention on ward B would divert attention from ward A which has the higher rate of falls.
Safety-Related Measures Risk Stratification and Subgroup Analysis• Heterogeneity of risk within the patient population or across the period of observation calls for risk stratification.• For example, risk stratification may involve sorting the population at risk by a common variable, such as age, gender, or admitting diagnosis. More precise stratification can help to identify the groups within a population at greatest risk.
Safety-Related Measures Risk Stratification by Wound Class Rate of infection Number of Number of (per 1,000 Wound Class Infections Surgeries surgeries)I. Clean 3 160 18II. Clean – Conta. 11 240 46III. Contaminated 13 56 232IV. Dirty 5 12 416TOTAL 32 468 68
Safety-Related Measures Risk Stratification and Subgroup Analysis• When medication administration errors are sorted based on the steps in the medication administration process (i.e., prescribing, transcribing, dispensing, administration, and monitoring), this is not considered as stratification (i.e., developing strata based on the risk for error), rather a subgrouping of different types of errors under the broad heading of medication error (here, each patient may be subject to the same error multiple times during his/her hospital stay; so it’s better to have the denominator equals the total administered doses).
Safety-Related Measures Risk Stratification and Subgroup Analysis• Even for a single type of error (e.g., administration), it is possible to perform a further subgrouping analysis as shown in the table in the next slide.
Safety-Related Measures Errors While Administering Oral Medication, by subgroup of Error Type of Error Number of Errors Number of Doses Incidence Rate (per 10,000 doses) Wrong 14 14,284 9.8 medication given Wrong dose 32 14,284 22.1Wrong patient 12 14,284 8.4Patient allergic 4 14,284 2.8
N.B.For calculating rates, we have 3 options for thedenominator: 1. To include the total number of patients/population subject to the event (i.e., total number of discharges forwith safety-related measures the whole hospital or one unit inside the hospital). 2. To be in the form of patient days (after summing the LOS of all patients subject to the event). 3. To be in the form of events (after summing the total number of events, like performed surgeries and administered doses). Here, one patient may be subject to the same event more than once.
Criteria for the Proper Measurement ProcessThese criteria are: 1. Validity 2. Reliability 3. Sensitivity 4. Specificity
Validity• Accuracy in measurement cannot happen without validity. The measuring instrument, whether a ruler, an IQ test, or a survey instrument, is considered valid if it measures what it is intended to measure and for the intended purpose.• A ruler or scale is a direct measure. In healthcare, because we cannot measure quality with scales, quality is often assessed through indirect measures.• For example, measuring the number of inpatient admissions for long-term diabetes in a Metropolitan area is actually a measure of the effectiveness of managing these patients in the ambulatory settings; the more the effective this management, the less the number of hospitalization of the chronic DM cases. Here , inpatient admissions act as a proxy measure.
Reliability• Error is integral to the measurement process, whether it is measurement of weight, height, or blood pressure. Even when measurement is made as accurately as the instrument allows and all procedures are followed, repeated measures do not always give exactly the same results.• However, an instrument that is reliable will tend to have results that are consistent with each other over repeated trials. A measurement process is said to be reliable if repeated measurements over time on the same property or attribute give the same or approximately the same results.• An example of unreliable measuring device is a scale that gives widely different weights each time the same object is measured.
Sensitivity• A measure is sensitive if it includes true positives, in addition to the false positives.• Sensitive tests are used for rapid tests (e.g.) HCV; which are then confirmed later on by the time-taking, more expensive, and more interventional specific tests.• Used for screening. Usually performed through urine analysis or pinpricks.
Specificity• A measure is specific if it identify the true positive cases.• For example, a population elements (1000) who were screened for HCV by an initial sensitive rapid test yielded 600 positives; those 600 are later diagnosed using ELIZA test. 200 out of the 600, for example, may turn out to be negatives. Thus we have the more expensive test conducted on 600 instead of 1000.• Specific tests are more expensive and more interventional (drawing blood samples).• Specific tests are usually used for research purposes.