SlideShare a Scribd company logo
Looking at Data
Clinical Data Example  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Descriptive Statistics
Types of Variables: Overview Categorical Quantitative continuous discrete ordinal nominal binary 2 categories +   more categories +   order matters + numerical  +   uninterrupted
Categorical Variables ,[object Object],[object Object],[object Object],[object Object],[object Object]
Categorical Variables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Categorical Variables ,[object Object],[object Object],[object Object],[object Object]
Categorical Variables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Quantitative Variables ,[object Object],[object Object],[object Object],[object Object],[object Object]
Quantitative Variables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Quantitative Variables ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Looking at Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The first rule of statistics:  USE COMMON SENSE! 90% of the information is contained in the graph.
Frequency Plots (univariate) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bar Chart ,[object Object],[object Object]
Bar Chart: categorical variables no yes
Much easier to extract information from a bar chart than from a table! Bar Chart for SI categories Number of Patients Shock Index Category 0.0 16.7 33.3 50.0 66.7 83.3 100.0 116.7 133.3 150.0 166.7 183.3 200.0 1 2 3 4 5 6 7 8 9 10
Box plot and histograms: for continuous variables ,[object Object]
0.0 0.7 1.3 2.0 SI Box Plot: Shock Index Shock Index Units “ whisker” Q3 + 1.5IQR = .8+1.5(.25)=1.175 75th percentile (0.8) 25th percentile (0.55) maximum (1.7) interquartile range (IQR) = .8-.55 = .25 minimum (or Q1-1.5IQR) Outliers  median (.66)
Note the “right skew” Bins of size 0.1 0.0 8.3 16.7 25.0 0.0 0.7 1.3 2.0 Histogram of SI SI Percent
100 bins (too much detail)
2 bins (too little detail)
Also shows the “right skew” 0.0 0.7 1.3 2.0 SI Box Plot: Shock Index Shock Index Units
0.0 33.3 66.7 100.0 AGE Box Plot: Age Variables Years More symmetric 75th percentile  25th percentile maximum interquartile range minimum median
Histogram: Age Not skewed, but not bell-shaped either… 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 AGE (Years) Percent
Some histograms from last year’s class (n=18) Starting with politics…
 
 
Feelings about math and writing…
Optimism…
Measures of central tendency ,[object Object],[object Object],[object Object]
Central Tendency ,[object Object],[object Object],In math shorthand:
Mean: example ,[object Object],[object Object]
Mean of age in Kline’s data ,[object Object],[object Object],[object Object],[object Object],[object Object],0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Percent
Mean of age in Kline’s data The balancing point 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Percent
Mean of Pulmonary Embolism? (Binary variable?) 19.44% (181) 80.56% (750)
Mean ,[object Object],0  1  2  3  4  5  6  7  8  9  10 Mean = 3 0  1  2  3  4  5  6  7  8  9  10 Mean = 4 ,[object Object]
Central Tendency ,[object Object],[object Object],[object Object],[object Object]
Median: example ,[object Object],[object Object],Median = (22+23)/2 = 22.5
Median of age in Kline’s data ,[object Object],[object Object],[object Object],[object Object],0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 AGE (Years) Percent
Median of age in Kline’s data 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Percent 50% of mass  50% of mass
Does PE have a median? ,[object Object]
Median ,[object Object],0  1  2  3  4  5  6  7  8  9  10 Median = 3 0  1  2  3  4  5  6  7  8  9  10 Median = 3 ,[object Object]
Central Tendency ,[object Object]
Mode: example ,[object Object],[object Object],Mode = 23  (occurs 3 times)
Mode of age in Kline’s data ,[object Object],[object Object],[object Object],[object Object]
Mode of PE? ,[object Object]
Measures of Variation/Dispersion ,[object Object],[object Object],[object Object],[object Object]
Range ,[object Object]
0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Range of age: 94 years-15 years = 79 years AGE (Years) Percent
Range of PE? ,[object Object]
Quartiles 25% 25% 25% 25% ,[object Object],[object Object],[object Object],Q1 Q2 Q3
Interquartile Range ,[object Object]
Interquartile Range: age Median (Q2) maximum minimum Q1 Q3 25%  25%  25%  25% 15  35  49  65  94 Interquartile range  = 65 – 35 = 30
[object Object],Variance
Why squared deviations? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Standard Deviation ,[object Object],[object Object],[object Object]
Calculation Example: Sample Standard Deviation Age data (n=8) :  17  19  21  22  23  23  23  38 n = 8  Mean = X = 23.25
0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 AGE (Years) Percent Std. dev is a measure of the “average” scatter around the mean. Estimation method:  if  the distribution is bell shaped, the range is around 6 SD, so here rough guess for SD is 79/6 = 13
Std. Deviation age ,[object Object],[object Object],[object Object],[object Object]
0.0 62.5 125.0 187.5 250.0 0.0 0.5 1.0 1.5 2.0 Std Dev of Shock Index SI Count Estimation method:  if  the distribution is bell shaped, the range is around 6 SD, so here rough guess for SD is 1.4/6 =.23 Std. dev is a measure of the “average” scatter around the mean.
Std. Deviation SI ,[object Object],[object Object],[object Object],[object Object]
Std. Dev of binary variable, PE Std. dev is a measure of the “average” scatter around the mean. 19.44% 80.56%
Std. Deviation PE ,[object Object],[object Object],[object Object],[object Object]
Comparing Standard Deviations Mean = 15.5 S =  3.338   11  12  13  14  15  16  17  18  19  20  21 11  12  13  14  15  16  17  18  19  20  21 Data B Data A Mean = 15.5 S =  0.926 11  12  13  14  15  16  17  18  19  20  21 Mean = 15.5 S =  4.570 Data C ,[object Object]
[object Object],Bienaym é- Chebyshev Rule within At least (1 - 1/1 2 ) =  0%   …….…..  k=1  ( μ   ± 1 σ ) (1 - 1/2 2 ) =  75%  …........  k=2  ( μ   ± 2 σ ) (1 - 1/3 2 ) =  89%  ……….... k=3  ( μ   ± 3 σ ) Note use of    (sigma) to represent “standard deviation.” Note use of    (mu) to represent “mean”.
Symbol Clarification ,[object Object],[object Object],[object Object],[object Object]
**The beauty of the normal curve:  No matter what    and    are, the area between   -   and   +   is about 68%; the area between   -2   and   +2   is about 95%; and the area between   -3   and   +3   is about 99.7%.  Almost all values fall within 3 standard deviations.
68-95-99.7 Rule 68% of the data 95% of the data 99.7% of the data
Summary of Symbols ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Examples of bad graphics
What’s wrong with this graph? from : ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut, 1983,  p.69
From:  Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot Wainer, H. 1997, p.29. Notice the X-axis
Correctly scaled X-axis…
Report of the Presidential Commission on the Space Shuttle Challenger Accident , 1986 (vol 1, p. 145)  The graph excludes the observations where no O-rings failed.
[object Object],Smooth curve at least shows the trend toward failure at high and low temperatures…
Even better: graph all the data (including non-failures) using a  logistic regression  model Tappin, L. (1994). "Analyzing data relating to the Challenger disaster".  Mathematics Teacher , 87, 423-426
What’s wrong with this graph? from : ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut, 1983,  p.74
 
What’s the message here? Diagraphics II , 1994
Diagraphics II , 1994
For more examples… ,[object Object]
“Lying” with statistics ,[object Object]
Example 1: projected statistics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example 1: projected statistics ,[object Object],[object Object]
Example 1: projected statistics ,[object Object],[object Object],[object Object],[object Object]
Example 2: propagation of statistics ,[object Object],[object Object]
For example… ,[object Object],[object Object],[object Object]
And… ,[object Object],[object Object]
And… ,[object Object],[object Object],[object Object]
And… ,[object Object],[object Object]
And… ,[object Object],[object Object]
And ,[object Object],[object Object],[object Object]
Where did the statistics come from? The 15%: Dummer GM, Rosen LW, Heusner WW, Roberts PJ, and Counsilman JE. Pathogenic weight-control behaviors of young competitive swimmers.  Physician Sportsmed  1987; 15: 75-84.  The “to”: Rosen LW, McKeag DB, O’Hough D, Curley VC. Pathogenic weight-control behaviors in female athletes.  Physician Sportsmed . 1986; 14: 79-86. The 62%:Rosen LW, Hough DO. Pathogenic weight-control behaviors of female college gymnasts.  Physician Sportsmed  1988; 16:140-146.
Where did the statistics come from? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Where did the statistics come from? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Where did the statistics come from? ,[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

Chapter 022
Chapter 022Chapter 022
Chapter 022
stanbridge
 
Some study materials
Some study materialsSome study materials
Some study materials
SatishH5
 
Seminar SPSS di UM
Seminar SPSS di UM Seminar SPSS di UM
Seminar SPSS di UM
Nadzirah Hanis
 
Chapter3
Chapter3Chapter3
Normal distribution
Normal distribution  Normal distribution
Normal distribution
Unitedworld School Of Business
 
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
Shakehand with Life
 
Frequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare ProfessioanlsFrequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare Professioanls
alberpaules
 
Hcai 5220 lecture notes on campus sessions fall 11(2)
Hcai 5220 lecture notes on campus sessions fall 11(2)Hcai 5220 lecture notes on campus sessions fall 11(2)
Hcai 5220 lecture notes on campus sessions fall 11(2)
Twene Peter
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Nilanjan Bhaumik
 
Normal curve
Normal curveNormal curve
Normal curve
Lori Rapp
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
Mohit Mahajan
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
Dr Dhavalkumar F. Chaudhary
 
State presentation2
State presentation2State presentation2
State presentation2
Lata Bhatta
 
02a one sample_t-test
02a one sample_t-test02a one sample_t-test
02a one sample_t-test
Madhusudhanan Balakumar
 
The Normal distribution
The Normal distributionThe Normal distribution
The Normal distribution
Sarfraz Ahmad
 
Chisquare
ChisquareChisquare
Chisquare
keerthi samuel
 
Descriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical DescriptionDescriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical Description
getyourcheaton
 
3.2 measures of variation
3.2 measures of variation3.2 measures of variation
3.2 measures of variation
leblance
 
Statistics for dummies
Statistics for dummiesStatistics for dummies
Statistics for dummies
Fred Moyer
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Habibullah Bahar University College
 

What's hot (20)

Chapter 022
Chapter 022Chapter 022
Chapter 022
 
Some study materials
Some study materialsSome study materials
Some study materials
 
Seminar SPSS di UM
Seminar SPSS di UM Seminar SPSS di UM
Seminar SPSS di UM
 
Chapter3
Chapter3Chapter3
Chapter3
 
Normal distribution
Normal distribution  Normal distribution
Normal distribution
 
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
Measure of dispersion part II ( Standard Deviation, variance, coefficient of ...
 
Frequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare ProfessioanlsFrequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare Professioanls
 
Hcai 5220 lecture notes on campus sessions fall 11(2)
Hcai 5220 lecture notes on campus sessions fall 11(2)Hcai 5220 lecture notes on campus sessions fall 11(2)
Hcai 5220 lecture notes on campus sessions fall 11(2)
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Normal curve
Normal curveNormal curve
Normal curve
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
State presentation2
State presentation2State presentation2
State presentation2
 
02a one sample_t-test
02a one sample_t-test02a one sample_t-test
02a one sample_t-test
 
The Normal distribution
The Normal distributionThe Normal distribution
The Normal distribution
 
Chisquare
ChisquareChisquare
Chisquare
 
Descriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical DescriptionDescriptive Statistics Part II: Graphical Description
Descriptive Statistics Part II: Graphical Description
 
3.2 measures of variation
3.2 measures of variation3.2 measures of variation
3.2 measures of variation
 
Statistics for dummies
Statistics for dummiesStatistics for dummies
Statistics for dummies
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 

Similar to Looking at data

DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
data2businessinsight
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
Diana Diana
 
2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx
nupuraajesh0202
 
Basics of statistics by Arup Nama Das
Basics of statistics by Arup Nama DasBasics of statistics by Arup Nama Das
Basics of statistics by Arup Nama Das
Arup8
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
Avjinder (Avi) Kaler
 
Student’s presentation
Student’s presentationStudent’s presentation
Student’s presentation
Pwalmiki
 
presentation
presentationpresentation
presentation
Pwalmiki
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
anom1392
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec doms
Babasab Patil
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
Gautam G
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
PerumalPitchandi
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
Sandeepkumar628916
 
Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1
RajnishSingh367990
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
hanreaz219
 
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICSSTATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
nagamani651296
 
Statistics
StatisticsStatistics
Statistics
Deepanshu Sharma
 
Chapter 4Summarizing Data Collected in the Sample.docx
Chapter 4Summarizing Data Collected in the Sample.docxChapter 4Summarizing Data Collected in the Sample.docx
Chapter 4Summarizing Data Collected in the Sample.docx
keturahhazelhurst
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
Erik D. Davenport
 
Stats chapter 1
Stats chapter 1Stats chapter 1
Stats chapter 1
Richard Ferreria
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
Makati Science High School
 

Similar to Looking at data (20)

DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
 
2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx
 
Basics of statistics by Arup Nama Das
Basics of statistics by Arup Nama DasBasics of statistics by Arup Nama Das
Basics of statistics by Arup Nama Das
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
 
Student’s presentation
Student’s presentationStudent’s presentation
Student’s presentation
 
presentation
presentationpresentation
presentation
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec doms
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICSSTATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
 
Statistics
StatisticsStatistics
Statistics
 
Chapter 4Summarizing Data Collected in the Sample.docx
Chapter 4Summarizing Data Collected in the Sample.docxChapter 4Summarizing Data Collected in the Sample.docx
Chapter 4Summarizing Data Collected in the Sample.docx
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 
Stats chapter 1
Stats chapter 1Stats chapter 1
Stats chapter 1
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
 

More from pcalabri

How population evolve
How population evolveHow population evolve
How population evolve
pcalabri
 
Ecology how population grow
Ecology how population growEcology how population grow
Ecology how population grow
pcalabri
 
Cycling of materials in ecosystem
Cycling of materials in ecosystemCycling of materials in ecosystem
Cycling of materials in ecosystem
pcalabri
 
What is ecology
What is ecologyWhat is ecology
What is ecology
pcalabri
 
Energy flow in ecosystems
Energy flow in ecosystemsEnergy flow in ecosystems
Energy flow in ecosystems
pcalabri
 
How organisms interact in communities
How organisms interact in communitiesHow organisms interact in communities
How organisms interact in communities
pcalabri
 
How organisms interact in communities
How organisms interact in communitiesHow organisms interact in communities
How organisms interact in communities
pcalabri
 
Aquatic communities
Aquatic communitiesAquatic communities
Aquatic communities
pcalabri
 
Major biological communities
Major biological communitiesMajor biological communities
Major biological communities
pcalabri
 
How competition shapes communities
How competition shapes communitiesHow competition shapes communities
How competition shapes communities
pcalabri
 
Mitosis and cytokinesis
Mitosis and cytokinesisMitosis and cytokinesis
Mitosis and cytokinesis
pcalabri
 
Meiosis and reproduction
Meiosis and reproductionMeiosis and reproduction
Meiosis and reproduction
pcalabri
 
Cellular respiration in detail
Cellular respiration in detailCellular respiration in detail
Cellular respiration in detail
pcalabri
 
Energy and chem reactions in cells
Energy and chem reactions in cellsEnergy and chem reactions in cells
Energy and chem reactions in cells
pcalabri
 
Cell organelles
Cell organellesCell organelles
Cell organelles
pcalabri
 
Cells
CellsCells
Cells
pcalabri
 
Cell features
Cell featuresCell features
Cell features
pcalabri
 

More from pcalabri (17)

How population evolve
How population evolveHow population evolve
How population evolve
 
Ecology how population grow
Ecology how population growEcology how population grow
Ecology how population grow
 
Cycling of materials in ecosystem
Cycling of materials in ecosystemCycling of materials in ecosystem
Cycling of materials in ecosystem
 
What is ecology
What is ecologyWhat is ecology
What is ecology
 
Energy flow in ecosystems
Energy flow in ecosystemsEnergy flow in ecosystems
Energy flow in ecosystems
 
How organisms interact in communities
How organisms interact in communitiesHow organisms interact in communities
How organisms interact in communities
 
How organisms interact in communities
How organisms interact in communitiesHow organisms interact in communities
How organisms interact in communities
 
Aquatic communities
Aquatic communitiesAquatic communities
Aquatic communities
 
Major biological communities
Major biological communitiesMajor biological communities
Major biological communities
 
How competition shapes communities
How competition shapes communitiesHow competition shapes communities
How competition shapes communities
 
Mitosis and cytokinesis
Mitosis and cytokinesisMitosis and cytokinesis
Mitosis and cytokinesis
 
Meiosis and reproduction
Meiosis and reproductionMeiosis and reproduction
Meiosis and reproduction
 
Cellular respiration in detail
Cellular respiration in detailCellular respiration in detail
Cellular respiration in detail
 
Energy and chem reactions in cells
Energy and chem reactions in cellsEnergy and chem reactions in cells
Energy and chem reactions in cells
 
Cell organelles
Cell organellesCell organelles
Cell organelles
 
Cells
CellsCells
Cells
 
Cell features
Cell featuresCell features
Cell features
 

Looking at data

  • 2.
  • 4. Types of Variables: Overview Categorical Quantitative continuous discrete ordinal nominal binary 2 categories + more categories + order matters + numerical + uninterrupted
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. The first rule of statistics: USE COMMON SENSE! 90% of the information is contained in the graph.
  • 14.
  • 15.
  • 16. Bar Chart: categorical variables no yes
  • 17. Much easier to extract information from a bar chart than from a table! Bar Chart for SI categories Number of Patients Shock Index Category 0.0 16.7 33.3 50.0 66.7 83.3 100.0 116.7 133.3 150.0 166.7 183.3 200.0 1 2 3 4 5 6 7 8 9 10
  • 18.
  • 19. 0.0 0.7 1.3 2.0 SI Box Plot: Shock Index Shock Index Units “ whisker” Q3 + 1.5IQR = .8+1.5(.25)=1.175 75th percentile (0.8) 25th percentile (0.55) maximum (1.7) interquartile range (IQR) = .8-.55 = .25 minimum (or Q1-1.5IQR) Outliers median (.66)
  • 20. Note the “right skew” Bins of size 0.1 0.0 8.3 16.7 25.0 0.0 0.7 1.3 2.0 Histogram of SI SI Percent
  • 21. 100 bins (too much detail)
  • 22. 2 bins (too little detail)
  • 23. Also shows the “right skew” 0.0 0.7 1.3 2.0 SI Box Plot: Shock Index Shock Index Units
  • 24. 0.0 33.3 66.7 100.0 AGE Box Plot: Age Variables Years More symmetric 75th percentile 25th percentile maximum interquartile range minimum median
  • 25. Histogram: Age Not skewed, but not bell-shaped either… 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 AGE (Years) Percent
  • 26. Some histograms from last year’s class (n=18) Starting with politics…
  • 27.  
  • 28.  
  • 29. Feelings about math and writing…
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Mean of age in Kline’s data The balancing point 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Percent
  • 36. Mean of Pulmonary Embolism? (Binary variable?) 19.44% (181) 80.56% (750)
  • 37.
  • 38.
  • 39.
  • 40.
  • 41. Median of age in Kline’s data 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Percent 50% of mass 50% of mass
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 Range of age: 94 years-15 years = 79 years AGE (Years) Percent
  • 51.
  • 52.
  • 53.
  • 54. Interquartile Range: age Median (Q2) maximum minimum Q1 Q3 25% 25% 25% 25% 15 35 49 65 94 Interquartile range = 65 – 35 = 30
  • 55.
  • 56.
  • 57.
  • 58. Calculation Example: Sample Standard Deviation Age data (n=8) : 17 19 21 22 23 23 23 38 n = 8 Mean = X = 23.25
  • 59. 0.0 4.7 9.3 14.0 0.0 33.3 66.7 100.0 AGE (Years) Percent Std. dev is a measure of the “average” scatter around the mean. Estimation method: if the distribution is bell shaped, the range is around 6 SD, so here rough guess for SD is 79/6 = 13
  • 60.
  • 61. 0.0 62.5 125.0 187.5 250.0 0.0 0.5 1.0 1.5 2.0 Std Dev of Shock Index SI Count Estimation method: if the distribution is bell shaped, the range is around 6 SD, so here rough guess for SD is 1.4/6 =.23 Std. dev is a measure of the “average” scatter around the mean.
  • 62.
  • 63. Std. Dev of binary variable, PE Std. dev is a measure of the “average” scatter around the mean. 19.44% 80.56%
  • 64.
  • 65.
  • 66.
  • 67.
  • 68. **The beauty of the normal curve: No matter what  and  are, the area between  -  and  +  is about 68%; the area between  -2  and  +2  is about 95%; and the area between  -3  and  +3  is about 99.7%. Almost all values fall within 3 standard deviations.
  • 69. 68-95-99.7 Rule 68% of the data 95% of the data 99.7% of the data
  • 70.
  • 71. Examples of bad graphics
  • 72. What’s wrong with this graph? from : ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut, 1983, p.69
  • 73. From: Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot Wainer, H. 1997, p.29. Notice the X-axis
  • 75. Report of the Presidential Commission on the Space Shuttle Challenger Accident , 1986 (vol 1, p. 145) The graph excludes the observations where no O-rings failed.
  • 76.
  • 77. Even better: graph all the data (including non-failures) using a logistic regression model Tappin, L. (1994). "Analyzing data relating to the Challenger disaster". Mathematics Teacher , 87, 423-426
  • 78. What’s wrong with this graph? from : ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut, 1983, p.74
  • 79.  
  • 80. What’s the message here? Diagraphics II , 1994
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94. Where did the statistics come from? The 15%: Dummer GM, Rosen LW, Heusner WW, Roberts PJ, and Counsilman JE. Pathogenic weight-control behaviors of young competitive swimmers. Physician Sportsmed 1987; 15: 75-84. The “to”: Rosen LW, McKeag DB, O’Hough D, Curley VC. Pathogenic weight-control behaviors in female athletes. Physician Sportsmed . 1986; 14: 79-86. The 62%:Rosen LW, Hough DO. Pathogenic weight-control behaviors of female college gymnasts. Physician Sportsmed 1988; 16:140-146.
  • 95.
  • 96.
  • 97.
  • 98.

Editor's Notes

  1. That's really what distinguishes these from discrete numerical
  2. What are some others?
  3. Does everybody know what I mean when I say percentiles? What is the median? Anyone?
  4. 1. Bin sizes may be altered. 2. How many people do you think are in bin 125-135? 3. Where do you think the center of the data are (what's your best guess at the average weight)? 4. On average, how far do you think a given woman is from 127 -- the center/mean?
  5. Balance the Bell Curve on a point. Where is the point of balance, average mass on each side.
  6. 1. Bin sizes may be altered. 2. How many people do you think are in bin 125-135? 3. Where do you think the center of the data are (what's your best guess at the average weight)? 4. On average, how far do you think a given woman is from 127 -- the center/mean?
  7. SAY: within 1 standard deviation either way of the mean within 2 standard deviations of the mean within 3 standard deviations either way of the mean WORKS FOR ALL NORMAL CURVES NO MATTER HOW SKINNY OR FAT