SlideShare a Scribd company logo
1 of 28
Henry R. Kang (1/2010)
General Chemistry
Lecture 5
Statistical Data
Analysis
Henry R. Kang (7/2008)
Outlines
• Fundamental Statistics
• Accuracy and Precision
• Data Rejection
Henry R. Kang (1/2010)
Accuracy & Precision
• Accuracy
 Accuracy is a measure of the closeness of a
measured quantity to the true value.
• Precision
 How close two or more measurements of the
quantity agree with one another.
 Precision is a measure of the agreement of
replicate measurements.
Henry R. Kang (7/2008)
Fundamental
Statistics
Henry R. Kang (7/2008)
Errors
• All Measurements Contain Errors.
• Types of Errors
 Systematic errors
 One-sided errors (either positive or negative)
• Usually from a single source
• Resulting data are consistently high or low
 Results may be precise but inaccurate
• Examples: Balance is incorrectly zeroed. Use incorrect constant for
calculations.
 Random errors
 Randomly occurred
 Positive and negative deviations occur with equal frequency and size.
• A bell shape curve (Gaussian or normal distribution)
 The source of the error is usually not known
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the distribution of data points with respect to the
true value. It gives a bell-shaped curve as shown in the figure.
 The closer to the true value, the higher the probability.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
Henry R. Kang (7/2008)
Measuring Accuracy
• Percent Error
 If the true value is known
• Part Per Thousand (PPT)
• Part Per Million (PPM)
• Unfortunately, the true value is often not known.
% error =
| true value – experimental value |
| True value |
× 100
PPT =
| true value – experimental value |
| true value – experimental value |
| True value |
| True value |
× 1000
× 106
PPM =
Henry R. Kang (7/2008)
Measuring Precision
• Mean (or Average)
• Deviation and Absolute Deviation
• Absolute Average Deviation
• Relative Deviation
• Relative Average Deviation (RAD)
• Standard Deviation
• Relative Standard Deviation
Henry R. Kang (7/2008)
Mean (Average)
• For multiple measurements of a given quantity,
we have numerical values x1, x2, x3, - - - -, xn, where
n is the number of measurements.
• Sum is defined as
Sum = x1 + x2 + x3 + - - - + xn = ∑ xi
• Mean xavg is defined as
∑ xiSum
n n=xavg =
Henry R. Kang (7/2008)
Deviation & Absolute Deviations
• Deviation is the difference (or variation) of a single measurement,
xi, away from the mean value, xavg.
 d1 = x1 – xavg
 d2 = x2 – xavg
 d3 = x3 – xavg
 -- - -- -- - -- --
 -- - -- -- - -- --
 dn = xn – xavg
• Absolute deviation is always positive.
 d1 = | x1 – xavg|
 d2 = | x2 – xavg|
 d3 = |x3 – xavg|
 -- - -- -- - -- --
 -- - -- -- - -- --
Henry R. Kang (7/2008)
Absolute Average Deviation
• Absolute average deviation, davg, is the arithmetic
mean of individual absolute deviations, di.
d1 = | x1 – xavg|
d2 = | x2 – xavg|
d3 = | x3 – xavg|
--------- ---
--------- ---
dn = | xn – xavg| ∑ di
n=davg
Henry R. Kang (7/2008)
Relative Deviation
• Relative deviation, Di, is the ratio of
individual absolute deviations, di, to the
mean value, xavg.
D1 = d1 / xavg = | x1 – xavg| / xavg
D2 = d2 / xavg = | x2 – xavg| / xavg
D3 = d3 / xavg = | x3 – xavg| / xavg
------------
Di = di / xavg = | xi – xavg| / xavg
------------
Henry R. Kang (7/2008)
Relative Average Deviation
• Relative average deviation (RAD) is the
absolute average deviation relative to
the mean xavg
A precision of 3 ppt or less is considered
very good.
RAD (ppt) = × 1000
davg
xavg
Henry R. Kang (7/2008)
Standard Deviation
• Standard deviation (σ) is useful in estimating data points
distribution in the form of the Gaussian distribution (a
bell-shaped curve).
 (xavg ± σ) incorporates 68.3% of the data points.
 (xavg ± 3σ) incorporates 99.7% of the data points.
 The smaller the σ, the less spread of data points.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
------------
dn = xn – xavg
∑ di
2
n – 1
=σ
√ =
√
d1
2
+ d2
2
+ d3
2
+ - - - - + dn
2
n – 1
Henry R. Kang (7/2008)
Relative Standard Deviation
• Relative standard deviation (σr) is the standard
deviation relative to the mean value.
 d1 = x1 – xavg
d2 = x2 – xavg
d3 = x3 – xavg
--------- ---
dn = xn – xavg
where n is the number of measurements
∑ (di /xavg)2
n – 1
=σr
√ =
√ D1
2
+D2
2
+D3
2
+ - - - - +Dn
2
n – 1
or σr (ppt) = (σ / xavg ) × 1000
Henry R. Kang (7/2008)
Gaussian Distribution
• Gaussian distribution gives the
distribution of data points with
respect to the true value. It gives a
bell-shaped curve as shown in the
figure.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2 -1 0 1 2 3
Standard Deviation
Probability
• The Gaussian equation is
P(x) = [(2π)1/2
σ]–1
exp[-(x – X)2
/(2σ2
)]
where σ is the standard deviation and X is the true value.
 The closer to the true value, the higher the probability.
 The area under the curve (or the integration of the Gaussian function)
 (xture ± σ) incorporates 68.3% of the data points.
 (xture ± 3σ) incorporates 99.7% of the data points.
 (xture ± 3.8901σ) incorporates 99.99% of the data points.
 (xture ± 4.4172σ) incorporates 99.999% of the data points.
 (xture ± 6σ) incorporates nearly 100% of the data points.
Henry R. Kang (7/2008)
Standard Deviation & Data Distribution
• The smaller the σ, the less spread of data points.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
-4 -3 -2 -1 0 1 2 3 4
Standard Deviation
Probability
σ = 0.5
σ = 1.0
σ = 2.0
Henry R. Kang (7/2008)
Approximation of Standard Deviation
• The computational cost for standard deviation is pretty
high; therefore, there exists a good approximation to
compute standard deviation with much less
computational cost.
• š = Ř/√N
 Ř is the range of data points from the lowest value to the
highest value
Ř = xmax – xmin
 N is the number of data points.
• For a small number of measurements the approximation
is accurate enough to replace the formal standard
deviation.
Henry R. Kang (7/2008)
Accuracy
and
Precision
Henry R. Kang (1/2010)
Accuracy & Precision of Measurements
• Accuracy is a measure of the closeness of a measured quantity to
the true value.
• Precision is a measure of the agreement of replicate
measurements.
• Measurements can be precise but not accurate or accurate but not
precise or neither. The best result is, of course, accurate and
precise.
Accurate &
precise
Precise but
not accurate
not accurate
& not precise
accurate but
not precise
Henry R. Kang (1/2010)
Example 1 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%
 Estimated precision by using the approximation: š = Ř / √N
 š = (28.72 – 28.40)% / 31/2
= 0.32% / 1.732 = 0.18 %
 Relative standard deviation: sr = š / xM
 sr = 0.18% / 28.60% = 0.0063
 Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%
 Relative accuracy = Accuracy / True value
= 4.09% / 32.69% = 0.125
• These result indicate that the data are precise but inaccurate.
Henry R. Kang (1/2010)
Example 2 of Accuracy and Precision
• Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%
 Estimated precision by using the approximation: š = Ř / √N
 š = (36.64 – 28.89)% / 31/2
= 7.75% / 1.732 = 4.47 %
 Relative standard deviation: sr = š / xM
 sr = 4.47% / 32.70% = 0.137
 Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%
 Relative accuracy = Accuracy / True value
= 0.01% / 32.69% = 0.0003
• These result indicate that the data are imprecise but accurate.
Henry R. Kang (1/2010)
Example 3 of Accuracy and Precision
• Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%,
where the true value is 32.69%. Determine the accuracy and
precision.
• Answer:
 Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%
 Estimated precision by using the approximation: š = Ř / √N
 š = (33.56 – 25.62)% / 31/2
= 7.94% / 1.732 = 4.58 %
 Relative standard deviation: sr = š / xM
 sr = 4.58% / 29.04% = 0.158
 Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%
 Relative accuracy = Accuracy / True value
= 3.65% / 32.69% = 0.112
• These result indicate that the data are imprecise and inaccurate.
Henry R. Kang (7/2008)
Data Rejection
Henry R. Kang (7/2008)
Data Rejection
• Replicate measurements of a given quantity are usually
scattered.
 Some values are closer than others.
• Which values to keep (or which values to discard)
 If a single result differs greatly from the others that is caused
by a particular error of the experimenter, then this result
should be discarded.
 If a result is significantly “off”, but there is no error in the
experiment, then the result, in general, should be kept.
• If in doubt, use the rejection coefficient Q test.
• Do not discard any result just to get “good precision”.
Henry R. Kang (7/2008)
Q Test
• Q test is used to test the extreme values (the highest and lowest
values)
• Procedure
 Calculate the range
 Range = xmax – xmin
 Calculate the difference between the extreme value with its nearest
neighbor
 dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |
 Calculate the ratio (Q value) between the difference and the range
 Qhi = dhi / Range ;Qlo = dlo / Range
• Compare the resulting Q value with the rejection table at 90%
confidence level (or other selected confidence level)
 If the calculated Q value is greater than the Q value given in the table, then
reject the value.
Henry R. Kang (7/2008)
Rejection Q Tables
Number
of Data
Q90 Q96 Q99
3 0.94 0.98 0.99
4 0.76 0.85 0.93
5 0.64 0.73 0.82
6 0.56 0.64 0.74
7 0.51 0.59 0.68
8 0.47 0.54 0.63
9 0.44 0.51 0.60
10 0.41 0.48 0.57
Henry R. Kang (7/2008)
Q Test - Example
• Data: 35.00, 35.05, 35.10, 35.80
• Calculate the range
 Range = xmax – xmin= 35.80 – 35.00 = 0.80
• Calculate the difference between the extreme value with its
nearest neighbor.
 dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70
 dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05
• Calculate Q values between the difference and the range.
 Qhi = dhi / Range = 0.70 / 0.80 = 0.88
 Qlo = dlo / Range = 0.05 / 0.80 = 0.063
• Compare the resulting Q value with the rejection table at 90%
confidence level.
 For 4 samples, the Q value in the table is 0.76
 Qhi > 0.76; therefore, the highest value 35.80 can be dropped
 Once the value is dropped, it is no longer in the data set and should not
be used for the calculations of mean and various deviations.
#Data Q90
3 0.94
4 0.76
5 0.64
6 0.56
7 0.51
8 0.47
9 0.44
10 0.41

More Related Content

What's hot

AP Chemistry Chapter 15 Sample Exercises
AP Chemistry Chapter 15 Sample ExercisesAP Chemistry Chapter 15 Sample Exercises
AP Chemistry Chapter 15 Sample ExercisesJane Hamze
 
solutions and their concentrations in Analytical chemistry by Azad Alshatteri
solutions and their concentrations in Analytical chemistry by Azad Alshatterisolutions and their concentrations in Analytical chemistry by Azad Alshatteri
solutions and their concentrations in Analytical chemistry by Azad AlshatteriAzad Alshatteri
 
GC-S008-Mass&Mole
GC-S008-Mass&MoleGC-S008-Mass&Mole
GC-S008-Mass&Molehenry kang
 
chapter 6 theories.pdf
chapter 6 theories.pdfchapter 6 theories.pdf
chapter 6 theories.pdfmonaabuhussein
 
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1dang_sayson1
 
Energy Changes and Chemical Reactions
Energy Changes and Chemical ReactionsEnergy Changes and Chemical Reactions
Energy Changes and Chemical ReactionsMelinda MacDonald
 
Stoichiometric Calculations
Stoichiometric CalculationsStoichiometric Calculations
Stoichiometric Calculationsgbsliebs2002
 
Chapter 15 Lecture- Chemical Equilibrium
Chapter 15 Lecture- Chemical EquilibriumChapter 15 Lecture- Chemical Equilibrium
Chapter 15 Lecture- Chemical EquilibriumMary Beth Smith
 
Stoichiometry PowerPoint
Stoichiometry PowerPointStoichiometry PowerPoint
Stoichiometry PowerPointAngela Willson
 
Chapter 7.4 : Determining Chemical Formulas
Chapter 7.4 : Determining Chemical FormulasChapter 7.4 : Determining Chemical Formulas
Chapter 7.4 : Determining Chemical FormulasChris Foltz
 
Finding volume of water bottle by integration
Finding volume of water bottle by integrationFinding volume of water bottle by integration
Finding volume of water bottle by integrationchiaralatif
 
limiting and excess reagent in chemical reaction
limiting and excess reagent in chemical reactionlimiting and excess reagent in chemical reaction
limiting and excess reagent in chemical reactionvxiiayah
 

What's hot (20)

AP Chemistry Chapter 15 Sample Exercises
AP Chemistry Chapter 15 Sample ExercisesAP Chemistry Chapter 15 Sample Exercises
AP Chemistry Chapter 15 Sample Exercises
 
solutions and their concentrations in Analytical chemistry by Azad Alshatteri
solutions and their concentrations in Analytical chemistry by Azad Alshatterisolutions and their concentrations in Analytical chemistry by Azad Alshatteri
solutions and their concentrations in Analytical chemistry by Azad Alshatteri
 
GC-S008-Mass&Mole
GC-S008-Mass&MoleGC-S008-Mass&Mole
GC-S008-Mass&Mole
 
chapter 6 theories.pdf
chapter 6 theories.pdfchapter 6 theories.pdf
chapter 6 theories.pdf
 
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1
Iplan dlp stem_gc11_mp-ia-b-1_chemistry_day1
 
Mindfulness
MindfulnessMindfulness
Mindfulness
 
Energy Changes and Chemical Reactions
Energy Changes and Chemical ReactionsEnergy Changes and Chemical Reactions
Energy Changes and Chemical Reactions
 
Stoichiometric Calculations
Stoichiometric CalculationsStoichiometric Calculations
Stoichiometric Calculations
 
Chemical kinetics-ppt
Chemical kinetics-pptChemical kinetics-ppt
Chemical kinetics-ppt
 
8.1 rate law
8.1 rate law8.1 rate law
8.1 rate law
 
Chapter 12 chemical kinetics2
Chapter 12   chemical kinetics2Chapter 12   chemical kinetics2
Chapter 12 chemical kinetics2
 
Mindfulness Overview
Mindfulness OverviewMindfulness Overview
Mindfulness Overview
 
Chapter 15 Lecture- Chemical Equilibrium
Chapter 15 Lecture- Chemical EquilibriumChapter 15 Lecture- Chemical Equilibrium
Chapter 15 Lecture- Chemical Equilibrium
 
Stoichiometry PowerPoint
Stoichiometry PowerPointStoichiometry PowerPoint
Stoichiometry PowerPoint
 
Chapter 7.4 : Determining Chemical Formulas
Chapter 7.4 : Determining Chemical FormulasChapter 7.4 : Determining Chemical Formulas
Chapter 7.4 : Determining Chemical Formulas
 
Chapter 1 : Rate of Reaction
Chapter 1 : Rate of ReactionChapter 1 : Rate of Reaction
Chapter 1 : Rate of Reaction
 
Finding volume of water bottle by integration
Finding volume of water bottle by integrationFinding volume of water bottle by integration
Finding volume of water bottle by integration
 
Ch3 stoichiometry
Ch3 stoichiometryCh3 stoichiometry
Ch3 stoichiometry
 
limiting and excess reagent in chemical reaction
limiting and excess reagent in chemical reactionlimiting and excess reagent in chemical reaction
limiting and excess reagent in chemical reaction
 
pH - MEASUREMENT .ppt
pH - MEASUREMENT .pptpH - MEASUREMENT .ppt
pH - MEASUREMENT .ppt
 

Viewers also liked

GC-S006-Graphing
GC-S006-GraphingGC-S006-Graphing
GC-S006-Graphinghenry kang
 
Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...enasanter
 
Accuracy & Precision
Accuracy & PrecisionAccuracy & Precision
Accuracy & PrecisionTekZeno
 
Accuracy and Precision
Accuracy and PrecisionAccuracy and Precision
Accuracy and PrecisionSimple ABbieC
 
Power point estrada
Power point estradaPower point estrada
Power point estradaalexhiithazz
 
dissertation_final_sarpakunnas
dissertation_final_sarpakunnasdissertation_final_sarpakunnas
dissertation_final_sarpakunnasTuomas Sarpakunnas
 
TireAngel Telematics 2014-12
TireAngel Telematics 2014-12TireAngel Telematics 2014-12
TireAngel Telematics 2014-12Xuelin Zhou
 
Conoscere l'editoria 2
Conoscere l'editoria 2Conoscere l'editoria 2
Conoscere l'editoria 2Cosi Repossi
 
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...IOSR Journals
 
Using Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales processUsing Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales processJames Williamson
 
Madcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern paMadcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern paRich Frank
 
LordJeshuainheritancemay2016
LordJeshuainheritancemay2016LordJeshuainheritancemay2016
LordJeshuainheritancemay2016Lord Jesus Christ
 

Viewers also liked (20)

GC-S006-Graphing
GC-S006-GraphingGC-S006-Graphing
GC-S006-Graphing
 
Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...Determination of the accuracy of linear and volumetric measurements on CBCT i...
Determination of the accuracy of linear and volumetric measurements on CBCT i...
 
Accuracy & Precision
Accuracy & PrecisionAccuracy & Precision
Accuracy & Precision
 
Accuracy and Precision
Accuracy and PrecisionAccuracy and Precision
Accuracy and Precision
 
I010315762
I010315762I010315762
I010315762
 
Power point estrada
Power point estradaPower point estrada
Power point estrada
 
dissertation_final_sarpakunnas
dissertation_final_sarpakunnasdissertation_final_sarpakunnas
dissertation_final_sarpakunnas
 
I0814852
I0814852I0814852
I0814852
 
TireAngel Telematics 2014-12
TireAngel Telematics 2014-12TireAngel Telematics 2014-12
TireAngel Telematics 2014-12
 
startup_inside_FINAL
startup_inside_FINALstartup_inside_FINAL
startup_inside_FINAL
 
B017250715
B017250715B017250715
B017250715
 
Ouranos hemeljesuschristus
Ouranos hemeljesuschristusOuranos hemeljesuschristus
Ouranos hemeljesuschristus
 
Conoscere l'editoria 2
Conoscere l'editoria 2Conoscere l'editoria 2
Conoscere l'editoria 2
 
Jeshua february2016
Jeshua february2016Jeshua february2016
Jeshua february2016
 
H017235155
H017235155H017235155
H017235155
 
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...Derivation and Application of Six-Point Linear Multistep Numerical Method for...
Derivation and Application of Six-Point Linear Multistep Numerical Method for...
 
Using Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales processUsing Kentico EMS to optimize the B2B sales process
Using Kentico EMS to optimize the B2B sales process
 
Madcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern paMadcom analyzes the need for broadband in eastern pa
Madcom analyzes the need for broadband in eastern pa
 
D018132226
D018132226D018132226
D018132226
 
LordJeshuainheritancemay2016
LordJeshuainheritancemay2016LordJeshuainheritancemay2016
LordJeshuainheritancemay2016
 

Similar to GC-S005-DataAnalysis

Statistics
StatisticsStatistics
Statisticsmegamsma
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceLong Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Long Beach City College
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptAKSAKS12
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfVamshi962726
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfHassanMohyUdDin2
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of DispersionKainatIqbal7
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviationFaisal Hussain
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard errorShahla Yasmin
 
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdfUnit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdfRavinandan A P
 
Quantitative Analysis for Emperical Research
Quantitative Analysis for Emperical ResearchQuantitative Analysis for Emperical Research
Quantitative Analysis for Emperical ResearchAmit Kamble
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersionWaqar Abbasi
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptxVishal543707
 

Similar to GC-S005-DataAnalysis (20)

Statistics chm 235
Statistics chm 235Statistics chm 235
Statistics chm 235
 
Statistics
StatisticsStatistics
Statistics
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.ppt
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
 
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdfDr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf
 
Measures of Variation
Measures of Variation Measures of Variation
Measures of Variation
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Variance & standard deviation
Variance & standard deviationVariance & standard deviation
Variance & standard deviation
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
 
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdfUnit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
 
Quantitative Analysis for Emperical Research
Quantitative Analysis for Emperical ResearchQuantitative Analysis for Emperical Research
Quantitative Analysis for Emperical Research
 
Measure of dispersion
Measure of dispersionMeasure of dispersion
Measure of dispersion
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
 
Cairo 02 Stat Inference
Cairo 02 Stat InferenceCairo 02 Stat Inference
Cairo 02 Stat Inference
 

More from henry kang

GC-S010-Nomenclature
GC-S010-NomenclatureGC-S010-Nomenclature
GC-S010-Nomenclaturehenry kang
 
GC-S009-Substances
GC-S009-SubstancesGC-S009-Substances
GC-S009-Substanceshenry kang
 
GC-S004-ScientificNotation
GC-S004-ScientificNotationGC-S004-ScientificNotation
GC-S004-ScientificNotationhenry kang
 
GC-S003-Measurement
GC-S003-MeasurementGC-S003-Measurement
GC-S003-Measurementhenry kang
 
GC-S002-Matter
GC-S002-MatterGC-S002-Matter
GC-S002-Matterhenry kang
 
RC3-deScreen_s
RC3-deScreen_sRC3-deScreen_s
RC3-deScreen_shenry kang
 
RC2-filterDesign_s
RC2-filterDesign_sRC2-filterDesign_s
RC2-filterDesign_shenry kang
 
GenChem000-WhatIsChemistry
GenChem000-WhatIsChemistryGenChem000-WhatIsChemistry
GenChem000-WhatIsChemistryhenry kang
 
GenChem001-ScientificMethod
GenChem001-ScientificMethodGenChem001-ScientificMethod
GenChem001-ScientificMethodhenry kang
 

More from henry kang (10)

GC-S010-Nomenclature
GC-S010-NomenclatureGC-S010-Nomenclature
GC-S010-Nomenclature
 
GC-S009-Substances
GC-S009-SubstancesGC-S009-Substances
GC-S009-Substances
 
GC-S007-Atom
GC-S007-AtomGC-S007-Atom
GC-S007-Atom
 
GC-S004-ScientificNotation
GC-S004-ScientificNotationGC-S004-ScientificNotation
GC-S004-ScientificNotation
 
GC-S003-Measurement
GC-S003-MeasurementGC-S003-Measurement
GC-S003-Measurement
 
GC-S002-Matter
GC-S002-MatterGC-S002-Matter
GC-S002-Matter
 
RC3-deScreen_s
RC3-deScreen_sRC3-deScreen_s
RC3-deScreen_s
 
RC2-filterDesign_s
RC2-filterDesign_sRC2-filterDesign_s
RC2-filterDesign_s
 
GenChem000-WhatIsChemistry
GenChem000-WhatIsChemistryGenChem000-WhatIsChemistry
GenChem000-WhatIsChemistry
 
GenChem001-ScientificMethod
GenChem001-ScientificMethodGenChem001-ScientificMethod
GenChem001-ScientificMethod
 

GC-S005-DataAnalysis

  • 1. Henry R. Kang (1/2010) General Chemistry Lecture 5 Statistical Data Analysis
  • 2. Henry R. Kang (7/2008) Outlines • Fundamental Statistics • Accuracy and Precision • Data Rejection
  • 3. Henry R. Kang (1/2010) Accuracy & Precision • Accuracy  Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision  How close two or more measurements of the quantity agree with one another.  Precision is a measure of the agreement of replicate measurements.
  • 4. Henry R. Kang (7/2008) Fundamental Statistics
  • 5. Henry R. Kang (7/2008) Errors • All Measurements Contain Errors. • Types of Errors  Systematic errors  One-sided errors (either positive or negative) • Usually from a single source • Resulting data are consistently high or low  Results may be precise but inaccurate • Examples: Balance is incorrectly zeroed. Use incorrect constant for calculations.  Random errors  Randomly occurred  Positive and negative deviations occur with equal frequency and size. • A bell shape curve (Gaussian or normal distribution)  The source of the error is usually not known
  • 6. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure.  The closer to the true value, the higher the probability. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability
  • 7. Henry R. Kang (7/2008) Measuring Accuracy • Percent Error  If the true value is known • Part Per Thousand (PPT) • Part Per Million (PPM) • Unfortunately, the true value is often not known. % error = | true value – experimental value | | True value | × 100 PPT = | true value – experimental value | | true value – experimental value | | True value | | True value | × 1000 × 106 PPM =
  • 8. Henry R. Kang (7/2008) Measuring Precision • Mean (or Average) • Deviation and Absolute Deviation • Absolute Average Deviation • Relative Deviation • Relative Average Deviation (RAD) • Standard Deviation • Relative Standard Deviation
  • 9. Henry R. Kang (7/2008) Mean (Average) • For multiple measurements of a given quantity, we have numerical values x1, x2, x3, - - - -, xn, where n is the number of measurements. • Sum is defined as Sum = x1 + x2 + x3 + - - - + xn = ∑ xi • Mean xavg is defined as ∑ xiSum n n=xavg =
  • 10. Henry R. Kang (7/2008) Deviation & Absolute Deviations • Deviation is the difference (or variation) of a single measurement, xi, away from the mean value, xavg.  d1 = x1 – xavg  d2 = x2 – xavg  d3 = x3 – xavg  -- - -- -- - -- --  -- - -- -- - -- --  dn = xn – xavg • Absolute deviation is always positive.  d1 = | x1 – xavg|  d2 = | x2 – xavg|  d3 = |x3 – xavg|  -- - -- -- - -- --  -- - -- -- - -- --
  • 11. Henry R. Kang (7/2008) Absolute Average Deviation • Absolute average deviation, davg, is the arithmetic mean of individual absolute deviations, di. d1 = | x1 – xavg| d2 = | x2 – xavg| d3 = | x3 – xavg| --------- --- --------- --- dn = | xn – xavg| ∑ di n=davg
  • 12. Henry R. Kang (7/2008) Relative Deviation • Relative deviation, Di, is the ratio of individual absolute deviations, di, to the mean value, xavg. D1 = d1 / xavg = | x1 – xavg| / xavg D2 = d2 / xavg = | x2 – xavg| / xavg D3 = d3 / xavg = | x3 – xavg| / xavg ------------ Di = di / xavg = | xi – xavg| / xavg ------------
  • 13. Henry R. Kang (7/2008) Relative Average Deviation • Relative average deviation (RAD) is the absolute average deviation relative to the mean xavg A precision of 3 ppt or less is considered very good. RAD (ppt) = × 1000 davg xavg
  • 14. Henry R. Kang (7/2008) Standard Deviation • Standard deviation (σ) is useful in estimating data points distribution in the form of the Gaussian distribution (a bell-shaped curve).  (xavg ± σ) incorporates 68.3% of the data points.  (xavg ± 3σ) incorporates 99.7% of the data points.  The smaller the σ, the less spread of data points.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg ------------ dn = xn – xavg ∑ di 2 n – 1 =σ √ = √ d1 2 + d2 2 + d3 2 + - - - - + dn 2 n – 1
  • 15. Henry R. Kang (7/2008) Relative Standard Deviation • Relative standard deviation (σr) is the standard deviation relative to the mean value.  d1 = x1 – xavg d2 = x2 – xavg d3 = x3 – xavg --------- --- dn = xn – xavg where n is the number of measurements ∑ (di /xavg)2 n – 1 =σr √ = √ D1 2 +D2 2 +D3 2 + - - - - +Dn 2 n – 1 or σr (ppt) = (σ / xavg ) × 1000
  • 16. Henry R. Kang (7/2008) Gaussian Distribution • Gaussian distribution gives the distribution of data points with respect to the true value. It gives a bell-shaped curve as shown in the figure. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2 -1 0 1 2 3 Standard Deviation Probability • The Gaussian equation is P(x) = [(2π)1/2 σ]–1 exp[-(x – X)2 /(2σ2 )] where σ is the standard deviation and X is the true value.  The closer to the true value, the higher the probability.  The area under the curve (or the integration of the Gaussian function)  (xture ± σ) incorporates 68.3% of the data points.  (xture ± 3σ) incorporates 99.7% of the data points.  (xture ± 3.8901σ) incorporates 99.99% of the data points.  (xture ± 4.4172σ) incorporates 99.999% of the data points.  (xture ± 6σ) incorporates nearly 100% of the data points.
  • 17. Henry R. Kang (7/2008) Standard Deviation & Data Distribution • The smaller the σ, the less spread of data points. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 -4 -3 -2 -1 0 1 2 3 4 Standard Deviation Probability σ = 0.5 σ = 1.0 σ = 2.0
  • 18. Henry R. Kang (7/2008) Approximation of Standard Deviation • The computational cost for standard deviation is pretty high; therefore, there exists a good approximation to compute standard deviation with much less computational cost. • š = Ř/√N  Ř is the range of data points from the lowest value to the highest value Ř = xmax – xmin  N is the number of data points. • For a small number of measurements the approximation is accurate enough to replace the formal standard deviation.
  • 19. Henry R. Kang (7/2008) Accuracy and Precision
  • 20. Henry R. Kang (1/2010) Accuracy & Precision of Measurements • Accuracy is a measure of the closeness of a measured quantity to the true value. • Precision is a measure of the agreement of replicate measurements. • Measurements can be precise but not accurate or accurate but not precise or neither. The best result is, of course, accurate and precise. Accurate & precise Precise but not accurate not accurate & not precise accurate but not precise
  • 21. Henry R. Kang (1/2010) Example 1 of Accuracy and Precision • Measured %S values in H2SO4 are 28.72%, 28.40%, and 28.57%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.72% + 28.40% + 28.57%) / 3 = 28.60%  Estimated precision by using the approximation: š = Ř / √N  š = (28.72 – 28.40)% / 31/2 = 0.32% / 1.732 = 0.18 %  Relative standard deviation: sr = š / xM  sr = 0.18% / 28.60% = 0.0063  Accuracy = |X − xM| = | 32.69% − 28.60% | = 4.09%  Relative accuracy = Accuracy / True value = 4.09% / 32.69% = 0.125 • These result indicate that the data are precise but inaccurate.
  • 22. Henry R. Kang (1/2010) Example 2 of Accuracy and Precision • Measured %S values in H2SO4 are 28.89%, 32.56%, and 36.64%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (28.89% + 32.56% + 36.64%) / 3 = 32.70%  Estimated precision by using the approximation: š = Ř / √N  š = (36.64 – 28.89)% / 31/2 = 7.75% / 1.732 = 4.47 %  Relative standard deviation: sr = š / xM  sr = 4.47% / 32.70% = 0.137  Accuracy = |X − xM| = | 32.69% − 32.70% | = 0.01%  Relative accuracy = Accuracy / True value = 0.01% / 32.69% = 0.0003 • These result indicate that the data are imprecise but accurate.
  • 23. Henry R. Kang (1/2010) Example 3 of Accuracy and Precision • Measured %S values in H2SO4 are 25.62%, 33.56%, and 27.93%, where the true value is 32.69%. Determine the accuracy and precision. • Answer:  Mean = (25.62% + 33.56% + 27.93%) / 3 = 29.04%  Estimated precision by using the approximation: š = Ř / √N  š = (33.56 – 25.62)% / 31/2 = 7.94% / 1.732 = 4.58 %  Relative standard deviation: sr = š / xM  sr = 4.58% / 29.04% = 0.158  Accuracy = |X − xM| = | 32.69% − 29.04% | = 3.65%  Relative accuracy = Accuracy / True value = 3.65% / 32.69% = 0.112 • These result indicate that the data are imprecise and inaccurate.
  • 24. Henry R. Kang (7/2008) Data Rejection
  • 25. Henry R. Kang (7/2008) Data Rejection • Replicate measurements of a given quantity are usually scattered.  Some values are closer than others. • Which values to keep (or which values to discard)  If a single result differs greatly from the others that is caused by a particular error of the experimenter, then this result should be discarded.  If a result is significantly “off”, but there is no error in the experiment, then the result, in general, should be kept. • If in doubt, use the rejection coefficient Q test. • Do not discard any result just to get “good precision”.
  • 26. Henry R. Kang (7/2008) Q Test • Q test is used to test the extreme values (the highest and lowest values) • Procedure  Calculate the range  Range = xmax – xmin  Calculate the difference between the extreme value with its nearest neighbor  dhi = xmax – xnbor,hi; dlo = | xmin – xnbor,lo |  Calculate the ratio (Q value) between the difference and the range  Qhi = dhi / Range ;Qlo = dlo / Range • Compare the resulting Q value with the rejection table at 90% confidence level (or other selected confidence level)  If the calculated Q value is greater than the Q value given in the table, then reject the value.
  • 27. Henry R. Kang (7/2008) Rejection Q Tables Number of Data Q90 Q96 Q99 3 0.94 0.98 0.99 4 0.76 0.85 0.93 5 0.64 0.73 0.82 6 0.56 0.64 0.74 7 0.51 0.59 0.68 8 0.47 0.54 0.63 9 0.44 0.51 0.60 10 0.41 0.48 0.57
  • 28. Henry R. Kang (7/2008) Q Test - Example • Data: 35.00, 35.05, 35.10, 35.80 • Calculate the range  Range = xmax – xmin= 35.80 – 35.00 = 0.80 • Calculate the difference between the extreme value with its nearest neighbor.  dhi = xmax – xnbor,hi = 35.80 – 35.10 = 0.70  dlo = xmin – xnbor,lo = | 35.00 – 35.05 | = 0.05 • Calculate Q values between the difference and the range.  Qhi = dhi / Range = 0.70 / 0.80 = 0.88  Qlo = dlo / Range = 0.05 / 0.80 = 0.063 • Compare the resulting Q value with the rejection table at 90% confidence level.  For 4 samples, the Q value in the table is 0.76  Qhi > 0.76; therefore, the highest value 35.80 can be dropped  Once the value is dropped, it is no longer in the data set and should not be used for the calculations of mean and various deviations. #Data Q90 3 0.94 4 0.76 5 0.64 6 0.56 7 0.51 8 0.47 9 0.44 10 0.41