SlideShare a Scribd company logo
1 of 13
Download to read offline
Statistics: Terms and Definitions 
Population: All data, continuous 
Sample: A subset of data, discrete. Use sample for inferential statistics. 
Every statistical problem contains five elements: 
•Questions to be answered. Identification of the populations 
•Design of experiment, sampling procedure 
•Analysis of the sampled data (equations and distributions) 
•Inference (based on confidence level) 
•How good the inference is, measure of goodness
Statistics: Terms and Definitions 
Measurements: Single Point 
Multiple Point 
Uncertainty is total error associated with measurements with specific level of confidence. 
Errors: Bias or fixed error (Systematic Error) 
Precision or random error 
Mean = 휇=푥 = 푥푖 푛 , 푥푖 is the sample and n is the total number of the samples. 
Variance = 휎2=푠2= 1 푛−1 (푥 −푥푖)2 
Average deviation from the mean= 1 푛 (푥 −푥푖)2 
R.M.S. Deviation from the mean = 1 푛 (푥 −푥푖)2 
Standard Deviation (SD)=푠=휎= 푠2=휎2 
Coefficient of Variation: It is a relative variation of the data, 푠 푥 
Standard Error of the Mean = 푠푥 = 푠 푛 
Mode: The most frequent items in the measurement 
Median: Central item when the data is arranged in ascending or descending order. 
Degrees of freedom: F or DF = n-K . Here k is the number of constraints imposed on the data.
Probability Density Function (PDF) 
Probability is a measure of occurrence 
Probability of an event between a & b 
P(a<x<b) = 푝푥푑푥 푏 푎 
Total Probability = 푝푥푑푥 ∞ −∞ 
Gaussian Distribution 
푝푥 1 휎푥2휋 푒 − 12(휎푥)2푥−휇2
Standard Normal Distribution 
If the data is large and random, then with the following conversion, it should follow a standard normal distribution. 
푧= 푥−휇 휎푥 
푝푧 12휋 푒− 푧22 
Area under the curve is one.
Histogram 
Histogram provides the probability of events within each increment. Histogram can be used to check if the data follows a standard distribution or not. The following steps can be used to draw a histogram: 
–Choose a number of class intervals (usually between 5 and 20) that covers the data range. Select the class marks which are the mid-point of the class intervals. If you arrange data in ascending order, the first data should fall in the first class interval. 
–For each class interval, determine the number of data that fall within that interval. If a data falls exactly at the division point, then it is placed in the lower interval. 
–Construct rectangles with centers at the class marks and areas proportional to class frequencies. If the widths of the rectangles are the same, then the height of the rectangles represent the class frequencies.
Histogram 
Data: 25 data point. 
3.0, 6.0, 7.5, 15.0, 12.0, 6.5, 8.0, 4.0, 5.5, 6.5, 5.5, 
12.0, 1.0, 3.5, 3.0, 7.5, 5.0, 10.0, 8.0, 3.5, 9.0, 2.0, 
6.5, 1.0, 5.0 
Δ푥 = 
(푥푚푎푥−푥푚푖푛) 
푐푙푎푠푠 푖푛푡푒푟푣푎푙 
= (15.0-1.0)/6=2.33 
0.2 2.4 2.2 2 1 x  x  x     
2.2 2.4 4.6 3 2 x  x  x    
4.6 2.4 7.0 4 3 x  x  x    
7.0 2.4 9.4 5 4 x  x  x    
9.4 2.4 11.8 6 5 x  x  x    
11.8 2.4 14.2 7 6 x  x  x    
14.2 2.4 16.6 8 7 x  x  x    
Class 
Class subinterval Class 
Marks 
Class Frequency 
Start End 
1 -0.2 2.2 1.0 3 
2 2.2 4.6 3.4 5 
3 4.6 7.0 5.8 8 
4 7.0 9.4 8.2 5 
5 9.4 11.8 10.6 1 
6 11.8 14.2 13.0 2 
7 14.2 16.6 15.4 1
Uncertainty Analysis
Uncertainty Analysis
Uncertainty Analysis utSRRR() for 95% confidence level
Uncertainty and Level of Confidence 
Variation of the mean value is identifies by the number of the standard deviations (± σ or ± s) we select which is also related to the level of confidence we choose to indicate that we are sure our data falls within the identified rang of the standard deviation. 
The relationships between the confidence level and the standard deviation are as follow: 
67% level of confidence ± s 
95% level of confidence ± 2s 
(this is what Engineers use, unless stated otherwise) 
99% level of confidence ± 3s 
For large sample 푥 ±푡훼푠푥 
Here α = 1-level of confidence. 
For small sample 푥 ±푡훼 2 푠푥 푛
Identification of Possible Bad Data Point 
Z Score: Z score is a measure of relative standing of the data. 
푧= 푥−푥 푠 
Data with z values higher than 1.96 (95% level of confidence) is discarded. 
Chouvenet’s Criterion: 
•For a sample population, calculate 푥 ,σ푥 . 
•Using sample population n, find σ푚푎푥 σ푥 . 
•Knowing σ푥 , find σ푚푎푥 from the table below 
•Calculate 푥 −푥 . Here 푥 is the sample that you are assessing. If the difference is larger than σ푚푎푥, the sample is discarded, otherwise it is retained 
.
Linear Regression 
Linear regression is used extensively for calibration. It is a relationship between input (x) and output (y). Calibration is used to eliminate Bias error. 
푦=푎0+푎1푥 
Where: 
The error associated with fitting the data with this equation is: 
This is a mathematical error.
Correlation Coefficient 
Correlation coefficient (r) is a measure of the strength of a linear relationship between two variables. 
Or

More Related Content

What's hot

Measures of Central Tendancy
Measures of Central TendancyMeasures of Central Tendancy
Measures of Central TendancyMARIAPPANM4
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersionForensic Pathology
 
Mann Whitney U Test
Mann Whitney U TestMann Whitney U Test
Mann Whitney U TestJohn Barlow
 
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyPrithwis Mukerjee
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1Cliffed Echavez
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency Kannan Iyanar
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsNitin George
 
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...Paul Richards
 
Thiyagu measures of central tendency final
Thiyagu   measures of central tendency finalThiyagu   measures of central tendency final
Thiyagu measures of central tendency finalThiyagu K
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersionForensic Pathology
 
Measures of Central Tendency and Dispersion
Measures of Central Tendency and DispersionMeasures of Central Tendency and Dispersion
Measures of Central Tendency and DispersionPharmacy Universe
 
Repeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of VarianceRepeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of Variancejasondroesch
 
The t Test for Two Independent Samples
The t Test for Two Independent SamplesThe t Test for Two Independent Samples
The t Test for Two Independent Samplesjasondroesch
 
Normality evaluation in a data
Normality evaluation in a dataNormality evaluation in a data
Normality evaluation in a dataWaqar Akram
 

What's hot (20)

Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Measures of Central Tendancy
Measures of Central TendancyMeasures of Central Tendancy
Measures of Central Tendancy
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersion
 
Mann Whitney U Test
Mann Whitney U TestMann Whitney U Test
Mann Whitney U Test
 
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
 
Statistical parameters
Statistical parametersStatistical parameters
Statistical parameters
 
Advanced statistics Lesson 1
Advanced statistics Lesson 1Advanced statistics Lesson 1
Advanced statistics Lesson 1
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency
 
Measures of Central tendency
Measures of Central tendencyMeasures of Central tendency
Measures of Central tendency
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
 
Thiyagu measures of central tendency final
Thiyagu   measures of central tendency finalThiyagu   measures of central tendency final
Thiyagu measures of central tendency final
 
Statistics - Basics
Statistics - BasicsStatistics - Basics
Statistics - Basics
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersion
 
Measures of Central Tendency and Dispersion
Measures of Central Tendency and DispersionMeasures of Central Tendency and Dispersion
Measures of Central Tendency and Dispersion
 
Repeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of VarianceRepeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of Variance
 
Errors2
Errors2Errors2
Errors2
 
The t Test for Two Independent Samples
The t Test for Two Independent SamplesThe t Test for Two Independent Samples
The t Test for Two Independent Samples
 
Normality
NormalityNormality
Normality
 
Normality evaluation in a data
Normality evaluation in a dataNormality evaluation in a data
Normality evaluation in a data
 

Viewers also liked

Statistical terms for classification
Statistical terms for classificationStatistical terms for classification
Statistical terms for classificationsurabhi_dwivedi
 
Publications_presentations
Publications_presentationsPublications_presentations
Publications_presentationsLarry Michael
 
FURLA Internship 2013 Competitive Trend Analysis
FURLA Internship 2013 Competitive Trend AnalysisFURLA Internship 2013 Competitive Trend Analysis
FURLA Internship 2013 Competitive Trend AnalysisKaren Conroy
 
GBI Case Study Midstream Energy Company 2015
GBI Case Study Midstream Energy Company 2015GBI Case Study Midstream Energy Company 2015
GBI Case Study Midstream Energy Company 2015Pamela Weitberg
 
Cheryl Martin Resume 2014
Cheryl Martin Resume 2014Cheryl Martin Resume 2014
Cheryl Martin Resume 2014Cheryl Martin
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceLarry Michael
 
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014 | Dr. Caroline Lohrisch
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014  | Dr. Caroline LohrischAdjuvant Systemic Therapy | Lunch and Learn - Dec 2014  | Dr. Caroline Lohrisch
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014 | Dr. Caroline LohrischCBCFBCYukon
 

Viewers also liked (13)

Statistical terms for classification
Statistical terms for classificationStatistical terms for classification
Statistical terms for classification
 
Publications_presentations
Publications_presentationsPublications_presentations
Publications_presentations
 
FURLA Internship 2013 Competitive Trend Analysis
FURLA Internship 2013 Competitive Trend AnalysisFURLA Internship 2013 Competitive Trend Analysis
FURLA Internship 2013 Competitive Trend Analysis
 
Katowice
KatowiceKatowice
Katowice
 
MICHAEL_LARRY
MICHAEL_LARRYMICHAEL_LARRY
MICHAEL_LARRY
 
Best buy
Best buyBest buy
Best buy
 
GBI Case Study Midstream Energy Company 2015
GBI Case Study Midstream Energy Company 2015GBI Case Study Midstream Energy Company 2015
GBI Case Study Midstream Energy Company 2015
 
Cheryl Martin Resume 2014
Cheryl Martin Resume 2014Cheryl Martin Resume 2014
Cheryl Martin Resume 2014
 
Superstitions
SuperstitionsSuperstitions
Superstitions
 
Mobile phone
Mobile phoneMobile phone
Mobile phone
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure Science
 
Wcdma presentation4
Wcdma presentation4Wcdma presentation4
Wcdma presentation4
 
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014 | Dr. Caroline Lohrisch
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014  | Dr. Caroline LohrischAdjuvant Systemic Therapy | Lunch and Learn - Dec 2014  | Dr. Caroline Lohrisch
Adjuvant Systemic Therapy | Lunch and Learn - Dec 2014 | Dr. Caroline Lohrisch
 

Similar to L1 statistics

Ders 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxDers 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxErgin Akalpler
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptAKSAKS12
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsBabasab Patil
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...JuliusRomano3
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptxVishal543707
 
Statistics in research
Statistics in researchStatistics in research
Statistics in researchBalaji P
 
3.2 measures of variation
3.2 measures of variation3.2 measures of variation
3.2 measures of variationleblance
 

Similar to L1 statistics (20)

9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 
Ders 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxDers 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptx
 
presentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.pptpresentation_statistics_1448025870_153985.ppt
presentation_statistics_1448025870_153985.ppt
 
BA 3 Statistics.ppt
BA 3 Statistics.pptBA 3 Statistics.ppt
BA 3 Statistics.ppt
 
Statistics excellent
Statistics excellentStatistics excellent
Statistics excellent
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Stats chapter 2
Stats chapter 2 Stats chapter 2
Stats chapter 2
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec doms
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Statistics in research
Statistics in researchStatistics in research
Statistics in research
 
Medical statistics
Medical statisticsMedical statistics
Medical statistics
 
3.2 measures of variation
3.2 measures of variation3.2 measures of variation
3.2 measures of variation
 
Statistics.ppt
Statistics.pptStatistics.ppt
Statistics.ppt
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Chapter 11 Psrm
Chapter 11 PsrmChapter 11 Psrm
Chapter 11 Psrm
 

L1 statistics

  • 1. Statistics: Terms and Definitions Population: All data, continuous Sample: A subset of data, discrete. Use sample for inferential statistics. Every statistical problem contains five elements: •Questions to be answered. Identification of the populations •Design of experiment, sampling procedure •Analysis of the sampled data (equations and distributions) •Inference (based on confidence level) •How good the inference is, measure of goodness
  • 2. Statistics: Terms and Definitions Measurements: Single Point Multiple Point Uncertainty is total error associated with measurements with specific level of confidence. Errors: Bias or fixed error (Systematic Error) Precision or random error Mean = 휇=푥 = 푥푖 푛 , 푥푖 is the sample and n is the total number of the samples. Variance = 휎2=푠2= 1 푛−1 (푥 −푥푖)2 Average deviation from the mean= 1 푛 (푥 −푥푖)2 R.M.S. Deviation from the mean = 1 푛 (푥 −푥푖)2 Standard Deviation (SD)=푠=휎= 푠2=휎2 Coefficient of Variation: It is a relative variation of the data, 푠 푥 Standard Error of the Mean = 푠푥 = 푠 푛 Mode: The most frequent items in the measurement Median: Central item when the data is arranged in ascending or descending order. Degrees of freedom: F or DF = n-K . Here k is the number of constraints imposed on the data.
  • 3. Probability Density Function (PDF) Probability is a measure of occurrence Probability of an event between a & b P(a<x<b) = 푝푥푑푥 푏 푎 Total Probability = 푝푥푑푥 ∞ −∞ Gaussian Distribution 푝푥 1 휎푥2휋 푒 − 12(휎푥)2푥−휇2
  • 4. Standard Normal Distribution If the data is large and random, then with the following conversion, it should follow a standard normal distribution. 푧= 푥−휇 휎푥 푝푧 12휋 푒− 푧22 Area under the curve is one.
  • 5. Histogram Histogram provides the probability of events within each increment. Histogram can be used to check if the data follows a standard distribution or not. The following steps can be used to draw a histogram: –Choose a number of class intervals (usually between 5 and 20) that covers the data range. Select the class marks which are the mid-point of the class intervals. If you arrange data in ascending order, the first data should fall in the first class interval. –For each class interval, determine the number of data that fall within that interval. If a data falls exactly at the division point, then it is placed in the lower interval. –Construct rectangles with centers at the class marks and areas proportional to class frequencies. If the widths of the rectangles are the same, then the height of the rectangles represent the class frequencies.
  • 6. Histogram Data: 25 data point. 3.0, 6.0, 7.5, 15.0, 12.0, 6.5, 8.0, 4.0, 5.5, 6.5, 5.5, 12.0, 1.0, 3.5, 3.0, 7.5, 5.0, 10.0, 8.0, 3.5, 9.0, 2.0, 6.5, 1.0, 5.0 Δ푥 = (푥푚푎푥−푥푚푖푛) 푐푙푎푠푠 푖푛푡푒푟푣푎푙 = (15.0-1.0)/6=2.33 0.2 2.4 2.2 2 1 x  x  x     2.2 2.4 4.6 3 2 x  x  x    4.6 2.4 7.0 4 3 x  x  x    7.0 2.4 9.4 5 4 x  x  x    9.4 2.4 11.8 6 5 x  x  x    11.8 2.4 14.2 7 6 x  x  x    14.2 2.4 16.6 8 7 x  x  x    Class Class subinterval Class Marks Class Frequency Start End 1 -0.2 2.2 1.0 3 2 2.2 4.6 3.4 5 3 4.6 7.0 5.8 8 4 7.0 9.4 8.2 5 5 9.4 11.8 10.6 1 6 11.8 14.2 13.0 2 7 14.2 16.6 15.4 1
  • 9. Uncertainty Analysis utSRRR() for 95% confidence level
  • 10. Uncertainty and Level of Confidence Variation of the mean value is identifies by the number of the standard deviations (± σ or ± s) we select which is also related to the level of confidence we choose to indicate that we are sure our data falls within the identified rang of the standard deviation. The relationships between the confidence level and the standard deviation are as follow: 67% level of confidence ± s 95% level of confidence ± 2s (this is what Engineers use, unless stated otherwise) 99% level of confidence ± 3s For large sample 푥 ±푡훼푠푥 Here α = 1-level of confidence. For small sample 푥 ±푡훼 2 푠푥 푛
  • 11. Identification of Possible Bad Data Point Z Score: Z score is a measure of relative standing of the data. 푧= 푥−푥 푠 Data with z values higher than 1.96 (95% level of confidence) is discarded. Chouvenet’s Criterion: •For a sample population, calculate 푥 ,σ푥 . •Using sample population n, find σ푚푎푥 σ푥 . •Knowing σ푥 , find σ푚푎푥 from the table below •Calculate 푥 −푥 . Here 푥 is the sample that you are assessing. If the difference is larger than σ푚푎푥, the sample is discarded, otherwise it is retained .
  • 12. Linear Regression Linear regression is used extensively for calibration. It is a relationship between input (x) and output (y). Calibration is used to eliminate Bias error. 푦=푎0+푎1푥 Where: The error associated with fitting the data with this equation is: This is a mathematical error.
  • 13. Correlation Coefficient Correlation coefficient (r) is a measure of the strength of a linear relationship between two variables. Or