SlideShare a Scribd company logo
1 of 11
Download to read offline
Course notes:
Descriptive
statistics
Course notes:
Descriptive
statistics
COURSE NOTES: INFERENTIAL
STATISTICS
Distributions
In statistics, when we talk about distributions we usually mean
probability distributions.
Definition (informal): A distribution is a function that shows
the possible values for a variable and how often they occur.
Definition (Wikipedia): In probability theory and statistics, a
probability distribution is a mathematical function that, stated
in simple terms, can be thought of as providing the
probabilities of occurrence of different possible outcomes in
an experiment.
Examples: Normal distribution, Studentโ€™s T distribution, Poisson
distribution, Uniform distribution, Binomial distribution
Graphical representation
It is a common mistake to believe that the distribution is the
graph. In fact the distribution is the โ€˜ruleโ€™ that determines how
values are positioned in relation to each other.
Very often, we use a graph to visualize the data. Since
different distributions have a particular graphical
representation, statisticians like to plot them.
Examples:
Uniform distribution Binomial distribution
Normal distribution Studentโ€™s T distribution
Definition
The Normal Distribution
The Normal distribution is also known as Gaussian
distribution or the Bell curve. It is one of the most
common distributions due to the following reasons:
โ€ข It approximates a wide variety of random variables
โ€ข Distributions of sample means with large enough
samples sizes could be approximated to normal
โ€ข All computable statistics are elegant
โ€ข Heavily used in regression analysis
โ€ข Good track record
Examples:
โ€ข Biology. Most biological measures are normally
distributed, such as: height; length of arms, legs,
nails; blood pressure; thickness of tree barks, etc.
โ€ข IQ tests
โ€ข Stock market information
๐‘~(๐œ‡, ๐œŽ2
)
N stands for normal;
~ stands for a distribution;
ฮผ is the mean;
๐œŽ2
is the variance.
The Normal Distribution
0
๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘
Origin
0
๐ = ๐Ÿ’๐Ÿ•๐ŸŽ
๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ
0
๐ = ๐Ÿ—๐Ÿ”๐ŸŽ
๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ ๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ
Controlling for the standard deviation
Keeping the standard deviation constant, the graph
of a normal distribution with:
โ€ข a smaller mean would look in the same way, but
be situated to the left (in gray)
โ€ข a larger mean would look in the same way, but
be situated to the right (in red)
The Normal Distribution
0
๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘
Origin
๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ
0
0
๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘
๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘
๐ˆ = ๐Ÿ•๐ŸŽ
๐ˆ = ๐Ÿ๐Ÿ๐ŸŽ
Controlling for the mean
Keeping the mean constant, a normal distribution
with:
โ€ข a smaller standard deviation would be situated in
the same spot, but have a higher peak and
thinner tails (in red)
โ€ข a larger standard deviation would be situated in
the same spot, but have a lower peak and fatter
tails (in gray)
The Standard Normal Distribution
๐‘~(0,1)
The Standard Normal distribution is a particular case
of the Normal distribution. It has a mean of 0 and a
standard deviation of 1.
Every Normal distribution can be โ€˜standardizedโ€™ using
the standardization formula:
๐‘ง =
๐‘ฅ โˆ’ ๐œ‡
๐œŽ
A variable following the Standard Normal
distribution is denoted with the letter z.
Why standardize?
Standardization allows us to:
โ€ข compare different normally distributed datasets
โ€ข detect normality
โ€ข detect outliers
โ€ข create confidence intervals
โ€ข test hypotheses
โ€ข perform regression analysis
Rationale of the formula for standardization:
We want to transform a random variable from ๐‘~ ฮผ, ๐œŽ2
to ๐‘~(0,1). Subtracting the mean from all observations would cause a
transformation from ๐‘~ ฮผ, ๐œŽ2
to ๐‘~ 0, ๐œŽ2
, moving the graph to the origin. Subsequently, dividing all observations by the standard
deviation would cause a transformation from ๐‘~ 0, ๐œŽ2
to ๐‘~ 0,1 , standardizing the peak and the tails of the graph.
The Central Limit Theorem
The Central Limit Theorem (CLT) is one of the greatest statistical insights. It states that no matter the underlying distribution of the
dataset, the sampling distribution of the means would approximate a normal distribution. Moreover, the mean of the sampling
distribution would be equal to the mean of the original distribution and the variance would be n times smaller, where n is the size of the
samples. The CLT applies whenever we have a sum or an average of many variables (e.g. sum of rolled numbers when rolling dice).
The theorem Why is it useful? Where can we see it?
โžข No matter the distribution
โžข The distribution of ๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3, ๐‘ฅ4,
โ€ฆ , ๐‘ฅ๐‘˜ would tend to ๐‘~ ฮผ,
๐œŽ2
๐‘›
โžข The more samples, the closer to
Normal ( k -> โˆž )
โžข The bigger the samples, the closer
to Normal ( n -> โˆž )
The CLT allows us to assume normality
for many different variables. That is very
useful for confidence intervals,
hypothesis testing, and regression
analysis. In fact, the Normal distribution
is so predominantly observed around
us due to the fact that following the
CLT, many variables converge to
Normal.
Since many concepts and events are a
sum or an average of different effects,
CLT applies and we observe normality
all the time. For example, in regression
analysis, the dependent variable is
explained through the sum of error
terms.
Click here for a CLT simulator.
Estimators and Estimates
Estimators Estimates
Broadly, an estimator is a mathematical function that
approximates a population parameter depending only on
sample information.
Examples of estimators and the corresponding parameters:
Term Estimator Parameter
Mean เดฅ
๐’™ ฮผ
Variance ๐’”๐Ÿ
๐ˆ๐Ÿ
Correlation r ฯ
An estimate is the output that you get from the estimator
(when you apply the formula). There are two types of
estimates: point estimates and confidence interval
estimates.
Point
estimates
Confidence
intervals
A single value.
Examples:
โ€ข 1
โ€ข 5
โ€ข 122.67
โ€ข 0.32
An interval.
Examples:
โ€ข ( 1 , 5 )
โ€ข ( 12 , 33)
โ€ข ( 221.78 , 745.66)
โ€ข ( - 0.71 , 0.11 )
Estimators have two important properties:
โ€ข Bias
The expected value of an unbiased estimator is the population
parameter. The bias in this case is 0. If the expected value of an
estimator is (parameter + b), then the bias is b.
โ€ข Efficiency
The most efficient estimator is the one with the smallest
variance.
Confidence intervals are much more precise than point
estimates. That is why they are preferred when making
inferences.
Confidence Intervals and the Margin of Error
Definition: A confidence interval is an interval within which we are confident (with a certain percentage of confidence) the population parameter will fall.
We build the confidence interval around the point estimate.
(1-ฮฑ) is the level of confidence. We are (1-ฮฑ)*100% confident that the population parameter will fall in the specified interval. Common alphas are: 0.01, 0.05, 0.1.
General formula:
[ เดค
๐’™- ME, เดค
๐’™ + ME ] , where ME is the margin of error.
ME = reliabilityfactorโˆ—
๐‘ ๐‘ก๐‘Ž๐‘›๐‘‘๐‘Ž๐‘Ÿ๐‘‘ ๐‘‘๐‘’๐‘ฃ๐‘–๐‘Ž๐‘ก๐‘–๐‘œ๐‘›
๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘ ๐‘–๐‘ง๐‘’
Interval start Interval end
Point estimate
๐’›๐œถ/๐Ÿ โˆ—
๐ˆ
๐’
๐’•ฯ…,๐œถ/๐Ÿ โˆ—
๐’”
๐’
Term Effect on width of CI
(1-ฮฑ) โ†‘ โ†‘
๐ˆ โ†‘ โ†‘
n โ†‘ โ†“
Studentโ€™s T Distribution
The Studentโ€™s T distribution is used
predominantly for creating confidence
intervals and testing hypotheses with
normally distributed populations when
the sample sizes are small. It is
particularly useful when we donโ€™t have
enough information or it is too costly to
obtain it.
All else equal, the Studentโ€™s T distribution
has fatter tails than the Normal distribution
and a lower peak. This is to reflect the
higher level of uncertainty, caused by the
small sample size.
Studentโ€™s T distribution
Normal distribution
A random variable following the t-distribution is denoted ๐‘กฯ…,ฮฑ , where ฯ… are the degrees of freedom.
We can obtain the studentโ€™s T distribution for a variable with a Normally distributed population using the formula: ๐‘กฯ…,ฮฑ =
าง
๐‘ฅโˆ’๐œ‡
๐‘ / ๐‘›
Formulas for Confidence Intervals
๐’›๐œถ/๐Ÿ โˆ—
๐ˆ
๐’
๐’•๐’….๐’‡.,๐œถ/๐Ÿ โˆ—
๐’”
๐’
# populations
Population
variance
Samples Statistic Variance Formula
One known - z ๐œŽ2 เดค
x ยฑ z ฮค
ฮฑ 2
ฯƒ
n
One unknown - t ๐‘ 2 เดค
x ยฑ ๐‘ก๐‘›โˆ’1, ฮค
ฮฑ 2
s
n
Two - dependent t ๐‘ ๐‘‘๐‘–๐‘“๐‘“๐‘’๐‘Ÿ๐‘’๐‘›๐‘๐‘’
2 เดค
d ยฑ ๐‘ก๐‘›โˆ’1, ฮค
ฮฑ 2
๐‘ ๐‘‘
n
Two Known independent z ฯƒ๐‘ฅ
2
, ฯƒ๐‘ฆ
2
( าง
๐‘ฅ โˆ’ เดค
๐‘ฆ) ยฑ ๐‘ง ฮค
๐›ผ 2
ฯƒ๐‘ฅ
2
๐‘›๐‘ฅ
+
ฯƒ๐‘ฆ
2
๐‘›๐‘ฆ
Two
unknown,
assumed
equal
independent t
๐‘ ๐‘
2
=
๐‘›๐‘ฅ โˆ’ 1 ๐‘ ๐‘ฅ
2
+ ๐‘›๐‘ฆ โˆ’ 1 ๐‘ ๐‘ฆ
2
๐‘›๐‘ฅ + ๐‘›๐‘ฆ โˆ’ 2
( าง
๐‘ฅ โˆ’ เดค
๐‘ฆ) ยฑ ๐‘ก ฮค
๐‘›๐‘ฅ+๐‘›๐‘ฆโˆ’2,๐›ผ 2
๐‘ ๐‘
2
๐‘›๐‘ฅ
+
๐‘ ๐‘
2
๐‘›๐‘ฆ
Two
unknown,
assumed
different
independent t ๐‘ ๐‘ฅ
2
, ๐‘ ๐‘ฆ
2
( าง
๐‘ฅ โˆ’ เดค
๐‘ฆ) ยฑ ๐‘ก ฮค
ฯ…,๐›ผ 2
๐‘ ๐‘ฅ
2
๐‘›๐‘ฅ
+
๐‘ ๐‘ฆ
2
๐‘›๐‘ฆ

More Related Content

What's hot

Ch1 The Nature of Statistics
Ch1 The Nature of StatisticsCh1 The Nature of Statistics
Ch1 The Nature of StatisticsFarhan Alfin
ย 
Descriptive Statistics - Thiyagu K
Descriptive Statistics - Thiyagu KDescriptive Statistics - Thiyagu K
Descriptive Statistics - Thiyagu KThiyagu K
ย 
Day 4 normal curve and standard scores
Day 4 normal curve and standard scoresDay 4 normal curve and standard scores
Day 4 normal curve and standard scoresElih Sutisna Yanto
ย 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersionDr Dhavalkumar F. Chaudhary
ย 
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION
MEASURES OF CENTRAL TENDENCY AND  MEASURES OF DISPERSION MEASURES OF CENTRAL TENDENCY AND  MEASURES OF DISPERSION
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION Tanya Singla
ย 
Statstics in nursing
Statstics in nursing Statstics in nursing
Statstics in nursing Monika Devi NR
ย 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statisticsMd. Mehadi Hassan Bappy
ย 
Central tendency and Measure of Dispersion
Central tendency and Measure of DispersionCentral tendency and Measure of Dispersion
Central tendency and Measure of DispersionDr Dhavalkumar F. Chaudhary
ย 
Statistics in nursing research
Statistics in nursing researchStatistics in nursing research
Statistics in nursing researchNursing Path
ย 
The Normal Distribution
The Normal DistributionThe Normal Distribution
The Normal DistributionSan Benito CISD
ย 
Frequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare ProfessioanlsFrequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare Professioanlsalberpaules
ย 
Descriptive Analysis in Statistics
Descriptive Analysis in StatisticsDescriptive Analysis in Statistics
Descriptive Analysis in StatisticsAzmi Mohd Tamil
ย 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersionRajaKrishnan M
ย 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersionForensic Pathology
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
ย 
statistics
statisticsstatistics
statisticsfoyezcse
ย 
15. descriptive statistics
15. descriptive statistics15. descriptive statistics
15. descriptive statisticsAshok Kulkarni
ย 
Ch3 Probability and The Normal Distribution
Ch3 Probability and The Normal Distribution Ch3 Probability and The Normal Distribution
Ch3 Probability and The Normal Distribution Farhan Alfin
ย 

What's hot (19)

Ch1 The Nature of Statistics
Ch1 The Nature of StatisticsCh1 The Nature of Statistics
Ch1 The Nature of Statistics
ย 
Descriptive Statistics - Thiyagu K
Descriptive Statistics - Thiyagu KDescriptive Statistics - Thiyagu K
Descriptive Statistics - Thiyagu K
ย 
Day 4 normal curve and standard scores
Day 4 normal curve and standard scoresDay 4 normal curve and standard scores
Day 4 normal curve and standard scores
ย 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
ย 
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION
MEASURES OF CENTRAL TENDENCY AND  MEASURES OF DISPERSION MEASURES OF CENTRAL TENDENCY AND  MEASURES OF DISPERSION
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION
ย 
Statstics in nursing
Statstics in nursing Statstics in nursing
Statstics in nursing
ย 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statistics
ย 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
ย 
Central tendency and Measure of Dispersion
Central tendency and Measure of DispersionCentral tendency and Measure of Dispersion
Central tendency and Measure of Dispersion
ย 
Statistics in nursing research
Statistics in nursing researchStatistics in nursing research
Statistics in nursing research
ย 
The Normal Distribution
The Normal DistributionThe Normal Distribution
The Normal Distribution
ย 
Frequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare ProfessioanlsFrequency Measures for Healthcare Professioanls
Frequency Measures for Healthcare Professioanls
ย 
Descriptive Analysis in Statistics
Descriptive Analysis in StatisticsDescriptive Analysis in Statistics
Descriptive Analysis in Statistics
ย 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
ย 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersion
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
ย 
statistics
statisticsstatistics
statistics
ย 
15. descriptive statistics
15. descriptive statistics15. descriptive statistics
15. descriptive statistics
ย 
Ch3 Probability and The Normal Distribution
Ch3 Probability and The Normal Distribution Ch3 Probability and The Normal Distribution
Ch3 Probability and The Normal Distribution
ย 

Similar to 1.1 course notes inferential statistics

estimation
estimationestimation
estimationMmedsc Hahm
ย 
Estimation
EstimationEstimation
EstimationMmedsc Hahm
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisForensic Pathology
ย 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.pptNoorelhuda2
ย 
ders 3 Unit root test.pptx
ders 3 Unit root test.pptxders 3 Unit root test.pptx
ders 3 Unit root test.pptxErgin Akalpler
ย 
ders 3.2 Unit root testing section 2 .pptx
ders 3.2 Unit root testing section 2 .pptxders 3.2 Unit root testing section 2 .pptx
ders 3.2 Unit root testing section 2 .pptxErgin Akalpler
ย 
Review of Chapters 1-5.ppt
Review of Chapters 1-5.pptReview of Chapters 1-5.ppt
Review of Chapters 1-5.pptNobelFFarrar
ย 
template.pptx
template.pptxtemplate.pptx
template.pptxuzmasulthana3
ย 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of datadrasifk
ย 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptTripthiDubey
ย 
M.Ed Tcs 2 seminar ppt npc to submit
M.Ed Tcs 2 seminar ppt npc   to submitM.Ed Tcs 2 seminar ppt npc   to submit
M.Ed Tcs 2 seminar ppt npc to submitBINCYKMATHEW
ย 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis TestingSr Edith Bogue
ย 
Basics of biostatistic
Basics of biostatisticBasics of biostatistic
Basics of biostatisticNeurologyKota
ย 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.pptDejeneDay
ย 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptNazarudinManik1
ย 
Statistical Methods in Research
Statistical Methods in ResearchStatistical Methods in Research
Statistical Methods in ResearchManoj Sharma
ย 

Similar to 1.1 course notes inferential statistics (20)

estimation
estimationestimation
estimation
ย 
Estimation
EstimationEstimation
Estimation
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
ย 
Stat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesisStat 4 the normal distribution & steps of testing hypothesis
Stat 4 the normal distribution & steps of testing hypothesis
ย 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
ย 
ders 3 Unit root test.pptx
ders 3 Unit root test.pptxders 3 Unit root test.pptx
ders 3 Unit root test.pptx
ย 
ders 3.2 Unit root testing section 2 .pptx
ders 3.2 Unit root testing section 2 .pptxders 3.2 Unit root testing section 2 .pptx
ders 3.2 Unit root testing section 2 .pptx
ย 
Review of Chapters 1-5.ppt
Review of Chapters 1-5.pptReview of Chapters 1-5.ppt
Review of Chapters 1-5.ppt
ย 
template.pptx
template.pptxtemplate.pptx
template.pptx
ย 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
ย 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
ย 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
ย 
M.Ed Tcs 2 seminar ppt npc to submit
M.Ed Tcs 2 seminar ppt npc   to submitM.Ed Tcs 2 seminar ppt npc   to submit
M.Ed Tcs 2 seminar ppt npc to submit
ย 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
ย 
Statistics excellent
Statistics excellentStatistics excellent
Statistics excellent
ย 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
ย 
Basics of biostatistic
Basics of biostatisticBasics of biostatistic
Basics of biostatistic
ย 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
ย 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
ย 
Statistical Methods in Research
Statistical Methods in ResearchStatistical Methods in Research
Statistical Methods in Research
ย 

Recently uploaded

Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
ย 
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,noida100girls
ย 
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...lizamodels9
ย 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
ย 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
ย 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
ย 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
ย 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
ย 
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaon
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City GurgaonCall Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaon
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaoncallgirls2057
ย 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
ย 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
ย 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
ย 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
ย 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
ย 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncrdollysharma2066
ย 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
ย 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
ย 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
ย 
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCRsoniya singh
ย 
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...lizamodels9
ย 

Recently uploaded (20)

Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
ย 
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad โœจ 9773824855 โœจ Escorts Service In Delhi Ncr,
ย 
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad โžฅ9990211544 Independent Best Escorts In...
ย 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
ย 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
ย 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
ย 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
ย 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
ย 
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaon
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City GurgaonCall Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaon
Call Us ๐Ÿ“ฒ8800102216๐Ÿ“ž Call Girls In DLF City Gurgaon
ย 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
ย 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
ย 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
ย 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
ย 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
ย 
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / NcrCall Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
Call Girls in DELHI Cantt, ( Call Me )-8377877756-Female Escort- In Delhi / Ncr
ย 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
ย 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
ย 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
ย 
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR
(8264348440) ๐Ÿ” Call Girls In Mahipalpur ๐Ÿ” Delhi NCR
ย 
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi โค๏ธ88604**77959_Russian 100% Genuine Escor...
ย 

1.1 course notes inferential statistics

  • 2. Distributions In statistics, when we talk about distributions we usually mean probability distributions. Definition (informal): A distribution is a function that shows the possible values for a variable and how often they occur. Definition (Wikipedia): In probability theory and statistics, a probability distribution is a mathematical function that, stated in simple terms, can be thought of as providing the probabilities of occurrence of different possible outcomes in an experiment. Examples: Normal distribution, Studentโ€™s T distribution, Poisson distribution, Uniform distribution, Binomial distribution Graphical representation It is a common mistake to believe that the distribution is the graph. In fact the distribution is the โ€˜ruleโ€™ that determines how values are positioned in relation to each other. Very often, we use a graph to visualize the data. Since different distributions have a particular graphical representation, statisticians like to plot them. Examples: Uniform distribution Binomial distribution Normal distribution Studentโ€™s T distribution Definition
  • 3. The Normal Distribution The Normal distribution is also known as Gaussian distribution or the Bell curve. It is one of the most common distributions due to the following reasons: โ€ข It approximates a wide variety of random variables โ€ข Distributions of sample means with large enough samples sizes could be approximated to normal โ€ข All computable statistics are elegant โ€ข Heavily used in regression analysis โ€ข Good track record Examples: โ€ข Biology. Most biological measures are normally distributed, such as: height; length of arms, legs, nails; blood pressure; thickness of tree barks, etc. โ€ข IQ tests โ€ข Stock market information ๐‘~(๐œ‡, ๐œŽ2 ) N stands for normal; ~ stands for a distribution; ฮผ is the mean; ๐œŽ2 is the variance.
  • 4. The Normal Distribution 0 ๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘ Origin 0 ๐ = ๐Ÿ’๐Ÿ•๐ŸŽ ๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ 0 ๐ = ๐Ÿ—๐Ÿ”๐ŸŽ ๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ ๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ Controlling for the standard deviation Keeping the standard deviation constant, the graph of a normal distribution with: โ€ข a smaller mean would look in the same way, but be situated to the left (in gray) โ€ข a larger mean would look in the same way, but be situated to the right (in red)
  • 5. The Normal Distribution 0 ๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘ Origin ๐ˆ = ๐Ÿ๐Ÿ’๐ŸŽ 0 0 ๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘ ๐ = ๐Ÿ•๐Ÿ’๐Ÿ‘ ๐ˆ = ๐Ÿ•๐ŸŽ ๐ˆ = ๐Ÿ๐Ÿ๐ŸŽ Controlling for the mean Keeping the mean constant, a normal distribution with: โ€ข a smaller standard deviation would be situated in the same spot, but have a higher peak and thinner tails (in red) โ€ข a larger standard deviation would be situated in the same spot, but have a lower peak and fatter tails (in gray)
  • 6. The Standard Normal Distribution ๐‘~(0,1) The Standard Normal distribution is a particular case of the Normal distribution. It has a mean of 0 and a standard deviation of 1. Every Normal distribution can be โ€˜standardizedโ€™ using the standardization formula: ๐‘ง = ๐‘ฅ โˆ’ ๐œ‡ ๐œŽ A variable following the Standard Normal distribution is denoted with the letter z. Why standardize? Standardization allows us to: โ€ข compare different normally distributed datasets โ€ข detect normality โ€ข detect outliers โ€ข create confidence intervals โ€ข test hypotheses โ€ข perform regression analysis Rationale of the formula for standardization: We want to transform a random variable from ๐‘~ ฮผ, ๐œŽ2 to ๐‘~(0,1). Subtracting the mean from all observations would cause a transformation from ๐‘~ ฮผ, ๐œŽ2 to ๐‘~ 0, ๐œŽ2 , moving the graph to the origin. Subsequently, dividing all observations by the standard deviation would cause a transformation from ๐‘~ 0, ๐œŽ2 to ๐‘~ 0,1 , standardizing the peak and the tails of the graph.
  • 7. The Central Limit Theorem The Central Limit Theorem (CLT) is one of the greatest statistical insights. It states that no matter the underlying distribution of the dataset, the sampling distribution of the means would approximate a normal distribution. Moreover, the mean of the sampling distribution would be equal to the mean of the original distribution and the variance would be n times smaller, where n is the size of the samples. The CLT applies whenever we have a sum or an average of many variables (e.g. sum of rolled numbers when rolling dice). The theorem Why is it useful? Where can we see it? โžข No matter the distribution โžข The distribution of ๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3, ๐‘ฅ4, โ€ฆ , ๐‘ฅ๐‘˜ would tend to ๐‘~ ฮผ, ๐œŽ2 ๐‘› โžข The more samples, the closer to Normal ( k -> โˆž ) โžข The bigger the samples, the closer to Normal ( n -> โˆž ) The CLT allows us to assume normality for many different variables. That is very useful for confidence intervals, hypothesis testing, and regression analysis. In fact, the Normal distribution is so predominantly observed around us due to the fact that following the CLT, many variables converge to Normal. Since many concepts and events are a sum or an average of different effects, CLT applies and we observe normality all the time. For example, in regression analysis, the dependent variable is explained through the sum of error terms. Click here for a CLT simulator.
  • 8. Estimators and Estimates Estimators Estimates Broadly, an estimator is a mathematical function that approximates a population parameter depending only on sample information. Examples of estimators and the corresponding parameters: Term Estimator Parameter Mean เดฅ ๐’™ ฮผ Variance ๐’”๐Ÿ ๐ˆ๐Ÿ Correlation r ฯ An estimate is the output that you get from the estimator (when you apply the formula). There are two types of estimates: point estimates and confidence interval estimates. Point estimates Confidence intervals A single value. Examples: โ€ข 1 โ€ข 5 โ€ข 122.67 โ€ข 0.32 An interval. Examples: โ€ข ( 1 , 5 ) โ€ข ( 12 , 33) โ€ข ( 221.78 , 745.66) โ€ข ( - 0.71 , 0.11 ) Estimators have two important properties: โ€ข Bias The expected value of an unbiased estimator is the population parameter. The bias in this case is 0. If the expected value of an estimator is (parameter + b), then the bias is b. โ€ข Efficiency The most efficient estimator is the one with the smallest variance. Confidence intervals are much more precise than point estimates. That is why they are preferred when making inferences.
  • 9. Confidence Intervals and the Margin of Error Definition: A confidence interval is an interval within which we are confident (with a certain percentage of confidence) the population parameter will fall. We build the confidence interval around the point estimate. (1-ฮฑ) is the level of confidence. We are (1-ฮฑ)*100% confident that the population parameter will fall in the specified interval. Common alphas are: 0.01, 0.05, 0.1. General formula: [ เดค ๐’™- ME, เดค ๐’™ + ME ] , where ME is the margin of error. ME = reliabilityfactorโˆ— ๐‘ ๐‘ก๐‘Ž๐‘›๐‘‘๐‘Ž๐‘Ÿ๐‘‘ ๐‘‘๐‘’๐‘ฃ๐‘–๐‘Ž๐‘ก๐‘–๐‘œ๐‘› ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘ ๐‘–๐‘ง๐‘’ Interval start Interval end Point estimate ๐’›๐œถ/๐Ÿ โˆ— ๐ˆ ๐’ ๐’•ฯ…,๐œถ/๐Ÿ โˆ— ๐’” ๐’ Term Effect on width of CI (1-ฮฑ) โ†‘ โ†‘ ๐ˆ โ†‘ โ†‘ n โ†‘ โ†“
  • 10. Studentโ€™s T Distribution The Studentโ€™s T distribution is used predominantly for creating confidence intervals and testing hypotheses with normally distributed populations when the sample sizes are small. It is particularly useful when we donโ€™t have enough information or it is too costly to obtain it. All else equal, the Studentโ€™s T distribution has fatter tails than the Normal distribution and a lower peak. This is to reflect the higher level of uncertainty, caused by the small sample size. Studentโ€™s T distribution Normal distribution A random variable following the t-distribution is denoted ๐‘กฯ…,ฮฑ , where ฯ… are the degrees of freedom. We can obtain the studentโ€™s T distribution for a variable with a Normally distributed population using the formula: ๐‘กฯ…,ฮฑ = าง ๐‘ฅโˆ’๐œ‡ ๐‘ / ๐‘›
  • 11. Formulas for Confidence Intervals ๐’›๐œถ/๐Ÿ โˆ— ๐ˆ ๐’ ๐’•๐’….๐’‡.,๐œถ/๐Ÿ โˆ— ๐’” ๐’ # populations Population variance Samples Statistic Variance Formula One known - z ๐œŽ2 เดค x ยฑ z ฮค ฮฑ 2 ฯƒ n One unknown - t ๐‘ 2 เดค x ยฑ ๐‘ก๐‘›โˆ’1, ฮค ฮฑ 2 s n Two - dependent t ๐‘ ๐‘‘๐‘–๐‘“๐‘“๐‘’๐‘Ÿ๐‘’๐‘›๐‘๐‘’ 2 เดค d ยฑ ๐‘ก๐‘›โˆ’1, ฮค ฮฑ 2 ๐‘ ๐‘‘ n Two Known independent z ฯƒ๐‘ฅ 2 , ฯƒ๐‘ฆ 2 ( าง ๐‘ฅ โˆ’ เดค ๐‘ฆ) ยฑ ๐‘ง ฮค ๐›ผ 2 ฯƒ๐‘ฅ 2 ๐‘›๐‘ฅ + ฯƒ๐‘ฆ 2 ๐‘›๐‘ฆ Two unknown, assumed equal independent t ๐‘ ๐‘ 2 = ๐‘›๐‘ฅ โˆ’ 1 ๐‘ ๐‘ฅ 2 + ๐‘›๐‘ฆ โˆ’ 1 ๐‘ ๐‘ฆ 2 ๐‘›๐‘ฅ + ๐‘›๐‘ฆ โˆ’ 2 ( าง ๐‘ฅ โˆ’ เดค ๐‘ฆ) ยฑ ๐‘ก ฮค ๐‘›๐‘ฅ+๐‘›๐‘ฆโˆ’2,๐›ผ 2 ๐‘ ๐‘ 2 ๐‘›๐‘ฅ + ๐‘ ๐‘ 2 ๐‘›๐‘ฆ Two unknown, assumed different independent t ๐‘ ๐‘ฅ 2 , ๐‘ ๐‘ฆ 2 ( าง ๐‘ฅ โˆ’ เดค ๐‘ฆ) ยฑ ๐‘ก ฮค ฯ…,๐›ผ 2 ๐‘ ๐‘ฅ 2 ๐‘›๐‘ฅ + ๐‘ ๐‘ฆ 2 ๐‘›๐‘ฆ