SlideShare a Scribd company logo
1 of 21
Download to read offline
Hypothesis
Testing -I
Definition
 A hypothesis test is a statistical test that is used for
determining whether there is enough evidence
from the sample data to draw a conclusion for
the entire population.
 Two types of conclusions:
1. Null Hypothesis (Ho): is the hypothesis that any
observe variation in a sample is simply because of
random chance variation or we can say “the
hypothesis - that there is no significant difference
between the sample and the population, and any
observed difference is due to randomness or
experimental error.”
Rupak Roy
2. Alternative Hypothesis ( Ha ):
is the hypothesis testing that is contrary to the
null hypothesis.
Examples:
If i replace the battery in my car, then my car will give
better mileage?
Null Hypothesis (Ho): no difference of mileage even if we
replace the battery of the car.
Alternative Hypothesis (Ha): difference in mileage if we
replace the battery of the car
Rupak Roy
Significance level i.e. alpha a
If the criteria used for rejecting the null
hypothesis is less than 5% i.e. 0.05(p-value)
then we will conclude that there is difference
between sample and population. In other
words we are rejecting the null hypothesis.
The most standard value for rejecting null
hypothesis is 0.05; however we can change
depending on our need.
Rupak Roy
Example
 If
P (value) > Significance level (a), then we will
accept the null hypothesis
 Else
P (value) < Significance level (a), then we will
reject the null hypothesis
Another term for saying we have rejected the
null hypothesis is Statistically Significant result.
Rupak Roy
Stages of Hypothesis
1. Select
Null hypothesis (Ho): no difference of mileage if we
replace the battery of the car.
Alternative Hypothesis(Ha): difference in mileage if we
replace the battery of the car.
2. Test Distribution: select appropriate distribution like
norm.dist, binom.dist, t-distribution with
significance level: alpha (a) 5% i.e. 0.05
3. P-value ( example, p = 1- norm.dist(………)=0.09
4. Result: failed to reject the null i.e. accepting the null
hypothesis and discarding the alternative hypothesis. We
will conclude that there is no difference in mileage even if
we replace the battery of the car.
Rupak Roy
Example
A food production unit produces a particular product of an average
weight of 10 lbs. with a standard deviation of 0.35 lbs. A random
sample of 30 units found a slightly increase of average weight by 2 lbs.
i.e. 12 lbs. So are there any issues in the product process?
Significance level (a) = 0.05
Null Hypothesis (H0): There are no issues in the production process,
what we found in the sample are due to random chance variation /
randomness.
Alternative Hypothesis (H1): There are some issues in the production
process that is leading to the increase in weight per unit.
Test Distribution: normal distribution
Rupak Roy
Example: continued
In Excel,
normal distribution = norm.dist( X, mean, Standard deviation, Cumulative)
where,
X =12, mean = 10, standard deviation = 0.35 and cumulative =
TRUE/False
Therefore,
= 1- norm.dist
(Because we need to calculate P-value for greater than 10 lbs.)
=1- norm.dist (12,10,0.35,TRUE)
= 5.5089E-09 i.e. less than 0.05
Since P-value is smaller than Significance level (a), we have failed to
reject the H1 i.e. accepting the alternative hypothesis and discarding the
Null hypothesis.
In other words, we will conclude that there are some issues in the
production process that leads to the increase in weight per unit of
production.
Rupak Roy
Terminology
Confidence level: is (1-significance level),
it refers how confident you are about your
conclusion.
So, if null hypothesis is rejected at a 5% level of
significance, then it means you are 95% (1- 0.05)
confident about your conclusion.
Again, if null hypothesis is rejected at a 1% level of
significance, then it means you are 99% (1-0.01)
confident about your conclusion.
Rupak Roy
Central Limit Theorem (CLT)
 The central limit theorem says irrespective of
the underlying population distribution, when
you pick a multiple random samples from an
underlying population with a sample size of at
least 30 or above. The distribution of sample
average will be normal even if the underlying
population is not normal.
Rupak Roy
Hypothesis testing when sample size is low
 Remember: Central limit theorem says if the sample size is
sufficiently large, the distribution of sample averages will
be normal irrespective of underlying population distribution
or else it will follow t-distribution.
 So to compute the probability if the sample size is less than
30, we will use t-dist to calculate the P-value.
 And is also a continuous probability distribution.
 As we can see in the
diagram when the
sample size
increases to 30,
the t-distribution
approximates
a normal distribution.
Rupak Roy
T-distance
In order to calculate t- distribution we need
t-distance i.e.
the test statistics =
Where,
(sample mean – population mean) /
( S ) standard deviation/ (N ) sample size )
Rupak Roy
Steps for T-distribution
 Select
null hypothesis (ho):
alternative hypothesis (h1):
 Significance level: 5%
 Test distribution: t-distribution(calculate P-value)
 Conclusion: reject the null hypothesis or accept
the null hypothesis.
Rupak Roy
Example
 The seller of a manufacturing company claims that
an average fluorescent light stays for 320 days. The
inspector randomly selects 10 fluorescent lights for
inspection. The sampled last with an average of 280
days along with a standard deviation of 95. What is
the likelihood that the randomly selected sample
fluorescent light would have an average life of no
more than 280 days?
Here, sample mean = 280
population mean = 320
population std. deviation = 95
sample size = 10
Rupak Roy
 In excel:
1) calculate t- distance
t =(280-320)/(95 / 10 )
Alternatively, (280-320)/(95/ (10^0.5))
t = - 1.331
2) use the T-distance value in Excel with the following
formula
= t.dist (t-distance, degrees of freedom, TRUE)
= t.dist( -1.331,9,TRUE) = 0.10788 = 11%
Therefore there is 11% likelihood that the average life for randomly selected bulbs is less
than 280 days
ALTERNATIVELY,
= 1-(t.dist( t-distance , degree of freedom, TRUE))
= 1-(t.dist(-1.331,9,TRUE) = 1- 0.1078= 0.89= 89%
Therefore there is 89% likelihood that the average life for 10 randomly selected bulbs is
more than 280 days
Note:
Df = degrees of freedom = N -1 ( here in the example N (samples size) = 10)
Rupak Roy
 Note:
Why sometimes we use
1- normal.distribution
1- t.distribution
If we have notice in any distribution, cumulative for
normal.distribution
= norm.dist(….cumulative) where
cumulative is TRUE / FALSE
TRUE (function) means < and FALSE (function) = point
probability
And what if we want > there is no function, so for that we
manually have to feed
1 – appropitate.distribution
Rupak Roy
What if population Std.deviation is not available
 If population standard deviation is not known,
sample deviation can be substitute for the
population standard deviation.
 Therefore, S =sample deviation / sample size
Rupak Roy
What if population distribution is not
normal i.e. not normal distribution?
 We are using normal distribution to calculate
p-value for hypothesis testing but it is not
always necessary that every hypothesis test
must use a normal distribution.
 If we already know the type of distribution,
then it’s better to use directly the right
distribution for hypothesis testing.
 Remember the example from our previous
slide “Stage of Hypothesis” where in point
number 2 we have mentioned that we can
choose any appropriate types of distribution.
Rupak Roy
Recap:
“Stages of Hypothesis”
1. Select
Null Hypothesis (Ho): no difference of mileage if we
replace the battery in the car.
Alternative Hypothesis (Ha): difference in mileage if we
replace the battery in the car
2. Test Distribution: select appropriate distribution like
norm.dist, binom.dist with significance level: alpha (a)
5%
3. P-value ( example, p = 1- norm.dist(………) )=0.09
4. Result: failed to reject the null i.e. accepting the null
hypothesis and discarding the alternative hypothesis.
We will conclude that there is no difference in
mileage even if we replace the battery of the car.
Rupak Roy
Next
Directional Hypothesis test
like one tail test i.e. if you have strong reason to
believe in your hypothesis.
And more.
Rupak Roy
 To be continued.
Rupak Roy

More Related Content

What's hot

Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- IIIAkhila Prabhakaran
 
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis TestingFoundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis TestingAndres Lopez-Sepulcre
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionDexlab Analytics
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...The Stockker
 
Chapter 10
Chapter 10Chapter 10
Chapter 10bmcfad01
 
Business Statistics Chapter 9
Business Statistics Chapter 9Business Statistics Chapter 9
Business Statistics Chapter 9Lux PP
 
Math3010 week 5
Math3010 week 5Math3010 week 5
Math3010 week 5stanbridge
 
Probability and basic statistics with R
Probability and basic statistics with RProbability and basic statistics with R
Probability and basic statistics with RAlberto Labarga
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Dexlab Analytics
 
hypothesis test
 hypothesis test hypothesis test
hypothesis testUnsa Shakir
 
Interval estimation for proportions
Interval estimation for proportionsInterval estimation for proportions
Interval estimation for proportionsAditya Mahagaonkar
 

What's hot (20)

Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- III
 
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis TestingFoundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
 
Hypothesis and Test
Hypothesis and TestHypothesis and Test
Hypothesis and Test
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling Distribution
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
3 es timation-of_parameters[1]
3 es timation-of_parameters[1]3 es timation-of_parameters[1]
3 es timation-of_parameters[1]
 
Estimation Theory
Estimation TheoryEstimation Theory
Estimation Theory
 
Applied statistics part 1
Applied statistics part 1Applied statistics part 1
Applied statistics part 1
 
Applied statistics part 2
Applied statistics  part 2Applied statistics  part 2
Applied statistics part 2
 
02a one sample_t-test
02a one sample_t-test02a one sample_t-test
02a one sample_t-test
 
Business Statistics Chapter 9
Business Statistics Chapter 9Business Statistics Chapter 9
Business Statistics Chapter 9
 
Math3010 week 5
Math3010 week 5Math3010 week 5
Math3010 week 5
 
Probability and basic statistics with R
Probability and basic statistics with RProbability and basic statistics with R
Probability and basic statistics with R
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
 
hypothesis test
 hypothesis test hypothesis test
hypothesis test
 
Interval estimation for proportions
Interval estimation for proportionsInterval estimation for proportions
Interval estimation for proportions
 
Stats chapter 10
Stats chapter 10Stats chapter 10
Stats chapter 10
 
Stats chapter 11
Stats chapter 11Stats chapter 11
Stats chapter 11
 

Similar to Hypothesis Testing with ease

10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsisKaran Kukreja
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docxgerardkortney
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1yhchung
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inferenceDev Pandey
 
8. testing of hypothesis for variable &amp; attribute data
8. testing of hypothesis for variable &amp; attribute  data8. testing of hypothesis for variable &amp; attribute  data
8. testing of hypothesis for variable &amp; attribute dataHakeem-Ur- Rehman
 
TEST #1Perform the following two-tailed hypothesis test, using a.docx
TEST #1Perform the following two-tailed hypothesis test, using a.docxTEST #1Perform the following two-tailed hypothesis test, using a.docx
TEST #1Perform the following two-tailed hypothesis test, using a.docxmattinsonjanel
 
hypothesis testing - research oriented
hypothesis testing  -  research orientedhypothesis testing  -  research oriented
hypothesis testing - research orientedjalajaAnilkumar
 
Hypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.pptHypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.pptSolomonkiplimo
 
Chapter 11
Chapter 11Chapter 11
Chapter 11bmcfad01
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypothesesRajThakuri
 
HypothesisTesting.pptx
HypothesisTesting.pptxHypothesisTesting.pptx
HypothesisTesting.pptxPriyaVijay35
 
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docxfelicidaddinwoodie
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handoutfatima d
 
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesisTesting of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesissvmmcradonco1
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesisAmit Sharma
 
t test using spss
t test using spsst test using spss
t test using spssParag Shah
 

Similar to Hypothesis Testing with ease (20)

10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsis
 
STATISTIC ESTIMATION
STATISTIC ESTIMATIONSTATISTIC ESTIMATION
STATISTIC ESTIMATION
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference
 
8. testing of hypothesis for variable &amp; attribute data
8. testing of hypothesis for variable &amp; attribute  data8. testing of hypothesis for variable &amp; attribute  data
8. testing of hypothesis for variable &amp; attribute data
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
TEST #1Perform the following two-tailed hypothesis test, using a.docx
TEST #1Perform the following two-tailed hypothesis test, using a.docxTEST #1Perform the following two-tailed hypothesis test, using a.docx
TEST #1Perform the following two-tailed hypothesis test, using a.docx
 
hypothesis testing - research oriented
hypothesis testing  -  research orientedhypothesis testing  -  research oriented
hypothesis testing - research oriented
 
Hypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.pptHypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.ppt
 
Chapter 11
Chapter 11Chapter 11
Chapter 11
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypotheses
 
HypothesisTesting.pptx
HypothesisTesting.pptxHypothesisTesting.pptx
HypothesisTesting.pptx
 
HYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.pptHYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.ppt
 
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handout
 
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesisTesting of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesis
 
t test using spss
t test using spsst test using spss
t test using spss
 

More from Rupak Roy

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPRupak Roy
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPRupak Roy
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLPRupak Roy
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical StepsRupak Roy
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular ExpressionsRupak Roy
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining Rupak Roy
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSRupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations Rupak Roy
 

More from Rupak Roy (20)

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLP
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLP
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLP
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical Steps
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular Expressions
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMS
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations
 

Recently uploaded

Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Lviv Startup Club
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdftbatkhuu1
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...Suhani Kapoor
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaShree Krishna Exports
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfPaul Menig
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdftbatkhuu1
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 

Recently uploaded (20)

Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdf
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in India
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdf
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 

Hypothesis Testing with ease

  • 2. Definition  A hypothesis test is a statistical test that is used for determining whether there is enough evidence from the sample data to draw a conclusion for the entire population.  Two types of conclusions: 1. Null Hypothesis (Ho): is the hypothesis that any observe variation in a sample is simply because of random chance variation or we can say “the hypothesis - that there is no significant difference between the sample and the population, and any observed difference is due to randomness or experimental error.” Rupak Roy
  • 3. 2. Alternative Hypothesis ( Ha ): is the hypothesis testing that is contrary to the null hypothesis. Examples: If i replace the battery in my car, then my car will give better mileage? Null Hypothesis (Ho): no difference of mileage even if we replace the battery of the car. Alternative Hypothesis (Ha): difference in mileage if we replace the battery of the car Rupak Roy
  • 4. Significance level i.e. alpha a If the criteria used for rejecting the null hypothesis is less than 5% i.e. 0.05(p-value) then we will conclude that there is difference between sample and population. In other words we are rejecting the null hypothesis. The most standard value for rejecting null hypothesis is 0.05; however we can change depending on our need. Rupak Roy
  • 5. Example  If P (value) > Significance level (a), then we will accept the null hypothesis  Else P (value) < Significance level (a), then we will reject the null hypothesis Another term for saying we have rejected the null hypothesis is Statistically Significant result. Rupak Roy
  • 6. Stages of Hypothesis 1. Select Null hypothesis (Ho): no difference of mileage if we replace the battery of the car. Alternative Hypothesis(Ha): difference in mileage if we replace the battery of the car. 2. Test Distribution: select appropriate distribution like norm.dist, binom.dist, t-distribution with significance level: alpha (a) 5% i.e. 0.05 3. P-value ( example, p = 1- norm.dist(………)=0.09 4. Result: failed to reject the null i.e. accepting the null hypothesis and discarding the alternative hypothesis. We will conclude that there is no difference in mileage even if we replace the battery of the car. Rupak Roy
  • 7. Example A food production unit produces a particular product of an average weight of 10 lbs. with a standard deviation of 0.35 lbs. A random sample of 30 units found a slightly increase of average weight by 2 lbs. i.e. 12 lbs. So are there any issues in the product process? Significance level (a) = 0.05 Null Hypothesis (H0): There are no issues in the production process, what we found in the sample are due to random chance variation / randomness. Alternative Hypothesis (H1): There are some issues in the production process that is leading to the increase in weight per unit. Test Distribution: normal distribution Rupak Roy
  • 8. Example: continued In Excel, normal distribution = norm.dist( X, mean, Standard deviation, Cumulative) where, X =12, mean = 10, standard deviation = 0.35 and cumulative = TRUE/False Therefore, = 1- norm.dist (Because we need to calculate P-value for greater than 10 lbs.) =1- norm.dist (12,10,0.35,TRUE) = 5.5089E-09 i.e. less than 0.05 Since P-value is smaller than Significance level (a), we have failed to reject the H1 i.e. accepting the alternative hypothesis and discarding the Null hypothesis. In other words, we will conclude that there are some issues in the production process that leads to the increase in weight per unit of production. Rupak Roy
  • 9. Terminology Confidence level: is (1-significance level), it refers how confident you are about your conclusion. So, if null hypothesis is rejected at a 5% level of significance, then it means you are 95% (1- 0.05) confident about your conclusion. Again, if null hypothesis is rejected at a 1% level of significance, then it means you are 99% (1-0.01) confident about your conclusion. Rupak Roy
  • 10. Central Limit Theorem (CLT)  The central limit theorem says irrespective of the underlying population distribution, when you pick a multiple random samples from an underlying population with a sample size of at least 30 or above. The distribution of sample average will be normal even if the underlying population is not normal. Rupak Roy
  • 11. Hypothesis testing when sample size is low  Remember: Central limit theorem says if the sample size is sufficiently large, the distribution of sample averages will be normal irrespective of underlying population distribution or else it will follow t-distribution.  So to compute the probability if the sample size is less than 30, we will use t-dist to calculate the P-value.  And is also a continuous probability distribution.  As we can see in the diagram when the sample size increases to 30, the t-distribution approximates a normal distribution. Rupak Roy
  • 12. T-distance In order to calculate t- distribution we need t-distance i.e. the test statistics = Where, (sample mean – population mean) / ( S ) standard deviation/ (N ) sample size ) Rupak Roy
  • 13. Steps for T-distribution  Select null hypothesis (ho): alternative hypothesis (h1):  Significance level: 5%  Test distribution: t-distribution(calculate P-value)  Conclusion: reject the null hypothesis or accept the null hypothesis. Rupak Roy
  • 14. Example  The seller of a manufacturing company claims that an average fluorescent light stays for 320 days. The inspector randomly selects 10 fluorescent lights for inspection. The sampled last with an average of 280 days along with a standard deviation of 95. What is the likelihood that the randomly selected sample fluorescent light would have an average life of no more than 280 days? Here, sample mean = 280 population mean = 320 population std. deviation = 95 sample size = 10 Rupak Roy
  • 15.  In excel: 1) calculate t- distance t =(280-320)/(95 / 10 ) Alternatively, (280-320)/(95/ (10^0.5)) t = - 1.331 2) use the T-distance value in Excel with the following formula = t.dist (t-distance, degrees of freedom, TRUE) = t.dist( -1.331,9,TRUE) = 0.10788 = 11% Therefore there is 11% likelihood that the average life for randomly selected bulbs is less than 280 days ALTERNATIVELY, = 1-(t.dist( t-distance , degree of freedom, TRUE)) = 1-(t.dist(-1.331,9,TRUE) = 1- 0.1078= 0.89= 89% Therefore there is 89% likelihood that the average life for 10 randomly selected bulbs is more than 280 days Note: Df = degrees of freedom = N -1 ( here in the example N (samples size) = 10) Rupak Roy
  • 16.  Note: Why sometimes we use 1- normal.distribution 1- t.distribution If we have notice in any distribution, cumulative for normal.distribution = norm.dist(….cumulative) where cumulative is TRUE / FALSE TRUE (function) means < and FALSE (function) = point probability And what if we want > there is no function, so for that we manually have to feed 1 – appropitate.distribution Rupak Roy
  • 17. What if population Std.deviation is not available  If population standard deviation is not known, sample deviation can be substitute for the population standard deviation.  Therefore, S =sample deviation / sample size Rupak Roy
  • 18. What if population distribution is not normal i.e. not normal distribution?  We are using normal distribution to calculate p-value for hypothesis testing but it is not always necessary that every hypothesis test must use a normal distribution.  If we already know the type of distribution, then it’s better to use directly the right distribution for hypothesis testing.  Remember the example from our previous slide “Stage of Hypothesis” where in point number 2 we have mentioned that we can choose any appropriate types of distribution. Rupak Roy
  • 19. Recap: “Stages of Hypothesis” 1. Select Null Hypothesis (Ho): no difference of mileage if we replace the battery in the car. Alternative Hypothesis (Ha): difference in mileage if we replace the battery in the car 2. Test Distribution: select appropriate distribution like norm.dist, binom.dist with significance level: alpha (a) 5% 3. P-value ( example, p = 1- norm.dist(………) )=0.09 4. Result: failed to reject the null i.e. accepting the null hypothesis and discarding the alternative hypothesis. We will conclude that there is no difference in mileage even if we replace the battery of the car. Rupak Roy
  • 20. Next Directional Hypothesis test like one tail test i.e. if you have strong reason to believe in your hypothesis. And more. Rupak Roy
  • 21.  To be continued. Rupak Roy