Confidence Intervals––Exact Intervals, Jackknife, and BootstrapFrancesco Casalegno
••• Learn how to correctly compute and interprete Confidence Intervals •••
In this presentation:
▸ (mis)understanding the real meaning of confidence intervals
▸ exact methods for known distributions
▸ approximated methods for non-parametric statistics
▸ resampling techniques: jackknife and bootstrap
1. continuous probability distribution
2. Normal Distribution
3. Application of Normal Dist
4. Characteristics of normal distribution
5.Standard Normal Distribution
📺Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapFrancesco Casalegno
••• Learn how to correctly compute and interprete Confidence Intervals •••
In this presentation:
▸ (mis)understanding the real meaning of confidence intervals
▸ exact methods for known distributions
▸ approximated methods for non-parametric statistics
▸ resampling techniques: jackknife and bootstrap
1. continuous probability distribution
2. Normal Distribution
3. Application of Normal Dist
4. Characteristics of normal distribution
5.Standard Normal Distribution
📺Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 6: Normal Probability Distribution
6.5: Assessing Normality
Meaning of Probability, Experiment. Events – Simple and Compound, Sample Space, Probability of Events, Event Independent and Dependent Events, Probability Laws Bayes Theorem
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 9: Inferences from Two Samples
9.2: Two Means, Independent Samples
The test used to ascertain whether the difference between estimator & parameter or between two estimator are real or due to chance are called test of hypothesis.
T-test.
Chi-square (휒^2)- test.
F-Test.
ANOVA.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 4: Probability
4.3: Complements and Conditional Probability, and Bayes' Theorem
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 5: Discrete Probability Distribution
5.3 - Poisson Probability Distributions
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 6: Normal Probability Distribution
6.3: Sampling Distributions and Estimators
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...nszakir
Mathematics, Statistics, Introduction to Inference, Tests of Significance, The Reasoning of Tests of Significance, Stating Hypotheses, Test Statistics, P-values, Statistical Significance, Test for a Population Mean, Two-Sided Significance Tests and Confidence Intervals
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Elementary Statistics Practice Test 1
Module 1: Chapters 1-3
Chapter 1: Introduction to Statistics.
Chapter 2: Exploring Data with Tables and Graphs.
Chapter 3: Describing, Exploring, and Comparing Data.
This ppt covers problems on Stratified Random Sampling in this you will find Equal Number of Units from Each Stratum
Proportional Allocation
Neyman’s Allocation
Optimum Allocation
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 6: Normal Probability Distribution
6.5: Assessing Normality
Meaning of Probability, Experiment. Events – Simple and Compound, Sample Space, Probability of Events, Event Independent and Dependent Events, Probability Laws Bayes Theorem
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 9: Inferences from Two Samples
9.2: Two Means, Independent Samples
The test used to ascertain whether the difference between estimator & parameter or between two estimator are real or due to chance are called test of hypothesis.
T-test.
Chi-square (휒^2)- test.
F-Test.
ANOVA.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 4: Probability
4.3: Complements and Conditional Probability, and Bayes' Theorem
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 5: Discrete Probability Distribution
5.3 - Poisson Probability Distributions
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 6: Normal Probability Distribution
6.3: Sampling Distributions and Estimators
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...nszakir
Mathematics, Statistics, Introduction to Inference, Tests of Significance, The Reasoning of Tests of Significance, Stating Hypotheses, Test Statistics, P-values, Statistical Significance, Test for a Population Mean, Two-Sided Significance Tests and Confidence Intervals
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Elementary Statistics Practice Test 1
Module 1: Chapters 1-3
Chapter 1: Introduction to Statistics.
Chapter 2: Exploring Data with Tables and Graphs.
Chapter 3: Describing, Exploring, and Comparing Data.
This ppt covers problems on Stratified Random Sampling in this you will find Equal Number of Units from Each Stratum
Proportional Allocation
Neyman’s Allocation
Optimum Allocation
Statistics practice for finalBe sure to review the following.docxdessiechisomjj4
Statistics practice for final
Be sure to review the following and have this information handy when taking Final GHA:
· Calculating z alpha/2 and t alpha/2 on Tables II and IV
· Find sample size for estimating population mean. Formula 8.3 p. 321 OCR.
· Stating H0 and H1 claims about variation and about the mean. Chapter 9 OCR.
· Type I and Type II errors p. 345 OCR.
· Confidence Interval for difference between two population means. Chapter 10 OCR p. 428
· Pooled sample standard deviation. Chapter 10 OCR p. 397
· What do Chi-Square tests measure? How are their degrees of freedom calculated? Chapter 12 OCR
· Find F test statistic using One-Way ANOVA.xls Be sure to enable editing and change values to match your problem. One-Way ANOVA.xls
· Find equation of regression line used to predict. To do on Excel, go to a blank worksheet, enter x values in one column and their matching y values in another column. Click Insert – Select Scatterplot. Right click any one of the points (diamonds) on the graph. Left click “Add a Trendline.” Check “Display Equation on Chart” box. Regression equation will appear on chart. Try it here with No. 20 below.
Practice Problems
Chapter 8 Final Review
1) In which of the following situations is it reasonable to use the z-interval
procedure to obtain a confidence interval for the population mean?
Assume that the population standard deviation is known.
A. n = 10, the data contain no outliers, the variable under consideration is
not normally distributed.
B. n = 10, the variable under consideration is normally distributed.
C. n = 18, the data contain no outliers, the variable under consideration is
far from being normally distributed.
D. n = 18, the data contain outliers, the variable under consideration is
normally distributed.
Find the necessary sample size.
2) The weekly earnings of students in one age group are normally
distributed with a standard deviation of 10 dollars. A researcher wishes to
estimate the mean weekly earnings of students in this age group. Find the
sample size needed to assure with 95 percent confidence that the sample
mean will not differ from the population mean by more than 2 dollars.
Find the specified t-value.
3) For a t-curve with df = 6, find the two t-values that divide the area under
the curve into a middle 0.99 area and two outside areas of 0.005.
Provide an appropriate response.
4) Under what conditions would you choose to use the t-interval procedure
instead of the z-interval procedure in order to obtain a confidence
interval for a population mean? What conditions must be satisfied in
order to use the t-interval procedure?
CHAPTER 8 Answers
1) B
2) 97
3) -3.707, 3.707
4) When the population standard deviation is unknown, the t-interval procedure is used instead of the
z-interval procedure. The t-interval procedure works provided that the population is normally
distributed or the.
Similar to Lecture 15 - Hypothesis Testing (1).pdf (20)
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
2. Study Outline
Sampling Distributions and Central limit theorem
◦ populations and samples
◦ sampling distributions of the mean (std dev known)
◦ sampling distribution of the mean ( std dev unknown)
Point Estimate (PE) and Maximum Error in Estimation E
Maximum likelihood Estimate (MLE)
Confidence Interval Estimation (CI)
Hypothesis Testing
Fall 2021 DR. MAHA A. HASSANEIN 2
3. Recall: Confidence Interval
For large sample , the Confidence Interval for 𝜇 (𝜎 known):
ത
𝑋 − 𝑧𝛼
2
.
𝜎
𝑛
≤ 𝜇 ≤ ത
𝑋 + 𝑧𝛼
2
.
𝜎
𝑛
For small sample of a Normal population, the Confidence Interval for
𝜇 (𝜎 unknown):
ത
𝑋 − 𝑡𝛼
2
.
𝑆
𝑛
≤ 𝜇 ≤ ത
𝑋 + 𝑡𝛼
2
.
𝑆
𝑛
This gives a (1 − 𝛼 )100% confidence that this interval contains the
population mean 𝜇.
Fall 2021 DR. MAHA A. HASSANEIN 3
z-Interval
t-Interval
4. Hypothesis Testing
We start by a conjecture or postulate something about a system
◦ For example ….
◦ Drinking coffee increases risk of cancer.
◦ There is a difference in accuracy between two measuring devices
◦ The sales of computer and the temperature are independent variables
◦ Blood type is independent on eye color.
Put the conjecture in the form of a statistical Hypothesis
We use the sample data to provide evidence that either accept or reject
the Hypothesis.
Data-based Decision procedure
Fall 2021 DR. MAHA A. HASSANEIN 4
5. Hypothesis Testing
Definition
A statistical Hypothesis is an assertion or conjecture concerning one or
more populations.
In other words , a Hypothesis is a claim that we want to test
Fall 2021 DR. MAHA A. HASSANEIN 5
6. Illustrative Example
The proportion of defective parts in a lot is p=0.1 .
A random sample of size n =100 is tested for defective parts ; the
proportion of defective parts in the sample is found to be p=0.12
Does this Sample rejects the old believes that p=0.1 ? P>0.1 ? P<0.1
How certain are you?
If another random sample of size 100 , has p=0.2
Does this Sample rejects the old believes that p=0.1 ? How certain?
Fall 2021 DR. MAHA A. HASSANEIN 6
7. Types Of Hypothesis
Null Hypothesis , denoted by 𝐻0
Alternative Hypothesis , denoted by 𝐻1 ( or 𝐻𝑎)
Definitions.
𝐻1: Is question to be answered; the claim we wish to establish
“Research Question”
𝐻0: Is the logical complement of 𝐻1; we need to reject the claim
“Default” or “Current belief”
Fall 2021 DR. MAHA A. HASSANEIN 7
8. Illustrative Example
In a factory , it is established that the mean weight of a product is 5 gm.
A new supervisor , claims that the factory no longer produces this
product with 5 gm weight.
Random samples are tested , and the average weight of product is
recorded .
We define the Hypothesis as follows:
𝑯𝟎 : 𝜇 = 5 and 𝑯𝟏 : 𝜇 ≠ 5
The question :
Using the samples, do we reject 𝑯𝟎 and accept 𝑯𝟏?
Else, we fail to reject 𝑯𝟎
Fall 2021 DR. MAHA A. HASSANEIN 8
9. Decision of HT
Reject Null Hypothesis 𝑯𝟎: in favour of 𝐻1because of sufficient
evidence in the data
Fail to Reject 𝑯𝟎: because of insufficient evidence in the data
Type I error: Rejection of the 𝑯𝟎 when 𝑯𝟎 is true with probability 𝛼
Type II error: Nonrejection of 𝑯𝟎 when 𝑯𝟏 is true with probability
denoted by 𝛽
Fall 2021 DR. MAHA A. HASSANEIN 9
𝑯𝟎 is True 𝑯𝟎 is False
Do not reject 𝑯𝟎
Correct Decision Type II error
Reject 𝑯𝟎 Type I error
Correct Decision
10. Step 1 : Formulate 𝑯𝟎 and 𝑯𝟏
One-sided alternative Hypothesis
𝑯𝟎 : 𝜇 = 𝜇0 and 𝑯𝟏 : 𝜇 > 𝜇0
One-sided alternative Hypothesis
𝑯𝟎 : 𝜇 = 𝜇0 and 𝑯𝟏 : 𝜇 < 𝜇0
Two-sided alternative Hypothesis
𝑯𝟎 : 𝜇 = 𝜇0 and 𝑯𝟏 : 𝜇 ≠ 𝜇0
Fall 2021 DR. MAHA A. HASSANEIN 10
11. One-Tailed vs. Two-Tailed Tests
One-Tailed Hypothesis Test
Values of the test statistic leading to rejecting H0 fall in one tail of the
sampling distribution curve.
Ex: H0: µ =29 & Ha: µ >29
Ex: H0: µ =29 & Ha: µ <29
Two-Tailed Hypothesis Test
Values of the test statistic leading to rejecting H0 fall in both tails of the
sampling distribution curve.
Ex: H0: µ =29 & Ha: µ =29
Fall 2021 DR. MAHA A. HASSANEIN 11
12. Step 2:
Specify Level of Significance
Denoted by 𝛼
α= probability of making a Type I Error ( also called the level of
significance)
The researcher decides the maximum acceptable error (α). Traditionally
α=5% (95% confidence) or 1% ( 00% confidence) .
Fall 2021 DR. MAHA A. HASSANEIN 12
13. Step 3 - Rejection Region
Based on the sampling distribution of an appropriate statistic, we
construct a criterion for testing the null hypothesis against the given
alternative for level of significance α
For z-Interval ; compute zα or zα/2; known as 𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙
For t-Interval ; compute tα or tα/2; known as 𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙
Fall 2021 DR. MAHA A. HASSANEIN 13
14. Steps 4-5
4. We calculate from the data the value of the test statistic on which
the decision is to be based
The test Statistic
For z-Interval 𝑧 =
ҧ
𝑥−𝜇
𝜎/ 𝑛
For t-Interval 𝑡 =
ҧ
𝑥−𝜇
𝑠/ 𝑛
5. We decide whether to reject the null hypothesis or whether to fail to
reject it based on the critical region ( rejection region) .
Fall 2021 DR. MAHA A. HASSANEIN 14
15. Example
A product manager of a production line wants to introduce the
production line into a new market area.
A random sample of 400 houses in the new market area indicated that
the average income is $30,000 with a standard deviation of $8,000.
It is believed that the product line will be successful if the average
income per household is >$29,000.
Should the new product line be introduced with 5% error?
Fall 2021 DR. MAHA A. HASSANEIN 18
16. Solution
Population mean 𝜇 = 29,000, 𝑛 = 400
Sample mean and standard deviation ҧ
𝑥 = 30,000 and 𝑠 = 8000
Step 1: H0: µ = 29,000 and Ha: µ > 29,000
Step 2: The maximum acceptable error α = 5%
Step 3: Using the one-tailed Z-Test (rejection region right tail)
𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 (𝑓𝑜𝑟 𝛼 = 0.05) = 1.645 Critical Region: 𝑧 > 1.645
Step 4: the test statistic
𝑧 =
ҧ
𝑥−𝜇
𝑠/ 𝑛
=
30000−29000
8000/ 400
=
20
8
= 2.5
Step 5: z=2.5 > 1.645 =zcritical Reject H0 & Accept Ha
We can be confident enough to introduce the new product line based on the
mean household income information available.
Fall 2021 DR. MAHA A. HASSANEIN 19
17. One-Tailed Hypothesis Test
Fall 2021 DR. MAHA A. HASSANEIN 20
µ=0
µ=29,000
Zcritical
xcritical
Rejection Area (α=5%)
1.645
z
x
2.5
18. Example
A company stated that their product is on average 3 mm in diameter. An
employee claims that the average is no more equal to 3.0 mm .
A random sample of 100 product is measured indicated that the
average diameter is 3.2 mm with a standard deviation of 0.1
It is believed that the claim will be true if the average diameter is not
3.0 mm.
Should the claim be accepted about the product average with 99%
confidence level?
Fall 2021 DR. MAHA A. HASSANEIN 21
19. Solution
Population: 𝜇 = 3.0, Sample : 𝑛 = 100, ҧ
𝑥 = 3.2 and 𝑠 = 0.1
Step 1: H0: µ = 3 and Ha: µ≠ 3
Step 2: The maximum acceptable error α = 0.01 ; α/2=0.005
Step 3: Using the Two-tailed Z-Test (large sample size)
𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 (𝑓𝑜𝑟 𝛼/2 = 0.005) = 2.575
Critical Region:𝑧 < −2.575𝑎𝑛𝑑 𝑧 > 2.575
Step 4: the test statistic
𝑧 =
ҧ
𝑥−𝜇
𝑠/ 𝑛
=
3.2−3.0
0.1/ 100
=
2
10
= 0.2
Step 5: z=0.2 < 2.575 =zcritical Fail to reject H0
We can not be confident enough to reject the product average
diameter 3.0 based on the information available.
Fall 2021 DR. MAHA A. HASSANEIN 22
20. Two-Tailed Hypothesis Test
Fall 2021 DR. MAHA A. HASSANEIN 23
µ=3.0
__
Xcritical 2
__
Xcritical 1
Rejection Area 1
(α/2=2.5%)
Rejection Area 2
(α/2=2.5%)
Z_critical = 2.645
z=0.2
21. Important Values for Tests
Convidence Level
CL
Significant level
𝛼
𝑍𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙
One-tail Test Two-tail Test
0.9 0.10 1.28 1.645
0.95 0.05 1.645 1.96
0.98 0.02 2.05 2.33
0.99 0.01 2.33 2.645
Fall 2021 DR. MAHA A. HASSANEIN 24
22. Example 3
A manufacturer of fuses claims that with 20% overload, the fuse will
blow in 12.4 minutes on the average .
To test this claim, a sample of 20 fuses was subjected to a 20%
overload, the times to blow had a mean 10.63 minutes and a standard
deviation of 2.48 minutes.
If the data constitute a random sample from a normal population , do
they tend to support or refute the manufacturer’s claim ?
Fall 2021 DR. MAHA A. HASSANEIN 25
23. Solution
Population: normally distributed with mean 12.4 and 𝜎 𝑢𝑛𝑘𝑛𝑜𝑤𝑛
Sample : n=20 , ത
𝑋=10.63 , S=2.48
Step 1: H0: µ = 12.4 and Ha: µ < 12.4
Step 2: The maximum acceptable error α ( not given) ?
Step 3: Using the one-tailed t-Test, 𝜈 = 𝑛 − 1 =19 with 𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝛼 ?
Step 4: the test statistic with 𝜈 = 𝑛 − 1 =19 degrees of freedom, is given by
𝑡 =
ത
𝑋−𝜇
𝑆/ 𝑛
=
10.63−12.4
2.48/ 20
= −3.19
Step 5: From Table: For 𝜈 = 19 , since t = −3.19 < −2.861
𝑝 = 𝑃 𝑡 < −2.861 = 0.005(= 𝛼)
𝑤ℎ𝑒𝑟𝑒 2.861 is the max abs value for t at 𝜈 = 19 . P-value is a very small
probability, we conclude that the data tend to refute the manufacturer’s claim. i.e.
the mean blowing time of his fuses with a 20% overload is less than 12.4 .
Fall 2021 DR. MAHA A. HASSANEIN 26
25. P-values in Hypothesis Testing
- Specify the rejection region whether we are two tail, one-tail
- Calculate your test statistic and compare
Purpose of the p-value is no different than before using rejection
regions
Before we have α and select the critical region accordingly .
The maximum risk of making type I error is controlled
Now P-values approach merely reject or donot reject conclusion with
probability p.
P-values approach are more common in research papers , real problems
, and applied statistics.
Fall 2021 DR. MAHA A. HASSANEIN 28
26. Definition: P-Value
A P-value is the lowest level (of significance) at which the observed value of
the test statistic is significant
i.e It is the probability of obtaining a sample “more extreme” than the ones
observed in your data, assuming that the Null Hypothesis is true .
Statistical Testing P-value Approach :
1- State Null and Alternative Hypothesis
2- Choose an appropriate test statistic
3- Compute P-value based on computed value of test statistic
Draw conclusions based on system
If p-value < α, we can reject H0 and accept Ha.
If p-value > α, we cannot reject H0 nor accept Ha.
Fall 2021 DR. MAHA A. HASSANEIN 29
27. Example 4
A random sample of 100 deaths in Us during the past year showed an
average life span of 71.8 years
A population with 𝜎 = 8.9 years , does this indicate that the mean life
span today is greater than 70 years? Use a 0.05 level of significance.
Fall 2021 DR. MAHA A. HASSANEIN 30
28. Solution
Population: 𝜇 = 70, 𝜎 = 8.9;
Sample : 𝑛 = 100, ҧ
𝑥 = 71.8
Step 1: H0: µ = 70 and H1: µ> 70
Step 2: α = 0.05 ;
Step 3: Using the one-tailed Z-Test
𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 (𝑓𝑜𝑟 𝛼 = 0.05) = 1.645 Critical Region: 𝑧 > 1.645
Step 4: the test statistic
𝑧 =
ҧ
𝑥−𝜇
𝑠/ 𝑛
=
71.8−70
8.9/ 100
= 2.02
Step 5: P=P(z>2.02)=0.0217 < 0.05= α Reject H0 in favor of H1
The evidence in favor of H1 is even stronger than that suggested by
a 0.05 level of significance
Fall 2021 DR. MAHA A. HASSANEIN 31
P
2.02
1.645
29. Reference
Text book
Chapter 7.
sec 7.4 Tests of Hypothesis
Sec. 7.5 Null Hypotheses and Tests of
Hypotheses
Sec. 7.6 Hypotheses concerning one mean
Sec. 7.7 The relation between Tests and
Confidence Intervals
Fall 2021 DR. MAHA A. HASSANEIN 32
30. Thank you for your attention
Maha
Fall 2021 DR. MAHA A. HASSANEIN 33