SlideShare a Scribd company logo
1 of 54
Download to read offline
ONLINE
ONLINE
BIOSTATISTICS
Lecture - 2
2. Normal Distribution:
• Distribution are arranged in linear fashion and vary
continuously on both the sides from the central value
• Probability distribution is symmetric about the mean
Examples of Normal distribution:
1. Measures of size of living tissue:
Length, height, skin area, weight
2. The length of inert appendages:
Hair, claws, nails, teeth
3. Certain physiological
measurements, such as blood
pressure of adult humans.
The curve plotted with the help of data of normal distribution presents a bell
shaped symmetrical curve called "normal distribution curve". This curve is also
known as the "Gaussian curve".
Mean = Median = Mode
3. Poisson Distribution :
It expresses the probability of a given number of events occurring in a fixed interval of
time or space, if these events occur with a known constant mean rate and
independently of the time since the last event
Eg. Telephone call per hour
Online order per day
Number of radioactive decay events per second
Number of plants per meter square
Number of mutation in DNA strand per unit length
The proportion of cells that will be infected at a given MOI
The number of deaths per year in a given age group.
The number of bacteria in a certain amount of liquid.
Conditions for Poisson Distribution:
• An event can occur any number of times during a time period.
• Events occur independently. In other words, if an event
occurs, it does not affect the probability of another event
occurring in the same time period.
• The rate of occurrence is constant; that is, the rate does not
change based on time.
• The probability of an event occurring is proportional to the
length of the time period. For example, it should be twice as
likely for an event to occur in a 2 hour time period than it is
for an event to occur in a 1 hour period.
• Let X be the discrete random variable that represents the
number of events observed over a given time period.
• Let λ is the average number of events per interval
• k is the number of times an event occurs in an interval
(k can take values 0, 1, 2, ....)
where e is Euler's number.
There is no upper limit on
the value of k for this
formula, though the
probability rapidly
approaches 0 as k increases
IFAS ONLINE
Apply Your Mind
Given below are names of statistical distribution (Column I) and their characteristic
features (Column II)
Column I Column II
A. Binomial distribution i. Each observation represents one of two outcomes
(success or failure)
B. Poisson distribution ii. Probability that is symmetric about the mean
C. Normal distribution iii Probability of a given number of events happening in a
fixed interval of time
Which one of the following represents a correct match between columns I and II?
(DEC 2018)
(1) A - (ii) ; B - (i) ; C - (iii)
(2) A - (i) ; B - (ii) ; C - (iii)
(3) A - (i); B - (iii) ; C - (ii)
(4) A - (iii) ; B - (ii); C - (i)
IFAS ONLINE
Apply Your Mind
Which one of the following statements regarding normal distribution is NOT correct?
(JUNE 2019)
(1) It is symmetric around the mean
(2) It is symmetric around the median
(3) It is symmetric around the variance.
(4) It is symmetric around the mode.
IFAS ONLINE
Apply Your Mind
A weed is assumed to be dispersed randomly in a meadow. What statistical distribution
will describe the dispersion correctly? (DEC 2013)
(1) Binomial (2) Negative Binomial
(3) Poisson (4) Normal
IFAS ONLINE
Apply Your Mind
A researcher samples n individuals randomly from a population of blackbuck and
identifies their sex. The number of females in the sample follows (DEC 2019)
(1) an exponential distribution
(2) a binomial distribution
(3) a Poisson distribution
(4) a normal distribution
SAMPLING DISTRIBUTION
• First, you need to identify the target
population of your research.
• The population is the entire group
that you want to draw conclusions
about.
• The sample is the specific group of
individuals that you will collect data
from.
• The sampling frame is the actual list
of individuals that the sample will
be drawn from.
When you conduct
research about a
group of people, it’s
rarely possible to
collect data from
every person in that
group. Instead, you
select a sample.
There are two types of sampling methods:
Probability sampling involves random
selection, allowing you to make statistical
inferences about the whole group.
Non-probability sampling involves non-
random selection based on convenience or
other criteria, allowing you to easily collect
initial data.
Probability sampling methods
Probability sampling means that every member of the
population has a chance of being selected.
There are four main types of probability sample
1. Simple Random Sample:
A random sample is a sample where each item of the
population has an equal chance of being included in the
sample
You want to select a
simple random sample
of 100 employees of
Company X. You
assign a number to
every employee in the
company database
from 1 to 1000, and
use a random number
generator to select 100
numbers.
2. Systematic Random sampling
• Population is large, scattered
and not homogeneous.
• Samples are selected at regular
intervals from the population
All employees of the company
are listed in alphabetical order.
From the first 10 numbers, you
randomly select a starting
point: number 2. From number
2 onwards, every 3rd person
on the list is selected (5, 8, 11,
14, and so on), and you end
up with a sample of 100
people.
3. Stratified Random sampling:
Used when the population is not homogeneous and large
Population is divided into groups/clusters and within each
group cluster, a probability sample is selected from it.
The company has 800 female
employees and 200 male
employees. You want to ensure
that the sample reflects the
gender balance of the company,
so you sort the population into
two strata based on gender.
Then you use random sampling
on each group, selecting 80
women and 20 men, which
gives you a representative
sample of 100 people.
4. Clustered sampling:
Population is divided into groups/clusters and a sample of
group cluster is chosen using probability method.
The company has offices
in 10 cities across the
country (all with roughly
the same number of
employees in similar
roles). You don’t have the
capacity to travel to every
office to collect your data,
so you use random
sampling to select 3 offices
– these are your clusters.
Non-random sampling
• In a non-probability sample,
individuals are selected
based on non-random
criteria, and not every
individual has a chance of
being included.
• This type of sample is easier
and cheaper to access, but
you can’t use it to make valid
statistical inferences about
the whole population.
Appropriate for
exploratory
and qualitative research.
The aim is not to test
a hypothesis about a
broad population, but to
develop an initial
understanding
1. Convenience sampling
• It includes the individuals who happen
to be most accessible to the
researcher.
Research about
opinions about student
support services in
your university, so after
each of your classes,
you ask your fellow
students to complete
a survey on the topic.
This is an easy and inexpensive way to
gather initial data, but there is no way
to tell if the sample is representative of
the population, so it can’t produce
generalizable results.
2. Opportunity/Voluntary response
sampling:
Only participants available and willing to participate are used.
3. Purposive sampling
• This type of sampling
involves the researcher
using their judgment to
select a sample that is
most useful to the
purposes of the
research.
You want to know more
about the opinions and
experiences of failed
students at your college
4. Snowball sampling
• If the population is hard to
access, snowball sampling can
be used to recruit participants
via other participants.
• The number of people you
have access to “snowballs” as
you get in contact with more
people.
You are researching
experiences of homelessness
in your city. there is no list of all
homeless, probability sampling
isn’t possible.
One person who agrees to
participate in the research, and
he puts you in contact with
other homeless people that she
knows in the area.
IFAS ONLINE
Apply Your Mind
Given below are sampling techniques and their features. Which one of the following
options correctly matches sampling techniques with their features? (DEC 2019 ASSAM)
(1) A-(ii); B-(i); C-(iv); D-(iii)
(2) A-(ii); B-(iv); C-(iii); D-(i)
(3) A-(i); B-(iv); C-(iii); D-(ii)
(4) A-(i); B-(iv); C-(ii); D-(iii)
PARAMETRIC AND NON
PARAMETRIC TESTS
Non-parametric tests:
• Tests don’t require that your data follow the normal
distribution.
• They’re also known as distribution-free tests and can
provide benefits in certain situations (nominal/ordinal).
• Used when individual variability among the study groups is
high
Example: Chi square test,
Spearman Correlation,
Kruskal Wallis Test,
Mann-Whitney U test,
Mann-Kendall’s test
Parametric tests
• Make assumptions about the parameters of the population
distribution from which the sample is drawn.
• These test assume that the population data are normally
distributed.
• Parametric test is more powerful as compared to non-
parametric test.
• Results can be significantly affected by outliers in a
parametric test.
Example:
• Paired/unpaired t-test
• ANOVA
• Pearson correlation
• Parametric
• t-test is used for differences in a continuous
dependent variable between two groups.
• An ANOVA assesses for difference in a continuous
dependent variable between more than two groups
• Non-parametric:
• Mann-Whitney U test is used for differences in a
continuous dependent variable between two groups.
• Kruskal-Wallis test assesses for difference in a
continuous dependent variable between more than
two groups
Test to assess differences in continuous dependent
variable in two or more groups
Test to assess strength of association between two variables
• Parametric test:
• Pearson correlation is used when assessing the relationship
between two continuous variables.
• Non-parametric test:
• Spearman correlation is appropriate when at least one of the
variables is measured on an ordinal scale.
• Kendall rank correlation is a non-parametric test that measures the
strength of dependence between two variables
IFAS ONLINE
Apply Your Mind
Choose the correct answer from the statements indicated below: (DEC 2018)
(1) Chi square test is parametric.
(2) Non-parametric test assumes normal distribution.
(3) Results can be significantly affected by outliers in a parametric test.
(4) Non-parametric test is more powerful as compared to parametric test.
IFAS ONLINE
Apply Your Mind
Two groups (Control, Treated) are to be compared to test the effect of a treatment.
Since individual variability is high in both groups, the appropriate statistical test to use is
(JUNE 2015)
(1) Analysis of variance.
(2) Kendall's test.
(3) Student's t-test.
(4) Mann-Whitney U-test.
IFAS ONLINE
Apply Your Mind
The frequency distribution of tree heights in two forest areas with different annual
rainfall are given Which of the following statistical analysis will you choose to test
whether rainfall has an effect on tree heights? (JUNE 2013)
(1) t-test for comparison of means.
(2) A non-parametric comparison of the two groups
(3) Correlation analysis of rainfall and mean tree heights.
(4) Regression of tree heights on rainfall.
IFAS ONLINE
Apply Your Mind
The use of Kruskal Wallis test is most appropriate in which of these cases? (JUNE 2016)
(1) There are more than two groups and each group is normally distributed.
(2) There are more than two groups and the distribution in each group is not normal.
(3) There are two groups and each group is normally distributed.
(4) There are two groups and the distribution in each group is not normal
CONFIDENCE INTERVAL
• A Confidence Interval is a range of values we are
fairly sure our true value lies in.
EXAMPLE: Average Height of humans
• We measure the heights of 100 randomly
chosen men, and get a mean height of
175cm.
• We calculate standard deviation and it comes
out to be 20 cm.
• The 95% Confidence Interval is: 175cm ±
3.92cm.
• This says the true mean of ALL men (if we
measure) is likely to be between 171.08cm
and 178.92cm in 95 % of cases.
• So there is a 1-in-20 chance (5%) that our
Confidence Interval does NOT include the
true mean.
How to calculate CI
Step 1:
• Start with the number of
observations (n=100)
• Calculate the mean (μ=175 cm)
• Calculate the standard
deviation (σ=20 cm)
Step 2:
Decide what Confidence Interval we want:
95% or 99% are common choices.
Then find the "Z" value for that Confidence
Interval here:
• Step 3: use that Z value in this formula for the
Confidence Interval
Observation (n) = 100
Mean (μ) = 175 cm
Standard deviation (σ) = 20
Z (for 95% CI) = 1.96
92
.
3
175
2
96
.
1
175
10
20
96
.
1
175
100
20
96
.
1
175









IFAS ONLINE
Apply Your Mind
The mean and standard deviation of serum cholesterol in a population of senior citizens
are assumed to be 200 and 24mg/dl, respectively. In a random sample of 36 senior
citizens, what values of cholesterol (to the nearest whole number) should lead to
rejection of the null hypothesis at 95% confidence level? (JUNE 2015)
(1) above 224
(2) above 248
(3) below 176 and above 224
(4) below 192 and above 208
IFAS ONLINE
Apply Your Mind
The number of seeds in the fruit of a plant species, Ho : µ=30. A random sample of 9
fruits gives the mean number of seeds as 24 with a standard deviation of 6.12.
(a) What are the confidence limits for the sample mean?
(b) Would your reject or accept the null hypothesis at 95% confidence level?
(DEC 2014)
(1) (a) 18 and 30, (b) reject the hypothesis
(2) (a) 20 and 28, (b) reject the hypothesis
(3) (a) 20 and 28, (b) accept the hypothesis
(4) (a) 18 and 30, (b) accept the hypothesis
ERRORS AND LEVEL OF
SIGNIFICANCE
Null Hypothesis:
• The null hypothesis, H0 is the commonly accepted
fact;.
• It is the opposite of the alternate hypothesis (H1).
• Researchers work to reject, nullify or disprove the
null hypothesis.
• Researchers come up with an alternate hypothesis,
one that they think explains a phenomenon, and
then work to reject the null hypothesis.
Example: Null hypothesis, H0: Corona Virus will not be
effective in causing disease at high temperature
Alternate hypothesis H1 : Corona can cause disease even at
high temperature
Hypothesis Testing
• Hypothesis: explanation based on limited evidences
• No hypothesis is 100 % true, unless proven
• Always chance of drawing incorrect conclusions
(errors)
Two types of errors in testing of hypothesis
1. Type I error: Rejection of null hypothesis which is true.
2. Type II error: Acceptance of null hypothesis which is false.
Reject H0 Accept H0
H0 is True Type I error Correct Decision
H0 is False Correct Decision Type II error
Research on
drug
Reality
Beneficial
Harmful
Beneficial Harmful
Type II error
Type I error
OK
OK
Significance level (p-value):
• The level of significance (p-value) is used in
hypothesis testing to help you support or reject the
null hypothesis.
• The p-value is the evidence against a null hypothesis.
• The smaller the p-value, the strong the evidence
that you should reject the null hypothesis.
• Thus, p-value is the probability of committing type I
error
• The smaller the p-value, lesser is the probability of
the error (Type I) of rejecting a true null hypothesis
If p > 0.10 → “not significant”
If p ≤ 0.10 → “marginally significant”
If p ≤ 0.05 → “significant”
If p ≤ 0.01 → “highly significant.”
Significance level (p): Error for Rejection of True Null Hypothesis:
p ≤ 0.05 : This means that the probability of accepting a
true alternative hypothesis is 95% and committing type I
error is 5%)
IFAS ONLINE
Apply Your Mind
In the following statement taken from a research paper, what does p in the parenthesis
stands for? (DEC 2011)
“The mean temperature of this region now is significantly higher than the one 50
years ago (p<0.05, t-test)”
(1) Ratio of the mean temperature of the two times periods tested
(2) Probability of the error of rejecting a true null hypothesis
(3) Probability of the error of accepting a false null hypothesis
(4) Probability of the t-test being effective in detecting significant difference in the
mean annual temperatures of the two periods
IFAS ONLINE
THANKS

More Related Content

Similar to Biostats Lec-2.pdf

COT INQUIRIES Population and sampling.pptx
COT INQUIRIES Population and sampling.pptxCOT INQUIRIES Population and sampling.pptx
COT INQUIRIES Population and sampling.pptxFlorissaConise3
 
Sampling by dr najeeb memon
Sampling  by dr najeeb memonSampling  by dr najeeb memon
Sampling by dr najeeb memonmuhammed najeeb
 
sampling and statiscal inference
sampling and statiscal inferencesampling and statiscal inference
sampling and statiscal inferenceShruti MISHRA
 
Sampling Technique - Anish
Sampling Technique - AnishSampling Technique - Anish
Sampling Technique - AnishAnish Kumar
 
sampling technique
sampling techniquesampling technique
sampling techniqueAnish Kumar
 
Sampling.pptx
Sampling.pptxSampling.pptx
Sampling.pptxheencomm
 
DATA ANALYTICS ASSIGNMENT.pptx
DATA ANALYTICS ASSIGNMENT.pptxDATA ANALYTICS ASSIGNMENT.pptx
DATA ANALYTICS ASSIGNMENT.pptxSamirkumar497189
 
Business research methodology
Business research methodology Business research methodology
Business research methodology Jh Labonno
 
Chapter 7 sampling methods
Chapter 7 sampling methodsChapter 7 sampling methods
Chapter 7 sampling methodsNiranjanHN3
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxSamirkumar497189
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfkobra22
 
SAMPLING METHODS 5.pptx research community health
SAMPLING METHODS 5.pptx research community healthSAMPLING METHODS 5.pptx research community health
SAMPLING METHODS 5.pptx research community healthakoeljames8543
 

Similar to Biostats Lec-2.pdf (20)

COT INQUIRIES Population and sampling.pptx
COT INQUIRIES Population and sampling.pptxCOT INQUIRIES Population and sampling.pptx
COT INQUIRIES Population and sampling.pptx
 
Sampling
SamplingSampling
Sampling
 
Sampling
SamplingSampling
Sampling
 
Sampling by dr najeeb memon
Sampling  by dr najeeb memonSampling  by dr najeeb memon
Sampling by dr najeeb memon
 
Population &Sample
Population &SamplePopulation &Sample
Population &Sample
 
sampling and statiscal inference
sampling and statiscal inferencesampling and statiscal inference
sampling and statiscal inference
 
samplingdesignppt.pdf
samplingdesignppt.pdfsamplingdesignppt.pdf
samplingdesignppt.pdf
 
Sampling design ppt
Sampling design pptSampling design ppt
Sampling design ppt
 
Sampling Techniques.docx
Sampling Techniques.docxSampling Techniques.docx
Sampling Techniques.docx
 
Sampling Technique - Anish
Sampling Technique - AnishSampling Technique - Anish
Sampling Technique - Anish
 
sampling technique
sampling techniquesampling technique
sampling technique
 
Sampling.pptx
Sampling.pptxSampling.pptx
Sampling.pptx
 
DATA ANALYTICS ASSIGNMENT.pptx
DATA ANALYTICS ASSIGNMENT.pptxDATA ANALYTICS ASSIGNMENT.pptx
DATA ANALYTICS ASSIGNMENT.pptx
 
Brm unit iii - cheet sheet
Brm   unit iii - cheet sheetBrm   unit iii - cheet sheet
Brm unit iii - cheet sheet
 
Business research methodology
Business research methodology Business research methodology
Business research methodology
 
Chapter 7 sampling methods
Chapter 7 sampling methodsChapter 7 sampling methods
Chapter 7 sampling methods
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdf
 
Sampling.pptx
Sampling.pptxSampling.pptx
Sampling.pptx
 
SAMPLING METHODS 5.pptx research community health
SAMPLING METHODS 5.pptx research community healthSAMPLING METHODS 5.pptx research community health
SAMPLING METHODS 5.pptx research community health
 

Recently uploaded

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 

Recently uploaded (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 

Biostats Lec-2.pdf

  • 2. 2. Normal Distribution: • Distribution are arranged in linear fashion and vary continuously on both the sides from the central value • Probability distribution is symmetric about the mean
  • 3. Examples of Normal distribution: 1. Measures of size of living tissue: Length, height, skin area, weight 2. The length of inert appendages: Hair, claws, nails, teeth 3. Certain physiological measurements, such as blood pressure of adult humans.
  • 4. The curve plotted with the help of data of normal distribution presents a bell shaped symmetrical curve called "normal distribution curve". This curve is also known as the "Gaussian curve". Mean = Median = Mode
  • 5. 3. Poisson Distribution : It expresses the probability of a given number of events occurring in a fixed interval of time or space, if these events occur with a known constant mean rate and independently of the time since the last event Eg. Telephone call per hour Online order per day Number of radioactive decay events per second Number of plants per meter square Number of mutation in DNA strand per unit length The proportion of cells that will be infected at a given MOI The number of deaths per year in a given age group. The number of bacteria in a certain amount of liquid.
  • 6. Conditions for Poisson Distribution: • An event can occur any number of times during a time period. • Events occur independently. In other words, if an event occurs, it does not affect the probability of another event occurring in the same time period. • The rate of occurrence is constant; that is, the rate does not change based on time. • The probability of an event occurring is proportional to the length of the time period. For example, it should be twice as likely for an event to occur in a 2 hour time period than it is for an event to occur in a 1 hour period.
  • 7. • Let X be the discrete random variable that represents the number of events observed over a given time period. • Let λ is the average number of events per interval • k is the number of times an event occurs in an interval (k can take values 0, 1, 2, ....) where e is Euler's number. There is no upper limit on the value of k for this formula, though the probability rapidly approaches 0 as k increases
  • 8. IFAS ONLINE Apply Your Mind Given below are names of statistical distribution (Column I) and their characteristic features (Column II) Column I Column II A. Binomial distribution i. Each observation represents one of two outcomes (success or failure) B. Poisson distribution ii. Probability that is symmetric about the mean C. Normal distribution iii Probability of a given number of events happening in a fixed interval of time Which one of the following represents a correct match between columns I and II? (DEC 2018) (1) A - (ii) ; B - (i) ; C - (iii) (2) A - (i) ; B - (ii) ; C - (iii) (3) A - (i); B - (iii) ; C - (ii) (4) A - (iii) ; B - (ii); C - (i)
  • 9. IFAS ONLINE Apply Your Mind Which one of the following statements regarding normal distribution is NOT correct? (JUNE 2019) (1) It is symmetric around the mean (2) It is symmetric around the median (3) It is symmetric around the variance. (4) It is symmetric around the mode.
  • 10. IFAS ONLINE Apply Your Mind A weed is assumed to be dispersed randomly in a meadow. What statistical distribution will describe the dispersion correctly? (DEC 2013) (1) Binomial (2) Negative Binomial (3) Poisson (4) Normal
  • 11. IFAS ONLINE Apply Your Mind A researcher samples n individuals randomly from a population of blackbuck and identifies their sex. The number of females in the sample follows (DEC 2019) (1) an exponential distribution (2) a binomial distribution (3) a Poisson distribution (4) a normal distribution
  • 13. • First, you need to identify the target population of your research. • The population is the entire group that you want to draw conclusions about. • The sample is the specific group of individuals that you will collect data from. • The sampling frame is the actual list of individuals that the sample will be drawn from.
  • 14. When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample. There are two types of sampling methods: Probability sampling involves random selection, allowing you to make statistical inferences about the whole group. Non-probability sampling involves non- random selection based on convenience or other criteria, allowing you to easily collect initial data.
  • 15. Probability sampling methods Probability sampling means that every member of the population has a chance of being selected. There are four main types of probability sample
  • 16. 1. Simple Random Sample: A random sample is a sample where each item of the population has an equal chance of being included in the sample You want to select a simple random sample of 100 employees of Company X. You assign a number to every employee in the company database from 1 to 1000, and use a random number generator to select 100 numbers.
  • 17. 2. Systematic Random sampling • Population is large, scattered and not homogeneous. • Samples are selected at regular intervals from the population All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly select a starting point: number 2. From number 2 onwards, every 3rd person on the list is selected (5, 8, 11, 14, and so on), and you end up with a sample of 100 people.
  • 18. 3. Stratified Random sampling: Used when the population is not homogeneous and large Population is divided into groups/clusters and within each group cluster, a probability sample is selected from it. The company has 800 female employees and 200 male employees. You want to ensure that the sample reflects the gender balance of the company, so you sort the population into two strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a representative sample of 100 people.
  • 19. 4. Clustered sampling: Population is divided into groups/clusters and a sample of group cluster is chosen using probability method. The company has offices in 10 cities across the country (all with roughly the same number of employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random sampling to select 3 offices – these are your clusters.
  • 20. Non-random sampling • In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a chance of being included. • This type of sample is easier and cheaper to access, but you can’t use it to make valid statistical inferences about the whole population. Appropriate for exploratory and qualitative research. The aim is not to test a hypothesis about a broad population, but to develop an initial understanding
  • 21. 1. Convenience sampling • It includes the individuals who happen to be most accessible to the researcher. Research about opinions about student support services in your university, so after each of your classes, you ask your fellow students to complete a survey on the topic. This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results.
  • 22. 2. Opportunity/Voluntary response sampling: Only participants available and willing to participate are used.
  • 23. 3. Purposive sampling • This type of sampling involves the researcher using their judgment to select a sample that is most useful to the purposes of the research. You want to know more about the opinions and experiences of failed students at your college
  • 24. 4. Snowball sampling • If the population is hard to access, snowball sampling can be used to recruit participants via other participants. • The number of people you have access to “snowballs” as you get in contact with more people. You are researching experiences of homelessness in your city. there is no list of all homeless, probability sampling isn’t possible. One person who agrees to participate in the research, and he puts you in contact with other homeless people that she knows in the area.
  • 25. IFAS ONLINE Apply Your Mind Given below are sampling techniques and their features. Which one of the following options correctly matches sampling techniques with their features? (DEC 2019 ASSAM) (1) A-(ii); B-(i); C-(iv); D-(iii) (2) A-(ii); B-(iv); C-(iii); D-(i) (3) A-(i); B-(iv); C-(iii); D-(ii) (4) A-(i); B-(iv); C-(ii); D-(iii)
  • 27. Non-parametric tests: • Tests don’t require that your data follow the normal distribution. • They’re also known as distribution-free tests and can provide benefits in certain situations (nominal/ordinal). • Used when individual variability among the study groups is high Example: Chi square test, Spearman Correlation, Kruskal Wallis Test, Mann-Whitney U test, Mann-Kendall’s test
  • 28. Parametric tests • Make assumptions about the parameters of the population distribution from which the sample is drawn. • These test assume that the population data are normally distributed. • Parametric test is more powerful as compared to non- parametric test. • Results can be significantly affected by outliers in a parametric test. Example: • Paired/unpaired t-test • ANOVA • Pearson correlation
  • 29. • Parametric • t-test is used for differences in a continuous dependent variable between two groups. • An ANOVA assesses for difference in a continuous dependent variable between more than two groups • Non-parametric: • Mann-Whitney U test is used for differences in a continuous dependent variable between two groups. • Kruskal-Wallis test assesses for difference in a continuous dependent variable between more than two groups Test to assess differences in continuous dependent variable in two or more groups
  • 30. Test to assess strength of association between two variables • Parametric test: • Pearson correlation is used when assessing the relationship between two continuous variables. • Non-parametric test: • Spearman correlation is appropriate when at least one of the variables is measured on an ordinal scale. • Kendall rank correlation is a non-parametric test that measures the strength of dependence between two variables
  • 31. IFAS ONLINE Apply Your Mind Choose the correct answer from the statements indicated below: (DEC 2018) (1) Chi square test is parametric. (2) Non-parametric test assumes normal distribution. (3) Results can be significantly affected by outliers in a parametric test. (4) Non-parametric test is more powerful as compared to parametric test.
  • 32. IFAS ONLINE Apply Your Mind Two groups (Control, Treated) are to be compared to test the effect of a treatment. Since individual variability is high in both groups, the appropriate statistical test to use is (JUNE 2015) (1) Analysis of variance. (2) Kendall's test. (3) Student's t-test. (4) Mann-Whitney U-test.
  • 33. IFAS ONLINE Apply Your Mind The frequency distribution of tree heights in two forest areas with different annual rainfall are given Which of the following statistical analysis will you choose to test whether rainfall has an effect on tree heights? (JUNE 2013) (1) t-test for comparison of means. (2) A non-parametric comparison of the two groups (3) Correlation analysis of rainfall and mean tree heights. (4) Regression of tree heights on rainfall.
  • 34. IFAS ONLINE Apply Your Mind The use of Kruskal Wallis test is most appropriate in which of these cases? (JUNE 2016) (1) There are more than two groups and each group is normally distributed. (2) There are more than two groups and the distribution in each group is not normal. (3) There are two groups and each group is normally distributed. (4) There are two groups and the distribution in each group is not normal
  • 35. CONFIDENCE INTERVAL • A Confidence Interval is a range of values we are fairly sure our true value lies in.
  • 36. EXAMPLE: Average Height of humans • We measure the heights of 100 randomly chosen men, and get a mean height of 175cm. • We calculate standard deviation and it comes out to be 20 cm. • The 95% Confidence Interval is: 175cm ± 3.92cm. • This says the true mean of ALL men (if we measure) is likely to be between 171.08cm and 178.92cm in 95 % of cases. • So there is a 1-in-20 chance (5%) that our Confidence Interval does NOT include the true mean.
  • 37. How to calculate CI Step 1: • Start with the number of observations (n=100) • Calculate the mean (μ=175 cm) • Calculate the standard deviation (σ=20 cm)
  • 38. Step 2: Decide what Confidence Interval we want: 95% or 99% are common choices. Then find the "Z" value for that Confidence Interval here:
  • 39.
  • 40.
  • 41.
  • 42. • Step 3: use that Z value in this formula for the Confidence Interval
  • 43. Observation (n) = 100 Mean (μ) = 175 cm Standard deviation (σ) = 20 Z (for 95% CI) = 1.96 92 . 3 175 2 96 . 1 175 10 20 96 . 1 175 100 20 96 . 1 175         
  • 44. IFAS ONLINE Apply Your Mind The mean and standard deviation of serum cholesterol in a population of senior citizens are assumed to be 200 and 24mg/dl, respectively. In a random sample of 36 senior citizens, what values of cholesterol (to the nearest whole number) should lead to rejection of the null hypothesis at 95% confidence level? (JUNE 2015) (1) above 224 (2) above 248 (3) below 176 and above 224 (4) below 192 and above 208
  • 45. IFAS ONLINE Apply Your Mind The number of seeds in the fruit of a plant species, Ho : µ=30. A random sample of 9 fruits gives the mean number of seeds as 24 with a standard deviation of 6.12. (a) What are the confidence limits for the sample mean? (b) Would your reject or accept the null hypothesis at 95% confidence level? (DEC 2014) (1) (a) 18 and 30, (b) reject the hypothesis (2) (a) 20 and 28, (b) reject the hypothesis (3) (a) 20 and 28, (b) accept the hypothesis (4) (a) 18 and 30, (b) accept the hypothesis
  • 46. ERRORS AND LEVEL OF SIGNIFICANCE
  • 47. Null Hypothesis: • The null hypothesis, H0 is the commonly accepted fact;. • It is the opposite of the alternate hypothesis (H1). • Researchers work to reject, nullify or disprove the null hypothesis. • Researchers come up with an alternate hypothesis, one that they think explains a phenomenon, and then work to reject the null hypothesis. Example: Null hypothesis, H0: Corona Virus will not be effective in causing disease at high temperature Alternate hypothesis H1 : Corona can cause disease even at high temperature
  • 48. Hypothesis Testing • Hypothesis: explanation based on limited evidences • No hypothesis is 100 % true, unless proven • Always chance of drawing incorrect conclusions (errors)
  • 49. Two types of errors in testing of hypothesis 1. Type I error: Rejection of null hypothesis which is true. 2. Type II error: Acceptance of null hypothesis which is false. Reject H0 Accept H0 H0 is True Type I error Correct Decision H0 is False Correct Decision Type II error
  • 51. Significance level (p-value): • The level of significance (p-value) is used in hypothesis testing to help you support or reject the null hypothesis. • The p-value is the evidence against a null hypothesis. • The smaller the p-value, the strong the evidence that you should reject the null hypothesis. • Thus, p-value is the probability of committing type I error • The smaller the p-value, lesser is the probability of the error (Type I) of rejecting a true null hypothesis
  • 52. If p > 0.10 → “not significant” If p ≤ 0.10 → “marginally significant” If p ≤ 0.05 → “significant” If p ≤ 0.01 → “highly significant.” Significance level (p): Error for Rejection of True Null Hypothesis: p ≤ 0.05 : This means that the probability of accepting a true alternative hypothesis is 95% and committing type I error is 5%)
  • 53. IFAS ONLINE Apply Your Mind In the following statement taken from a research paper, what does p in the parenthesis stands for? (DEC 2011) “The mean temperature of this region now is significantly higher than the one 50 years ago (p<0.05, t-test)” (1) Ratio of the mean temperature of the two times periods tested (2) Probability of the error of rejecting a true null hypothesis (3) Probability of the error of accepting a false null hypothesis (4) Probability of the t-test being effective in detecting significant difference in the mean annual temperatures of the two periods