A short introduction to sample size estimation for Research methodology workshop at Dr. BVP RMC, Pravara Institute of Medical Sciences(DU), Loni by Dr. Mandar Baviskar
Sample size Calculation:
Objectives:
Calculate sample size according to particular type of research, and purpose.
Identify and select various software to calculate sample size according to particular type of research, and purpose.
Why to calculate sample size?
To show that under certain conditions, the hypothesis test has a good chance of showing a desired difference (if it exists)
To show to the IRB committee and funding agency that the study has a reasonable chance to obtain a conclusive result
To show that the necessary resources (human, monetary, time) will be minimized and well utilized.
Most Important: sample size calculation is an educated guess
It is more appropriate for studies involving hypothesis testing
There is no magic involved; only statistical and mathematical logic and some algebra
Researchers need to know something about what they are measuring and how it varies in the population of interest.
SAMPLE SIZE:
How many subjects are needed to assure a given probability of detecting a statistically significant effect of a given magnitude if one truly exists?
POWER:
If a limited pool of subjects is available, what is the likelihood of finding a statistically significant effect of a given magnitude if one truly exists?
Before We Can Determine Sample Size We Need To Answer The Following:
1. What is the primary objective of the study?
2. What is the main outcome measure?
Is it a continuous or dichotomous outcome?
3. How will the data be analyzed to detect a group difference?
4. How small a difference is clinically important to detect?
5. How much variability is in our target population?
6. What is the desired and ?
7. What is the anticipated drop out and non-response % ?
Where do we get this knowledge?
Previous published studies
Pilot studies
If information is lacking, there is no good way to calculate the sample size.
Type I error: Rejecting H0 when H0 is true
: The type I error rate.
Type II error: Failing to reject H0 when H0 is false
: The type II error rate
Power (1 - ): Probability of detecting group difference given the size of the effect () and the sample size of the trial (N).
Estimation of Sample Size by Three ways:
By using
(1) Formulae (manual calculations)
(2) Sample size tables or Nomogram
(3) Softwares.
SAMPLE SIZE FOR ADEQUATE PRECISION:
In a descriptive study,
Summary statistics (mean, proportion)
Reliability (or) precision
By giving “confidence interval”
Wider the C.I – sample statistic is not reliable and it may not give an accurate estimate of the true value of the population parameter.
Sample size calculation for cross sectional studies/surveys:
Cross sectional studies or cross sectional survey are done to estimate a population parameter like prevalence of some disease in a community or finding the average value of some quantitative variable in a population.
Sample size formula for qualitative variable and quantities variable are different.
Assessing Model Performance - Beginner's GuideMegan Verbakel
Introduction on how to assess the performance of a classifier model. Covers theories (bias-variance trade-off, over/under-fitting), data preparation (train/test split, cross-validation), common performance plots (e.g. ROC curve and confusion matrix), and common metrics (e.g. accuracy, precision, recall, f1-score).
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
📺Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.
A short introduction to sample size estimation for Research methodology workshop at Dr. BVP RMC, Pravara Institute of Medical Sciences(DU), Loni by Dr. Mandar Baviskar
Sample size Calculation:
Objectives:
Calculate sample size according to particular type of research, and purpose.
Identify and select various software to calculate sample size according to particular type of research, and purpose.
Why to calculate sample size?
To show that under certain conditions, the hypothesis test has a good chance of showing a desired difference (if it exists)
To show to the IRB committee and funding agency that the study has a reasonable chance to obtain a conclusive result
To show that the necessary resources (human, monetary, time) will be minimized and well utilized.
Most Important: sample size calculation is an educated guess
It is more appropriate for studies involving hypothesis testing
There is no magic involved; only statistical and mathematical logic and some algebra
Researchers need to know something about what they are measuring and how it varies in the population of interest.
SAMPLE SIZE:
How many subjects are needed to assure a given probability of detecting a statistically significant effect of a given magnitude if one truly exists?
POWER:
If a limited pool of subjects is available, what is the likelihood of finding a statistically significant effect of a given magnitude if one truly exists?
Before We Can Determine Sample Size We Need To Answer The Following:
1. What is the primary objective of the study?
2. What is the main outcome measure?
Is it a continuous or dichotomous outcome?
3. How will the data be analyzed to detect a group difference?
4. How small a difference is clinically important to detect?
5. How much variability is in our target population?
6. What is the desired and ?
7. What is the anticipated drop out and non-response % ?
Where do we get this knowledge?
Previous published studies
Pilot studies
If information is lacking, there is no good way to calculate the sample size.
Type I error: Rejecting H0 when H0 is true
: The type I error rate.
Type II error: Failing to reject H0 when H0 is false
: The type II error rate
Power (1 - ): Probability of detecting group difference given the size of the effect () and the sample size of the trial (N).
Estimation of Sample Size by Three ways:
By using
(1) Formulae (manual calculations)
(2) Sample size tables or Nomogram
(3) Softwares.
SAMPLE SIZE FOR ADEQUATE PRECISION:
In a descriptive study,
Summary statistics (mean, proportion)
Reliability (or) precision
By giving “confidence interval”
Wider the C.I – sample statistic is not reliable and it may not give an accurate estimate of the true value of the population parameter.
Sample size calculation for cross sectional studies/surveys:
Cross sectional studies or cross sectional survey are done to estimate a population parameter like prevalence of some disease in a community or finding the average value of some quantitative variable in a population.
Sample size formula for qualitative variable and quantities variable are different.
Assessing Model Performance - Beginner's GuideMegan Verbakel
Introduction on how to assess the performance of a classifier model. Covers theories (bias-variance trade-off, over/under-fitting), data preparation (train/test split, cross-validation), common performance plots (e.g. ROC curve and confusion matrix), and common metrics (e.g. accuracy, precision, recall, f1-score).
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
📺Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 7: Estimating Parameters and Determining Sample Sizes
7.1: Estimating a Population Proportion
A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity Green house effect & Hydrological cycle
Types of Ecosystem
(1) Natural Ecosystem
(2) Artificial Ecosystem
component of ecosystem
Biotic Components
Abiotic Components
Producers
Consumers
Decomposers
Functions of Ecosystem
Types of Biodiversity
Genetic Biodiversity
Species Biodiversity
Ecological Biodiversity
Importance of Biodiversity
Hydrological Cycle
Green House Effect
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxDenish Jangid
Solid waste management & Types of Basic civil Engineering notes by DJ Sir
Types of SWM
Liquid wastes
Gaseous wastes
Solid wastes.
CLASSIFICATION OF SOLID WASTE:
Based on their sources of origin
Based on physical nature
SYSTEMS FOR SOLID WASTE MANAGEMENT:
METHODS FOR DISPOSAL OF THE SOLID WASTE:
OPEN DUMPS:
LANDFILLS:
Sanitary landfills
COMPOSTING
Different stages of composting
VERMICOMPOSTING:
Vermicomposting process:
Encapsulation:
Incineration
MANAGEMENT OF SOLID WASTE:
Refuse
Reuse
Recycle
Reduce
FACTORS AFFECTING SOLID WASTE MANAGEMENT:
Power-sharing Class 10 is a vital aspect of democratic governance. It refers to the distribution of power among different organs of government, levels of government, and social groups. This ensures that no single entity can control all aspects of governance, promoting stability and unity in a diverse society.
For more information, visit-www.vavaclasses.com
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
2. Model Performance Metrics
Model performance metrics are measurements used to evaluate the
effectiveness and efficiency of a predictive model or machine learning
algorithm.
To evaluate the performance of predictive model there are metrics:
Accuracy
Precision
Recall (Sensitivity)
F1-Score
Confusion Matrix
ROC Curve and AUC
Please check the description box for the link to Machine Learning videos.
3. TP TN FP FN
A true positive is an outcome where the model correctly predicts the
positive class. Similarly, a true negative is an outcome where the
model correctly predicts the negative class.
A false positive is an outcome where the model incorrectly predicts
the positive class. And a false negative is an outcome where the
model incorrectly predicts the negative class.
Predicted Positive Negative
Positive
Negative
Actual
True Positive TP
True Negative TN
False Negative FN
False Positive FP
4. Accuracy
Accuracy: Accuracy is the ratio of the number of correct predictions and the
total number of predictions. It is calculated as-
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Accuracy is useful in binary classification with balanced classes, also be used
for evaluating multiclass classification model when classes are balanced.
When classes in the dataset are highly imbalanced, meaning there is a
significant disparity in the number of instances between classes, accuracy can
be misleading. A model may achieve high accuracy by simply predicting the
majority class for every instance, ignoring the minority class entirely.
5. Example
let's consider a medical diagnosis scenario where we are developing a model
to predict whether a patient has a rare disease or not. Suppose we have a
dataset of 100 patients, out of which only 2 have the disease. This dataset
represents a highly imbalanced scenario.
let's say we develop a simple classifier that always predicts that a patient
does not have the disease. Despite the high accuracy of 98%, this classifier
is not useful because it fails to identify any patients with the disease. It
simply predicts that every patient is disease-free.
In such cases, evaluation metrics like precision, recall, or F1-score provide
more insightful information about the model's performance, especially
concerning its ability to correctly identify the minority class (patients with
the disease).
6. Precision
Precision: Precision is a measure of a model’s performance that tells us how
many of the positive predictions made by the model are actually correct. It is
calculated as-
Precision = TP / (TP + FP)
Precision is particularly useful in scenarios where the cost of false positives
is high.
The importance of precision is in music or video recommendation systems,
e-commerce websites, etc. where wrong results could lead to customer churn,
and this could be harmful to the business.
It gives us insight into the model's ability to avoid false positives, A higher
precision indicates fewer false positives.
7. Example
• Suppose we have a dataset of 1000 emails, out of which 200 are spam
(positive class) and 800 are not spam (negative class). After training our
spam detection model, it predicts that 250 emails are spam.
• True Positives (TP): 150 (correctly identified spam emails)
• False Positives (FP): 100 (non-spam emails incorrectly classified as
spam)
• Using these numbers, let's calculate precision:
• Precision=150/150+100=150/250 =0.6
• So, the precision of the model is 0.6 or 60%. This means that out of all
the emails predicted as spam, 60% of them are actually spam.
8. Recall (Sensitivity)
Recall: Also known as sensitivity or true positive rate, recall measures the
proportion of true positive predictions among all actual positive instances in
the dataset. It is calculated as-
Recall = TP / (TP + FN).
Recall is particularly useful in scenarios where capturing all positive instances
is crucial, even if it means accepting a higher rate of false positives.
In medical diagnosis, missing a positive instance (false negative) can have
severe consequences for the patient's health or even lead to loss of life. High
recall ensures that the model identifies as many positive cases as possible,
reducing the likelihood of missing critical diagnoses.
It gives us insight into the model's ability to avoid false negatives, which are
cases where patients with the disease are incorrectly diagnosed as not having
it.
9. Example
• Suppose we have a dataset of 100 patients who were tested for a specific
disease, where 20 patients actually have the disease (positive class), and 80
patients do not have the disease (negative class).
• After training our diagnostic model,
• True Positives (TP): 15 (patients correctly diagnosed with the disease)
• False Positives (FP): 5 (patients incorrectly diagnosed with the disease)
• False Negatives (FN): 5 (patients with the disease incorrectly diagnosed as not
having the disease)
• True Negatives (TN): 75 (patients correctly diagnosed as not having the
disease)
• Recall= 15/15+5 =15/20 =0.75
10. Precision vs Recall
• Precision can be seen as a
measure of quality.
• Higher precision means that an
algorithm returns more relevant
results than irrelevant ones.
• Precision measures the accuracy
of positive predictions.
• Precision is important when the
cost of false positives is high.
(e.g. spam detection).
• Recall can be seen as a measure of
quantity.
• Higher recall means that an
algorithm returns most of the
relevant results (whether irrelevant
ones are also returned).
• Recall measures the completeness
of positive predictions.
• Recall is important when the cost
of false negative is high. (e.g.
disease diagnosis)