SlideShare a Scribd company logo
1 of 8
Sampling Bias
Dr.K.Prabhakar
Bias
• Once we collect the data we represent the data by way of a model.
Let us assume a linear model.
• This may be written as y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error
• Therefore we predict that there will be an error as the outcome is
expressed as a set of predictor variables multiplied by a set of
coefficients the parameters the a in the equation and tell us about
the relationship between the predictor and outcome variable.
• The prediction will not be perfect as there will be an error as we are
using sample data to predict the outcome variable.
The contexts for bias
• Things that bias the parameter estimates
• Things that bias standard errors and confidence intervals
• Things that bias test statistics and p-values. These bias are related. If
the test statistics are bias then the confidence intervals will be biased.
A bias in confidence intervals will bias the test statistics.
• If the test statistics is biased then the results will be biased and we
need to identify and eliminate the biases as much as possible.
Assumptions that lead to bias
1. Presence of outliners
2. Additivity and linearity
3. Normality
4. Homoscedasticity or homogeneity of variance
5. Independence
Outliers
• Presence of outliers in data will bias the data.
• For example if the class average marks is 60 and standard deviation is
10 marks then if there is a presence of zero marks or 100 marks by
few students may bias the data.
• The outliers need to be identified and removed or replaced to have a
better representation of the data. It generally affect the mean of the
data as well as some of the squares errors. The sum of the squares is
used to compute the standard deviation, which in turn is used to
estimate the standard error. The standard error is used for confidence
intervals around the parameter estimates. This it will have a domino
effect on the results.
Additivity and Linearity
• The assumption is the outcome variable is linearly related to all
predictors. That means the relationship may be summed up as a
straight line.
• If there are several predictors as we have see the equation
y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error
their combined effect is described by adding their effects together.
The model can described accurately by the equation given here.
Assumption of Normality
• There is a mistaken belief that assumption of normality = the data need to be
from normally distributed. This misconception stems from the fact that if the
data is normally distributed then errors in the model as well as sampling
distribution is also normally distributed.
• The central limit theorem means that there are different situations in which we
can assume normality regardless of the shape of the sample data.
• Normality matters when you construct confidence intervals around parameters of
the model or compute significance tests relating to those parameters then
assumption of normality matters in small samples.
• As long as the sample size is fairly large, outliers are taken into account then
assumption of normality will not be a pressing concern.
• Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of
the normality assumption in large public health data sets. Annual review of
public health, 23(1), 151-169.
Homoscedasticity or homogeneity of variance

More Related Content

What's hot

Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysisIrfan Hussain
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation ModelingAzmi Mohd Tamil
 
M1 regression metrics_middleschool
M1 regression metrics_middleschoolM1 regression metrics_middleschool
M1 regression metrics_middleschoolaiclub_slides
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataTianfan Song
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataHopkinsCFAR
 
Lab report walk through
Lab report walk throughLab report walk through
Lab report walk throughserenaasya
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Rankingijsrd.com
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methodsguest2137aa
 
CS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by SlazbergCS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by Slazbergmustafa sarac
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Mohammed Musah
 
Lecture note 2
Lecture note 2Lecture note 2
Lecture note 2sreenu t
 
Polynomials 12.2 12.4
Polynomials 12.2 12.4Polynomials 12.2 12.4
Polynomials 12.2 12.4RobinFilter
 
Lesson 10 rm psych stats & graphs 2013
Lesson 10   rm psych stats & graphs 2013Lesson 10   rm psych stats & graphs 2013
Lesson 10 rm psych stats & graphs 2013coburgpsych
 

What's hot (19)

Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysis
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 
M1 regression metrics_middleschool
M1 regression metrics_middleschoolM1 regression metrics_middleschool
M1 regression metrics_middleschool
 
Methods of point estimation
Methods of point estimationMethods of point estimation
Methods of point estimation
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing Data
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing Data
 
Lab report walk through
Lab report walk throughLab report walk through
Lab report walk through
 
Estimation Theory
Estimation TheoryEstimation Theory
Estimation Theory
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
R - Multiple Regression
R - Multiple RegressionR - Multiple Regression
R - Multiple Regression
 
CS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by SlazbergCS550 Presentation - On comparing classifiers by Slazberg
CS550 Presentation - On comparing classifiers by Slazberg
 
Regression
RegressionRegression
Regression
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
Point estimation
Point estimationPoint estimation
Point estimation
 
Lecture note 2
Lecture note 2Lecture note 2
Lecture note 2
 
Polynomials 12.2 12.4
Polynomials 12.2 12.4Polynomials 12.2 12.4
Polynomials 12.2 12.4
 
The Chi Square Test
The Chi Square TestThe Chi Square Test
The Chi Square Test
 
Lesson 10 rm psych stats & graphs 2013
Lesson 10   rm psych stats & graphs 2013Lesson 10   rm psych stats & graphs 2013
Lesson 10 rm psych stats & graphs 2013
 

Similar to Bias in Research Methods

regression.pptx
regression.pptxregression.pptx
regression.pptxaneeshs28
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionRione Drevale
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxsmithashetty24
 
Error in chemical analysis
Error in chemical analysisError in chemical analysis
Error in chemical analysisSuresh Selvaraj
 
Normal distribtion curve
Normal distribtion curveNormal distribtion curve
Normal distribtion curveAliRaza1767
 
L1 statistics
L1 statisticsL1 statistics
L1 statisticsdapdai
 
statistical estimation
statistical estimationstatistical estimation
statistical estimationAmish Akbar
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfVamshi962726
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Physics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesPhysics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesJohnPaul Kennedy
 

Similar to Bias in Research Methods (20)

regression.pptx
regression.pptxregression.pptx
regression.pptx
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regression
 
Unit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptxUnit III_Ch 17_Probablistic Methods.pptx
Unit III_Ch 17_Probablistic Methods.pptx
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Errors2
Errors2Errors2
Errors2
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Error in chemical analysis
Error in chemical analysisError in chemical analysis
Error in chemical analysis
 
chapter12.ppt
chapter12.pptchapter12.ppt
chapter12.ppt
 
Correlation in Statistics
Correlation in StatisticsCorrelation in Statistics
Correlation in Statistics
 
Normal distribtion curve
Normal distribtion curveNormal distribtion curve
Normal distribtion curve
 
L1 statistics
L1 statisticsL1 statistics
L1 statistics
 
statistical estimation
statistical estimationstatistical estimation
statistical estimation
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
template.pptx
template.pptxtemplate.pptx
template.pptx
 
R training4
R training4R training4
R training4
 
DSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptxDSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptx
 
Physics 1.2b Errors and Uncertainties
Physics 1.2b Errors and UncertaintiesPhysics 1.2b Errors and Uncertainties
Physics 1.2b Errors and Uncertainties
 
Presentation1
Presentation1Presentation1
Presentation1
 

More from Central University of Jammu

The twelve commandments to live better by one of my friend
 The twelve commandments to live better by one of my friend  The twelve commandments to live better by one of my friend
The twelve commandments to live better by one of my friend Central University of Jammu
 
Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Central University of Jammu
 

More from Central University of Jammu (20)

The Crooked Timber of New India [Autosaved].pptx
The Crooked Timber of New India [Autosaved].pptxThe Crooked Timber of New India [Autosaved].pptx
The Crooked Timber of New India [Autosaved].pptx
 
Qualitative research and use of Nvivo
Qualitative research and use of NvivoQualitative research and use of Nvivo
Qualitative research and use of Nvivo
 
Impact of covid pandemic on indian economy future
Impact of covid pandemic on indian economy futureImpact of covid pandemic on indian economy future
Impact of covid pandemic on indian economy future
 
Learning
LearningLearning
Learning
 
Introduction to qualitative research and nvivo 12
Introduction to qualitative research and nvivo 12Introduction to qualitative research and nvivo 12
Introduction to qualitative research and nvivo 12
 
Examiners Expectations from PhD Thesis
Examiners Expectations from PhD ThesisExaminers Expectations from PhD Thesis
Examiners Expectations from PhD Thesis
 
Fundamental of Research
Fundamental of Research Fundamental of Research
Fundamental of Research
 
Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Sample Size Determination
 
Sampling Concepts
 Sampling Concepts Sampling Concepts
Sampling Concepts
 
Sampling
 Sampling Sampling
Sampling
 
Variables, Theory and Sampling Map
Variables, Theory and Sampling MapVariables, Theory and Sampling Map
Variables, Theory and Sampling Map
 
Role of Good Governance Practices
Role of Good Governance Practices Role of Good Governance Practices
Role of Good Governance Practices
 
Individualization
IndividualizationIndividualization
Individualization
 
The twelve commandments to live better by one of my friend
 The twelve commandments to live better by one of my friend  The twelve commandments to live better by one of my friend
The twelve commandments to live better by one of my friend
 
Innovations for next 30 years and business
Innovations for next 30 years and businessInnovations for next 30 years and business
Innovations for next 30 years and business
 
Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility Companies Act 2013 and Corporate Social Responsibility
Companies Act 2013 and Corporate Social Responsibility
 
Sight Care Foundation
Sight Care Foundation Sight Care Foundation
Sight Care Foundation
 
Project guidelines for mba
Project guidelines for mbaProject guidelines for mba
Project guidelines for mba
 
Web 2.0 Opportunities and Risks
Web 2.0 Opportunities and RisksWeb 2.0 Opportunities and Risks
Web 2.0 Opportunities and Risks
 

Recently uploaded

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Recently uploaded (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

Bias in Research Methods

  • 2. Bias • Once we collect the data we represent the data by way of a model. Let us assume a linear model. • This may be written as y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error • Therefore we predict that there will be an error as the outcome is expressed as a set of predictor variables multiplied by a set of coefficients the parameters the a in the equation and tell us about the relationship between the predictor and outcome variable. • The prediction will not be perfect as there will be an error as we are using sample data to predict the outcome variable.
  • 3. The contexts for bias • Things that bias the parameter estimates • Things that bias standard errors and confidence intervals • Things that bias test statistics and p-values. These bias are related. If the test statistics are bias then the confidence intervals will be biased. A bias in confidence intervals will bias the test statistics. • If the test statistics is biased then the results will be biased and we need to identify and eliminate the biases as much as possible.
  • 4. Assumptions that lead to bias 1. Presence of outliners 2. Additivity and linearity 3. Normality 4. Homoscedasticity or homogeneity of variance 5. Independence
  • 5. Outliers • Presence of outliers in data will bias the data. • For example if the class average marks is 60 and standard deviation is 10 marks then if there is a presence of zero marks or 100 marks by few students may bias the data. • The outliers need to be identified and removed or replaced to have a better representation of the data. It generally affect the mean of the data as well as some of the squares errors. The sum of the squares is used to compute the standard deviation, which in turn is used to estimate the standard error. The standard error is used for confidence intervals around the parameter estimates. This it will have a domino effect on the results.
  • 6. Additivity and Linearity • The assumption is the outcome variable is linearly related to all predictors. That means the relationship may be summed up as a straight line. • If there are several predictors as we have see the equation y(outcome)= a1x1+a2x2+a3x3+…+anxn+ error their combined effect is described by adding their effects together. The model can described accurately by the equation given here.
  • 7. Assumption of Normality • There is a mistaken belief that assumption of normality = the data need to be from normally distributed. This misconception stems from the fact that if the data is normally distributed then errors in the model as well as sampling distribution is also normally distributed. • The central limit theorem means that there are different situations in which we can assume normality regardless of the shape of the sample data. • Normality matters when you construct confidence intervals around parameters of the model or compute significance tests relating to those parameters then assumption of normality matters in small samples. • As long as the sample size is fairly large, outliers are taken into account then assumption of normality will not be a pressing concern. • Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of the normality assumption in large public health data sets. Annual review of public health, 23(1), 151-169.