This document summarizes four key assumptions that should be tested in multiple regression analysis: normality, linearity, reliability of measurement, and homoscedasticity. It discusses how violating these assumptions can lead to inefficient or biased results. Researchers are encouraged to check for normality of variables, linear relationships between variables, reliability of measurement tools, and equal variance of errors. Techniques like residual plots and transformations are mentioned as ways to test the assumptions. The document emphasizes that while methods exist to address issues like non-normality, they may inadvertently change the data or relationships in problematic ways.
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
D. Mayo: Replication Research Under an Error Statistical Philosophy jemille6
D. Mayo (Virginia Tech) slides from her talk June 3 at the "Preconference Workshop on Replication in the Sciences" at the 2015 Society for Philosophy and Psychology meeting.
Exploratory Research is More Reliable Than Confirmatory Researchjemille6
PSA 2016 Symposium:
Philosophy of Statistics in the Age of Big Data and Replication Crises
Presenter: Clark Glymour (Alumni University Professor in Philosophy, Carnegie Mellon University, Pittsburgh, Pennsylvania)
ABSTRACT: Ioannidis (2005) argued that most published research is false, and that “exploratory” research in which many hypotheses are assessed automatically is especially likely to produce false positive relations. Colquhoun (2014) with simulations estimates that 30 to 40% of positive results using the conventional .05 cutoff for rejection of a null hypothesis is false. Their explanation is that true relationships in a domain are rare and the selection of hypotheses to test is roughly independent of their truth, so most relationships tested will in fact be false. Conventional use of hypothesis tests, in other words, suffers from a base rate fallacy. I will show that the reverse is true for modern search methods for causal relations because: a. each hypothesis is tested or assessed multiple times; b. the methods are biased against positive results; c. systems in which true relationships are rare are an advantage for these methods. I will substantiate the claim with both empirical data and with simulations of data from systems with a thousand to a million variables that result in fewer than 5% false positive relationships and in which 90% or more of the true relationships are recovered.
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...jemille6
I will explore the extent to which concerns about ‘scientism’– an unwarranted obeisance to scientific over other methods of inquiry – are intertwined with issues in the foundations of the statistical data analyses on which (social, behavioral, medical and physical) science increasingly depends. The rise of big data, machine learning, and high-powered computer programs have extended statistical methods and modeling across the landscape of science, law and evidence-based policy, but this has been accompanied by enormous hand wringing as to the reliability, replicability, and valid use of statistics. Legitimate criticisms of scientism often stem from insufficiently self-critical uses of statistical methodology, broadly construed — i.e., from what might be called “statisticism”-- particularly when those methods are applied to matters of controversy.
Statistical skepticism: How to use significance tests effectively jemille6
Prof. D. Mayo, presentation Oct. 12, 2017 at the ASA Symposium on Statistical Inference : “A World Beyond p < .05” in the session: “What are the best uses for P-values?“
A. Gelman "50 shades of gray: A research story," presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
D. Mayo: Replication Research Under an Error Statistical Philosophy jemille6
D. Mayo (Virginia Tech) slides from her talk June 3 at the "Preconference Workshop on Replication in the Sciences" at the 2015 Society for Philosophy and Psychology meeting.
Exploratory Research is More Reliable Than Confirmatory Researchjemille6
PSA 2016 Symposium:
Philosophy of Statistics in the Age of Big Data and Replication Crises
Presenter: Clark Glymour (Alumni University Professor in Philosophy, Carnegie Mellon University, Pittsburgh, Pennsylvania)
ABSTRACT: Ioannidis (2005) argued that most published research is false, and that “exploratory” research in which many hypotheses are assessed automatically is especially likely to produce false positive relations. Colquhoun (2014) with simulations estimates that 30 to 40% of positive results using the conventional .05 cutoff for rejection of a null hypothesis is false. Their explanation is that true relationships in a domain are rare and the selection of hypotheses to test is roughly independent of their truth, so most relationships tested will in fact be false. Conventional use of hypothesis tests, in other words, suffers from a base rate fallacy. I will show that the reverse is true for modern search methods for causal relations because: a. each hypothesis is tested or assessed multiple times; b. the methods are biased against positive results; c. systems in which true relationships are rare are an advantage for these methods. I will substantiate the claim with both empirical data and with simulations of data from systems with a thousand to a million variables that result in fewer than 5% false positive relationships and in which 90% or more of the true relationships are recovered.
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...jemille6
I will explore the extent to which concerns about ‘scientism’– an unwarranted obeisance to scientific over other methods of inquiry – are intertwined with issues in the foundations of the statistical data analyses on which (social, behavioral, medical and physical) science increasingly depends. The rise of big data, machine learning, and high-powered computer programs have extended statistical methods and modeling across the landscape of science, law and evidence-based policy, but this has been accompanied by enormous hand wringing as to the reliability, replicability, and valid use of statistics. Legitimate criticisms of scientism often stem from insufficiently self-critical uses of statistical methodology, broadly construed — i.e., from what might be called “statisticism”-- particularly when those methods are applied to matters of controversy.
Statistical skepticism: How to use significance tests effectively jemille6
Prof. D. Mayo, presentation Oct. 12, 2017 at the ASA Symposium on Statistical Inference : “A World Beyond p < .05” in the session: “What are the best uses for P-values?“
A. Gelman "50 shades of gray: A research story," presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Abstract: Mounting failures of replication in the social and biological sciences give a practical spin to statistical foundations in the form of the question: How can we attain reliability when methods make illicit cherry-picking and significance seeking so easy? Researchers, professional societies, and journals are increasingly getting serious about methodological reforms to restore scientific integrity – some are quite welcome (e.g., pre-registration), while others are quite radical. The American Statistical Association convened members from differing tribes of frequentists, Bayesians, and likelihoodists to codify misuses of P-values. Largely overlooked are the philosophical presuppositions of both criticisms and proposed reforms. Paradoxically, alternative replacement methods may enable rather than reveal illicit inferences due to cherry-picking, multiple testing, and other biasing selection effects. Crowd-sourced reproducibility research in psychology is helping to change the reward structure but has its own shortcomings. Focusing on purely statistical considerations, it tends to overlook problems with artificial experiments. Without a better understanding of the philosophical issues, we can expect the latest reforms to fail.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Severe Testing: The Key to Error Correctionjemille6
D. G. Mayo's slides for her presentation given March 17, 2017 at Boston Colloquium for Philosophy of Science, Alfred I.Taub forum: "Understanding Reproducibility & Error Correction in Science"
Deborah G. Mayo: Is the Philosophy of Probabilism an Obstacle to Statistical Fraud Busting?
Presentation slides for: Revisiting the Foundations of Statistics in the Era of Big Data: Scaling Up to Meet the Challenge[*] at the Boston Colloquium for Philosophy of Science (Feb 21, 2014).
D. G. Mayo (Virginia Tech) "Error Statistical Control: Forfeit at your Peril" presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performancejemille6
Slides from Rutgers Seminar talk by Deborah G Mayo
December 3, 2014
Rutgers, Department of Statistics and Biostatistics
Abstract: Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.
These slides were presented on November 22 2016 during the Annual Julius Symposium, organised by the Julius Center for Health Sciences and Primary Care, University Medical Hospital Utrecht.
Only a few months ago, the American Statistical Association authoritatively issued an official statement on significance and p-values (American Statistician, 2016, 70:2, 129-133), claiming that the p-value is: “commonly misused and misinterpreted.”
In this presentation I focus on the principles of the ASA statement.
Replication Crises and the Statistics Wars: Hidden Controversiesjemille6
D. Mayo presentation at the X-Phil conference on "Reproducibility and Replicabililty in Psychology and Experimental Philosophy", University College London (June 14, 2018)
D. G. Mayo: Your data-driven claims must still be probed severelyjemille6
In the session "Philosophy of Science and the New Paradigm of Data-Driven Science at the American Statistical Association Conference on Statistical Learning and Data Science/Nonparametric Statistics
Hypothesis Testing. Inferential Statistics pt. 2John Labrador
A hypothesis test is a statistical test that is used to determine whether there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. A hypothesis test examines two opposing hypotheses about a population: the null hypothesis and the alternative hypothesis.
Cellular telephone workshop given at NECC 2009 in Washington DC by Vicki Davis, classroom teacher and former General Manager for a cellular telephone market.
Abstract: Mounting failures of replication in the social and biological sciences give a practical spin to statistical foundations in the form of the question: How can we attain reliability when methods make illicit cherry-picking and significance seeking so easy? Researchers, professional societies, and journals are increasingly getting serious about methodological reforms to restore scientific integrity – some are quite welcome (e.g., pre-registration), while others are quite radical. The American Statistical Association convened members from differing tribes of frequentists, Bayesians, and likelihoodists to codify misuses of P-values. Largely overlooked are the philosophical presuppositions of both criticisms and proposed reforms. Paradoxically, alternative replacement methods may enable rather than reveal illicit inferences due to cherry-picking, multiple testing, and other biasing selection effects. Crowd-sourced reproducibility research in psychology is helping to change the reward structure but has its own shortcomings. Focusing on purely statistical considerations, it tends to overlook problems with artificial experiments. Without a better understanding of the philosophical issues, we can expect the latest reforms to fail.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Severe Testing: The Key to Error Correctionjemille6
D. G. Mayo's slides for her presentation given March 17, 2017 at Boston Colloquium for Philosophy of Science, Alfred I.Taub forum: "Understanding Reproducibility & Error Correction in Science"
Deborah G. Mayo: Is the Philosophy of Probabilism an Obstacle to Statistical Fraud Busting?
Presentation slides for: Revisiting the Foundations of Statistics in the Era of Big Data: Scaling Up to Meet the Challenge[*] at the Boston Colloquium for Philosophy of Science (Feb 21, 2014).
D. G. Mayo (Virginia Tech) "Error Statistical Control: Forfeit at your Peril" presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performancejemille6
Slides from Rutgers Seminar talk by Deborah G Mayo
December 3, 2014
Rutgers, Department of Statistics and Biostatistics
Abstract: Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.
These slides were presented on November 22 2016 during the Annual Julius Symposium, organised by the Julius Center for Health Sciences and Primary Care, University Medical Hospital Utrecht.
Only a few months ago, the American Statistical Association authoritatively issued an official statement on significance and p-values (American Statistician, 2016, 70:2, 129-133), claiming that the p-value is: “commonly misused and misinterpreted.”
In this presentation I focus on the principles of the ASA statement.
Replication Crises and the Statistics Wars: Hidden Controversiesjemille6
D. Mayo presentation at the X-Phil conference on "Reproducibility and Replicabililty in Psychology and Experimental Philosophy", University College London (June 14, 2018)
D. G. Mayo: Your data-driven claims must still be probed severelyjemille6
In the session "Philosophy of Science and the New Paradigm of Data-Driven Science at the American Statistical Association Conference on Statistical Learning and Data Science/Nonparametric Statistics
Hypothesis Testing. Inferential Statistics pt. 2John Labrador
A hypothesis test is a statistical test that is used to determine whether there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. A hypothesis test examines two opposing hypotheses about a population: the null hypothesis and the alternative hypothesis.
Cellular telephone workshop given at NECC 2009 in Washington DC by Vicki Davis, classroom teacher and former General Manager for a cellular telephone market.
SPSS Presentation. topics include general concepts of statistics, basic concepts of SPSS, Variables, types of variables, data and its types, sources of data,four windows of SPSS, viewer window, output viewer. results etc..............................
Lesson 2 Statistics Benefits, Risks, and MeasurementsAssignmen.docxSHIVA101531
Lesson 2: Statistics: Benefits, Risks, and Measurements
Assignments
· See your Course Syllabus for the reading assignments.
· Work through the Lesson 2 online notes that follow.
· Complete the Practice Questions and Lesson 2 Assignment.
Learning Objectives
Chapters 1 and 3
After successfully completing this lesson, you should be able to:
· Identify the three conditions needed to conduct a proper study.
· Apply the seven pitfalls that can be encountered when asking questions in a survey.
· Distinguish between measurement variables and categorical variables.
· Distinguish between continuous variables and discrete variables for those that are measurement variables.
· Distinguish between validity, reliability, and bias.
Terms to Know
From Chapter 1
· statistics
· population
· sample
· observational study
· experiment
· selection bias
· nonresponse bias
From Chapter 3
· data (variable)
· categorical variables
· measurement variables
· measurement (discrete) variables
· measurement (continuous) variables
· validity
· reliability
· bias
2.1 What is Statistics?
Section 2.1. Chapter 1
Overview
What is statistics? If you think statistics is just another math course with many formulas and lifeless numbers, you are not alone. However, this is a myth that hopefully will be debunked as you work through this course. Statistics is about data. More precisely, statistics is a collection of procedures and principles for gaining and processing information from collected data. Knowing these principles and procedures will help you make intelligent decisions in everyday life when faced with uncertainty. The following examples are meant to illuminate the definition of statistics.
Example 2.1. Angry Women
Who are those angry women? (Streitfield, D., 1988 and Wallis, 1987.) In 1987, Shere Hite published a best-selling book called Women and Love: A Cultural Revolution in Progress. This 7-year research project produced a controversial 922-page publication that summarized the results from a survey that was designed to examine how American women feel about their relationships with men. Hite mailed out 100,000 fifteen-page questionnaires to women who were members of a wide variety of organizations across the U.S. These organizations included church, political, volunteer, senior citizen, and counseling groups, among many others. Questionnaires were actually sent to the leader of each organization. The leader was asked to distribute questionnaires to all members. Each questionnaire contained 127 open-ended questions with many parts and follow-ups. Part of Hite’s directions read as follows: “Feel free to skip around and answer only those questions you choose.” Approximately 4500 questionnaires were returned. Below are a few statements from this 1987 publication.
· 84% of women are not emotionally satisfied with their relationships
· 95% of women reported emotional and psychological harassment from their partners
· 70% of women married 5 years or more are having extramarital ...
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS7 MEDIA LIBRARY.docxtaishao1
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS
7: MEDIA LIBRARY
Premium Videos
Core Concepts in Stats Video
· Probability and Hypothesis Testing
Lightboard Lecture Video
· Hypothesis Testing
Difficulty Scale
(don’t plan on going out tonight)
WHAT YOU WILL LEARN IN THIS CHAPTER
· Understanding the difference between a sample and a population
· Understanding the importance of the null and research hypotheses
· Using criteria to judge a good hypothesis
SO YOU WANT TO BE A SCIENTIST
You might have heard the term hypothesis used in other classes. You may even have had to formulate one for a research project you did for another class, or you may have read one or two in a journal article. If so, then you probably have a good idea what a hypothesis is. For those of you who are unfamiliar with this often-used term, a hypothesis is basically “an educated guess.” Its most important role is to reflect the general problem statement or question that was the motivation for asking the research question in the first place.
That’s why taking the care and time to formulate a really precise and clear research question is so important. This research question will guide your creation of a hypothesis, and in turn, the hypothesis will determine the techniques you will use to test it and answer the question that was originally asked.
So, a good hypothesis translates a problem statement or a research question into a format that makes it easier to examine. This format is called a hypothesis. We will talk about what makes a hypothesis a good one later in this chapter. Before that, let’s turn our attention to the difference between a sample and a population. This is an important distinction, because while hypotheses usually describe a population, hypothesis testing deals with a sample and then the results are generalized to the larger population. We also address the two main types of hypotheses (the null hypothesis and the research hypothesis). But first, let’s formally define some simple terms that we have used earlier in Statistics for People Who (Think They) Hate Statistics.
SAMPLES AND POPULATIONS
As a good scientist, you would like to be able to say that if Method A is better than Method B in your study, this is true forever and always and for all people in the universe, right? Indeed. And, if you do enough research on the relative merits of Methods A and B and test enough people, you may someday be able to say that.
But don’t get too excited, because it’s unlikely you will ever be able to speak with such confidence. It takes too much money ($$$) and too much time (all those people!) to do all that research, and besides, it’s not even necessary. Instead, you can just select a representative sample from the population and test your hypothesis about the relative merits of Methods A and B on that sample.
Given the constraints of never enough time and never enough research funds, with which almost all scientists live, the next best strategy is to take a portion of a lar.
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS7 MEDIA LIBRARY.docxevonnehoggarth79783
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS
7: MEDIA LIBRARY
Premium Videos
Core Concepts in Stats Video
· Probability and Hypothesis Testing
Lightboard Lecture Video
· Hypothesis Testing
Difficulty Scale
(don’t plan on going out tonight)
WHAT YOU WILL LEARN IN THIS CHAPTER
· Understanding the difference between a sample and a population
· Understanding the importance of the null and research hypotheses
· Using criteria to judge a good hypothesis
SO YOU WANT TO BE A SCIENTIST
You might have heard the term hypothesis used in other classes. You may even have had to formulate one for a research project you did for another class, or you may have read one or two in a journal article. If so, then you probably have a good idea what a hypothesis is. For those of you who are unfamiliar with this often-used term, a hypothesis is basically “an educated guess.” Its most important role is to reflect the general problem statement or question that was the motivation for asking the research question in the first place.
That’s why taking the care and time to formulate a really precise and clear research question is so important. This research question will guide your creation of a hypothesis, and in turn, the hypothesis will determine the techniques you will use to test it and answer the question that was originally asked.
So, a good hypothesis translates a problem statement or a research question into a format that makes it easier to examine. This format is called a hypothesis. We will talk about what makes a hypothesis a good one later in this chapter. Before that, let’s turn our attention to the difference between a sample and a population. This is an important distinction, because while hypotheses usually describe a population, hypothesis testing deals with a sample and then the results are generalized to the larger population. We also address the two main types of hypotheses (the null hypothesis and the research hypothesis). But first, let’s formally define some simple terms that we have used earlier in Statistics for People Who (Think They) Hate Statistics.
SAMPLES AND POPULATIONS
As a good scientist, you would like to be able to say that if Method A is better than Method B in your study, this is true forever and always and for all people in the universe, right? Indeed. And, if you do enough research on the relative merits of Methods A and B and test enough people, you may someday be able to say that.
But don’t get too excited, because it’s unlikely you will ever be able to speak with such confidence. It takes too much money ($$$) and too much time (all those people!) to do all that research, and besides, it’s not even necessary. Instead, you can just select a representative sample from the population and test your hypothesis about the relative merits of Methods A and B on that sample.
Given the constraints of never enough time and never enough research funds, with which almost all scientists live, the next best strategy is to take a portion of a lar.
BUS 308 Week 2 Lecture 1
Examining Differences - overview
Expected Outcomes
After reading this lecture, the student should be familiar with:
1. The importance of random sampling.
2. The meaning of statistical significance.
3. The basic approach to determining statistical significance.
4. The meaning of the null and alternate hypothesis statements.
5. The hypothesis testing process.
6. The purpose of the F-test and the T-test.
Overview
Last week we collected clues and evidence to help us answer our case question about
males and females getting equal pay for equal work. As we looked at the clues presented by the
salary and comp-ratio measures of pay, things got a bit confusing with results that did not see to
be consistent. We found, among other things, that the male and female compa-ratios were fairly
close together with the female mean being slightly larger. The salary analysis showed a different
view; here we noticed that the averages were apparently quite different with the males, on
average, earning more. Contradictory findings such as this are not all that uncommon when
examining data in the “real world.”
One issue that we could not fully address last week was how meaningful were the
differences? That is, would a different sample have results that might be completely different, or
can we be fairly sure that the observed differences are real and show up in the population as
well? This issue, often referred to as sampling error, deals with the fact that random samples
taken from a population will generally be a bit different than the actual population parameters,
but will be “close” enough to the actual values to be valuable in decision making.
This week, our journey takes us to ways to explore differences, and how significant these
differences are. Just as clues in mysteries are not all equally useful, not all differences are
equally important; and one of the best things statistics will do for us is tell us what differences
we should pay attention to and what we can safely ignore.
Side note; this is a skill that many managers could benefit from. Not all differences in
performances from one period to another are caused by intentional employee actions, some are
due to random variations that employees have no control over. Knowing which differences to
react to would make managers much more effective.
In keeping with our detective theme, this week could be considered the introduction of
the crime scene experts who help detectives interpret what the physical evidence means and how
it can relate to the crime being looked at. We are getting into the support being offered by
experts who interpret details. We need to know how to use these experts to our fullest
advantage. 😊😊
Differences
In general, differences exist in virtually everything we measure that is man-made or
influenced. The underlying issue in statistical analysis is that at times differences are important.
When measu .
BUS 308 Week 2 Lecture 1
Examining Differences - overview
Expected Outcomes
After reading this lecture, the student should be familiar with:
1. The importance of random sampling.
2. The meaning of statistical significance.
3. The basic approach to determining statistical significance.
4. The meaning of the null and alternate hypothesis statements.
5. The hypothesis testing process.
6. The purpose of the F-test and the T-test.
Overview
Last week we collected clues and evidence to help us answer our case question about
males and females getting equal pay for equal work. As we looked at the clues presented by the
salary and comp-ratio measures of pay, things got a bit confusing with results that did not see to
be consistent. We found, among other things, that the male and female compa-ratios were fairly
close together with the female mean being slightly larger. The salary analysis showed a different
view; here we noticed that the averages were apparently quite different with the males, on
average, earning more. Contradictory findings such as this are not all that uncommon when
examining data in the “real world.”
One issue that we could not fully address last week was how meaningful were the
differences? That is, would a different sample have results that might be completely different, or
can we be fairly sure that the observed differences are real and show up in the population as
well? This issue, often referred to as sampling error, deals with the fact that random samples
taken from a population will generally be a bit different than the actual population parameters,
but will be “close” enough to the actual values to be valuable in decision making.
This week, our journey takes us to ways to explore differences, and how significant these
differences are. Just as clues in mysteries are not all equally useful, not all differences are
equally important; and one of the best things statistics will do for us is tell us what differences
we should pay attention to and what we can safely ignore.
Side note; this is a skill that many managers could benefit from. Not all differences in
performances from one period to another are caused by intentional employee actions, some are
due to random variations that employees have no control over. Knowing which differences to
react to would make managers much more effective.
In keeping with our detective theme, this week could be considered the introduction of
the crime scene experts who help detectives interpret what the physical evidence means and how
it can relate to the crime being looked at. We are getting into the support being offered by
experts who interpret details. We need to know how to use these experts to our fullest
advantage. 😊😊
Differences
In general, differences exist in virtually everything we measure that is man-made or
influenced. The underlying issue in statistical analysis is that at times differences are important.
When measu.
Statistics is a powerful tool for both researchers and decision makers, yet, there remains many misuse, misinterpretations, and misrepresentations of statistics. This seminar aims at raising awareness of common misconceptions in statistics in social science and beyond (e.g. media, readers). I do not own the copyrights of the materials in this presentation, all the sources were added in the bottom of the slide in which I borrowed the figures from other sources.
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxkarlhennesey
Page 266
LEARNING OBJECTIVES
· Explain how researchers use inferential statistics to evaluate sample data.
· Distinguish between the null hypothesis and the research hypothesis.
· Discuss probability in statistical inference, including the meaning of statistical significance.
· Describe the t test and explain the difference between one-tailed and two-tailed tests.
· Describe the F test, including systematic variance and error variance.
· Describe what a confidence interval tells you about your data.
· Distinguish between Type I and Type II errors.
· Discuss the factors that influence the probability of a Type II error.
· Discuss the reasons a researcher may obtain nonsignificant results.
· Define power of a statistical test.
· Describe the criteria for selecting an appropriate statistical test.
Page 267IN THE PREVIOUS CHAPTER, WE EXAMINED WAYS OF DESCRIBING THE RESULTS OF A STUDY USING DESCRIPTIVE STATISTICS AND A VARIETY OF GRAPHING TECHNIQUES. In addition to descriptive statistics, researchers use inferential statistics to draw more general conclusions about their data. In short, inferential statistics allow researchers to (a) assess just how confident they are that their results reflect what is true in the larger population and (b) assess the likelihood that their findings would still occur if their study was repeated over and over. In this chapter, we examine methods for doing so.
SAMPLES AND POPULATIONS
Inferential statistics are necessary because the results of a given study are based only on data obtained from a single sample of research participants. Researchers rarely, if ever, study entire populations; their findings are based on sample data. In addition to describing the sample data, we want to make statements about populations. Would the results hold up if the experiment were conducted repeatedly, each time with a new sample?
In the hypothetical experiment described in Chapter 12 (see Table 12.1), mean aggression scores were obtained in model and no-model conditions. These means are different: Children who observe an aggressive model subsequently behave more aggressively than children who do not see the model. Inferential statistics are used to determine whether the results match what would happen if we were to conduct the experiment again and again with multiple samples. In essence, we are asking whether we can infer that the difference in the sample means shown in Table 12.1 reflects a true difference in the population means.
Recall our discussion of this issue in Chapter 7 on the topic of survey data. A sample of people in your state might tell you that 57% prefer the Democratic candidate for an office and that 43% favor the Republican candidate. The report then says that these results are accurate to within 3 percentage points, with a 95% confidence level. This means that the researchers are very (95%) confident that, if they were able to study the entire population rather than a sample, the actual percentage who preferred th ...
Change language English DeutschEspañolNederlandsYour ResultsClosed.docxsleeperharwell
Change language: English DeutschEspañolNederlandsYour ResultsClosed-MindedOpen to New ExperiencesDisorganizedConscientiousIntrovertedExtravertedDisagreeableAgreeableCalm / RelaxedNervous / High-Strung
What aspects of personality does this tell me about?
There has been much research on how people describe others, and five major dimensions of human personality have been found. They are often referred to as the OCEAN model of personality,
because of the acronym from the names of the five dimensions. Here are your results:
Open-Mindedness
High scorers tend to be original, creative, curious, complex; Low scorers tend to be conventional, down to earth, narrow interests, uncreative.
You typically don't seek out new experiences.
(Your percentile: 54)
Conscientiousness
High scorers tend to be reliable, well-organized, self-disciplined, careful; Low scorers tend to be disorganized, undependable, negligent.
You are very well-organized, and can be relied upon.
(Your percentile: 98)
Extraversion
High scorers tend to be sociable, friendly, fun loving, talkative; Low scorers tend to be introverted, reserved, inhibited, quiet.
You are relatively social and enjoy the company of others.
(Your percentile: 73)
Agreeableness
High scorers tend to be good natured, sympathetic, forgiving, courteous; Low scorers tend to be critical, rude, harsh, callous.
You tend to consider the feelings of others.
(Your percentile: 68)
Negative Emotionality
High scorers tend to be nervous, high-strung, insecure, worrying; Low scorers tend to be calm, relaxed, secure, hardy.
You are generally relaxed.
(Your percentile: 22)
Results Feedback
How useful did you find your results?
Not at all
Very Useful
What is the “Big Five”?Personality psychologists are interested in what differentiates one person from another and why we behave the way that we do. Personality research, like any science, relies on quantifiable concrete data which can be used to examine what people are like. This is where the Big Five plays an important role.
The Big Five was originally derived in the 1970's by two independent research teams -- Paul Costa and Robert McCrae (at the National Institutes of Health), and Warren Norman (at the University of Michigan)/Lewis Goldberg (at the University of Oregon) -- who took slightly different routes at arriving at the same results: most human personality traits can be boiled down to five broad dimensions of personality, regardless of language or culture. These five dimensions were derived by asking thousands of people hundreds of questions and then analyzing the data with a statistical procedure known as factor analysis. It is important to realize that the researchers did not set out to find five dimensions, but that five dimensions emerged from their analyses of the data. In scientific circles, the Big Five is now the most widely accepted and used model of personality .
Cross-Cultural PsychologyChapter 2 Methodology of Cross-Cult.docxannettsparrow
Cross-Cultural Psychology
Chapter 2
Methodology of Cross-Cultural Research
A blind man who sees is better than a seeing man who is blind.
Persian Proverb
Never believe on faith, see for yourself! What you yourself don’t learn, you don’t know.
Bertolt Brecht (1898–1956)—
Twentieth-Century German Playwright
Shiraev/Levy Cross-Cultural Psychology 5/e
1
Goals of Cross-Cultural Research
Shiraev/Levy Cross-Cultural Psychology 5/e
Imagine, a researcher wants to find similarities and differences between arranged marriage practiced in India and nonarranged marriages in the United States and how they affect marital stability. What does the psychologist aim to pursue in this particular project?
First, the researcher wants to describe the findings of this research.
Then, when some differences between ethnic groups are found, the researcher tries to explain whether these factors affect stability.
The practical value of the study may be significant if it not only explains but also predicts the factors that should determine successful marital relationships in both studied groups.
2
Love marriages are like hot soup that cool overtime, arranged marriages are like cold soup that warm up.
-Outsourced
“There is never a time or place for true love. It happens accidentally, in a heartbeat, in a single flashing, throbbing moment.”
― Sarah Dessen, The Truth About Forever
Different cultures and even people within these cultures have different perspectives on love and marriage.
3
First, the researcher wants to describe the findings of this research.
Then, when some differences between ethnic groups are found, the researcher tries to explain whether these factors affect stability.
The practical value of the study may be significant if it not only explains but also predicts the factors that should determine successful marital relationships in both studied groups.
Factors that Affect Marital Stability
What we aim to do as cultural psychologists is to describe, explain, and predict behavior.
4
Two strategies in cross-cultural research
Application-Oriented
Strategy
Comparativist
Strategy
Shiraev/Levy Cross-Cultural Psychology 5/e
Application oriented attempts to establish research findings obtained in one country to the culture of another. Comparativist trys to find similarities and differences in sampling of cultures.
5
equivalence. Indicates that the evidence that the methods selected for the study measure the same phenomenon across other cultures chosen for the study.
Method A is used to study anxiety in France and Italy
Method B is used to study anxiety in India and Pakistan
The results will likely to be incompatible due to the equivalency problem
!
Shiraev/Levy Cross-Cultural Psychology 5/e
Consider a study that measures anxiety using a self-report survey in France but a study which uses observation of a population and measures number of anxiety educing instances in an Indian population. While they may attempt to measure the sa.
IMRAD format
An acronym for Introduction – Method – Results – and – Discussion. The IMRaD format is a way of structuring a scientific article. It is often used in health care and the natural sciences. Unlike theses in the social sciences, the IMRaD format does not include a separate theory chapter
This is a modified version of Master Class that Dr Siobhan O'Dwyer delivered at the Griffith University School of Nursing's Annual Research School for postgraduate students.
1. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
1
Four
Assumptions
of
Multiple
Regression
that
Researchers
should
Always
Test
A
Reference
Paper
Review
Jasmine
K.
Tamanaha
University
of
North
Carolina
–
Charlotte
Author’s
Note
This paper was prepared for Course Project, STAT 4123/5123 Applied Statistics
I, taught by Dr. Shaoyu Li.
2. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
2
Abstract
We live in a world, where results are key and numbers answer questions
and solidify answers. How many times have you thought to yourself, “show me
the numbers?” Even as a numbers person I often times find myself asking or
thinking the same thing, however, I also like to dig a little deeper and ask the
follow up questions that never seem to get asked or answered, “WHERE did you
get your numbers?” Likewise, “HOW did you come to that conclusion?” This
review of a reference paper relative to those types of questions, and responds to
how faulty the numbers can be, four assumptions that the practicing researcher
(Osborne and Waters, 2002) needs to take into account, how to test these four
assumptions, and how pertinent this information is to data analysis and more
specifically analysis in the social sciences. If any of these assumptions is
violated…then the forecasts, confidence intervals, and scientific insights yielded
by a regression model may be (at best) inefficient or (at worst) seriously biased or
misleading. (Roberts, 2014)
3. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
3
“Essentially, all models are wrong, but some are useful”. (Box, 1987) This may
be one of the most analyzed and discussed quotes, by analysts alike. The first time I
heard this was from my Applied Statistics I class taught by Dr. Li, and after hearing the
quote, it really resonated with me. I further investigated and it was not hard to find.
After typing in bits and pieces of the quote into Google, it quickly auto filled and
immediately my page was flooded. Suddenly I was inundated with information about
George E.P. Box, questions and discussions of “what does this mean”, and much more.
Personally, I have since re-quoted this many times particularly whenever somebody
wants to talk numbers. The further you progress in your statistical studies you come to
realize that numbers are not as reliable as you had been originally taught since grade
school. Osborne and Waters do a remarkable job in Four Assumptions of Multiple
Regression That Researchers Should Always Test (2002), at bringing some issues to light,
in particular, highlighting four assumptions that fellow researchers and analysts need to
concede:
1) Normality Assumptions
2) Linearity Assumptions
3) Reliability of Measurement Assumptions
4) Homoscedasticity Assumptions
Awareness
and
understanding
the
importance
of
checking
these
assumptions
in
regression
analysis,
period,
should
and
needs
to
be
general
knowledge.
4. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
4
Regression
analysis
assumes
that
the
data
variables
have
normal
distribution,
but
what
about
the
cases
of
non-‐normality?
Most
people
know
that
non-‐normality
exists
and
if
the
name
does
not
ring
a
bell,
words
like
“outlier”
and
“skewed”
are
most
definitely
key
buzzwords
that
everyone
has
either
used
or
heard.
In
statistics
substantial
outliers
and
highly
skewed
variables
can
completely
change
the
relationships
of
the
data,
as
well
as
significance
tests.
In
statistics
you
learn
many
ways
to
spot
non-‐normality
such,
as
Normality
Plots,
Q-‐Q
Plots,
and
Smirnov
tests
to
name
a
few.
As
a
result
of
finding
normality
we
are
taught
about
“data
cleaning”
and
using
Transformations.
However
by
removing
outliers
we
may
deleting
key
information
that
may
or
may
not
be
relevant
to
the
test
at
hand,
by
adding
more
data
we
can
make
the
data
high
risk
for
multicollinearity,
Type
I
or
Type
II
errors,
and
by
doing
Transformation
we
may
be
complicating
the
interpretation
of
the
results.
Basically
we
have
learned
ways
to
improve
normality
and
maybe
even
accuracy,
but
at
what
cost?
Changing
data
has
always
been
a
topic
of
curiosity
for
me,
because
solely
for
analytical
purposes
I
have
my
ideal
goals
for
meeting
basic
requirements
such
as,
p-‐
values,
z-‐tests,
t-‐tests,
Adjusted
R-‐squared,
F-‐statistic,
and
the
list
goes
on.
Conversely
so,
it
makes
me
want
to
shout
“YOU
ARE
STILL
CHANGING
DATA”.
How
am
I
supposed
to
trust
any
statistics
regurgitated
by
news
anchors,
salesman,
and
advertisements
not
knowing
the
steps
that
were
taken
to
support
their
“90%
Accuracy”
or
“5.8%
Unemployment
Drop”?
Overall
we
assume
that
there
is
even
a
relationship
between
the
dependent
and
independent
variables,
and
multiple
regression
can
only
accurately
estimate
the
relationship
between
these
variables
if
the
relationships
are
linear
in
nature. (Osborne, Waters, 2002) Which presents the
question, “What about social sciences?” Non-linear relationships commonly occur in the
social sciences, of which Osborne has an in-depth working knowledge of, particularly
Psychology and Education. When experiencing non-linearity the results will typically
underestimate the true nature between the independent and dependent variables. Osborne
and Waters (2002), Pedhazur (1997), Cohen and Cohen (1983), and Berry and Feldman
(1985) discuss or suggest three primary ways to detect non-linearity (Osborne, Waters
5. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
5
2002). First being use of theory, or using past analyses to educate oneself as well as
supplement the current analyses. Second, accrue residual plots, which is easily and
readily accessible, and Thirdly, detecting curvilinearity, using squared or cubed terms.
The three “primary” methods for detecting non-linearity are not fail-safe, and still
pose many concerns, especially for the social sciences. Logically the social sciences
have many variables that are not particularly measurable. How exactly do you measure
your stress and anxiety levels? Unfortunately humans we not built with charts detailing
or bodies levels, although we have come up with way that we could use factors to test our
stress levels. Those many factors are obviously important to measurement, but there is a
very clear correlation amongst the factors, which again can lead to under estimation, over
estimation, all based on unreliable measurements. Every statistician’s goal is to
accurately model the “real” relationship, that is where Cronbach alphas come into play,
mainly for the world of social science analyses. Error estimates and reliability estimates
are just that, estimates, and are often times assumed for acceptability. There are
accepted methods for dealing with reliability in both simple and multiple regression.
Analysts be aware, even small correlations can change your R squared when correcting
low reliability, in making adjustments you may also change the magnitude or even the
direction of relationships, and the most dramatic changes occur when the covariate has a
substantial relationship with the other variables.
Even the simplest of changes can cause a reaction of changes, which may even
change what your data was trying to say in the first place. With discussing unreliable
reliability measurements, I also mentioned error estimates. What happens if the variance
of errors is the same across the board for all levels of independent variables? This is
called Homoscedasticity, and the opposite is Heteroscedasticity. When
Heteroscedasticity is very obvious, it can lead to serious alterations in your data, and
which can certainly “weaken” the analysis. Again with weakening the analysis you will
run into overestimation errors. We can use our handy residual plots to check for it.
Visually heteroscedasticity may look like bowtie or even a fan, which as we know we
want even randomness around 0 for our residuals. The fan shape can be show in
Goldfeld-Quandt test, indicating the error term either increases or decreases consistently
6. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
6
as the value of the dependent variables increases, and in the Glejser test we recognize the
bow-tie shape due to the error term having a small variance centrally and a larger
variance at the extreme points. Transformation may be helpful to reduce
heteroscedasticity.
As one can see there is no quick fix or remedy without having potential
consequences, but not making alterations may have consequences as well, and it is very
much a catch 22 situation, which is when one may decide to go about their research and
analysis differently. Osborne and Waters’ main goal of the article was to raise
awareness of the importance of checking assumptions in simple and multiple regression
(2002), and that the four assumptions given can be checked and dealt with, with ease,
which seem to have important benefits. As Osborne and Waters also state as an
introduction “Most statistical tests rely upon certain assumptions about the variables
used in the analysis.” So it is our duty as researchers and analysts to recognize situations
as to not cause serious bias, familiarize, even if they may have little affect, and identify
when the violations of these four assumptions and many others are essential to
meaningful data analysis (Pedhazur, 1997, p.33). We have as serious situation where we
have a rich literature in education and social science, but we are fored to call into
question the validity of many of these results, conclusions, and assertions, as we have no
idea whether the assumptions of the statistical tests were met. (Osborne).
7. FOUR
ASSUMPTIONS
RESEARCHERS
SHOULD
TEST
7
References
Osborne, Jason W, & Waters, Elaine, Four Assumptions of Multiple Regression
That Researchers Should Always Test, Practical Assessment, Research &
Evaluation, 2002, 8(2), North Carolina State University University of
Oklahoma.
Box, George E. P.; Norman R. Draper (1987). Empirical Model-Building and
Response Surfaces, p. 424, Wiley. ISBN 0471810339.
Roberts, K. Global Warming: Utah's Future Threatens Hotter Temps, Longer and
More Severe Droughts. Department of Decision Sciences. Duke University: The
Fuqua School of Business, Updated 1 Dec. 2014 Web. 2014.
Berry, W. D., Feldman, S. (1985). Multiple Regression in Practice. Sage
University Paper Series on Quantitative Applications in the Social Sciences, series
no. 07-050). Newbury Park, CA: Sage.
Cohen, J., Cohen, P. (1983). Applied multiple regression/correlation analysis
for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Nunnally, J. C. (1978). Psychometric Theory (2nd
ed.). New York: McGraw Hill.
Osborne, J. W. (2001). A new look at outliers and fringeliers: Their effects on
statistic accuracy and Type I and Type II error rates. Unpublished manuscript,
Department of Educational Research and Leadership and Counselor Education,
North Carolina State University.
Osborne, J. W., Christensen, W. R., Gunter, J. (April, 2001). Educational
Psychology from a Statistician’sPerspective: AReviewofthePower and Goodness
of Educational Psychology Research. Paper presented at the national meeting of
the American Education Research Association (AERA), Seattle, WA.
Pedhazur, E. J., (1997). Multiple Regression in Behavioral Research (3
rd
ed.).
Orlando, FL:Harcourt Brace.
Tabachnick, B. G., Fidell, L. S. (1996). Using Multivariate Statistics (3rd ed.).
New York: Harper Collins College Publishers
Tabachnick, B. G., Fidell, L. S. (2001). Using Multivariate Statistics (4th ed.).
Needham Heights, MA: Allyn and Bacon.