4 major threats to reproducibility are publication bias, low power, p-hacking and HARKing. In this talk I explain these terms and show how study pre-registration can fix them
In this talk I discuss our recent Bayesian reanalysis of the Reproducibility Project: Psychology.
The slides at the end include the technical details underlying the Bayesian model averaging method we employ.
A. Gelman "50 shades of gray: A research story," presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
D. Mayo: Replication Research Under an Error Statistical Philosophy jemille6
D. Mayo (Virginia Tech) slides from her talk June 3 at the "Preconference Workshop on Replication in the Sciences" at the 2015 Society for Philosophy and Psychology meeting.
4 major threats to reproducibility are publication bias, low power, p-hacking and HARKing. In this talk I explain these terms and show how study pre-registration can fix them
In this talk I discuss our recent Bayesian reanalysis of the Reproducibility Project: Psychology.
The slides at the end include the technical details underlying the Bayesian model averaging method we employ.
A. Gelman "50 shades of gray: A research story," presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
D. Mayo: Replication Research Under an Error Statistical Philosophy jemille6
D. Mayo (Virginia Tech) slides from her talk June 3 at the "Preconference Workshop on Replication in the Sciences" at the 2015 Society for Philosophy and Psychology meeting.
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
Exploratory Research is More Reliable Than Confirmatory Researchjemille6
PSA 2016 Symposium:
Philosophy of Statistics in the Age of Big Data and Replication Crises
Presenter: Clark Glymour (Alumni University Professor in Philosophy, Carnegie Mellon University, Pittsburgh, Pennsylvania)
ABSTRACT: Ioannidis (2005) argued that most published research is false, and that “exploratory” research in which many hypotheses are assessed automatically is especially likely to produce false positive relations. Colquhoun (2014) with simulations estimates that 30 to 40% of positive results using the conventional .05 cutoff for rejection of a null hypothesis is false. Their explanation is that true relationships in a domain are rare and the selection of hypotheses to test is roughly independent of their truth, so most relationships tested will in fact be false. Conventional use of hypothesis tests, in other words, suffers from a base rate fallacy. I will show that the reverse is true for modern search methods for causal relations because: a. each hypothesis is tested or assessed multiple times; b. the methods are biased against positive results; c. systems in which true relationships are rare are an advantage for these methods. I will substantiate the claim with both empirical data and with simulations of data from systems with a thousand to a million variables that result in fewer than 5% false positive relationships and in which 90% or more of the true relationships are recovered.
Statistical skepticism: How to use significance tests effectively jemille6
Prof. D. Mayo, presentation Oct. 12, 2017 at the ASA Symposium on Statistical Inference : “A World Beyond p < .05” in the session: “What are the best uses for P-values?“
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performancejemille6
Slides from Rutgers Seminar talk by Deborah G Mayo
December 3, 2014
Rutgers, Department of Statistics and Biostatistics
Abstract: Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.
D. G. Mayo (Virginia Tech) "Error Statistical Control: Forfeit at your Peril" presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Abstract: Mounting failures of replication in the social and biological sciences give a practical spin to statistical foundations in the form of the question: How can we attain reliability when methods make illicit cherry-picking and significance seeking so easy? Researchers, professional societies, and journals are increasingly getting serious about methodological reforms to restore scientific integrity – some are quite welcome (e.g., pre-registration), while others are quite radical. The American Statistical Association convened members from differing tribes of frequentists, Bayesians, and likelihoodists to codify misuses of P-values. Largely overlooked are the philosophical presuppositions of both criticisms and proposed reforms. Paradoxically, alternative replacement methods may enable rather than reveal illicit inferences due to cherry-picking, multiple testing, and other biasing selection effects. Crowd-sourced reproducibility research in psychology is helping to change the reward structure but has its own shortcomings. Focusing on purely statistical considerations, it tends to overlook problems with artificial experiments. Without a better understanding of the philosophical issues, we can expect the latest reforms to fail.
The research problem statement is one of the first steps in developing a Doctoral Thesis proposal. It is the starting point of the research process. Identifiable aspects of a research problem include something is broken, it has a cause and effect relationship, and there are initial observations and evidence mentioned. Developing a research problem statement from an identified problem isn’t easy but is an essential step in the thesis proposal process. To assist in the what and how, the Doctorate Hub team has been putting together this slideshow.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Severe Testing: The Key to Error Correctionjemille6
D. G. Mayo's slides for her presentation given March 17, 2017 at Boston Colloquium for Philosophy of Science, Alfred I.Taub forum: "Understanding Reproducibility & Error Correction in Science"
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
Exploratory Research is More Reliable Than Confirmatory Researchjemille6
PSA 2016 Symposium:
Philosophy of Statistics in the Age of Big Data and Replication Crises
Presenter: Clark Glymour (Alumni University Professor in Philosophy, Carnegie Mellon University, Pittsburgh, Pennsylvania)
ABSTRACT: Ioannidis (2005) argued that most published research is false, and that “exploratory” research in which many hypotheses are assessed automatically is especially likely to produce false positive relations. Colquhoun (2014) with simulations estimates that 30 to 40% of positive results using the conventional .05 cutoff for rejection of a null hypothesis is false. Their explanation is that true relationships in a domain are rare and the selection of hypotheses to test is roughly independent of their truth, so most relationships tested will in fact be false. Conventional use of hypothesis tests, in other words, suffers from a base rate fallacy. I will show that the reverse is true for modern search methods for causal relations because: a. each hypothesis is tested or assessed multiple times; b. the methods are biased against positive results; c. systems in which true relationships are rare are an advantage for these methods. I will substantiate the claim with both empirical data and with simulations of data from systems with a thousand to a million variables that result in fewer than 5% false positive relationships and in which 90% or more of the true relationships are recovered.
Statistical skepticism: How to use significance tests effectively jemille6
Prof. D. Mayo, presentation Oct. 12, 2017 at the ASA Symposium on Statistical Inference : “A World Beyond p < .05” in the session: “What are the best uses for P-values?“
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performancejemille6
Slides from Rutgers Seminar talk by Deborah G Mayo
December 3, 2014
Rutgers, Department of Statistics and Biostatistics
Abstract: Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.
D. G. Mayo (Virginia Tech) "Error Statistical Control: Forfeit at your Peril" presented May 23 at the session on "The Philosophy of Statistics: Bayesianism, Frequentism and the Nature of Inference," 2015 APS Annual Convention in NYC.
Abstract: Mounting failures of replication in the social and biological sciences give a practical spin to statistical foundations in the form of the question: How can we attain reliability when methods make illicit cherry-picking and significance seeking so easy? Researchers, professional societies, and journals are increasingly getting serious about methodological reforms to restore scientific integrity – some are quite welcome (e.g., pre-registration), while others are quite radical. The American Statistical Association convened members from differing tribes of frequentists, Bayesians, and likelihoodists to codify misuses of P-values. Largely overlooked are the philosophical presuppositions of both criticisms and proposed reforms. Paradoxically, alternative replacement methods may enable rather than reveal illicit inferences due to cherry-picking, multiple testing, and other biasing selection effects. Crowd-sourced reproducibility research in psychology is helping to change the reward structure but has its own shortcomings. Focusing on purely statistical considerations, it tends to overlook problems with artificial experiments. Without a better understanding of the philosophical issues, we can expect the latest reforms to fail.
The research problem statement is one of the first steps in developing a Doctoral Thesis proposal. It is the starting point of the research process. Identifiable aspects of a research problem include something is broken, it has a cause and effect relationship, and there are initial observations and evidence mentioned. Developing a research problem statement from an identified problem isn’t easy but is an essential step in the thesis proposal process. To assist in the what and how, the Doctorate Hub team has been putting together this slideshow.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Severe Testing: The Key to Error Correctionjemille6
D. G. Mayo's slides for her presentation given March 17, 2017 at Boston Colloquium for Philosophy of Science, Alfred I.Taub forum: "Understanding Reproducibility & Error Correction in Science"
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS7 MEDIA LIBRARY.docxtaishao1
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS
7: MEDIA LIBRARY
Premium Videos
Core Concepts in Stats Video
· Probability and Hypothesis Testing
Lightboard Lecture Video
· Hypothesis Testing
Difficulty Scale
(don’t plan on going out tonight)
WHAT YOU WILL LEARN IN THIS CHAPTER
· Understanding the difference between a sample and a population
· Understanding the importance of the null and research hypotheses
· Using criteria to judge a good hypothesis
SO YOU WANT TO BE A SCIENTIST
You might have heard the term hypothesis used in other classes. You may even have had to formulate one for a research project you did for another class, or you may have read one or two in a journal article. If so, then you probably have a good idea what a hypothesis is. For those of you who are unfamiliar with this often-used term, a hypothesis is basically “an educated guess.” Its most important role is to reflect the general problem statement or question that was the motivation for asking the research question in the first place.
That’s why taking the care and time to formulate a really precise and clear research question is so important. This research question will guide your creation of a hypothesis, and in turn, the hypothesis will determine the techniques you will use to test it and answer the question that was originally asked.
So, a good hypothesis translates a problem statement or a research question into a format that makes it easier to examine. This format is called a hypothesis. We will talk about what makes a hypothesis a good one later in this chapter. Before that, let’s turn our attention to the difference between a sample and a population. This is an important distinction, because while hypotheses usually describe a population, hypothesis testing deals with a sample and then the results are generalized to the larger population. We also address the two main types of hypotheses (the null hypothesis and the research hypothesis). But first, let’s formally define some simple terms that we have used earlier in Statistics for People Who (Think They) Hate Statistics.
SAMPLES AND POPULATIONS
As a good scientist, you would like to be able to say that if Method A is better than Method B in your study, this is true forever and always and for all people in the universe, right? Indeed. And, if you do enough research on the relative merits of Methods A and B and test enough people, you may someday be able to say that.
But don’t get too excited, because it’s unlikely you will ever be able to speak with such confidence. It takes too much money ($$$) and too much time (all those people!) to do all that research, and besides, it’s not even necessary. Instead, you can just select a representative sample from the population and test your hypothesis about the relative merits of Methods A and B on that sample.
Given the constraints of never enough time and never enough research funds, with which almost all scientists live, the next best strategy is to take a portion of a lar.
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS7 MEDIA LIBRARY.docxevonnehoggarth79783
7 HYPOTHETICALS AND YOU TESTING YOUR QUESTIONS
7: MEDIA LIBRARY
Premium Videos
Core Concepts in Stats Video
· Probability and Hypothesis Testing
Lightboard Lecture Video
· Hypothesis Testing
Difficulty Scale
(don’t plan on going out tonight)
WHAT YOU WILL LEARN IN THIS CHAPTER
· Understanding the difference between a sample and a population
· Understanding the importance of the null and research hypotheses
· Using criteria to judge a good hypothesis
SO YOU WANT TO BE A SCIENTIST
You might have heard the term hypothesis used in other classes. You may even have had to formulate one for a research project you did for another class, or you may have read one or two in a journal article. If so, then you probably have a good idea what a hypothesis is. For those of you who are unfamiliar with this often-used term, a hypothesis is basically “an educated guess.” Its most important role is to reflect the general problem statement or question that was the motivation for asking the research question in the first place.
That’s why taking the care and time to formulate a really precise and clear research question is so important. This research question will guide your creation of a hypothesis, and in turn, the hypothesis will determine the techniques you will use to test it and answer the question that was originally asked.
So, a good hypothesis translates a problem statement or a research question into a format that makes it easier to examine. This format is called a hypothesis. We will talk about what makes a hypothesis a good one later in this chapter. Before that, let’s turn our attention to the difference between a sample and a population. This is an important distinction, because while hypotheses usually describe a population, hypothesis testing deals with a sample and then the results are generalized to the larger population. We also address the two main types of hypotheses (the null hypothesis and the research hypothesis). But first, let’s formally define some simple terms that we have used earlier in Statistics for People Who (Think They) Hate Statistics.
SAMPLES AND POPULATIONS
As a good scientist, you would like to be able to say that if Method A is better than Method B in your study, this is true forever and always and for all people in the universe, right? Indeed. And, if you do enough research on the relative merits of Methods A and B and test enough people, you may someday be able to say that.
But don’t get too excited, because it’s unlikely you will ever be able to speak with such confidence. It takes too much money ($$$) and too much time (all those people!) to do all that research, and besides, it’s not even necessary. Instead, you can just select a representative sample from the population and test your hypothesis about the relative merits of Methods A and B on that sample.
Given the constraints of never enough time and never enough research funds, with which almost all scientists live, the next best strategy is to take a portion of a lar.
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docxAASTHA76
Topic: Learning Team
Number of Pages: 2 (Double Spaced)
Number of sources: 1
Writing Style: APA
Type of document: Essay
Academic Level:Master
Category: Psychology
VIP Support: N/A
Language Style: English (U.S.)
Order Instructions:
I will attach the instruction. On this paper please follow the instructions carefully. Thank you
Correlation
PSYCH/610 Version 2
1
University of Phoenix Material
Correlation
A researcher is interested in investigating the relationship between viewing time (in seconds) and ratings of aesthetic appreciation. Participants are asked to view a painting for as long as they like. Time (in seconds) is measured. After the viewing time, the researcher asks the participants to provide a ‘preference rating’ for the painting on a scale ranging from 1-10. Create a scatter plot depicting the following data:
Viewing Time in Seconds
Preference Rating
10
3
12
4
24
7
5
3
16
6
3
4
11
4
5
2
21
8
23
9
9
5
3
3
17
5
14
6
What does the scatter plot suggest about the relationship between viewing time and aesthetic preference? Is it accurate to state that longer viewing times are the result of greater preference for paintings? Explain. Submit your scatter plot and your answers to the questions to your instructor.
LEARNING OBJECTIVES
· Explain how researchers use inferential statistics to evaluate sample data.
· Distinguish between the null hypothesis and the research hypothesis.
· Discuss probability in statistical inference, including the meaning of statistical significance.
· Describe the t test and explain the difference between one-tailed and two-tailed tests.
· Describe the F test, including systematic variance and error variance.
· Describe what a confidence interval tells you about your data.
· Distinguish between Type I and Type II errors.
· Discuss the factors that influence the probability of a Type II error.
· Discuss the reasons a researcher may obtain nonsignificant results.
· Define power of a statistical test.
· Describe the criteria for selecting an appropriate statistical test.
Page 267IN THE PREVIOUS CHAPTER, WE EXAMINED WAYS OF DESCRIBING THE RESULTS OF A STUDY USING DESCRIPTIVE STATISTICS AND A VARIETY OF GRAPHING TECHNIQUES.In addition to descriptive statistics, researchers use inferential statistics to draw more general conclusions about their data. In short, inferential statistics allow researchers to (a) assess just how confident they are that their results reflect what is true in the larger population and (b) assess the likelihood that their findings would still occur if their study was repeated over and over. In this chapter, we examine methods for doing so.
SAMPLES AND POPULATIONS
Inferential statistics are necessary because the results of a given study are based only on data obtained from a single sample of research participants. Researchers rarely, if ever, study entire populations; their findings are based on sample data. In addition to describing the sample data, we want to make statements ab.
What is the reproducibility crisis in science and what can we do about it?Dorothy Bishop
Talk given to the Rhodes Biomedical Association, 4th May 2016.
For references see: http://www.slideshare.net/deevybishop/references-on-reproducibility-crisis-in-science-by-dvm-bishop
The Role of Agent-Based Modelling in Extending the Concept of Bounded Rationa...Edmund Chattoe-Brown
A seminar given to the Judgement and Decision Making Research Group in the Department of Neuroscience, Psychology and Behaviour, University of Leicester kindly asked me to give a seminar on 25 January 2023 on "The Role of Agent-Based Modelling in Extending the Concept of Bounded Rationality". It discusses the challenges to different research methods of dealing with subjective accounts and models a situation where people can be rational but communicate and have incomplete information about both the number of choices and their payoff. The model is based on this paper: https://doi.org/10.1007/s11299-009-0060-7 One interesting result is that, without coercion or mass media, minority groups may be disadvantaged in their decision making by hegemonic discourse.
Strategy Execution is more important then ever. This ebook will help you identify the 7 most common strategy execution hurdles (execution villains) and shows you how to combat them.
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
Similar to Insights from psychology on lack of reproducibility (20)
Open Research Practices in the Age of a Papermill PandemicDorothy Bishop
Talk given to Open Research Group, Maynooth University, October 2022.
Describes the phenomenon of large-scale fraudulent science publishing (papermills), and discusses how open science practices can help tackle this.
Language-impaired preschoolers: A follow-up into adolescence.Dorothy Bishop
Stothard, S. E., Snowling, M. J., Bishop, D. V., Chipchase, B. B., & Kaplan, C. A. (1998). Language-impaired preschoolers: A follow-up into adolescence. Journal of Speech, Language, and Hearing Research: JSLHR, 41(2), 407–418. https://doi.org/10.1044/jslhr.4102.407
ABSTRACT: This paper reports a longitudinal follow-up of 71 adolescents with a preschool history of speech-language impairment, originally studied by Bishop and Edmundson (1987). These children had been subdivided at 4 years into those with nonverbal IQ 2 SD below the mean (General Delay group), and those with normal nonverbal intelligence (SLI group). At age 5;6 the SLI group was subdivided into those whose language problems had resolved, and those with persistent SLI. The General Delay group was also followed up. At age 15-16 years, these children were compared with age-matched normal-language controls on a battery of tests of spoken language and literacy skills. Children whose language problems had resolved did not differ from controls on tests of vocabulary and language comprehension skills. However, they performed significantly less well on tests of phonological processing and literacy skill. Children who still had significant language difficulties at 5;6 had significant impairments in all aspects of spoken and written language functioning, as did children classified as having a general delay. These children fell further and further behind their peer group in vocabulary growth over time.
Otitis media with effusion: an illustration of ascertainment biasDorothy Bishop
Otitis media with effusion (OME) provides an example of how ascertainment bias can induce spurious correlations. Early work suggested it impacted children's language, but when unbiased samples are studied, the effect is absent or very small
Simulating data to gain insights intopower and p-hackingDorothy Bishop
Very basic introduction to simulating data to illustrate issues affecting reproducibility. Uses Excel and R, but assumes no prior knowledge of R. Please let me know of errors or things that need better explanation.
Lecture by Prof Dorothy Bishop, 1st Feb 2017, University of Southampton:
What’s wrong with our Universities, and will the Teaching Excellence Framework put it right?
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Insights from psychology on lack of reproducibility
1. Insights from psychology
on lack of reproducibility
Dorothy V. M. Bishop
Professor of Developmental Neuropsychology
University of Oxford
@deevybee
Talk given at All Souls seminar on Reproducibility and Open Research, 31/10/18
http://users.ox.ac.uk/~phys1213/ReproAtASC.html
2. The four horsemen of the Apocalypse
P-hacking
Publication biasLow power
HARKing
6. 1956
De Groot
Failure to distinguish between
hypothesis-testing and hypothesis-
generating (exploratory) research
-> misuse of statistical tests
Historical timeline: concerns about reproducibility
Describes P-hacking (though that term not used)
Situation when 10 statistical tests done in a study
“….when N=10 it is as if one participates… in a game of chance
with “probability of losing” α for each “draw” or “throw”. The
probability that we do not lose a single time in 10 draws can be
calculated in the case that the draws are independent; it equals (1
− α)^10. For α = 0.05, the traditional 5% level, this becomes 0.9510
= 0.60. This means, therefore, that we have a 40% chance of
rejecting at least one of our 10 null hypotheses — falsely”
8. 1956
De Groot
1975
Greenwald
The “file drawer” problem
1979
Rosenthal
Prejudice against the null
“As it is functioning in at least some areas of
behavioral science research, the research-
publication system may be regarded as a
device for systematically generating and
propagating anecdotal information.”
Publication bias
1969
Cohen
Nonsignificant findings not
published: literature gives distorted
impression
10. If we’ve known about this for
decades, why haven’t the problems
been fixed?
No one cause: need to consider research
environment and incentives
This talk: focus on cognitive biases that make it
hard to do science well
Idea: doing good science is in opposition to many
of our natural ways of thinking
11. Cognitive biases that make it hard
to do science well
• Failure to understand probability
• Tendency to see patterns in things
• Confirmation bias
• Errors of omission seen as acceptable
• Need for narrative
12. A certain town is served by two hospitals. In the larger hospital about 45
babies are born each day, and in the smaller hospital about 15 babies are
born each day. As you know, about 50% of all babies are boys. However, the
exact percentage varies from day to day. Sometimes it may be higher than
50%, sometimes lower. For a period of 1 year, each hospital recorded the
days on which more than 60% of the babies born were boys. Which hospital
do you think recorded more such days?
1.The larger hospital
2.The smaller hospital
3.About the same (that is, within 5% of each other)
Example from Daniel Kahneman & Amos Tversky
Consider this problem
13. Example from Daniel Kahneman & Amos Tversky
A certain town is served by two hospitals. In the larger hospital about 45
babies are born each day, and in the smaller hospital about 15 babies are
born each day. As you know, about 50% of all babies are boys. However, the
exact percentage varies from day to day. Sometimes it may be higher than
50%, sometimes lower. For a period of 1 year, each hospital recorded the
days on which more than 60% of the babies born were boys. Which hospital
do you think recorded more such days?
1.The larger hospital
2.The smaller hospital
3.About the same (that is, within 5% of each other)
Expected value:
Hosp15 = 57 days
Hosp45 = 26 days
14. Example from Daniel Kahneman & Amos Tversky
A certain town is served by two hospitals. In the larger hospital about 45
babies are born each day, and in the smaller hospital about 15 babies are
born each day. As you know, about 50% of all babies are boys. However, the
exact percentage varies from day to day. Sometimes it may be higher than
50%, sometimes lower. For a period of 1 year, each hospital recorded the
days on which more than 60% of the babies born were boys. Which hospital
do you think recorded more such days?
1.The larger hospital
2.The smaller hospital
3.About the same (that is, within 5% of each other)
Expected value:
Hosp15 = 57 days
Hosp45 = 26 days
Day
Hospital 15 Hospital 45
Small sample gives noisier
estimates: red line bounces
around much more than blue line
15. Insensitivity to sample size
• People have strong intuitions about random
sampling;
• These intuitions are wrong in fundamental
respects;
• These intuitions are shared by naive subjects and
by trained scientists;
• Intuitions are applied with unfortunate
consequences in the course of scientific inquiry
Tversky, A., & Kahneman, D. (1971). Belief in the law of small
numbers. Psychological Bulletin, 76, 105-110.
16. Work in progress: The experimenter game
• You have a budget to improve reading ability across Oxfordshire –
potential for roll-out to hundreds of schools. You’ve been offered
a remedy that claims to boost children’s reading ability by half a
standard deviation
• If you buy it and it turns out useless, you’ll lose a lot of money
• If you buy it and it really works, it will be worth a lot of money
• You’re not sure whether to trust the vendor – you think there’s a
50:50 chance that it really works
• You can run some tests on samples of children, but it costs
money – the more children you test, the more expensive.
• You have an optimization problem!
• So what’s your experimental strategy?
17. You decide to run a study with two groups of N children
What value of N should you start with? - let’s try 20 per group
Here’s a sample of data: do you think this is sufficient to decide
whether to adopt/reject the intervention?
18. This sample was drawn from population with no real difference
19. Here’s another sample of data: do you think this is sufficient to
decide whether to adopt/reject the intervention?
20. This time the sample was drawn from a sample with a true
effect.
With small sample, difference can look small when there is a
true effect – this illustrates problem of LOW POWER
22. This time, the impression from the sample of data gives a better
indication of the true effect in the population
But how reliable this impression is depends on the effect size, i.e.
the separation in the means of the population distributions
23. Population
effect size
= .2
Population
effect size
= .5
Separation between red dots (drawn from population with true
effect) and grey dots (drawn from population with no effect)
shows sample size where can reliably detect a true effect
24. Failure to appreciate power of ‘the prepared mind’
Tendency to see patterns in things
26. Example from Lazic, S. (2016) Experimental Design for Laboratory Biologists
Position of bomb hits:
General has map of bomb hits and wants to know if bombs were
dropped at random or whether some sites are being targeted.
Which map suggests targeting? Blue, red or neither?
27. General message: we tend to assume random data are regular, and
so try to interpret patterns when there is irregularity
The blue map may look as if there is targeting, especially if there are
potential targets at A or B.
In fact, blue X and Y co-ordinates were selected at random.
The red map does not suggest targeting, but it is not random. The co-
ordinates were selected to be evenly distributed, and then jittered
A
B
28. But! seeing novel patterns in complex data is one of the most
important and exciting aspects of science!
Consider Brodmann (1909): identified brain regions with different cell
types – not obvious: required expertise and painstaking study
Bailey and von Bonin (1951) noted problems in Brodmann's approach
— lack of observer independency, reproducibility and objectivity
Yet Brodman’s areas stood test of time: still used today
29. Special expertise or Jesus in toast?
How to decide
• Eradicate subjectivity from methods
• Adopt standards from industry for checking/double-
checking
• Automate data collection and analysis as far as possible
• Make recordings of methods (e.g. Journal of Visualised
Experiments)
• Make data and analysis scripts open
31. How to do good science
That is the idea that we all hope you have learned in
studying science in school… ….
It’s a kind of scientific integrity, a principle of scientific
thought that corresponds to a kind of utter honesty—a
kind of leaning over backwards.
For example, if you’re doing an experiment, you should
report everything that you think might make it invalid—
not only what you think is right about it: other causes
that could possibly explain your results; and things you
thought of that you’ve eliminated by some other
experiment, and how they worked—to make sure the
other fellow can tell they have been eliminated.
Richard Feynman,
Caltech 1974 commencement address
32. Wason task:
a way of thinking about experimental design
Each card has a number on one side and a patch of colour on the
other.
You are asked to test the hypothesis that – for these 4 cards - if an
even number appears on one side, then the opposite side is red.
• Are any of the cards irrelevant to the hypothesis?
• Are any of the cards critical to the hypothesis?
• Which card(s) would you turn over to test the hypothesis?
A B C D
33. Wason task:
a way of thinking about experimental design
Each card has a number on one side and patch of colour on the other.
You are asked to test the hypothesis that – for these 4 cards - if an
even number appears on one side, then the opposite side is red.
• Usual response is B & C are critical.
• But C is not critical (we’re testing ‘if P then Q’, not ‘if Q then P’)
• D is critical as it has potential to disconfirm hypothesis – but usually
overlooked
A B C D
34. Wason task:
Shows how confirmation bias can affect
experimental design
We need to design experiments to look for disconfirmation of a theory .
In practice: "To test a hypothesis, we think of a result that would be found if the
hypothesis were true and then look for that result" (J. Baron, 1988, p. 231).
In survey of 84 scientists (physicists,biologists, psychologists,
sociologists) Mahoney (1976) found fewer than 10% correctly identified
the critical cards
35. “The self-deception comes
in that over the next 20
years, people believed
they saw specks of light
that corresponded to what
they thought Vulcan
should look during an
eclipse: round objects
crossing the face of the
sun, which were
interpreted as transits of
Vulcan.”
Confirmation bias at level of observations:
Seeing what you expect to see
36. • Cherry-picking may not be deliberate
• We find it much easier to process and remember information
that agrees with our viewpoint
Confirmation bias affects how we remember
and process information
37. 37
Twin studies of SLI
probandwise
concordance:
same-sex twins
MZ DZ
Lewis & Thompson, 1992 .86 .48
Bishop et al, 1995 .70 .46
Tomblin & Buckwalter, 1998 .96 .69
Hayiou-Thomas et al, 2005 .36 .33
A personal example: Slide from talks I gave on genetics
of language disorder
Twin concordance
points to genetic
influence when
MZ > DZ
38. 38
Twin studies of SLI
probandwise
concordance:
same-sex twins
MZ DZ
Lewis & Thompson, 1992 .86 .48
Bishop et al, 1995 .70 .46
Tomblin & Buckwalter, 1998 .96 .69
Hayiou-Thomas et al, 2005 .36 .33
I continued to use the original slide after 2005, despite
this additional study I had co-authored
I failed to mention this in talks for several years – I literally forgot about it –
presumably because it did not fit!
39. 39
Example also illustrates how we will do further research
to try to make sense of data that does not fit our ideas –
but look far less closely when data does fit.
For denouement of this story, see
41. Most literature reviews cherry-pick the evidence
(that’s why I’m not identifying this specific e.g.)
“Regardless of etiology, cerebellar neuropathology
commonly occurs in autistic individuals. Cerebellar
hypoplasia and reduced cerebellar Purkinje cell
numbers are the most consistent neuropathologies
linked to autism [8, 9, 10, 11, 12, 13]. MRI studies
report that autistic children have smaller cerebellar
vermal volume in comparison to typically developing
children [14].”
Example: Study published in 2013
• I was surprised by this introduction to a paper, as it did not fit my impression of the
literature on neuropathology in autism: but the authors seemed to cite a lot of
supportive evidence
• I checked to see if there was a relevant meta-analysis: there was….
42. Standardized mean difference is +ve when cerebellar volume is greater in ASD
Meta-analysis: Traut et al (2018) https://doi.org/10.1016/j.biopsych.2017.09.029
Though Webb et al (ref 14) did find area of vermis smaller in ASD after covarying cerebellum size
Ref [14] –
larger
cerebellum
Other studies
mostly found
no difference
or increase –
opposite of
what claimed
in 2013 paper
43. Confirmation bias tends to produce errors of
omission – these are generally thought to be less
serious than errors of commission (i.e. making
stuff up)
But consequences can be major
44. Errors of omission in reporting research
“[I]t is a truly gross ethical violation for a researcher to
suppress reporting of difficult-to-explain or embarrassing
data in order to present a neat and attractive package
to a journal editor.” (Greenwald, 1975, p. 19)
“Failure to report results from a clinical trial is equivalent
to fraud.” Iain Chalmers, personal communication
45. Consequence of omission errors in literature reviews
• When we read a peer-reviewed paper, we tend to trust the
citations that back up a point
• When we come to write our own paper, we cite the same
materials
• A good scientist won’t cite papers without reading them, but
even this won’t save you from bias – you inherit it from prior
papers
• If prior papers only cite materials agreeing with a viewpoint,
that viewpoint gets entrenched
• You won’t know – unless you explicitly search – that there are
other papers that give a different picture
46. The (partial*) solution
Always start with a systematic review
• Systematic review
• Collecting and summarise all empirical evidence that fits
pre-specified eligibility criteria to address a specific
question
• Meta-analysis
• Use statistical methods to summarise the results of
these studies
*But depends on finding all relevant papers
47. Example from Lazic, S. (2016) Experimental Design for Laboratory Biologists
100 relevant studies on gene/disease association
95 studies find no association.
Negative findings tend not to
be mentioned in Abstracts
5 false positive results. Disease
and gene mentioned in
Abstract
Pubmed search for
disease AND gene
5 supporting and one
negative study found
49. Let’s take another look at that cerebellum paper:
statements that are not untrue, but are misleading
“Regardless of etiology, cerebellar neuropathology
commonly occurs in autistic individuals. Cerebellar
hypoplasia and reduced cerebellar Purkinje cell
numbers are the most consistent neuropathologies
linked to autism [8, 9, 10, 11, 12, 13]. MRI studies
report that autistic children have smaller cerebellar
vermal volume in comparison to typically developing
children [14].” Impression of large body of work, but
mostly reviews of same few studies
Study 14 by Webb et al: found overall increase in cerebellum size:
smaller vermis effect only after adjusting total cerebellar volume
50. In terms of ethical behaviour, rank order the following
behaviours:
• Omission of relevant studies
• Stating that a study found something that it didn’t
• Stating a study result that was true, but in a misleading
way
51. I don’t know of studies looking at this in science reporting,
but analogous behaviour rated in studies of negotiation by
Rogers et al (2016)
• Omission of relevant information
• Lying (untrue statement)
• Stating something that is true, but in a misleading way
(paltering)
23%
5%
32%
Honesty
judgement
in
negotiation
study*
*Rogers, T. et al (2016). Artful paltering: The risks and rewards of using truthful statements to
mislead others. Journal of Personality and Social Psychology, 112(3), 456-473.
Neither omission of information nor paltering seen as
honest, but both are more acceptable than lying
52. How common are these in literature reviews? Does it
matter?
• Omission of relevant studies
• Stating that a study found something that it didn’t
• Stating a study result that was true, but in a misleading
way
My view:
Adoption of these behaviours in science is likely to depend on:
(a) Is it rewarded?
(b) Will it be detected?
(c) If it is, could you avoid blame?
(d) Are there obvious victims?
(e) Is ‘everyone doing it’?
Overlooked victims:
• Potential users (patients, etc)
• Researchers trying to build on results
• Funders
53. A further, overarching problem
The need for narrative
“Another reason why HARKed research reports may fare better in
the review and publication process is that they not only provide a
better fit to a specific good science script, they may also provide a
better fit to the more general good story script.
Positing a theory serves as an effective "initiating event." It gives
certain events significance and justifies the investigators'
subsequent purposeful activities directed at the goal of testing the
hypotheses. And, when one HARKs, a "happy ending” (i.e.,
confirmation) is guaranteed.”
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social
54. Darwinian processes in survival of ideas
“Examples of memes are tunes, ideas, catch-
phrases, clothes fashions, ways of making
pots or of building arches. Just as genes
propagate themselves in the gene pool by
leaping from body to body via sperms or
eggs, so memes propagate themselves in the
meme pool by leaping from brain to brain via
a process which, in the broad sense, can be
called imitation.”
R. Dawkins
55. Successful meme
• Easy to understand, remember, and communicate
to others
• Not helped by reporting everything!
• Not helped by reporting null results!
• May be influenced by whether confers advantage to
the person communicating
• Survival does not depend on whether they are
useful, true, or potentially harmful
56. Cognitive biases pervade every
step of the research process
Reading literature Confirmation bias, Omissions
Experimental design
Confirmation bias, Law of
small numbers
Experimental observations
Seeing patterns,
Confirmation bias
Data analysis
Confirmation bias, Seeing
patterns, Law of small
numbers, Omissions
Scientific reporting
Confirmation bias,
Omissions, Need for
narrative
57. Will anything change?
“It really is striking just for how long there have been
reports about the poor quality of research
methodology, inadequate implementation of research
methods and use of inappropriate analysis procedures
as well as lack of transparency of reporting. All have
failed to stir researchers, funders, regulators,
institutions or companies into action”. Bustin, 2014
Reasons for optimism
• Concern from those who use research:
• Doctors and patients
• Pharma companies
• Concern from funders
• Increase in studies quantifying the problem
• Social media
58. 58
Professor Dorothy Bishop, FRS, FMedSci, FBA,
Wellcome Trust Principal Research Fellow,
Department of Experimental Psychology,
Anna Watts Building,
Woodstock Road,
Oxford,
OX2 6GG. @deevybee
http://deevybee.blogspot.com/2012/11/bishopblog-catalogue-
updated-24th-nov.html
https://www.slideshare.net/deevybishop
https://orcid.org/0000-0002-2448-4033