SlideShare a Scribd company logo
1 of 13
Download to read offline
Towards reproducible research
and maximally-open data
Pablo Bernabeu
OSCG Open Scholarship Prize Competition, 14th May 2021
Psycholinguistics
> Conceptual processing
Reanalysis of data from
Hutchison et al. (2013)
How does the brain process the
meaning of words?
Statistical regularities in language
as well as perceptual, motor and
emotional information.
How does this process vary in
different contexts and across
different people?
2
Open information at each stage of research
Design: preregistration of theoretical background and methodological
protocol.
Development: procedural issues (corrections, errors, other changes)
that bear on the final materials.
Completion: data collection software, raw data, processed data, final
data set, analysis code and results.
S
h
a
r
e
d
i
n
m
y
P
h
D
S
h
a
r
e
d
i
n
m
y
M
A
3
Application of
open science
My research
Community: open-source code workshops and software
Open data
Bernabeu et al. (2017), Bernabeu (2018)
• All materials from the completion
stage: experiment administration
software, raw data, processed
data, final data sets, analysis code
and results.
• Development stage proceedings
reported.
• Readme files describing the data
sets and linking to resources such
as data dashboards.
4
Maximally-open data
Bernabeu et al. (2017), Bernabeu (2018)
• R-based web applications open to scientists and the general public.
• Easy visualization of the variance and inspection of procedural aspects such as
trimming, adjustments, changes. Quicker usage of the data (see blog post).
5
Preregistration, power analysis and open data
Chen et al. (2018)
• Prereg.: https://psyarxiv.com/t2pjv/
• Video demonstration of the lab
procedure: https://osf.io/h36wr/
• Data: https://osf.io/waf48/
6
Preregistration
and open data
Bernabeu et al. (2021)
• Specification of the theoretical
background and the methodological
protocol for a forthcoming study.
• Integration of FAIR data from several,
large-sample studies
• Large, secondary data = valuable
alternative to small, noisy samples
(see Loken & Gelman, 2017).
7
Power analysis: How many participants required?
For next study, the preregistration will include a power analysis based on two large-
sample pilots (combined, FAIR data sets), using power curves based on Monte Carlo
simulations (simr R package).
Preliminary curves below (pending more simulations for a greater accuracy).
Y axis = power for a certain effect; X axis = 1 to 312 participants.
8
R workshops
Workshops and presentations on data
visualisation and analysis in R since 2018,
mostly in the context of a fellowship from
the Software Sustainability Institute.
• http://pablobernabeu.github.io/#workshops
• https://github.com/pablobernabeu/Data-is-present
Blogging
Several blog posts on psycholinguistic research,
open science and statistics.
http://pablobernabeu.github.io/blog
9
More open-source web applications
for research and teaching
Experimental data simulation WebVTT caption transcription
https://github.com/pablobernabeu/Experimental-data-simulation https://github.com/pablobernabeu/VTT-Transcription-App
10
Concluding thoughts
• Attainable for early-career researcher: individual and community applications
Some win-win benefits
• Open science is a framework, rather than an all-or-nothing result.
Design stage: my preregistrations could be even more precise (see Bakker et al., 2020).
Development stage: my procedures could be even more open.
Completion stage: my materials could be even more easily accessible.
• Let’s not eschew studies that report adjustments or errors.
If errare humanum est, by definition (see Bakker et al., 2020), how many spotless studies should
there naturally be in journals?
• Reward structures (e.g., promotion) still often prioritise number of publications.
11
References
Bakker, M., Veldkamp, C. L., van Assen, M. A., Crompvoets, E. A., Ong, H. H., Nosek, B. A., Soderberg, C. K., Mellor,
D., & Wicherts, J. M. (2020). Ensuring the quality and specificity of preregistrations. PLoS Biology, 18(12),
e3000937. https://doi.org/10.1371/journal.pbio.3000937
Bernabeu, P. (2018). Dutch modality exclusivity norms for 336 properties and 411 concepts. PsyArXiv.
https://psyarxiv.com/s2c5h
Bernabeu, P., Lynott, D., & Connell, L. (2021). Preregistration: The interplay between linguistic and embodied
systems in conceptual processing. OSF. https://osf.io/ftydw/
Bernabeu, P., Willems, R. M., & Louwerse, M. M. (2017). Modality switch effects emerge early and increase
throughout conceptual processing: evidence from ERPs. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. J.
Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 1629-1634).
Austin, TX: Cognitive Science Society. https://mindmodeling.org/cogsci2017/papers/0318
Chen, S., Szabelska, A., Chartier, C. R., Kekecs, Z., Lynott, D., Bernabeu, P., … Schmidt, K. (2018). Investigating object
orientation effects across 14 languages. PsyArXiv. https://doi.org/10.31234/osf.io/t2pjv/
Hutchison, K. A., Balota, D. A., Neely, J. H., Cortese, M. J., Cohen-Shikora, E. R., Tse, C.-S., Yap, M. J., Bengson, J. J.,
Niemeyer, D., & Buchanan, E. (2013). The semantic priming project. Behavior Research Methods, 45, 1099–1114.
https://doi.org/10.3758/s13428-012-0304-z
Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584-585.
https://doi.org/10.1126/science.aal3618
12
Thank you to OSCG, the sponsors and the audience!
Also, thank you to my mentors and everyone else who has
contributed to my research.
13

More Related Content

What's hot

Himansu sahoo resume-ds
Himansu sahoo resume-dsHimansu sahoo resume-ds
Himansu sahoo resume-dsHimansu Sahoo
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 
Accelerating materials design through natural language processing
Accelerating materials design through natural language processingAccelerating materials design through natural language processing
Accelerating materials design through natural language processingAnubhav Jain
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbenchEuropean Data Forum
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAnubhav Jain
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chainPaul Groth
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningPaul Groth
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda ProvenanceVlad Vega
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
 
Berlin 6 Open Access Conference: Patrick Vandewalle
Berlin 6 Open Access Conference: Patrick VandewalleBerlin 6 Open Access Conference: Patrick Vandewalle
Berlin 6 Open Access Conference: Patrick VandewalleCornelius Puschmann
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignAnubhav Jain
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringMehdi Mirakhorli
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDatabricks
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsAnubhav Jain
 

What's hot (20)

Himansu sahoo resume-ds
Himansu sahoo resume-dsHimansu sahoo resume-ds
Himansu sahoo resume-ds
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
Accelerating materials design through natural language processing
Accelerating materials design through natural language processingAccelerating materials design through natural language processing
Accelerating materials design through natural language processing
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data Analysis
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
Berlin 6 Open Access Conference: Patrick Vandewalle
Berlin 6 Open Access Conference: Patrick VandewalleBerlin 6 Open Access Conference: Patrick Vandewalle
Berlin 6 Open Access Conference: Patrick Vandewalle
 
A First Step Towards Content Protecting Plagiarism Detection
A First Step Towards Content Protecting Plagiarism Detection  A First Step Towards Content Protecting Plagiarism Detection
A First Step Towards Content Protecting Plagiarism Detection
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
Sub1579
Sub1579Sub1579
Sub1579
 
Using Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive ComputingUsing Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive Computing
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge Graphs
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 

Similar to Towards reproducibility and maximally-open data

Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLAnubhav Jain
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectAnubhav Jain
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Curriculum_Amoroso_EN_28_07_2016
Curriculum_Amoroso_EN_28_07_2016Curriculum_Amoroso_EN_28_07_2016
Curriculum_Amoroso_EN_28_07_2016Nicola Amoroso
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_Titash Mandal
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
Digital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible researchDigital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible researchSC CTSI at USC and CHLA
 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.docbutest
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
My Research Journey with R
My Research Journey with RMy Research Journey with R
My Research Journey with RTom Kelly
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Daniele Dell'Aglio
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 

Similar to Towards reproducibility and maximally-open data (20)

Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials Project
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Curriculum_Amoroso_EN_28_07_2016
Curriculum_Amoroso_EN_28_07_2016Curriculum_Amoroso_EN_28_07_2016
Curriculum_Amoroso_EN_28_07_2016
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
CV_10/17
CV_10/17CV_10/17
CV_10/17
 
Cv long
Cv longCv long
Cv long
 
Alan Berg
Alan Berg Alan Berg
Alan Berg
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Open reproducible research
Open reproducible researchOpen reproducible research
Open reproducible research
 
Digital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible researchDigital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible research
 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.doc
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
My Research Journey with R
My Research Journey with RMy Research Journey with R
My Research Journey with R
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 

Recently uploaded

Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Silpa
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 

Recently uploaded (20)

Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 

Towards reproducibility and maximally-open data

  • 1. Towards reproducible research and maximally-open data Pablo Bernabeu OSCG Open Scholarship Prize Competition, 14th May 2021
  • 2. Psycholinguistics > Conceptual processing Reanalysis of data from Hutchison et al. (2013) How does the brain process the meaning of words? Statistical regularities in language as well as perceptual, motor and emotional information. How does this process vary in different contexts and across different people? 2
  • 3. Open information at each stage of research Design: preregistration of theoretical background and methodological protocol. Development: procedural issues (corrections, errors, other changes) that bear on the final materials. Completion: data collection software, raw data, processed data, final data set, analysis code and results. S h a r e d i n m y P h D S h a r e d i n m y M A 3 Application of open science My research Community: open-source code workshops and software
  • 4. Open data Bernabeu et al. (2017), Bernabeu (2018) • All materials from the completion stage: experiment administration software, raw data, processed data, final data sets, analysis code and results. • Development stage proceedings reported. • Readme files describing the data sets and linking to resources such as data dashboards. 4
  • 5. Maximally-open data Bernabeu et al. (2017), Bernabeu (2018) • R-based web applications open to scientists and the general public. • Easy visualization of the variance and inspection of procedural aspects such as trimming, adjustments, changes. Quicker usage of the data (see blog post). 5
  • 6. Preregistration, power analysis and open data Chen et al. (2018) • Prereg.: https://psyarxiv.com/t2pjv/ • Video demonstration of the lab procedure: https://osf.io/h36wr/ • Data: https://osf.io/waf48/ 6
  • 7. Preregistration and open data Bernabeu et al. (2021) • Specification of the theoretical background and the methodological protocol for a forthcoming study. • Integration of FAIR data from several, large-sample studies • Large, secondary data = valuable alternative to small, noisy samples (see Loken & Gelman, 2017). 7
  • 8. Power analysis: How many participants required? For next study, the preregistration will include a power analysis based on two large- sample pilots (combined, FAIR data sets), using power curves based on Monte Carlo simulations (simr R package). Preliminary curves below (pending more simulations for a greater accuracy). Y axis = power for a certain effect; X axis = 1 to 312 participants. 8
  • 9. R workshops Workshops and presentations on data visualisation and analysis in R since 2018, mostly in the context of a fellowship from the Software Sustainability Institute. • http://pablobernabeu.github.io/#workshops • https://github.com/pablobernabeu/Data-is-present Blogging Several blog posts on psycholinguistic research, open science and statistics. http://pablobernabeu.github.io/blog 9
  • 10. More open-source web applications for research and teaching Experimental data simulation WebVTT caption transcription https://github.com/pablobernabeu/Experimental-data-simulation https://github.com/pablobernabeu/VTT-Transcription-App 10
  • 11. Concluding thoughts • Attainable for early-career researcher: individual and community applications Some win-win benefits • Open science is a framework, rather than an all-or-nothing result. Design stage: my preregistrations could be even more precise (see Bakker et al., 2020). Development stage: my procedures could be even more open. Completion stage: my materials could be even more easily accessible. • Let’s not eschew studies that report adjustments or errors. If errare humanum est, by definition (see Bakker et al., 2020), how many spotless studies should there naturally be in journals? • Reward structures (e.g., promotion) still often prioritise number of publications. 11
  • 12. References Bakker, M., Veldkamp, C. L., van Assen, M. A., Crompvoets, E. A., Ong, H. H., Nosek, B. A., Soderberg, C. K., Mellor, D., & Wicherts, J. M. (2020). Ensuring the quality and specificity of preregistrations. PLoS Biology, 18(12), e3000937. https://doi.org/10.1371/journal.pbio.3000937 Bernabeu, P. (2018). Dutch modality exclusivity norms for 336 properties and 411 concepts. PsyArXiv. https://psyarxiv.com/s2c5h Bernabeu, P., Lynott, D., & Connell, L. (2021). Preregistration: The interplay between linguistic and embodied systems in conceptual processing. OSF. https://osf.io/ftydw/ Bernabeu, P., Willems, R. M., & Louwerse, M. M. (2017). Modality switch effects emerge early and increase throughout conceptual processing: evidence from ERPs. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. J. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 1629-1634). Austin, TX: Cognitive Science Society. https://mindmodeling.org/cogsci2017/papers/0318 Chen, S., Szabelska, A., Chartier, C. R., Kekecs, Z., Lynott, D., Bernabeu, P., … Schmidt, K. (2018). Investigating object orientation effects across 14 languages. PsyArXiv. https://doi.org/10.31234/osf.io/t2pjv/ Hutchison, K. A., Balota, D. A., Neely, J. H., Cortese, M. J., Cohen-Shikora, E. R., Tse, C.-S., Yap, M. J., Bengson, J. J., Niemeyer, D., & Buchanan, E. (2013). The semantic priming project. Behavior Research Methods, 45, 1099–1114. https://doi.org/10.3758/s13428-012-0304-z Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584-585. https://doi.org/10.1126/science.aal3618 12
  • 13. Thank you to OSCG, the sponsors and the audience! Also, thank you to my mentors and everyone else who has contributed to my research. 13