SlideShare a Scribd company logo
Can data access
combat fake news?
Mercè Crosas
Chief Data Science and Technology Officer, Institute for Quantitative Social Science, Harvard University
๏ Science must maintain credibility when
moving from the lab to the public domain

๏ Journalistic credibility could be improved by
adopting some standards and best practices
used in science
Connecting Science with Journalism
In Science
What do we need to enable replication?
๏ Provide persistent access to data (and code)

๏ Data should:

๏ be well documented 

๏ be accessible in reusable formats

๏ be open, or have a clear license and terms of use

๏ include provenance describing origin and transformations

๏ archived in trusted repositories
It’s not “Fake”, but can we reproduce it?
In Psychology: 39 out of 100 rated
as successful replications
In Biomedicine: 6 out of 53
“landmark” cancer studies were
successfully replicated
“…data can be narrowed
and expanded (p-hacked)
to make either
hypothesis appear
correct.That’s because
answering even a simple
scientific question
requires lots of choices
that can shape the
results.”
Science isn’t unreliable, but is more
challenging than we give it credit for
Aschwanden, 2015, FiveThirtyEight
Science has a mechanism for self-correction
“Instances in which
scientists detect and
address flaws in work
constitute evidence of
success, not failure.”
“Ensuring that the
integrity of science is
protected is the
responsibility of many
stakeholders.”
Alberts et al., 2015, Science
Fraud happens, but is rare
“His lab’s controversial work raised the possibility
that the human heart could repair itself, but many
other scientists have been unable to replicate the
research.”
The scientific community is establishing
best practices to address replication issues
Journal Data Policies are being applied
across disciplines
Genetics
Biomedical
Computational
Science
Economics
Ecology
“weak”= recommend
“strong” = require
Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing
Authors comply with strong data policies
0
10
20
30
40
50
60
70
80
90
100
Percentage of articles with available replication material
Data Sharing in Top Political Science Journals (Key 2016)*
Mandatory Replication Material
or Verification
Replication Material
Expected
No Requirement
Key, 2016, Political Science & Politics; via: Sebastian Karcher
Recently, 27 political science journals adopted the
Journals Editors’ Transparency Statement
6 top journals
in Political Science
Journal guidelines encourage reporting of
statistics and open research practices
Giofrè D, Cumming G, Fresc L, Boedker I, Tressoldi P (2017) The influence of journal submission guidelines on authors' reporting
of statistics and use of open research practices. PLOS ONE 12(4): e0175583. https://doi.org/10.1371/journal.pone.0175583
NHST = null hypothesis significance testing;
CI = confidence intervals;
MA = meta-analysis;
CI_interp = confidence intervals interpretation;
ES_interp = effect size interpretation;
Data_excl = exclusion criteria reported;
Material = additional materials availability;
Prereg = preregistered study.
In Psychology Science:
• use of confidence
intervals grows from
28% in 2013 to 70%
in 2015
• availability to open
data, from 3% to 39%
From
Journal
Article
To Data
Example: Journals
integrate with
Data Repositories
For AJPS: Replication review becomes part
of the submission workflow
+ +
0 Resubmits
1 Resubmit
2 Resubmits
3 Resubmits
4 Resubmits
0 10 20 30 40
3
17
31
37
8
Total: 95 completed verifications
via Thu-Mai Lewis Christian, Data Archives, ODUM Institute
From News Article to Data
1800 bicycle collisions
from 2009 to 2012
In Journalism and
Bloggers World
Connecting Science with Journalism
๏ Science must maintain credibility when moving from the lab
to the public domain

๏ Journalistic credibility could be improved by adopting some
standards and best practices used in science

๏ Apply data policies that require access to data (and code)

๏ Include citation to data in trusted repositories

๏ Provide guidelines for documenting statistics and data

๏ Add data fluency to journalism school experience
Education is Key:
New data programs in journalism
Data Concentration Program in Columbia
School of Journalism offered since 2014
Barret, Phillips, 2016,Teaching Data and Computational Journalism Report
From 113 programs in Journalism Schools
“Journalism schools that embrace both teaching and
research into data journalism methods will be poised
to fundamentally improve the way future journalists
will inquire into maters of public interest and
communicate with their audiences.”
“Science is the subset of knowledge on
which we can aspire to universal
agreement*”
*Richard Muller, 2016, Now: The Physics of Time
But, can we also aspire to universal
agreement for information that adheres
to scientific rigor?
Report written by: David Lazer, Matthew Baum, Nir Grinberg, Lisa
Friedland, Kenneth Joseph,Will Hobbs, and Carolina Mattsson
This recent
comprehensive, useful
report says:
“As a research
community, …
collaborate more
closely with journalists
in order to make the
truth “louder,””
This talk adds:
make the truth “louder”
and more verifiable,
with access to data
Thanks
Mercè Crosas, Institute for Quantitative Social Science, Harvard University
mercecrosas.com
@mercecrosas

More Related Content

What's hot

Seattle-Denver VA Center for Innovation
Seattle-Denver VA Center for InnovationSeattle-Denver VA Center for Innovation
Seattle-Denver VA Center for Innovation
Brian Bot
 
Statistical and critical thinking
Statistical and critical thinkingStatistical and critical thinking
Statistical and critical thinking
RamiroGarcia103
 
Journal club summary: Open Science save lives
Journal club summary: Open Science save livesJournal club summary: Open Science save lives
Journal club summary: Open Science save lives
Dorothy Bishop
 
Collaborative Technologies for Biomedical Research
Collaborative Technologies for Biomedical ResearchCollaborative Technologies for Biomedical Research
Collaborative Technologies for Biomedical Research
Sean Ekins
 
biomedical research in an increasingly digital world
biomedical research in an increasingly digital worldbiomedical research in an increasingly digital world
biomedical research in an increasingly digital world
Brian Bot
 
Parker institute for Cancer Immunotherapy 2016-2017 Year in Review
Parker institute for Cancer Immunotherapy 2016-2017 Year in ReviewParker institute for Cancer Immunotherapy 2016-2017 Year in Review
Parker institute for Cancer Immunotherapy 2016-2017 Year in Review
Parker Institute for Cancer Immunotherapy
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicine
Eagle Genomics
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
Warren Kibbe
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?
Sandra Binning
 
In review, unlocking the potential of early sharing
In review, unlocking the potential of early sharingIn review, unlocking the potential of early sharing
In review, unlocking the potential of early sharing
Crossref
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
Bram Zandbelt
 
Warren-Jones "Using text-mining and summarisation technology to manage the gr...
Warren-Jones "Using text-mining and summarisation technology to manage the gr...Warren-Jones "Using text-mining and summarisation technology to manage the gr...
Warren-Jones "Using text-mining and summarisation technology to manage the gr...
National Information Standards Organization (NISO)
 
Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...
Aravind Sesagiri Raamkumar
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NL
Dominique Roche
 
Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?
Shea Swauger
 
Christina engaging the biomedical researchers
Christina   engaging the biomedical researchersChristina   engaging the biomedical researchers
Christina engaging the biomedical researchersdjmichael156
 
Joao
JoaoJoao
De vry university math 221 week 3 quiz
De vry university math 221 week 3 quizDe vry university math 221 week 3 quiz
De vry university math 221 week 3 quiz
Olivia Fournier
 

What's hot (19)

Seattle-Denver VA Center for Innovation
Seattle-Denver VA Center for InnovationSeattle-Denver VA Center for Innovation
Seattle-Denver VA Center for Innovation
 
Statistical and critical thinking
Statistical and critical thinkingStatistical and critical thinking
Statistical and critical thinking
 
Journal club summary: Open Science save lives
Journal club summary: Open Science save livesJournal club summary: Open Science save lives
Journal club summary: Open Science save lives
 
Collaborative Technologies for Biomedical Research
Collaborative Technologies for Biomedical ResearchCollaborative Technologies for Biomedical Research
Collaborative Technologies for Biomedical Research
 
biomedical research in an increasingly digital world
biomedical research in an increasingly digital worldbiomedical research in an increasingly digital world
biomedical research in an increasingly digital world
 
Evaluating Sources
Evaluating SourcesEvaluating Sources
Evaluating Sources
 
Parker institute for Cancer Immunotherapy 2016-2017 Year in Review
Parker institute for Cancer Immunotherapy 2016-2017 Year in ReviewParker institute for Cancer Immunotherapy 2016-2017 Year in Review
Parker institute for Cancer Immunotherapy 2016-2017 Year in Review
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicine
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?
 
In review, unlocking the potential of early sharing
In review, unlocking the potential of early sharingIn review, unlocking the potential of early sharing
In review, unlocking the potential of early sharing
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 
Warren-Jones "Using text-mining and summarisation technology to manage the gr...
Warren-Jones "Using text-mining and summarisation technology to manage the gr...Warren-Jones "Using text-mining and summarisation technology to manage the gr...
Warren-Jones "Using text-mining and summarisation technology to manage the gr...
 
Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NL
 
Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?
 
Christina engaging the biomedical researchers
Christina   engaging the biomedical researchersChristina   engaging the biomedical researchers
Christina engaging the biomedical researchers
 
Joao
JoaoJoao
Joao
 
De vry university math 221 week 3 quiz
De vry university math 221 week 3 quizDe vry university math 221 week 3 quiz
De vry university math 221 week 3 quiz
 

Similar to Can data access combat fake news?

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
Merce Crosas
 
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
Warren Kibbe
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
Megan O'Donnell
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
Warren Kibbe
 
Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024
Peter Embi
 
One Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open ScienceOne Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open Science
Philip Bourne
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
Kees van Bochove
 
Big Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- CeliBig Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- Celi
intensivecaresociety
 
Biomedical Research_House of Cards_Editorial
Biomedical Research_House of Cards_EditorialBiomedical Research_House of Cards_Editorial
Biomedical Research_House of Cards_EditorialRathnam Chaguturu
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH View
Philip Bourne
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
Laure Wynants
 
Research methodology
Research methodologyResearch methodology
Research methodologyTosif Ahmad
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets Practice
Philip Bourne
 
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GrahamSmith646206
 
Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)
MiguelRosario24
 
Thesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataThesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research data
Heather Piwowar
 
Public engagement while you sleep
Public engagement while you sleepPublic engagement while you sleep
Public engagement while you sleep
UoLResearchSupport
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...
Susanna-Assunta Sansone
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
Susanna-Assunta Sansone
 
Science as open enterprise
Science as open enterpriseScience as open enterprise

Similar to Can data access combat fake news? (20)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
SAMSI Precision Medicine Keynote, August 2018: Data: where Precision Oncology...
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024
 
One Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open ScienceOne Funder’s View for Advancing Open Science
One Funder’s View for Advancing Open Science
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
 
Big Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- CeliBig Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- Celi
 
Biomedical Research_House of Cards_Editorial
Biomedical Research_House of Cards_EditorialBiomedical Research_House of Cards_Editorial
Biomedical Research_House of Cards_Editorial
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH View
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Open Science: Where Theory Meets Practice
Open Science: Where Theory Meets PracticeOpen Science: Where Theory Meets Practice
Open Science: Where Theory Meets Practice
 
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
 
Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)
 
Thesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataThesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research data
 
Public engagement while you sleep
Public engagement while you sleepPublic engagement while you sleep
Public engagement while you sleep
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Science as open enterprise
Science as open enterpriseScience as open enterprise
Science as open enterprise
 

More from Merce Crosas

Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
Merce Crosas
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Merce Crosas
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
Merce Crosas
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
Merce Crosas
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
Merce Crosas
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
Merce Crosas
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
Merce Crosas
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
Merce Crosas
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
Merce Crosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
Merce Crosas
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Merce Crosas
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
Merce Crosas
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
Merce Crosas
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
Merce Crosas
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
Merce Crosas
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Merce Crosas
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
Merce Crosas
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
Merce Crosas
 
Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access Symposium
Merce Crosas
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposium
Merce Crosas
 

More from Merce Crosas (20)

Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Data Citation Implementation at Dataverse
Data Citation Implementation at DataverseData Citation Implementation at Dataverse
Data Citation Implementation at Dataverse
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 
Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access Symposium
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposium
 

Recently uploaded

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 

Recently uploaded (20)

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 

Can data access combat fake news?

  • 1. Can data access combat fake news? Mercè Crosas Chief Data Science and Technology Officer, Institute for Quantitative Social Science, Harvard University
  • 2. ๏ Science must maintain credibility when moving from the lab to the public domain ๏ Journalistic credibility could be improved by adopting some standards and best practices used in science Connecting Science with Journalism
  • 4. What do we need to enable replication? ๏ Provide persistent access to data (and code) ๏ Data should: ๏ be well documented ๏ be accessible in reusable formats ๏ be open, or have a clear license and terms of use ๏ include provenance describing origin and transformations ๏ archived in trusted repositories
  • 5. It’s not “Fake”, but can we reproduce it? In Psychology: 39 out of 100 rated as successful replications In Biomedicine: 6 out of 53 “landmark” cancer studies were successfully replicated
  • 6. “…data can be narrowed and expanded (p-hacked) to make either hypothesis appear correct.That’s because answering even a simple scientific question requires lots of choices that can shape the results.” Science isn’t unreliable, but is more challenging than we give it credit for Aschwanden, 2015, FiveThirtyEight
  • 7. Science has a mechanism for self-correction “Instances in which scientists detect and address flaws in work constitute evidence of success, not failure.” “Ensuring that the integrity of science is protected is the responsibility of many stakeholders.” Alberts et al., 2015, Science
  • 8. Fraud happens, but is rare “His lab’s controversial work raised the possibility that the human heart could repair itself, but many other scientists have been unable to replicate the research.”
  • 9. The scientific community is establishing best practices to address replication issues
  • 10. Journal Data Policies are being applied across disciplines Genetics Biomedical Computational Science Economics Ecology “weak”= recommend “strong” = require Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing
  • 11. Authors comply with strong data policies 0 10 20 30 40 50 60 70 80 90 100 Percentage of articles with available replication material Data Sharing in Top Political Science Journals (Key 2016)* Mandatory Replication Material or Verification Replication Material Expected No Requirement Key, 2016, Political Science & Politics; via: Sebastian Karcher Recently, 27 political science journals adopted the Journals Editors’ Transparency Statement 6 top journals in Political Science
  • 12. Journal guidelines encourage reporting of statistics and open research practices Giofrè D, Cumming G, Fresc L, Boedker I, Tressoldi P (2017) The influence of journal submission guidelines on authors' reporting of statistics and use of open research practices. PLOS ONE 12(4): e0175583. https://doi.org/10.1371/journal.pone.0175583 NHST = null hypothesis significance testing; CI = confidence intervals; MA = meta-analysis; CI_interp = confidence intervals interpretation; ES_interp = effect size interpretation; Data_excl = exclusion criteria reported; Material = additional materials availability; Prereg = preregistered study. In Psychology Science: • use of confidence intervals grows from 28% in 2013 to 70% in 2015 • availability to open data, from 3% to 39%
  • 14. For AJPS: Replication review becomes part of the submission workflow + + 0 Resubmits 1 Resubmit 2 Resubmits 3 Resubmits 4 Resubmits 0 10 20 30 40 3 17 31 37 8 Total: 95 completed verifications via Thu-Mai Lewis Christian, Data Archives, ODUM Institute
  • 15. From News Article to Data 1800 bicycle collisions from 2009 to 2012
  • 17. Connecting Science with Journalism ๏ Science must maintain credibility when moving from the lab to the public domain ๏ Journalistic credibility could be improved by adopting some standards and best practices used in science ๏ Apply data policies that require access to data (and code) ๏ Include citation to data in trusted repositories ๏ Provide guidelines for documenting statistics and data ๏ Add data fluency to journalism school experience
  • 18. Education is Key: New data programs in journalism Data Concentration Program in Columbia School of Journalism offered since 2014 Barret, Phillips, 2016,Teaching Data and Computational Journalism Report From 113 programs in Journalism Schools
  • 19. “Journalism schools that embrace both teaching and research into data journalism methods will be poised to fundamentally improve the way future journalists will inquire into maters of public interest and communicate with their audiences.”
  • 20. “Science is the subset of knowledge on which we can aspire to universal agreement*” *Richard Muller, 2016, Now: The Physics of Time But, can we also aspire to universal agreement for information that adheres to scientific rigor?
  • 21. Report written by: David Lazer, Matthew Baum, Nir Grinberg, Lisa Friedland, Kenneth Joseph,Will Hobbs, and Carolina Mattsson This recent comprehensive, useful report says: “As a research community, … collaborate more closely with journalists in order to make the truth “louder,”” This talk adds: make the truth “louder” and more verifiable, with access to data
  • 22. Thanks Mercè Crosas, Institute for Quantitative Social Science, Harvard University mercecrosas.com @mercecrosas