SlideShare a Scribd company logo

GWAS and DAS

V
Verena139

Mining Data Availability Statements for GWAS data, presentation by Jo McEntyre, EMBL-EBI

1 of 26
Download to read offline
Jo McEntyre, EMBL-EBI
Mining Data Availability Statements for GWAS data
GWAS and the GWAS Catalog
• GWAS
analyse
variants
across the
genome to
identify loci
associated
with a
disease or
phenotype
Study metadata
including:
- Trait
- Sample
information
Publication
information
Results
- Lead
associations
- Summary
statistics
GWAS
Catalog
data
GWAS Catalog content
As of October 2019
• 4,220 publications
• 7,661 studies
• 157,336 variant-trait assoc.
• 276 pubs with summary
statistics, >8,000 datasets
www.ebi.ac.uk/gwas
What is Europe PMC?
Europe PMC– free digital archive of
biomedical and life sciences research publications
Content in Europe PMC
Europe PMC is a partner in PubMed Central International
Text mining infrastructure
• Gene-disease relationships
• Mutations
• GeneRIFs
• Diseases and phenotypes
• Phosphorylation events
• Transcription factor-target
interactions
• Organisms
• Gene/proteins
• GO terms
• ChEBI
• EFO
• Grants
• Accession numbers
Ad

Recommended

Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataCyndy Parr
 
Literature-data integration in the life sciences – Jo McEntyre, EMBL-EBI
Literature-data integration in the life sciences – Jo McEntyre, EMBL-EBILiterature-data integration in the life sciences – Jo McEntyre, EMBL-EBI
Literature-data integration in the life sciences – Jo McEntyre, EMBL-EBIOpenAIRE
 
The Diversity of Biomedical Data, Databases and Standards (Research Data Alli...
The Diversity of Biomedical Data, Databases and Standards (Research Data Alli...The Diversity of Biomedical Data, Databases and Standards (Research Data Alli...
The Diversity of Biomedical Data, Databases and Standards (Research Data Alli...Peter McQuilton
 
Alternative Avenues of Discovery: Competition or Potential
Alternative Avenues of Discovery: Competition or PotentialAlternative Avenues of Discovery: Competition or Potential
Alternative Avenues of Discovery: Competition or PotentialJason Price, PhD
 
CI4CC sustainability-panel
CI4CC sustainability-panelCI4CC sustainability-panel
CI4CC sustainability-panelRavi Madduri
 
Jsm madduri-august-2015
Jsm madduri-august-2015Jsm madduri-august-2015
Jsm madduri-august-2015Ravi Madduri
 

More Related Content

What's hot

As a result of the mandates
As a result of the mandatesAs a result of the mandates
As a result of the mandatesPauladavey
 
Biositemaps: A Framework for Biomedical Resource Discovery
Biositemaps: A Framework for Biomedical Resource DiscoveryBiositemaps: A Framework for Biomedical Resource Discovery
Biositemaps: A Framework for Biomedical Resource DiscoveryTrish Whetzel
 
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...Jason Price, PhD
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseAlejandra Gonzalez-Beltran
 
NCBO Overview and Biositemaps
NCBO Overview and BiositemapsNCBO Overview and Biositemaps
NCBO Overview and BiositemapsTrish Whetzel
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOAlejandra Gonzalez-Beltran
 
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013nettiel
 
Open Access and Publishers - Michael Mabe (2007)
Open Access and Publishers - Michael Mabe (2007)Open Access and Publishers - Michael Mabe (2007)
Open Access and Publishers - Michael Mabe (2007)faflrt
 
The Growing Call for Open Access - Heather Joseph (2007)
The Growing Call for Open Access - Heather Joseph (2007)The Growing Call for Open Access - Heather Joseph (2007)
The Growing Call for Open Access - Heather Joseph (2007)faflrt
 
Ontology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowTrish Whetzel
 
Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline Ravi Madduri
 
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionGerald Lushington
 

What's hot (20)

As a result of the mandates
As a result of the mandatesAs a result of the mandates
As a result of the mandates
 
PubMed
PubMedPubMed
PubMed
 
Biositemaps: A Framework for Biomedical Resource Discovery
Biositemaps: A Framework for Biomedical Resource DiscoveryBiositemaps: A Framework for Biomedical Resource Discovery
Biositemaps: A Framework for Biomedical Resource Discovery
 
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
 
NCBO Overview and Biositemaps
NCBO Overview and BiositemapsNCBO Overview and Biositemaps
NCBO Overview and Biositemaps
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
 
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
 
Discovery impact erl2014
Discovery impact erl2014Discovery impact erl2014
Discovery impact erl2014
 
Cameron Neylon - Lightning talk at NISO Altmetrics Initiative
Cameron Neylon - Lightning talk at NISO Altmetrics InitiativeCameron Neylon - Lightning talk at NISO Altmetrics Initiative
Cameron Neylon - Lightning talk at NISO Altmetrics Initiative
 
NISO Apr 29 Virtual Conference: Value in numbers: A Shared Approach to Measur...
NISO Apr 29 Virtual Conference: Value in numbers: A Shared Approach to Measur...NISO Apr 29 Virtual Conference: Value in numbers: A Shared Approach to Measur...
NISO Apr 29 Virtual Conference: Value in numbers: A Shared Approach to Measur...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Open Access and Publishers - Michael Mabe (2007)
Open Access and Publishers - Michael Mabe (2007)Open Access and Publishers - Michael Mabe (2007)
Open Access and Publishers - Michael Mabe (2007)
 
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
 
The Growing Call for Open Access - Heather Joseph (2007)
The Growing Call for Open Access - Heather Joseph (2007)The Growing Call for Open Access - Heather Joseph (2007)
The Growing Call for Open Access - Heather Joseph (2007)
 
Ontology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation Workflow
 
Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline
 
eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...
 
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
 
Niso dda uksg 2014
Niso dda uksg 2014Niso dda uksg 2014
Niso dda uksg 2014
 

Similar to GWAS and DAS

Mcentyre dryad-orcid_may2013
Mcentyre dryad-orcid_may2013Mcentyre dryad-orcid_may2013
Mcentyre dryad-orcid_may2013ORCID, Inc
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Amit Sheth
 
Data availability and feasibility of validation – A genomics case study
Data availability and feasibility of validation – A genomics case studyData availability and feasibility of validation – A genomics case study
Data availability and feasibility of validation – A genomics case studyVerena139
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Ann-Marie Roche
 
Data availability Study
Data availability Study Data availability Study
Data availability Study Verena139
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisCatherine Canevet
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_PresentationYatpang Cheung
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics IntroductionDavid Montaner
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECAProject
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOAEBI
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its toolsGaurav Diwakar
 
openSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association StudiesopenSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association StudiesBastian Greshake
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
NCBO Tools and Web services
NCBO Tools and Web servicesNCBO Tools and Web services
NCBO Tools and Web servicesTrish Whetzel
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 

Similar to GWAS and DAS (20)

Mcentyre dryad-orcid_may2013
Mcentyre dryad-orcid_may2013Mcentyre dryad-orcid_may2013
Mcentyre dryad-orcid_may2013
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Data availability and feasibility of validation – A genomics case study
Data availability and feasibility of validation – A genomics case studyData availability and feasibility of validation – A genomics case study
Data availability and feasibility of validation – A genomics case study
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
 
Data availability Study
Data availability Study Data availability Study
Data availability Study
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_Presentation
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics Introduction
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIR
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOA
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
openSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association StudiesopenSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association Studies
 
Data integration
Data integrationData integration
Data integration
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
NCBO Tools and Web services
NCBO Tools and Web servicesNCBO Tools and Web services
NCBO Tools and Web services
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 

More from Verena139

Peer judge: Praise and Criticism Detection in F1000Research reviews
Peer judge: Praise and Criticism Detection in F1000Research reviews Peer judge: Praise and Criticism Detection in F1000Research reviews
Peer judge: Praise and Criticism Detection in F1000Research reviews Verena139
 
Tracking data
Tracking dataTracking data
Tracking dataVerena139
 
Metrics for oa monographs - introduction
Metrics for oa monographs - introductionMetrics for oa monographs - introduction
Metrics for oa monographs - introductionVerena139
 
Thoughts on metrics for OA monographs
Thoughts on metrics for OA monographsThoughts on metrics for OA monographs
Thoughts on metrics for OA monographsVerena139
 
Operas Metrics Service
Operas Metrics Service Operas Metrics Service
Operas Metrics Service Verena139
 
Reproducibility Analytics Lab
Reproducibility Analytics Lab Reproducibility Analytics Lab
Reproducibility Analytics Lab Verena139
 
Prediction markets
Prediction markets  Prediction markets
Prediction markets Verena139
 
Jisc R&D work in Research Analytics
Jisc R&D work in Research AnalyticsJisc R&D work in Research Analytics
Jisc R&D work in Research AnalyticsVerena139
 
ORCID: Jisc&ARMA final meeting update by Josh Brown
ORCID: Jisc&ARMA final meeting update by Josh BrownORCID: Jisc&ARMA final meeting update by Josh Brown
ORCID: Jisc&ARMA final meeting update by Josh BrownVerena139
 
Orcid implementation in uk 29092014
Orcid implementation in uk 29092014Orcid implementation in uk 29092014
Orcid implementation in uk 29092014Verena139
 
ORCID: Jisc&ARMA progress meeting update by Josh Brown
ORCID: Jisc&ARMA progress meeting update by Josh Brown ORCID: Jisc&ARMA progress meeting update by Josh Brown
ORCID: Jisc&ARMA progress meeting update by Josh Brown Verena139
 
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)Verena139
 
Thunderbolts and lightning outputs
Thunderbolts and lightning outputsThunderbolts and lightning outputs
Thunderbolts and lightning outputsVerena139
 
Weathering the storm outputs
Weathering the storm outputsWeathering the storm outputs
Weathering the storm outputsVerena139
 

More from Verena139 (14)

Peer judge: Praise and Criticism Detection in F1000Research reviews
Peer judge: Praise and Criticism Detection in F1000Research reviews Peer judge: Praise and Criticism Detection in F1000Research reviews
Peer judge: Praise and Criticism Detection in F1000Research reviews
 
Tracking data
Tracking dataTracking data
Tracking data
 
Metrics for oa monographs - introduction
Metrics for oa monographs - introductionMetrics for oa monographs - introduction
Metrics for oa monographs - introduction
 
Thoughts on metrics for OA monographs
Thoughts on metrics for OA monographsThoughts on metrics for OA monographs
Thoughts on metrics for OA monographs
 
Operas Metrics Service
Operas Metrics Service Operas Metrics Service
Operas Metrics Service
 
Reproducibility Analytics Lab
Reproducibility Analytics Lab Reproducibility Analytics Lab
Reproducibility Analytics Lab
 
Prediction markets
Prediction markets  Prediction markets
Prediction markets
 
Jisc R&D work in Research Analytics
Jisc R&D work in Research AnalyticsJisc R&D work in Research Analytics
Jisc R&D work in Research Analytics
 
ORCID: Jisc&ARMA final meeting update by Josh Brown
ORCID: Jisc&ARMA final meeting update by Josh BrownORCID: Jisc&ARMA final meeting update by Josh Brown
ORCID: Jisc&ARMA final meeting update by Josh Brown
 
Orcid implementation in uk 29092014
Orcid implementation in uk 29092014Orcid implementation in uk 29092014
Orcid implementation in uk 29092014
 
ORCID: Jisc&ARMA progress meeting update by Josh Brown
ORCID: Jisc&ARMA progress meeting update by Josh Brown ORCID: Jisc&ARMA progress meeting update by Josh Brown
ORCID: Jisc&ARMA progress meeting update by Josh Brown
 
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)
Jisc-ARMA ORCID pilot start-up meeting - presentation by Laure Haak (ORCID)
 
Thunderbolts and lightning outputs
Thunderbolts and lightning outputsThunderbolts and lightning outputs
Thunderbolts and lightning outputs
 
Weathering the storm outputs
Weathering the storm outputsWeathering the storm outputs
Weathering the storm outputs
 

Recently uploaded

Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxMdRafiqulIslam403212
 
Oppotus - Malaysians on Malaysia 4Q 2023.pdf
Oppotus - Malaysians on Malaysia 4Q 2023.pdfOppotus - Malaysians on Malaysia 4Q 2023.pdf
Oppotus - Malaysians on Malaysia 4Q 2023.pdfOppotus
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxVighnesh Shashtri
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Cyber Security Experts
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Thibaud Le Douarin
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaAdrian Sanabria
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdfdigimartfamily
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referencepriyansabari355
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)CUO VEERANAN VEERANAN
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)UNCResearchHub
 
Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensKondapi V Siva Rama Brahmam
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for usersStephenEfange3
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referencepriyansabari355
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfAustraliaChapterIIBA
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023stephizcoolio
 

Recently uploaded (17)

Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptx
 
Oppotus - Malaysians on Malaysia 4Q 2023.pdf
Oppotus - Malaysians on Malaysia 4Q 2023.pdfOppotus - Malaysians on Malaysia 4Q 2023.pdf
Oppotus - Malaysians on Malaysia 4Q 2023.pdf
 
2.pptx
2.pptx2.pptx
2.pptx
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptx
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix Enigma
 
Electricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptxElectricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptx
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdf
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as reference
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)
 
Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample Screens
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for users
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a reference
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023
 

GWAS and DAS

  • 1. Jo McEntyre, EMBL-EBI Mining Data Availability Statements for GWAS data
  • 2. GWAS and the GWAS Catalog • GWAS analyse variants across the genome to identify loci associated with a disease or phenotype Study metadata including: - Trait - Sample information Publication information Results - Lead associations - Summary statistics GWAS Catalog data
  • 3. GWAS Catalog content As of October 2019 • 4,220 publications • 7,661 studies • 157,336 variant-trait assoc. • 276 pubs with summary statistics, >8,000 datasets www.ebi.ac.uk/gwas
  • 4. What is Europe PMC? Europe PMC– free digital archive of biomedical and life sciences research publications
  • 5. Content in Europe PMC Europe PMC is a partner in PubMed Central International
  • 6. Text mining infrastructure • Gene-disease relationships • Mutations • GeneRIFs • Diseases and phenotypes • Phosphorylation events • Transcription factor-target interactions • Organisms • Gene/proteins • GO terms • ChEBI • EFO • Grants • Accession numbers
  • 7. Text mining platform: SciLite application
  • 8. Accession numbers mined from full text publications ELIXIR Core Data Resources and Deposition Databases
  • 9. Cross-links between GWAS and Europe PMC
  • 11. <title> and XML path Title XML path Frequency Data Availability article:front:notes 90,928 Data accessibility article:back:sec 2,694 Data Availability article:back:sec:fn-group 2,580 Data article:body:sec 2,265 Availability of supporting data article:body:sec 1,593 Major datasets article:back:sec:sec 1,074 Database survey article:body:sec 986 Extended Data article:body:sec 851 Data availability article:body:sec 795 Extended Data Figure 1 article:body:sec:SecTag:fig 689 Top 10 combinations of <title> content containing “data” and XML path
  • 14. Curating papers for the GWAS catalog
  • 15. GWAS Catalog literature identification: Query based vs machine learning Query-based Machine learning Precision 6% 27% Recall 100% 96% Improved efficiency 80% reduction in publications to review average 144 to 30/week
  • 16. Summary statistics in the GWAS Catalog by publication year % of publications with summary statistics over time & in the whole Catalog
  • 17. Summary statistics for users Facilitating data integration and downstream analyses
  • 20. GWAS Catalog literature identification • Previously used manual query based search term • Query: genomewide OR genome wide OR genome-wide OR GWAS • Now replaced with machine learning based search • convolutional neural net trained on corpus of GWAS Catalog publications • Collaboration with Zhiyong Lu’s group Lee et al, PMID 30102703 , PloS Comp Bio • ML results triaged by curator in custom Pubtator interface
  • 21. Old literature search and triage process • Manual search in PubMed • Query: genomewide OR genome wide OR genome- wide OR GWAS • Curator assesses each publication for eligibility for inclusion in GWAS Catalog • Specific eligibility criteria https://www.ebi.ac.uk/gwas/docs/methods/criteria • Genome wide association study of >100,000 variants distributed genome
  • 22. Deep learning algorithm (convolutional neural net) trained on corpus of GWAS Catalog publications) Figure 1. Lee et al, PMID 30102703 , PloS Comp Bio Machine learning search Corpus of GWAS Catalog publications
  • 23. GWAS Catalog machine learning literature search method • Precision 27% • Recall 96% Table 3. Lee et al, PMID 30102703 , PloS Comp Bio
  • 24. Machine learning: • Improved efficiency (80% reduction in publications to review, 144 to 30/week) • Similar capture of eligible studies GWAS Catalog machine learning literature search method vs query based search Table 3. Lee et al, PMID 30102703 , PloS Comp Bio
  • 26. DOI citations within DASs Most popular data repositories based on DOI citations in DASs (Jan-Mar 2019) (?i)(10[.]d{4,9})(?=/)(?=[-._;()/:A-Z0-9]+)