SlideShare a Scribd company logo
1 of 28
Download to read offline
WikiPathways
Pathway Models for Network Analysis
Martina Summer-Kutmon, PhD
Maastricht Centre for Systems Biology (MaCSBio)
Department of Bioinformatics (BiGCaT)
Maastricht University
4 September 2020
BioNetVisA 2020 workshop
Acknowledgements
core development and curation team
and many contributors and curators
around the world
WikiPathways Introduction
Slenter DN, Kutmon M, Hanspers K, Riutta A, Windsor J, Nunes N, Mélius J, Cirillo E, Coort SL,
Digles D, Ehrhart F, Giesbertz P, Kalafati M, Martens M, Miller R, Nishida K, Rieswijk L,
Waagmeester A, Eijssen LMT, Evelo CT, Pico AR, Willighagen EL
WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research.
Nucleic Acids Res. 2018 Jan 4;46(D1):D661-D667. doi: 10.1093/nar/gkx1064.
WikiPathways
• Launched in 2008 as an experiment in
community-based curation of biological pathways
Too much data!
Difficult to keep knowledge
up-to-date, accessible and
integrated
Taking advantage of direct
participation by a greater portion
of the community (crowdsourcing)
Image:
https://www.vizioninteractive.com/blog/data-overload-when-it-all-becomes-too-much/
WikiPathways
• A wikipedia for pathways
- Build on MediaWiki (same software wiki package as
used by wikipedia.org)
- Collection and curation of knowledge
- Community curated
- Everybody can contribute pathways
- Everybody can edit and curate pathways
- Everybody can use the pathway collections
WikiPathways
• Advantages
- Fast
- New findings can be added immediately
- Collaborative
- Researchers can exchange ideas and discuss pathways
- Collaborations with other manually curated pathway
databases (Reactome, NetPath)
- Flexible
- Pathways under development or hypothetical pathways
- Disease specific pathways
- Cell-type specific pathways
Pathway pages
https://www.wikipathways.org/index.php/Pathway:WP545
Title
Authors
Clickable diagram
Quality tags
Ontology tags
Bibliography
History
Discussion page
Download
Community portals
• Special interest groups
• Portal pages to highlight communities
COVID-19 portal
• Collaboration within the
COVID-19 DiseaseMap project
• Ongoing curation effort
• Grant for curation and development of new
software features
COVID-19 Disease Map, building a computational repository of
SARS-CoV-2 virus-host interaction mechanisms (2020)
https://doi.org/10.1038/s41597-020-0477-8
Content and coverage
WikiPathways content
• Statistics
- 2,887 pathways
- 739 contributors
• August 2020 release
- Curated collection
- 1,998 pathways in 25 species
- Focus still mainly on human pathways
- In the last month: edits from 21 contributors (165
edits)
Images:
https://cybra.com/wp-content/uploads/2015/09/statistics.png
Data accessibility
• Download
- For each pathway
- Collections in monthly releases
• Data formats
- GPML (graphical pathway markup language)
- PNG, SVG, PDF (images)
- BioPAX (biological pathway exchange language)
- Gene lists / GMT files
Data accessibility
• Programmatic access
- REST API
- RDF, semantic web
- rWikiPathways
- Cytoscape app
- NDEx
- wikidata
User stats
• Statistics in the last year
- ~15k-20k visitors a month
- >500,000 REST webservice requests per month
Human gene coverage
40% of protein-coding
genes not present in
any pathway db
Only ~300 not
protein-coding genes
Many protein-coding genes
only present in one of the databases
577 (KEGG), 710 (WP), 3,320 (Reactome)Data December 2018
Missing knowledge
Extracting information from pathway figures
Improving coverage
Published
figures of
pathways
WikiPathways
Anders Riutta Alex Pico
Identifying Genes in Published Pathway Figure Images (2018)
https://doi.org/10.1101/379446
Difficulty level of pathway figures
Publications with pathway figures
• PubMed Central image search for a set of
pathway types
- 235,000 figure between 1995 and 2020
• Classification of figures
- Machine learning -> 64,643 actual pathway figures
• OCR to identify genes in pathway figures
- Interesting gene sets that can be used to prioritize
curation and perform enrichment analysis
25 Years of Pathway Figures (2020)
https://doi.org/10.1101/2020.05.29.124503
Interactive query tool
https://gladstone-bioinformatics.shinyapps.io/shiny-25years/
Network analysis
Pathway / Network view
WikiPathways App for Cytoscape: Making biological pathways
amenable to network analysis and visualization
(2014) https://doi.org/10.12688/f1000research.4254.2
Network analysis
Gene-pathway associations
Automatic extension
CyTargetLinker app update: A flexible solution
for network extension in Cytoscape (2019)
https://doi.org/10.12688/f1000research.14613.2
Pathway overlap / connections
Primary open‐angle glaucoma
Molecular pathogenesis
Comprehensive bioinformatics analysis of trabecular
meshwork gene expression data to unravel the molecular
pathogenesis of primary open‐angle glaucoma (2020)
https://doi.org/10.1111%2Faos.14154Ilona Liesenborghs
Active module analysis
Beyond Pathway Analysis: Identification of
Active Subnetworks in Rett Syndrome (2019)
https://doi.org/10.3389%2Ffgene.2019.00059
Ryan Miller
Network of all pathways
Active modules
Metabolic pathways
Denise Slenter
Sparse metabolic data
Pathway crosstalk
Manuscript in preparation
Summary
• WikiPathways – collaborative pathway database
• Open source / open data
• Easy programmatic access
• Pathway visualization in PathVisio/Cytoscape
• Network representations of pathway data
- RDF, Neo4j, Gene-Pathway networks
• Network analysis in Cytoscape
Thank you for your attention
Questions?
Martina Summer-Kutmon
martina.kutmon@maastrichtuniversity.nl
twitter: mkutmon

More Related Content

What's hot

Inmagic user group meeting Melbourne june 2011
Inmagic user group meeting Melbourne june 2011Inmagic user group meeting Melbourne june 2011
Inmagic user group meeting Melbourne june 2011
Peter Neish
 
Mdst 3559-01-25-data-journalism
Mdst 3559-01-25-data-journalismMdst 3559-01-25-data-journalism
Mdst 3559-01-25-data-journalism
Rafael Alvarado
 

What's hot (20)

VIVO 2010 2010 Paper
VIVO 2010 2010 PaperVIVO 2010 2010 Paper
VIVO 2010 2010 Paper
 
186-RISIS
186-RISIS186-RISIS
186-RISIS
 
Visualizing the information of a Linked Open Data enabled Research Informatio...
Visualizing the information of a Linked Open Data enabled Research Informatio...Visualizing the information of a Linked Open Data enabled Research Informatio...
Visualizing the information of a Linked Open Data enabled Research Informatio...
 
En tematisk översikt av svensk forskningsfinansiering genom SweCRIS
En tematisk översikt av svensk forskningsfinansiering genom SweCRISEn tematisk översikt av svensk forskningsfinansiering genom SweCRIS
En tematisk översikt av svensk forskningsfinansiering genom SweCRIS
 
EOSC-hub and OpenAIRE-Advance collaboration (Presentation at RDA 11th plenary)
EOSC-hub and OpenAIRE-Advance collaboration (Presentation at RDA 11th plenary)EOSC-hub and OpenAIRE-Advance collaboration (Presentation at RDA 11th plenary)
EOSC-hub and OpenAIRE-Advance collaboration (Presentation at RDA 11th plenary)
 
3rd DBpedia Community Meeting - ALIGNED
3rd DBpedia Community Meeting - ALIGNED3rd DBpedia Community Meeting - ALIGNED
3rd DBpedia Community Meeting - ALIGNED
 
Inmagic user group meeting Melbourne june 2011
Inmagic user group meeting Melbourne june 2011Inmagic user group meeting Melbourne june 2011
Inmagic user group meeting Melbourne june 2011
 
Harvesting and semantically tagging media releases from political websites us...
Harvesting and semantically tagging media releases from political websites us...Harvesting and semantically tagging media releases from political websites us...
Harvesting and semantically tagging media releases from political websites us...
 
Linked Data: thinking big, starting small
Linked Data: thinking big, starting smallLinked Data: thinking big, starting small
Linked Data: thinking big, starting small
 
Transition to Open Science in Europe
Transition to Open Science in EuropeTransition to Open Science in Europe
Transition to Open Science in Europe
 
General Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research CentreGeneral Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research Centre
 
The OpenAIRE Infrastructure: A Vision towards e-infrastructure Commons (e-...
The OpenAIRE Infrastructure: A Vision towards e-infrastructure Commons (e-...The OpenAIRE Infrastructure: A Vision towards e-infrastructure Commons (e-...
The OpenAIRE Infrastructure: A Vision towards e-infrastructure Commons (e-...
 
Open Research in Ireland: Infrastructures for Open Research
Open Research in Ireland: Infrastructures for Open ResearchOpen Research in Ireland: Infrastructures for Open Research
Open Research in Ireland: Infrastructures for Open Research
 
Freya, en förutsättning för öppna vetenskapssystem
Freya, en förutsättning för öppna vetenskapssystemFreya, en förutsättning för öppna vetenskapssystem
Freya, en förutsättning för öppna vetenskapssystem
 
Demonstration of the 4C cost comparison tool
Demonstration of the 4C cost comparison toolDemonstration of the 4C cost comparison tool
Demonstration of the 4C cost comparison tool
 
GeoChronos: An On-line Collaborative Platform for Earth Observation Scientists
GeoChronos: An On-line Collaborative Platform for Earth Observation ScientistsGeoChronos: An On-line Collaborative Platform for Earth Observation Scientists
GeoChronos: An On-line Collaborative Platform for Earth Observation Scientists
 
Mdst 3559-01-25-data-journalism
Mdst 3559-01-25-data-journalismMdst 3559-01-25-data-journalism
Mdst 3559-01-25-data-journalism
 
How Jisc supports reporting, communicating and measuring research in the UK
How Jisc supports reporting, communicating and measuring research in the UKHow Jisc supports reporting, communicating and measuring research in the UK
How Jisc supports reporting, communicating and measuring research in the UK
 
A Research Data Catalogue supporting Blue Growth: the BlueBRIDGE case
A Research Data Catalogue supporting Blue Growth: the BlueBRIDGE caseA Research Data Catalogue supporting Blue Growth: the BlueBRIDGE case
A Research Data Catalogue supporting Blue Growth: the BlueBRIDGE case
 
GeoChronos: Challenges and Achievements
GeoChronos: Challenges and AchievementsGeoChronos: Challenges and Achievements
GeoChronos: Challenges and Achievements
 

Similar to 20200901 ECCB M. Kutmon

RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 
OpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open scienceOpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open science
Jisc
 
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
dkNET
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1
bdemchak
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Peter Löwe
 

Similar to 20200901 ECCB M. Kutmon (20)

RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
 
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSOPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
OpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open scienceOpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open science
 
Building data infrastructures for science
Building data infrastructures for scienceBuilding data infrastructures for science
Building data infrastructures for science
 
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content TypesIlik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
 
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
dkNET Webinar "Pancreatlas™: Mapping the Human Pancreas in Health and Disease...
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
The Developing Needs for e-infrastructures
The Developing Needs for e-infrastructuresThe Developing Needs for e-infrastructures
The Developing Needs for e-infrastructures
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
Enabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data InfrastructuresEnabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data Infrastructures
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science Theme
 
Parsec 191119 slideshare
Parsec 191119 slideshareParsec 191119 slideshare
Parsec 191119 slideshare
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 

Recently uploaded

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 

20200901 ECCB M. Kutmon

  • 1. WikiPathways Pathway Models for Network Analysis Martina Summer-Kutmon, PhD Maastricht Centre for Systems Biology (MaCSBio) Department of Bioinformatics (BiGCaT) Maastricht University 4 September 2020 BioNetVisA 2020 workshop
  • 2. Acknowledgements core development and curation team and many contributors and curators around the world
  • 3. WikiPathways Introduction Slenter DN, Kutmon M, Hanspers K, Riutta A, Windsor J, Nunes N, Mélius J, Cirillo E, Coort SL, Digles D, Ehrhart F, Giesbertz P, Kalafati M, Martens M, Miller R, Nishida K, Rieswijk L, Waagmeester A, Eijssen LMT, Evelo CT, Pico AR, Willighagen EL WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research. Nucleic Acids Res. 2018 Jan 4;46(D1):D661-D667. doi: 10.1093/nar/gkx1064.
  • 4. WikiPathways • Launched in 2008 as an experiment in community-based curation of biological pathways Too much data! Difficult to keep knowledge up-to-date, accessible and integrated Taking advantage of direct participation by a greater portion of the community (crowdsourcing) Image: https://www.vizioninteractive.com/blog/data-overload-when-it-all-becomes-too-much/
  • 5. WikiPathways • A wikipedia for pathways - Build on MediaWiki (same software wiki package as used by wikipedia.org) - Collection and curation of knowledge - Community curated - Everybody can contribute pathways - Everybody can edit and curate pathways - Everybody can use the pathway collections
  • 6. WikiPathways • Advantages - Fast - New findings can be added immediately - Collaborative - Researchers can exchange ideas and discuss pathways - Collaborations with other manually curated pathway databases (Reactome, NetPath) - Flexible - Pathways under development or hypothetical pathways - Disease specific pathways - Cell-type specific pathways
  • 8. Community portals • Special interest groups • Portal pages to highlight communities
  • 9. COVID-19 portal • Collaboration within the COVID-19 DiseaseMap project • Ongoing curation effort • Grant for curation and development of new software features COVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms (2020) https://doi.org/10.1038/s41597-020-0477-8
  • 11. WikiPathways content • Statistics - 2,887 pathways - 739 contributors • August 2020 release - Curated collection - 1,998 pathways in 25 species - Focus still mainly on human pathways - In the last month: edits from 21 contributors (165 edits) Images: https://cybra.com/wp-content/uploads/2015/09/statistics.png
  • 12. Data accessibility • Download - For each pathway - Collections in monthly releases • Data formats - GPML (graphical pathway markup language) - PNG, SVG, PDF (images) - BioPAX (biological pathway exchange language) - Gene lists / GMT files
  • 13. Data accessibility • Programmatic access - REST API - RDF, semantic web - rWikiPathways - Cytoscape app - NDEx - wikidata
  • 14. User stats • Statistics in the last year - ~15k-20k visitors a month - >500,000 REST webservice requests per month
  • 15. Human gene coverage 40% of protein-coding genes not present in any pathway db Only ~300 not protein-coding genes Many protein-coding genes only present in one of the databases 577 (KEGG), 710 (WP), 3,320 (Reactome)Data December 2018
  • 17. Improving coverage Published figures of pathways WikiPathways Anders Riutta Alex Pico Identifying Genes in Published Pathway Figure Images (2018) https://doi.org/10.1101/379446
  • 18. Difficulty level of pathway figures
  • 19. Publications with pathway figures • PubMed Central image search for a set of pathway types - 235,000 figure between 1995 and 2020 • Classification of figures - Machine learning -> 64,643 actual pathway figures • OCR to identify genes in pathway figures - Interesting gene sets that can be used to prioritize curation and perform enrichment analysis 25 Years of Pathway Figures (2020) https://doi.org/10.1101/2020.05.29.124503
  • 22. Pathway / Network view WikiPathways App for Cytoscape: Making biological pathways amenable to network analysis and visualization (2014) https://doi.org/10.12688/f1000research.4254.2
  • 23. Network analysis Gene-pathway associations Automatic extension CyTargetLinker app update: A flexible solution for network extension in Cytoscape (2019) https://doi.org/10.12688/f1000research.14613.2
  • 24. Pathway overlap / connections Primary open‐angle glaucoma Molecular pathogenesis Comprehensive bioinformatics analysis of trabecular meshwork gene expression data to unravel the molecular pathogenesis of primary open‐angle glaucoma (2020) https://doi.org/10.1111%2Faos.14154Ilona Liesenborghs
  • 25. Active module analysis Beyond Pathway Analysis: Identification of Active Subnetworks in Rett Syndrome (2019) https://doi.org/10.3389%2Ffgene.2019.00059 Ryan Miller Network of all pathways Active modules
  • 26. Metabolic pathways Denise Slenter Sparse metabolic data Pathway crosstalk Manuscript in preparation
  • 27. Summary • WikiPathways – collaborative pathway database • Open source / open data • Easy programmatic access • Pathway visualization in PathVisio/Cytoscape • Network representations of pathway data - RDF, Neo4j, Gene-Pathway networks • Network analysis in Cytoscape
  • 28. Thank you for your attention Questions? Martina Summer-Kutmon martina.kutmon@maastrichtuniversity.nl twitter: mkutmon