SlideShare a Scribd company logo
1 of 13
Download to read offline
Quest for Orthologs*
anchoring comparative biology research
Sharing and delivery of reusable phylogenetic knowledge
TDWG 2013 Annual Conference - October 2013 Florence, Italy
*http://questfororthologs.org/

Wednesday, October 23, 13
Evolutionary conservation allows knowledge
transfer between well-characterized model
organisms to human & other organisms and is
the basis for comparative genomic studies

Wednesday, October 23, 13
The barriers
More than 30 phylogenomic databases provide their analysis
results to the scientific community.
The content of these databases differ
The concepts of these databases also differ
Complex/slow pipelines
Unavailability as stand alone programs
Different output formats
Lack of benchmarking data sets
Consequently comparing and choosing is difficult

Wednesday, October 23, 13
SWISS INSTITUTE OF BIOINFORMATICS
EUROPEAN BIOINFORMATICS
INSTITUTE
STOCKHOLMS UNIVERSITET
EIDGENÖSSISCHE TECHNISCHE
HOCHSCHULE ZÜRICH
INSTITUT DE GÉNÉTIQUE ET
MICROBIOLOGIE
NATIONAL INSTITUTE FOR BASIC
BIOLOGY, JAPAN
SANGER INSTITUTE
EUROPEAN MOLECULAR BIOLOGY
LABORATORY
INSTITUT NATIONAL DE LA
RECHERCHE AGRONOMIQUE
UNIVERSIDAD DE MURCIA
JOINT GENOME INSTITUTE
UNIVERSITY OF LAUSANNE

Who we are

SYNGENTA
UNIVERSITÄT BONN

UNIVERSITY OF CAMBRIDGE

CENTRE INTERNATIONALE POUR LA
RECHERCHE AGRONOMIQUE POUR LE
DÉVELOPPEMENT

JACKSON LABORATORY
UNIVERSITÉ DE LYON
UNIVERSITÉ DE GENEVE
PRINCETON UNIVERSITY

Wednesday, October 23, 13

CENTRE DE REGULACIÓ GENOMICS,
BARCELONA
UNIVERSITY OF PENNSYLVANIA
Quest for Orthologs’
objectives
A collaboration of phylogenomic databases
Use shared reference datasets (proteomes and
species trees)
Benchmark orthology predictions
Use an agreed format
Evaluate emerging new methods

Wednesday, October 23, 13
QfO - proteomes
Criteria 1: include the major experimental model
organisms
Criteria 2: include a broad taxonomic range of genomes
Common dataset: QfO Reference Proteome: http://
www.ebi.ac.uk/reference_proteomes
Currently 147 species that are publicly available and are
generated using UniProtKB, Ensembl and Ensembl
Genomes.
Additional species on request, annual release in April

Wednesday, October 23, 13
QfO - Format
Common format: OrthoXML: http://seqxml.org
Designed for representing the orthology relationships
that are generated as output
Sign ups: Ensembl Compara*, HCOP, InParanoid*,
MBGD Microbial Genome Database, OMA*,
OrthoInspector, OrthoMCL, Panther, PHOG, PhyloFacts,
PhylomeDB, ProGMap, Roundup*

Wednesday, October 23, 13
QfO - Benchmark
Compared: Ensembl Compara, InParanoid (Full, core),
MetaPhOrs (Missing 3 genomes), OMA (Pairs, Groups,
HOGs), Orthoinspector 1.30, PANTHER 8.0 (LDO only, all),
PhylomeDB, RSD 0.8 1e-5 (RoundUp)
OrthoBench: http://orthology.benchmarkservice.org
Battery of approaches: Species-tree discordance test: Gold
standard gene trees: Gold standard (hierarchical)
orthologous groups
Minimum standard and sanity check (already useful):
Minimum Information for an Orthology Prediction Algorithm?
Guide to improve algorithms

Wednesday, October 23, 13
A species tree is key
A reliable species phylogeny enhances prediction of gene relationships
Current cladogram comprised of 147 species from the reference
datasets and is based on information from various resources
Newick format
3 identifiers: UniProtKB species code, scientific name, NCBI taxid
Relevant publications for speciation nodes possible

For QfO benchmarking only needs to cover current accepted models of
species evolution
A time-tree would be desirable to define rules for the introduction of
multi-furcating nodes for benchmarking purposes
Ortholog DB providers use it for gene/species tree reconciliation

Wednesday, October 23, 13
Ontology based
annotation of
trees

Wednesday, October 23, 13
Resources
Quest for Orthologs—http://questfororthologs.org/
Alan Wilter Sousa da Silva—
http://www.ebi.ac.uk/reference_proteomes
Eric Sonnhammer & Matthieu Muffato—
http://seqxml.org
Adrian Altenhoff & Christophe Dessimov—
http://orthology.benchmarkservice.org
Brigitte Boeckman—wiki.isb-sib.ch/swisstree

Wednesday, October 23, 13
Our questions
Are model differences among the different ToLs documented and if
yes, is this info made public or can be made available to QfO?
Which tree format is best to use for data comparison and update?
Are confidence values for internal nodes in the ToLs made
available?
Are ToLs available for download (formats, update frequencies,
release identifiers?,...)
Are species identifiers of ToL projects in sync with the NCBI TaxIds?
Is there a point of contact and communication? How can we
productively engage with the ToL & taxonomy community on a
cooperative effort?

Wednesday, October 23, 13
Brigitte Boeckmann
Vincent Daubin
Kristoffer Forslund
Toni Gabaldon

SWISS INSTITUTE OF BIOINFORMATICS
UNIVERSITÉ DE LYON
EUROPEAN MOLECULAR BIOLOGY
LABORATORY
CENTRE DE REGULACIÓ GENOMICS,
BARCELONA

Matthieu Muffato

EUROPEAN BIOINFORMATICS
INSTITUTE

Fabian Schreiber

SANGER INSTITUTE

Special thanks to
Wednesday, October 23, 13

More Related Content

What's hot

ELIXIR Node poster Norway
ELIXIR Node poster NorwayELIXIR Node poster Norway
ELIXIR Node poster NorwayELIXIR-Europe
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Sreekanth Gali
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesHammad Afzal
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics AakifahAmreen
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_finalagosti
 
Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics David Shorthouse
 
Evaluation of beef production and consumption ontology and presentatio...
Evaluation of beef production and consumption ontology and presentatio...Evaluation of beef production and consumption ontology and presentatio...
Evaluation of beef production and consumption ontology and presentatio...Robert Trypuz
 
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...Janos Hajagos
 
Proteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyProteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyChrist College, Rajkot
 
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001Zohaib HUSSAIN
 
Berlin 6 Open Access Conference: Theodore Papazoglou
Berlin 6 Open Access Conference: Theodore PapazoglouBerlin 6 Open Access Conference: Theodore Papazoglou
Berlin 6 Open Access Conference: Theodore PapazoglouCornelius Puschmann
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanksNithyaNandapal
 

What's hot (18)

ELIXIR Node poster Norway
ELIXIR Node poster NorwayELIXIR Node poster Norway
ELIXIR Node poster Norway
 
Resume
ResumeResume
Resume
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resources
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_final
 
Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics
 
Evaluation of beef production and consumption ontology and presentatio...
Evaluation of beef production and consumption ontology and presentatio...Evaluation of beef production and consumption ontology and presentatio...
Evaluation of beef production and consumption ontology and presentatio...
 
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...
VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedic...
 
Proteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyProteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASy
 
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001
ExPASy SIB Bioinformatics Resource Portal CIIT ATD sp13-bty-001
 
Berlin 6 Open Access Conference: Theodore Papazoglou
Berlin 6 Open Access Conference: Theodore PapazoglouBerlin 6 Open Access Conference: Theodore Papazoglou
Berlin 6 Open Access Conference: Theodore Papazoglou
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Opendata repository-v2
Opendata repository-v2Opendata repository-v2
Opendata repository-v2
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 

Viewers also liked

Ortholog assignment
Ortholog assignmentOrtholog assignment
Ortholog assignmentMelvin Zhang
 
Anchoring innovative behaviour change communication in local organisations an...
Anchoring innovative behaviour change communication in local organisations an...Anchoring innovative behaviour change communication in local organisations an...
Anchoring innovative behaviour change communication in local organisations an...International WaterCentre
 
Anchoring Script
Anchoring  Script Anchoring  Script
Anchoring Script Pankaj Rao
 
College fest anchoring script
College fest anchoring scriptCollege fest anchoring script
College fest anchoring scriptLata A G
 
Master of Ceremony Script
Master of Ceremony ScriptMaster of Ceremony Script
Master of Ceremony ScriptBella Meraki
 

Viewers also liked (6)

Ortholog assignment
Ortholog assignmentOrtholog assignment
Ortholog assignment
 
Anchoring innovative behaviour change communication in local organisations an...
Anchoring innovative behaviour change communication in local organisations an...Anchoring innovative behaviour change communication in local organisations an...
Anchoring innovative behaviour change communication in local organisations an...
 
Anchoring
AnchoringAnchoring
Anchoring
 
Anchoring Script
Anchoring  Script Anchoring  Script
Anchoring Script
 
College fest anchoring script
College fest anchoring scriptCollege fest anchoring script
College fest anchoring script
 
Master of Ceremony Script
Master of Ceremony ScriptMaster of Ceremony Script
Master of Ceremony Script
 

Similar to Quest for Orthologs: anchoring comparative biology research (TDWG 2013)

Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisationBiogeeks
 
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)Somsak Likhitrattanapisal
 
Evolution Phylogenetic
Evolution PhylogeneticEvolution Phylogenetic
Evolution PhylogeneticSamsil Arefin
 
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Ross Mounce
 
The electroniorg10.1098uk.Author for cReceived .docx
The electroniorg10.1098uk.Author for cReceived .docxThe electroniorg10.1098uk.Author for cReceived .docx
The electroniorg10.1098uk.Author for cReceived .docxcherry686017
 
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...Alejandra Gonzalez-Beltran
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionPaul Groth
 
ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ARDC
 
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesProcessing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesMartin Hartmann
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
 
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...Global R & D Services
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...Barry Hardy
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 

Similar to Quest for Orthologs: anchoring comparative biology research (TDWG 2013) (20)

Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)
AnMicro-TBRC Seminar on Phylogenetic Analysis (EP.2)
 
Evolution Phylogenetic
Evolution PhylogeneticEvolution Phylogenetic
Evolution Phylogenetic
 
Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8
 
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
 
The electroniorg10.1098uk.Author for cReceived .docx
The electroniorg10.1098uk.Author for cReceived .docxThe electroniorg10.1098uk.Author for cReceived .docx
The electroniorg10.1098uk.Author for cReceived .docx
 
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...From peer-reviewed to peer-reproduced: a role for research objects in scholar...
From peer-reviewed to peer-reproduced: a role for research objects in scholar...
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 
ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016
 
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesProcessing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...
3rd International Conference on Biotechnology, Bio Informatics, Bio Medical S...
 
Biological database
Biological databaseBiological database
Biological database
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Mikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_systemMikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_system
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
Bms 2010
Bms 2010Bms 2010
Bms 2010
 

Recently uploaded

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Recently uploaded (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Quest for Orthologs: anchoring comparative biology research (TDWG 2013)

  • 1. Quest for Orthologs* anchoring comparative biology research Sharing and delivery of reusable phylogenetic knowledge TDWG 2013 Annual Conference - October 2013 Florence, Italy *http://questfororthologs.org/ Wednesday, October 23, 13
  • 2. Evolutionary conservation allows knowledge transfer between well-characterized model organisms to human & other organisms and is the basis for comparative genomic studies Wednesday, October 23, 13
  • 3. The barriers More than 30 phylogenomic databases provide their analysis results to the scientific community. The content of these databases differ The concepts of these databases also differ Complex/slow pipelines Unavailability as stand alone programs Different output formats Lack of benchmarking data sets Consequently comparing and choosing is difficult Wednesday, October 23, 13
  • 4. SWISS INSTITUTE OF BIOINFORMATICS EUROPEAN BIOINFORMATICS INSTITUTE STOCKHOLMS UNIVERSITET EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH INSTITUT DE GÉNÉTIQUE ET MICROBIOLOGIE NATIONAL INSTITUTE FOR BASIC BIOLOGY, JAPAN SANGER INSTITUTE EUROPEAN MOLECULAR BIOLOGY LABORATORY INSTITUT NATIONAL DE LA RECHERCHE AGRONOMIQUE UNIVERSIDAD DE MURCIA JOINT GENOME INSTITUTE UNIVERSITY OF LAUSANNE Who we are SYNGENTA UNIVERSITÄT BONN UNIVERSITY OF CAMBRIDGE CENTRE INTERNATIONALE POUR LA RECHERCHE AGRONOMIQUE POUR LE DÉVELOPPEMENT JACKSON LABORATORY UNIVERSITÉ DE LYON UNIVERSITÉ DE GENEVE PRINCETON UNIVERSITY Wednesday, October 23, 13 CENTRE DE REGULACIÓ GENOMICS, BARCELONA UNIVERSITY OF PENNSYLVANIA
  • 5. Quest for Orthologs’ objectives A collaboration of phylogenomic databases Use shared reference datasets (proteomes and species trees) Benchmark orthology predictions Use an agreed format Evaluate emerging new methods Wednesday, October 23, 13
  • 6. QfO - proteomes Criteria 1: include the major experimental model organisms Criteria 2: include a broad taxonomic range of genomes Common dataset: QfO Reference Proteome: http:// www.ebi.ac.uk/reference_proteomes Currently 147 species that are publicly available and are generated using UniProtKB, Ensembl and Ensembl Genomes. Additional species on request, annual release in April Wednesday, October 23, 13
  • 7. QfO - Format Common format: OrthoXML: http://seqxml.org Designed for representing the orthology relationships that are generated as output Sign ups: Ensembl Compara*, HCOP, InParanoid*, MBGD Microbial Genome Database, OMA*, OrthoInspector, OrthoMCL, Panther, PHOG, PhyloFacts, PhylomeDB, ProGMap, Roundup* Wednesday, October 23, 13
  • 8. QfO - Benchmark Compared: Ensembl Compara, InParanoid (Full, core), MetaPhOrs (Missing 3 genomes), OMA (Pairs, Groups, HOGs), Orthoinspector 1.30, PANTHER 8.0 (LDO only, all), PhylomeDB, RSD 0.8 1e-5 (RoundUp) OrthoBench: http://orthology.benchmarkservice.org Battery of approaches: Species-tree discordance test: Gold standard gene trees: Gold standard (hierarchical) orthologous groups Minimum standard and sanity check (already useful): Minimum Information for an Orthology Prediction Algorithm? Guide to improve algorithms Wednesday, October 23, 13
  • 9. A species tree is key A reliable species phylogeny enhances prediction of gene relationships Current cladogram comprised of 147 species from the reference datasets and is based on information from various resources Newick format 3 identifiers: UniProtKB species code, scientific name, NCBI taxid Relevant publications for speciation nodes possible For QfO benchmarking only needs to cover current accepted models of species evolution A time-tree would be desirable to define rules for the introduction of multi-furcating nodes for benchmarking purposes Ortholog DB providers use it for gene/species tree reconciliation Wednesday, October 23, 13
  • 11. Resources Quest for Orthologs—http://questfororthologs.org/ Alan Wilter Sousa da Silva— http://www.ebi.ac.uk/reference_proteomes Eric Sonnhammer & Matthieu Muffato— http://seqxml.org Adrian Altenhoff & Christophe Dessimov— http://orthology.benchmarkservice.org Brigitte Boeckman—wiki.isb-sib.ch/swisstree Wednesday, October 23, 13
  • 12. Our questions Are model differences among the different ToLs documented and if yes, is this info made public or can be made available to QfO? Which tree format is best to use for data comparison and update? Are confidence values for internal nodes in the ToLs made available? Are ToLs available for download (formats, update frequencies, release identifiers?,...) Are species identifiers of ToL projects in sync with the NCBI TaxIds? Is there a point of contact and communication? How can we productively engage with the ToL & taxonomy community on a cooperative effort? Wednesday, October 23, 13
  • 13. Brigitte Boeckmann Vincent Daubin Kristoffer Forslund Toni Gabaldon SWISS INSTITUTE OF BIOINFORMATICS UNIVERSITÉ DE LYON EUROPEAN MOLECULAR BIOLOGY LABORATORY CENTRE DE REGULACIÓ GENOMICS, BARCELONA Matthieu Muffato EUROPEAN BIOINFORMATICS INSTITUTE Fabian Schreiber SANGER INSTITUTE Special thanks to Wednesday, October 23, 13