Mungall keynote-biocurator-2017

Chris Mungall
Chris MungallComputer Research Scientist at Lawrence Berkeley National Laboratory
Chris Mungall
Biocuration, Stanford, 2017
2017: AN ONTOLOGY
BIOCURATION ODYSSEY
chrismungall
Outline
 My path towards biocuration
 Ontologies past and future
 Some final thoughts on biocuration
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Edinburgh,
Scotland
Which path to AI? (circa 1990s)
Knowledge-
Based
Knowledge-
Free
statisti
cs
logic
learnin
g
encodin
g
Artificial Intelligence
Narrow AI Broad AI
‘knowin
g that’
‘knowin
g how’
Biologicall
y inspired
Cognitivel
y inspired
- All cats are mammals
- All dogs are mammals
- All cats are mammals
- All dogs are mammals
- Mammals have fur
- Dogs like balls
- Fido is a dog
Mungall keynote-biocurator-2017
???
Answer: CAT
DOES NOT
COMPUTE
Mungall keynote-biocurator-2017
• Analysis pipeline
• Curation tools
• Annotation databa
From sequence to genome
annotation
• Analysis pipeline
• Curation tools
• Annotation databa
Chado
Mungall, C. J., Emmert, D. B., & FlyBase Consortium, (2007). A Chado case study: an ontology-based
modular schema for representing genome-associated biological information. Bioinformatics, 23(13),
i337-346. http://doi.org/10.1093/bioinformatics/btm189
Generalized community tools
• Analysis pipeline
• Curation tools
• Annotation
database
• Functional
annotation
Genomes to function
annotation?
What does it do?
Gene Ontology: tool for the
unification of biology (2000)
 Organize
generalized
biological
knowledge as a
graph
 Attach genes to
nodes
 Propagate across
species
 Create gene lists
 Interpret high
throughput data
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., … Sherlock, G. (2000). Gene ontology: tool for the
unification of biology. The Gene Ontology Consortium. Nat Genet, 25(1), 25–29. http://doi.org/10.1038/75556
Ontologies as force amplifiers for
data
domain knowledgedata
biocurationexperimen
Don’t worship the monolith
PROBLEM: GO and other ontologies were becoming monolithic
- lots of implicit overlap with other ontologies, latent structure
Open Biological Ontologies
(OBO)
http://obofoundry.org
1. Well-integrated
Modular ontologies
2. Provide technical
and
sociotechnological
framework for
cooperation
4. Allow us to
curate all of the
things
3. Provide tools, best
practices and
infrastructure for
forging new
ontologies
@obofoundry
OBO Library PURLs
 PURL: Persistent URL
 Consistent, predictable, stable and versioned
URLs for ontology objects
 Can be shortened as compact URIs (CURIEs), e.g.
GO:0008150
 Can be registered and viewed on OBO site
 http://obofoundry.org
 Ontology purls
 Main ontology, subsets
 versionIRIs
 Ontology term purls
compound
eye
ommatidium
sense organ
eye
disc
is_a
part_of
develops
from
detection of light
stimulus involved in
visual perception
(GO)
One ontology to bind them: the
Relation Ontology (RO)
capable of
outer photoreceptor
cell
part_of
http://obofoundry.org/ontology/ro.html
lamina monopolar
neuron L3
synapsed
by
Contributions to and uses of
RO
virtualflybrain.org globalbioticinteractions.org
Osumi-Sutherland, D. (2012).
doi:10.1093/bioinformatics/bts113
 Has soma location
 Has synaptic terminal in
 Upstream in neural circuit with
 …
 Eats
 Epiphyte of
 Parasite of
 Kleptoparasitizes
 hyperparasitizes
Neurocellular Bioitic interaction
 Is model of
 Has phenotype
 Molecularly controls
 Allosteric inhibitor of
 causes or contributes to condition
 ...
David Osumi-Sutherland Anne ThessenMatt Brush Greg Stupp
Gene, drug,
phenotype
>500 relations
Mungall keynote-biocurator-2017
What happens when the pieces
don’t fit together?
Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Some relationships didn’t make sense
• E.g. nucleotide isa carbohydrate
• Acids  conjugate bases
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Fixed many is-as
• E.g. nucleotide isa carbohydrate
• Acids  conjugate bases
+ OWL reasoning
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
GO CHEBI
+ Design Patterns
lung
lung
lobular organ
parenchymatous
organ
solid organ
pleural sac
thoracic
cavity organ
thoracic
cavity
abnormal lung
morphology
abnormal respiratory
system morphology
Mammalian Phenotype
Mouse Anatomy
FMA
abnormal pulmonary
acinus morphology
abnormal pulmonary
alveolus morphology
lung
alveolus
organ system
respiratory
system
Lower
respiratory
tract
alveolar sac
pulmonary
acinus
organ system
respiratory
system
Human development
lung
lung bud
respiratory
primordium
pharyngeal region
Challenges of multi-species anatomy
and phenotypes
develops_from
part_of
is_a (SubClassOf)
surrounded_by
The perils of mappings
Class A Class B Mapped
?
Useful
?
FMA: extensor
retinaculum of wrist
MouseAnatomy: retina Yes No
Plant Ontology: Pith
Fly Anat: femur
MouseAnatomy: medulla
MouseAnatomy: femur
Yes
Yes
No
No*
ZfishAnat: hypophysis MouseAnatomy: pituitary No Yes
TAO:fossa AdverseReactions: depression Yes No
FMA: colon GAZ: Colón, Panama Yes No
Quality: male Chebi: maleate 2(-) Yes No
http://uberon.org
• Initial Phase
• Bottom-up
• Create groupings of
terms
• Light curation
• Next Phase
• Top down
• 14k classes
• Design Patterns
• Periodic alignment
and feeding back to
curators
Uberon
http://uberon.org
Uberon for gene expression
curation
http://bgee.org/
Uberon for gene expression
curation
http://bgee.org/
dinosaurs, sponges, comb jellies
and cephalopods, oh my
Thacker, R. W., (2014). The Porifera Ontology (PORO):
enhancing sponge systematics with an anatomy ontology.
Journal of Biomedical Semantics, 5(1), 39.
http://doi.org/10.1186/2041-1480-5-39
Graphic courtesy Nizar Ibrahim, Paul Sereno, et al.
Phenotype RCN
Wasila Dahdul
Bob Thacker
obofoundry.org/
ontology/ceph.html
obofoundry.org/
ontology/cteno.html
Phenotype and Disease
Ontologies
 Problem: Many ontologies, vocabularies and
condition/phenotype lists:
 HP, MP, WBPhenotype, FBcv, TO, VT, FYPO, APO,
SNOMED
 OMIM, Orphanet, DO, NCIT, MESH, ICD, UMLS,
MEDGEN …
 ZFIN, Phenoscape: EQ
Köhler, S.. (2013).. F1000Research, 1–
12.
http://doi.org/10.3410/f1000research.2-
Standardized Design
Patterns + OWL
Reasoning
Bayesian OWL Ontology
Merging
(BOOM)
Mungall, C.J et al (2016) kBOOM.
bioRxiv 10.1101/048843
Monarch merged
‘upheno’ ontology
MonDO
Elvira Mitraka
Sue Bello Nicole
Vasileksky
Combined score
Remove off-target and common variants
Whole exome
Variant Score based on allele frequency and
pathological impact
Mendelian filters
Whole or partial
phenome (HPO)
Owl
Sim
Gene phenotype scores
Curated
Phenotype
Data
Monarch
Integrated
KB
upheno
Curated
Orthology,
Interaction, ..
Data
+GENOMISER
Environments
animal-
associated
soil
marine
plant-
associated
sediment
aquatic
hot spring
food
cultured
freshwater
hydrothermal
vent
terrestrialsludge waste water
extremeorganism-
associated
air
microbial mat
lite
http://obofoundry.org/ontology/envo.html
Ramona Walls
Pier Luigi Buttigieg
Environments: generalizing beyond
microbes
https://github.com/cmungall/environmental-conditions
Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
• Genes in slime mold
responsible for dorsal fin
development
Solution: Taxon constraints
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Encode taxon constraints as OWL
rules in the ontology
only in taxon
never in taxon
Can be propagated across
ontologies
E.g.
dorsal fin only in vertebrata
(uberon)
dorsal fin never in tetrapod
(uberon)
lactation only in mammals (go)
Hi, ROBOT
 How can we package things up and make
them easier to use in ontology/curation QC
pipelines?
 Enter ROBOT
 Design Patterns
 Continuous Integration
Next steps for ontology
annotation
 Existing ontology annotation model:
 Bag of terms
gene
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
All GO
annotations for
(human) beta-
catenin:(Molec
ular Function
branch)
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Next generation ontology
annotation in Noctua
http://noctua.berkeleybop.org/
Generalization to phenotypes
http://noctua.berkeleybop.org/
Intelligent Concept Assistant
https://github.com/INCATools
Take homes
 Knowledge is a force multiplier
 Applies to all biocuration work
 But pinpoints need for QC
 Design for generality
 But acknowledge difficulties
 Better support required
 Biological knowledge is multifaceted and
nuanced
 Computer scientists have a tendency towards
hubris
 Biology is our nemesis
 Collaborative approach is vital
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
http://hoodline.com/2016/12/caught-on-camera-self-driving-uber-runs-red-
light-in-soma
Mungall keynote-biocurator-2017
Curators are…
Acknowledgments
 Monarch Initiative: Jeremy Nguyen-Xuan, Kent Shefcheck, Matt Brush, Tom Conlin, Lilly
Winfree, Eric Douglass, Jules Jacobsen, Craig McLachan, Suzanna Lewis, Julie McMurry, Dan
Keith, Nicole Washington, Nicole Vasilevsky, Nathan Dunn, Harry Hochheiser, William Bone, Neal
Boerkel, Damian Smedley, Tudor Groza, Sebastian Koehler, Melissa Haendel, Peter
Robinson
 GO: Michael Ashburner, David Hill, Paola Roncaglia, David Osumi-Sutherland, Tanya Berardini,
Jen Deegan, Jane Lomax, Karen Christie, Pascale Gaudet, Monica Munoz-Torres, Seth
Carbon, Eric Douglass, Heiko Dietze, Ruth Loverin, Rachael Huntley, Midori Harris, Harold
Drabkin, Kimberley Van Auken, Marc Feuermann, Petra Fey, Jim Hu, Debbie Siegel, Helen
Parkinson, Tony Sawford, Stacia Engel, Sylav Poux, Melanie Courtot, Becky Foulger, Emily
Dimmer, Rachael Huntley, Huaiyu Mi, Judy Blake, Paul Sternberg, Mike Cherry, Suzi Lewis, Paul
Thomas
 OBO: Michael Ashburner, Suzanna Lewis, Barry Smith, Richard Scheuermann, Chris Stockert,
Jie Zheng, Melanie Courtot, Simon Jupp, Ramona Wall,s Darren Natale, Melissa Haendel, Lynn
Schriml, Alan Ruttenberg, Seth Carbon, James Overton, Bjoern Peters, + all contributors
 Planteome: Pankaj Jaiswal, Dennis Stevenson, Laurel Cooper, Austin Meier, Marie Angelique
Laporte, Elizabeth Arnaud
 Uberon: David Osumi-Sutherland, Paula Mabee, Jim Balhoff, Wasila Dahdul, Alex Dececci,
Nizar Ibrahim, Paul Sereno, Frederic Bastian, Ann Niknejad, Marc Robinson-Rechavi, David
Blackburn, Terry Hayamizu, Yvonne Bradford, Ceri Van Slyke, Alex Diehl, Terry Meehab,
Robert Druzinsky, Melissa Haendel
 ALL OF THE BIOCURATORSNIH ORIP R24OD011883
NHGRI U41HG 002273 NSF DEB-0956049 DOE DE-AC02-05CH11231
NSF IOS 1340112
NSF DBI 1062404
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Give me a place to stand and with a lever I
will move the whole world
Uncovering latent meaning in
ontologies
Mungall, C. J. (2004). Obol: Integrating Language and Meaning in Bio-Ontologies. Comparative and
Functional Genomics, 5(7), 509–520.
regulation of Notch signaling pathway involved in heart
induction
relation relation anatomicpathway
OWL EXPRESSION HERE
≡ ∃regulates (NSP ⊓ ∃ part-of HI)
Open Biological Ontologies
(OBO)
 To provide modular building
blocks
 Not just functional annotation of
genes and gene products
 Framework, tools and
infrastructure for cooperation and
harmonization
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., … Lewis, S. (2007). The OBO Foundry: coordinated
evolution of ontologies to support biomedical data integration. Nat Biotechnol, 25(11), 1251–1255.
Functio
n
(GO)
Anatomy
Environ
ment
Chemical
s
(CHEBI)
Phenotyp
e and
Disease
Genes
(SO,
GENO)
Occurs
in
…
http://obofoundry.org
OBO: Modularity
Functio
n
(GO)
Gross
Anatomy
Chemical
s
(CHEBI)
Abnormal
Phenotype and
Disease
Sequenc
e
Imported into
Cell
Types
Relations: the glue that holds it
together
 RO 2005 paper
 10 relations
 Current RO
 >500 relations
 Molecular biology
 Neurobiology
 Biotic interactions
 …
 Many rules on how relations compose together
 Working with wikidata
http://obofoundry.org/ontology/ro.html
Beyond the GO
Functional
Genomics: Gene
function
Transcriptomics:
Gene expression
Phenomics: Effects
of gene mutations
Gene Ontology
Anatomy and
Stage Ontology
Phenotype and
Trait Ontology
Links genes to
What they do
Links genes to
where they
are expressed
Links genes to
what happens
when they are
disrupted or
when they varyDisease Ontology
Environment
Ontology
anatomical
structure
endoderm of
forgut
lung bud
lung
respiration organ
organ
foregut
alveolus
alveolus of lung
organ part
FMA:lung
MA:lung
endoderm
GO: respiratory
gaseous exchange
MA:lung
alveolus
FMA:
pulmonary
alveolus
is_a (taxon equivalent)
develops_from
part_of
is_a (SubClassOf)
capable_of
NCBITaxon: Mammalia
EHDAA:
lung bud
only_in_taxon
pulmonary
acinus
alveolar sac
lung primordium
swim bladder
respiratory
primordium
NCBITaxon:
Actinopterygii
http://uberon.org
Mungall, C. J., Torniai, C., Gkoutos, G. V, Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy
ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5
Uberon bridges anatomy
ontologies
Uberon for comparative Gene
Expression
http://bgee.org/
Uberon Core
Extensions to other animals…
Thacker, R. W., Díaz, M. C., Kerner, A., Vignes-Lebbe, R., Segerdell, E., Haendel, M. a,
& Mungall, C. J. (2014). The Porifera Ontology (PORO): enhancing sponge
systematics with an anatomy ontology. Journal of Biomedical Semantics, 5(1), 39
Non-model/human
extension
Porifera
Ontology
Ctenophore
Ontology
Cephalopod
Ontology
http://phenotypercn.org
https://github.com/obophenotype/cephalopod-ontology
https://github.com/obophenotype/ctenophore-ontology
https://github.com/obophenotype/porifera-ontology
https://github.com/obophenotype/uberon
Arthropod
Ontology
http://monarchinitiative.org/analyze/phenotypes/
PhenoGrid: visualizing phenotype
matches
The Undiagnosed Disease Patient
(UDP) Use Case
Clinical
Phenotyping
(HPO/phenot
ips)
Exome
Sequencing
Causative
Variant?
https://www.sanger.ac.uk/resources/databases/exomiser/query/exomiser2
Robinson, P., et al . (2013). Improved exome prioritization of
disease genes through cross species phenotype comparison.
Genome Research. doi:10.1101/gr.160325.113
TODO DEPRECATED The need
for modularization
 Growing pains of GO
 Terms were added as-needed for curation
 Hard to maintain
 Scope: Encompassing all of biology is hard
 Biochemistry, cell biology, plants, animal development and
physiology, …
 We needed to modularize
 Meanwhile
 Other ontologies in the ‘style’ of GO were popping up,
for annotating other kinds of data
 Challenge: how were we going to coordinate this?
Biological knowledge and curation
QC
 Taxon constraints
 CONCRETE EXAMPLE HERE
 Intersection rules
 (see Seth’s talk)
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Knowledge-Based
• ice cream derived-from
dairy
• Ice cream is yummy
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Uberon/CL applications and
users
 Ontology Modularization
 GO
 CLO
 Pheno Ontologies (EQ definitions)
 ENVO
 Transcriptomics and genome annotation
 ENCODE
 FANTOM5
 LINCS
 BgeeDb
 Phenomics
 Human and Mammalia Phenotype Ontology
 Phenotype comparison algorithms
 Evolutionary Phenotypes: Phenoscape
http://uberon.github.io/about/adopters.html
The path to AI, 1990s
 Two goals
 Broad AI
 Narrow AI
 What path to get there?
 Knowledge-Based
 Explicit Encoding of knowledge about the world
 Analytic or deductive reasoning
 Mathematical Logic vs Cognitively inspired (neats vs scruffs)
 ‘Knowing that’
 Knowledge-Free
 Machine Learning, Neural Networks
 Statistics
 Pattern Recognition
 Biological Inspired
 ‘Knowing how’
Opposites
Koehler et al, bioRxiv https://doi.org/10.1101/108977
compound
eye
ommatidium
sense organ
eye
disc
is_a
part_of
develops
from
detection of light
stimulus involved in
visual perception
One ontology to bind them: the
Relation Ontology (RO)
capable of
outer photoreceptor
cell
part_of
http://obofoundry.org/ontology/ro.html
lamina monopolar
neuron L3
synapsed
by
1 of 85

Recommended

Computing on Phenotypes AMP 2015 by
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
3.7K views34 slides
Representation of kidney structures in Uberon by
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
210 views32 slides
Collaboratively Creating the Knowledge Graph of Life by
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
238 views49 slides
Causal reasoning using the Relation Ontology by
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
535 views55 slides
Experiences in the biosciences with the open biological ontologies foundry an... by
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
441 views25 slides
Ontology Development Kit: Bio-Ontologies 2019 by
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
816 views19 slides

More Related Content

What's hot

GIGA2 Structuring Phenotype Data by
GIGA2 Structuring Phenotype DataGIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype DataChris Mungall
841 views38 slides
Mapping Phenotype Ontologies for Obesity and Diabetes by
Mapping Phenotype Ontologies for Obesity and DiabetesMapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and DiabetesChris Mungall
2.7K views30 slides
Light Intro to the Gene Ontology by
Light Intro to the Gene OntologyLight Intro to the Gene Ontology
Light Intro to the Gene Ontologynniiicc
984 views29 slides
Introduction to Ontologies for Environmental Biology by
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
5K views105 slides
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga... by
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
39 views55 slides
Pathogen Genome Data by
Pathogen Genome DataPathogen Genome Data
Pathogen Genome DataLeighton Pritchard
322 views65 slides

What's hot(20)

GIGA2 Structuring Phenotype Data by Chris Mungall
GIGA2 Structuring Phenotype DataGIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype Data
Chris Mungall841 views
Mapping Phenotype Ontologies for Obesity and Diabetes by Chris Mungall
Mapping Phenotype Ontologies for Obesity and DiabetesMapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and Diabetes
Chris Mungall2.7K views
Light Intro to the Gene Ontology by nniiicc
Light Intro to the Gene OntologyLight Intro to the Gene Ontology
Light Intro to the Gene Ontology
nniiicc984 views
Introduction to Ontologies for Environmental Biology by Barry Smith
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
Barry Smith5K views
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga... by Nathan Dunn
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Nathan Dunn39 views
Molecular scaffolds poster by Jeremy Yang
Molecular scaffolds posterMolecular scaffolds poster
Molecular scaffolds poster
Jeremy Yang984 views
Building and Using Ontologies to do biology by robertstevens65
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biology
robertstevens65585 views
Molecular scaffolds are special and useful guides to discovery by Jeremy Yang
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
Jeremy Yang6K views
Computing on the shoulders of giants by Benjamin Good
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
Benjamin Good723 views
US2TS presentation on Gene Ontology by Chris Mungall
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
Chris Mungall356 views
ContentMine Presentation for WHO Health Data Seminar by Jenny Molloy
ContentMine Presentation for WHO Health Data SeminarContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
Jenny Molloy3.9K views
ICAR2016 TAIR talk by Donghui Li
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talk
Donghui Li201 views
Introduction to Gene Mining Part A: BLASTn-off! by adcobb
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
adcobb1.2K views
International Cancer Genomics Consortium (ICGC) Data Coordinating Center by Neuro, McGill University
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
Functional annotation of invertebrate genomes by Surya Saha
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
Surya Saha331 views
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera by Monica Munoz-Torres
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraAn introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera

Similar to Mungall keynote-biocurator-2017

Fundamentals of Analysis of Exomes by
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomesdaforerog
2.3K views26 slides
Pdb Enzyme Lab by
Pdb Enzyme LabPdb Enzyme Lab
Pdb Enzyme LabElizabeth Jenkins
2 views39 slides
Use of semantic phenotyping to aid disease diagnosis by
Use of semantic phenotyping to aid disease diagnosisUse of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosismhaendel
1.4K views61 slides
Getting Started with the Hymenoptera Anatomical Ontology by
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyKatja C. Seltmann
2.8K views61 slides
Knockout mice by
Knockout miceKnockout mice
Knockout miceSam Khoo
4K views14 slides
Zebrafish Embryos by
Zebrafish EmbryosZebrafish Embryos
Zebrafish EmbryosErica Smith
2 views40 slides

Similar to Mungall keynote-biocurator-2017(20)

Fundamentals of Analysis of Exomes by daforerog
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomes
daforerog2.3K views
Use of semantic phenotyping to aid disease diagnosis by mhaendel
Use of semantic phenotyping to aid disease diagnosisUse of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosis
mhaendel1.4K views
Getting Started with the Hymenoptera Anatomical Ontology by Katja C. Seltmann
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical Ontology
Katja C. Seltmann2.8K views
Knockout mice by Sam Khoo
Knockout miceKnockout mice
Knockout mice
Sam Khoo4K views
Basic Formal Ontology (BFO) and Disease by Barry Smith
 Basic Formal Ontology (BFO) and Disease Basic Formal Ontology (BFO) and Disease
Basic Formal Ontology (BFO) and Disease
Barry Smith2.1K views
Why the world needs phenopacketeers, and how to be one by mhaendel
Why the world needs phenopacketeers, and how to be oneWhy the world needs phenopacketeers, and how to be one
Why the world needs phenopacketeers, and how to be one
mhaendel1.3K views
Deep phenotyping for everyone by mhaendel
Deep phenotyping for everyoneDeep phenotyping for everyone
Deep phenotyping for everyone
mhaendel2.8K views

More from Chris Mungall

MADICES Mungall 2022.pptx by
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
19 views17 slides
Scaling up semantics; lessons learned across the life sciences by
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
43 views75 slides
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO by
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
2.7K views53 slides
Ontology Access Kit_ Workshop Intro Slides.pptx by
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
92 views57 slides
LinkML Intro (for Monarch devs) by
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
126 views27 slides
LinkML presentation to Yosemite Group by
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
143 views60 slides

More from Chris Mungall(19)

Scaling up semantics; lessons learned across the life sciences by Chris Mungall
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
Chris Mungall43 views
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO by Chris Mungall
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
Chris Mungall2.7K views
Ontology Access Kit_ Workshop Intro Slides.pptx by Chris Mungall
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
Chris Mungall92 views
LinkML Intro (for Monarch devs) by Chris Mungall
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
Chris Mungall126 views
LinkML presentation to Yosemite Group by Chris Mungall
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
Chris Mungall143 views
SparqlProg (BioHackathon 2019) by Chris Mungall
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
Chris Mungall3K views
US2TS: Reasoning over multiple open bio-ontologies to make machines and human... by Chris Mungall
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
Chris Mungall206 views
Uberon: opening up to community contributions by Chris Mungall
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
Chris Mungall144 views
Modeling exposure events and adverse outcome pathways using ontologies by Chris Mungall
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
Chris Mungall221 views
Introduction to the BioLink datamodel by Chris Mungall
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
Chris Mungall1.3K views
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013 by Chris Mungall
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Chris Mungall1K views
Ontologies and Continuous Integration by Chris Mungall
Ontologies and Continuous IntegrationOntologies and Continuous Integration
Ontologies and Continuous Integration
Chris Mungall2K views
Human developmental-kb-2012 by Chris Mungall
Human developmental-kb-2012Human developmental-kb-2012
Human developmental-kb-2012
Chris Mungall437 views

Recently uploaded

MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf by
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfMODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfKerryNuez1
21 views5 slides
plasmids by
plasmidsplasmids
plasmidsscribddarkened352
7 views2 slides
DATABASE MANAGEMENT SYSTEM by
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDr. GOPINATH D
5 views50 slides
Guinea Pig as a Model for Translation Research by
Guinea Pig as a Model for Translation ResearchGuinea Pig as a Model for Translation Research
Guinea Pig as a Model for Translation ResearchPervaizDar1
11 views21 slides
Pollination By Nagapradheesh.M.pptx by
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptxMNAGAPRADHEESH
15 views9 slides
Nitrosamine & NDSRI.pptx by
Nitrosamine & NDSRI.pptxNitrosamine & NDSRI.pptx
Nitrosamine & NDSRI.pptxNileshBonde4
8 views22 slides

Recently uploaded(20)

MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf by KerryNuez1
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfMODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
KerryNuez121 views
Guinea Pig as a Model for Translation Research by PervaizDar1
Guinea Pig as a Model for Translation ResearchGuinea Pig as a Model for Translation Research
Guinea Pig as a Model for Translation Research
PervaizDar111 views
Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH15 views
Connecting communities to promote FAIR resources: perspectives from an RDA / ... by Allyson Lister
Connecting communities to promote FAIR resources: perspectives from an RDA / ...Connecting communities to promote FAIR resources: perspectives from an RDA / ...
Connecting communities to promote FAIR resources: perspectives from an RDA / ...
Allyson Lister33 views
application of genetic engineering 2.pptx by SankSurezz
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptx
SankSurezz6 views
How to be(come) a successful PhD student by Tom Mens
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
Tom Mens422 views
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx by MN
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
MN6 views
Ethical issues associated with Genetically Modified Crops and Genetically Mod... by PunithKumars6
Ethical issues associated with Genetically Modified Crops and Genetically Mod...Ethical issues associated with Genetically Modified Crops and Genetically Mod...
Ethical issues associated with Genetically Modified Crops and Genetically Mod...
PunithKumars618 views
별헤는 사람들 2023년 12월호 전명원 교수 자료 by sciencepeople
별헤는 사람들 2023년 12월호 전명원 교수 자료별헤는 사람들 2023년 12월호 전명원 교수 자료
별헤는 사람들 2023년 12월호 전명원 교수 자료
sciencepeople7 views
PRINCIPLES-OF ASSESSMENT by rbalmagro
PRINCIPLES-OF ASSESSMENTPRINCIPLES-OF ASSESSMENT
PRINCIPLES-OF ASSESSMENT
rbalmagro11 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles543 views
himalay baruah acid fast staining.pptx by HimalayBaruah
himalay baruah acid fast staining.pptxhimalay baruah acid fast staining.pptx
himalay baruah acid fast staining.pptx
HimalayBaruah5 views
"How can I develop my learning path in bioinformatics? by Bioinformy
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?
Bioinformy18 views

Mungall keynote-biocurator-2017

  • 1. Chris Mungall Biocuration, Stanford, 2017 2017: AN ONTOLOGY BIOCURATION ODYSSEY chrismungall
  • 2. Outline  My path towards biocuration  Ontologies past and future  Some final thoughts on biocuration
  • 7. Which path to AI? (circa 1990s) Knowledge- Based Knowledge- Free statisti cs logic learnin g encodin g Artificial Intelligence Narrow AI Broad AI ‘knowin g that’ ‘knowin g how’ Biologicall y inspired Cognitivel y inspired
  • 8. - All cats are mammals - All dogs are mammals
  • 9. - All cats are mammals - All dogs are mammals - Mammals have fur - Dogs like balls - Fido is a dog
  • 11. ???
  • 14. • Analysis pipeline • Curation tools • Annotation databa From sequence to genome annotation
  • 15. • Analysis pipeline • Curation tools • Annotation databa Chado Mungall, C. J., Emmert, D. B., & FlyBase Consortium, (2007). A Chado case study: an ontology-based modular schema for representing genome-associated biological information. Bioinformatics, 23(13), i337-346. http://doi.org/10.1093/bioinformatics/btm189 Generalized community tools
  • 16. • Analysis pipeline • Curation tools • Annotation database • Functional annotation Genomes to function annotation? What does it do?
  • 17. Gene Ontology: tool for the unification of biology (2000)  Organize generalized biological knowledge as a graph  Attach genes to nodes  Propagate across species  Create gene lists  Interpret high throughput data Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., … Sherlock, G. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 25(1), 25–29. http://doi.org/10.1038/75556
  • 18. Ontologies as force amplifiers for data domain knowledgedata biocurationexperimen
  • 19. Don’t worship the monolith PROBLEM: GO and other ontologies were becoming monolithic - lots of implicit overlap with other ontologies, latent structure
  • 20. Open Biological Ontologies (OBO) http://obofoundry.org 1. Well-integrated Modular ontologies 2. Provide technical and sociotechnological framework for cooperation 4. Allow us to curate all of the things 3. Provide tools, best practices and infrastructure for forging new ontologies @obofoundry
  • 21. OBO Library PURLs  PURL: Persistent URL  Consistent, predictable, stable and versioned URLs for ontology objects  Can be shortened as compact URIs (CURIEs), e.g. GO:0008150  Can be registered and viewed on OBO site  http://obofoundry.org  Ontology purls  Main ontology, subsets  versionIRIs  Ontology term purls
  • 22. compound eye ommatidium sense organ eye disc is_a part_of develops from detection of light stimulus involved in visual perception (GO) One ontology to bind them: the Relation Ontology (RO) capable of outer photoreceptor cell part_of http://obofoundry.org/ontology/ro.html lamina monopolar neuron L3 synapsed by
  • 23. Contributions to and uses of RO virtualflybrain.org globalbioticinteractions.org Osumi-Sutherland, D. (2012). doi:10.1093/bioinformatics/bts113  Has soma location  Has synaptic terminal in  Upstream in neural circuit with  …  Eats  Epiphyte of  Parasite of  Kleptoparasitizes  hyperparasitizes Neurocellular Bioitic interaction  Is model of  Has phenotype  Molecularly controls  Allosteric inhibitor of  causes or contributes to condition  ... David Osumi-Sutherland Anne ThessenMatt Brush Greg Stupp Gene, drug, phenotype >500 relations
  • 25. What happens when the pieces don’t fit together?
  • 26. Making the pieces fit together: GO and CHEBI Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513. http://doi.org/10.1186/1471-2164-14-513 GO CHEBI • Some relationships didn’t make sense • E.g. nucleotide isa carbohydrate • Acids  conjugate bases Harold Drabkin David Hill Jane Lomax Tanya Berardini Janna Hastings
  • 27. Making the pieces fit together: GO and CHEBI Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513. http://doi.org/10.1186/1471-2164-14-513 GO CHEBI • Fixed many is-as • E.g. nucleotide isa carbohydrate • Acids  conjugate bases + OWL reasoning Harold Drabkin David Hill Jane Lomax Tanya Berardini Janna Hastings GO CHEBI + Design Patterns
  • 28. lung lung lobular organ parenchymatous organ solid organ pleural sac thoracic cavity organ thoracic cavity abnormal lung morphology abnormal respiratory system morphology Mammalian Phenotype Mouse Anatomy FMA abnormal pulmonary acinus morphology abnormal pulmonary alveolus morphology lung alveolus organ system respiratory system Lower respiratory tract alveolar sac pulmonary acinus organ system respiratory system Human development lung lung bud respiratory primordium pharyngeal region Challenges of multi-species anatomy and phenotypes develops_from part_of is_a (SubClassOf) surrounded_by
  • 29. The perils of mappings Class A Class B Mapped ? Useful ? FMA: extensor retinaculum of wrist MouseAnatomy: retina Yes No Plant Ontology: Pith Fly Anat: femur MouseAnatomy: medulla MouseAnatomy: femur Yes Yes No No* ZfishAnat: hypophysis MouseAnatomy: pituitary No Yes TAO:fossa AdverseReactions: depression Yes No FMA: colon GAZ: Colón, Panama Yes No Quality: male Chebi: maleate 2(-) Yes No
  • 30. http://uberon.org • Initial Phase • Bottom-up • Create groupings of terms • Light curation • Next Phase • Top down • 14k classes • Design Patterns • Periodic alignment and feeding back to curators Uberon
  • 32. Uberon for gene expression curation http://bgee.org/
  • 33. Uberon for gene expression curation http://bgee.org/
  • 34. dinosaurs, sponges, comb jellies and cephalopods, oh my Thacker, R. W., (2014). The Porifera Ontology (PORO): enhancing sponge systematics with an anatomy ontology. Journal of Biomedical Semantics, 5(1), 39. http://doi.org/10.1186/2041-1480-5-39 Graphic courtesy Nizar Ibrahim, Paul Sereno, et al. Phenotype RCN Wasila Dahdul Bob Thacker obofoundry.org/ ontology/ceph.html obofoundry.org/ ontology/cteno.html
  • 35. Phenotype and Disease Ontologies  Problem: Many ontologies, vocabularies and condition/phenotype lists:  HP, MP, WBPhenotype, FBcv, TO, VT, FYPO, APO, SNOMED  OMIM, Orphanet, DO, NCIT, MESH, ICD, UMLS, MEDGEN …  ZFIN, Phenoscape: EQ Köhler, S.. (2013).. F1000Research, 1– 12. http://doi.org/10.3410/f1000research.2- Standardized Design Patterns + OWL Reasoning Bayesian OWL Ontology Merging (BOOM) Mungall, C.J et al (2016) kBOOM. bioRxiv 10.1101/048843 Monarch merged ‘upheno’ ontology MonDO Elvira Mitraka Sue Bello Nicole Vasileksky
  • 36. Combined score Remove off-target and common variants Whole exome Variant Score based on allele frequency and pathological impact Mendelian filters Whole or partial phenome (HPO) Owl Sim Gene phenotype scores Curated Phenotype Data Monarch Integrated KB upheno Curated Orthology, Interaction, .. Data +GENOMISER
  • 38. animal- associated soil marine plant- associated sediment aquatic hot spring food cultured freshwater hydrothermal vent terrestrialsludge waste water extremeorganism- associated air microbial mat lite http://obofoundry.org/ontology/envo.html Ramona Walls Pier Luigi Buttigieg
  • 40. Biological knowledge and curation QC Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530 Annotation errors can arise for different reasons - machine error (inappropriate propagation) - human error Previous versions of the GO had various unusual annotations: • Genes in chicken responsible for lactation
  • 41. Biological knowledge and curation QC Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530 Annotation errors can arise for different reasons - machine error (inappropriate propagation) - human error Previous versions of the GO had various unusual annotations: • Genes in chicken responsible for lactation • Genes in slime mold responsible for dorsal fin development
  • 42. Solution: Taxon constraints Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530 Encode taxon constraints as OWL rules in the ontology only in taxon never in taxon Can be propagated across ontologies E.g. dorsal fin only in vertebrata (uberon) dorsal fin never in tetrapod (uberon) lactation only in mammals (go)
  • 43. Hi, ROBOT  How can we package things up and make them easier to use in ontology/curation QC pipelines?  Enter ROBOT  Design Patterns  Continuous Integration
  • 44. Next steps for ontology annotation  Existing ontology annotation model:  Bag of terms gene ter m ter m ter m ter m ter m ter m ter m ter m
  • 45. All GO annotations for (human) beta- catenin:(Molec ular Function branch)
  • 50. Next generation ontology annotation in Noctua http://noctua.berkeleybop.org/
  • 53. Take homes  Knowledge is a force multiplier  Applies to all biocuration work  But pinpoints need for QC  Design for generality  But acknowledge difficulties  Better support required  Biological knowledge is multifaceted and nuanced  Computer scientists have a tendency towards hubris  Biology is our nemesis  Collaborative approach is vital
  • 59. Acknowledgments  Monarch Initiative: Jeremy Nguyen-Xuan, Kent Shefcheck, Matt Brush, Tom Conlin, Lilly Winfree, Eric Douglass, Jules Jacobsen, Craig McLachan, Suzanna Lewis, Julie McMurry, Dan Keith, Nicole Washington, Nicole Vasilevsky, Nathan Dunn, Harry Hochheiser, William Bone, Neal Boerkel, Damian Smedley, Tudor Groza, Sebastian Koehler, Melissa Haendel, Peter Robinson  GO: Michael Ashburner, David Hill, Paola Roncaglia, David Osumi-Sutherland, Tanya Berardini, Jen Deegan, Jane Lomax, Karen Christie, Pascale Gaudet, Monica Munoz-Torres, Seth Carbon, Eric Douglass, Heiko Dietze, Ruth Loverin, Rachael Huntley, Midori Harris, Harold Drabkin, Kimberley Van Auken, Marc Feuermann, Petra Fey, Jim Hu, Debbie Siegel, Helen Parkinson, Tony Sawford, Stacia Engel, Sylav Poux, Melanie Courtot, Becky Foulger, Emily Dimmer, Rachael Huntley, Huaiyu Mi, Judy Blake, Paul Sternberg, Mike Cherry, Suzi Lewis, Paul Thomas  OBO: Michael Ashburner, Suzanna Lewis, Barry Smith, Richard Scheuermann, Chris Stockert, Jie Zheng, Melanie Courtot, Simon Jupp, Ramona Wall,s Darren Natale, Melissa Haendel, Lynn Schriml, Alan Ruttenberg, Seth Carbon, James Overton, Bjoern Peters, + all contributors  Planteome: Pankaj Jaiswal, Dennis Stevenson, Laurel Cooper, Austin Meier, Marie Angelique Laporte, Elizabeth Arnaud  Uberon: David Osumi-Sutherland, Paula Mabee, Jim Balhoff, Wasila Dahdul, Alex Dececci, Nizar Ibrahim, Paul Sereno, Frederic Bastian, Ann Niknejad, Marc Robinson-Rechavi, David Blackburn, Terry Hayamizu, Yvonne Bradford, Ceri Van Slyke, Alex Diehl, Terry Meehab, Robert Druzinsky, Melissa Haendel  ALL OF THE BIOCURATORSNIH ORIP R24OD011883 NHGRI U41HG 002273 NSF DEB-0956049 DOE DE-AC02-05CH11231 NSF IOS 1340112 NSF DBI 1062404
  • 62. Give me a place to stand and with a lever I will move the whole world
  • 63. Uncovering latent meaning in ontologies Mungall, C. J. (2004). Obol: Integrating Language and Meaning in Bio-Ontologies. Comparative and Functional Genomics, 5(7), 509–520. regulation of Notch signaling pathway involved in heart induction relation relation anatomicpathway OWL EXPRESSION HERE ≡ ∃regulates (NSP ⊓ ∃ part-of HI)
  • 64. Open Biological Ontologies (OBO)  To provide modular building blocks  Not just functional annotation of genes and gene products  Framework, tools and infrastructure for cooperation and harmonization Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., … Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol, 25(11), 1251–1255. Functio n (GO) Anatomy Environ ment Chemical s (CHEBI) Phenotyp e and Disease Genes (SO, GENO) Occurs in … http://obofoundry.org
  • 66. Relations: the glue that holds it together  RO 2005 paper  10 relations  Current RO  >500 relations  Molecular biology  Neurobiology  Biotic interactions  …  Many rules on how relations compose together  Working with wikidata http://obofoundry.org/ontology/ro.html
  • 67. Beyond the GO Functional Genomics: Gene function Transcriptomics: Gene expression Phenomics: Effects of gene mutations Gene Ontology Anatomy and Stage Ontology Phenotype and Trait Ontology Links genes to What they do Links genes to where they are expressed Links genes to what happens when they are disrupted or when they varyDisease Ontology Environment Ontology
  • 68. anatomical structure endoderm of forgut lung bud lung respiration organ organ foregut alveolus alveolus of lung organ part FMA:lung MA:lung endoderm GO: respiratory gaseous exchange MA:lung alveolus FMA: pulmonary alveolus is_a (taxon equivalent) develops_from part_of is_a (SubClassOf) capable_of NCBITaxon: Mammalia EHDAA: lung bud only_in_taxon pulmonary acinus alveolar sac lung primordium swim bladder respiratory primordium NCBITaxon: Actinopterygii http://uberon.org Mungall, C. J., Torniai, C., Gkoutos, G. V, Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5 Uberon bridges anatomy ontologies
  • 69. Uberon for comparative Gene Expression http://bgee.org/
  • 70. Uberon Core Extensions to other animals… Thacker, R. W., Díaz, M. C., Kerner, A., Vignes-Lebbe, R., Segerdell, E., Haendel, M. a, & Mungall, C. J. (2014). The Porifera Ontology (PORO): enhancing sponge systematics with an anatomy ontology. Journal of Biomedical Semantics, 5(1), 39 Non-model/human extension Porifera Ontology Ctenophore Ontology Cephalopod Ontology http://phenotypercn.org https://github.com/obophenotype/cephalopod-ontology https://github.com/obophenotype/ctenophore-ontology https://github.com/obophenotype/porifera-ontology https://github.com/obophenotype/uberon Arthropod Ontology
  • 72. The Undiagnosed Disease Patient (UDP) Use Case Clinical Phenotyping (HPO/phenot ips) Exome Sequencing Causative Variant?
  • 73. https://www.sanger.ac.uk/resources/databases/exomiser/query/exomiser2 Robinson, P., et al . (2013). Improved exome prioritization of disease genes through cross species phenotype comparison. Genome Research. doi:10.1101/gr.160325.113
  • 74. TODO DEPRECATED The need for modularization  Growing pains of GO  Terms were added as-needed for curation  Hard to maintain  Scope: Encompassing all of biology is hard  Biochemistry, cell biology, plants, animal development and physiology, …  We needed to modularize  Meanwhile  Other ontologies in the ‘style’ of GO were popping up, for annotating other kinds of data  Challenge: how were we going to coordinate this?
  • 75. Biological knowledge and curation QC  Taxon constraints  CONCRETE EXAMPLE HERE  Intersection rules  (see Seth’s talk) Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
  • 76. Knowledge-Based • ice cream derived-from dairy • Ice cream is yummy
  • 82. Uberon/CL applications and users  Ontology Modularization  GO  CLO  Pheno Ontologies (EQ definitions)  ENVO  Transcriptomics and genome annotation  ENCODE  FANTOM5  LINCS  BgeeDb  Phenomics  Human and Mammalia Phenotype Ontology  Phenotype comparison algorithms  Evolutionary Phenotypes: Phenoscape http://uberon.github.io/about/adopters.html
  • 83. The path to AI, 1990s  Two goals  Broad AI  Narrow AI  What path to get there?  Knowledge-Based  Explicit Encoding of knowledge about the world  Analytic or deductive reasoning  Mathematical Logic vs Cognitively inspired (neats vs scruffs)  ‘Knowing that’  Knowledge-Free  Machine Learning, Neural Networks  Statistics  Pattern Recognition  Biological Inspired  ‘Knowing how’
  • 84. Opposites Koehler et al, bioRxiv https://doi.org/10.1101/108977
  • 85. compound eye ommatidium sense organ eye disc is_a part_of develops from detection of light stimulus involved in visual perception One ontology to bind them: the Relation Ontology (RO) capable of outer photoreceptor cell part_of http://obofoundry.org/ontology/ro.html lamina monopolar neuron L3 synapsed by