7. Which path to AI? (circa 1990s)
Knowledge-
Based
Knowledge-
Free
statisti
cs
logic
learnin
g
encodin
g
Artificial Intelligence
Narrow AI Broad AI
‘knowin
g that’
‘knowin
g how’
Biologicall
y inspired
Cognitivel
y inspired
14. • Analysis pipeline
• Curation tools
• Annotation databa
From sequence to genome
annotation
15. • Analysis pipeline
• Curation tools
• Annotation databa
Chado
Mungall, C. J., Emmert, D. B., & FlyBase Consortium, (2007). A Chado case study: an ontology-based
modular schema for representing genome-associated biological information. Bioinformatics, 23(13),
i337-346. http://doi.org/10.1093/bioinformatics/btm189
Generalized community tools
16. • Analysis pipeline
• Curation tools
• Annotation
database
• Functional
annotation
Genomes to function
annotation?
What does it do?
17. Gene Ontology: tool for the
unification of biology (2000)
Organize
generalized
biological
knowledge as a
graph
Attach genes to
nodes
Propagate across
species
Create gene lists
Interpret high
throughput data
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., … Sherlock, G. (2000). Gene ontology: tool for the
unification of biology. The Gene Ontology Consortium. Nat Genet, 25(1), 25–29. http://doi.org/10.1038/75556
18. Ontologies as force amplifiers for
data
domain knowledgedata
biocurationexperimen
19. Don’t worship the monolith
PROBLEM: GO and other ontologies were becoming monolithic
- lots of implicit overlap with other ontologies, latent structure
20. Open Biological Ontologies
(OBO)
http://obofoundry.org
1. Well-integrated
Modular ontologies
2. Provide technical
and
sociotechnological
framework for
cooperation
4. Allow us to
curate all of the
things
3. Provide tools, best
practices and
infrastructure for
forging new
ontologies
@obofoundry
21. OBO Library PURLs
PURL: Persistent URL
Consistent, predictable, stable and versioned
URLs for ontology objects
Can be shortened as compact URIs (CURIEs), e.g.
GO:0008150
Can be registered and viewed on OBO site
http://obofoundry.org
Ontology purls
Main ontology, subsets
versionIRIs
Ontology term purls
23. Contributions to and uses of
RO
virtualflybrain.org globalbioticinteractions.org
Osumi-Sutherland, D. (2012).
doi:10.1093/bioinformatics/bts113
Has soma location
Has synaptic terminal in
Upstream in neural circuit with
…
Eats
Epiphyte of
Parasite of
Kleptoparasitizes
hyperparasitizes
Neurocellular Bioitic interaction
Is model of
Has phenotype
Molecularly controls
Allosteric inhibitor of
causes or contributes to condition
...
David Osumi-Sutherland Anne ThessenMatt Brush Greg Stupp
Gene, drug,
phenotype
>500 relations
26. Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Some relationships didn’t make sense
• E.g. nucleotide isa carbohydrate
• Acids conjugate bases
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
27. Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Fixed many is-as
• E.g. nucleotide isa carbohydrate
• Acids conjugate bases
+ OWL reasoning
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
GO CHEBI
+ Design Patterns
28. lung
lung
lobular organ
parenchymatous
organ
solid organ
pleural sac
thoracic
cavity organ
thoracic
cavity
abnormal lung
morphology
abnormal respiratory
system morphology
Mammalian Phenotype
Mouse Anatomy
FMA
abnormal pulmonary
acinus morphology
abnormal pulmonary
alveolus morphology
lung
alveolus
organ system
respiratory
system
Lower
respiratory
tract
alveolar sac
pulmonary
acinus
organ system
respiratory
system
Human development
lung
lung bud
respiratory
primordium
pharyngeal region
Challenges of multi-species anatomy
and phenotypes
develops_from
part_of
is_a (SubClassOf)
surrounded_by
29. The perils of mappings
Class A Class B Mapped
?
Useful
?
FMA: extensor
retinaculum of wrist
MouseAnatomy: retina Yes No
Plant Ontology: Pith
Fly Anat: femur
MouseAnatomy: medulla
MouseAnatomy: femur
Yes
Yes
No
No*
ZfishAnat: hypophysis MouseAnatomy: pituitary No Yes
TAO:fossa AdverseReactions: depression Yes No
FMA: colon GAZ: Colón, Panama Yes No
Quality: male Chebi: maleate 2(-) Yes No
30. http://uberon.org
• Initial Phase
• Bottom-up
• Create groupings of
terms
• Light curation
• Next Phase
• Top down
• 14k classes
• Design Patterns
• Periodic alignment
and feeding back to
curators
Uberon
34. dinosaurs, sponges, comb jellies
and cephalopods, oh my
Thacker, R. W., (2014). The Porifera Ontology (PORO):
enhancing sponge systematics with an anatomy ontology.
Journal of Biomedical Semantics, 5(1), 39.
http://doi.org/10.1186/2041-1480-5-39
Graphic courtesy Nizar Ibrahim, Paul Sereno, et al.
Phenotype RCN
Wasila Dahdul
Bob Thacker
obofoundry.org/
ontology/ceph.html
obofoundry.org/
ontology/cteno.html
35. Phenotype and Disease
Ontologies
Problem: Many ontologies, vocabularies and
condition/phenotype lists:
HP, MP, WBPhenotype, FBcv, TO, VT, FYPO, APO,
SNOMED
OMIM, Orphanet, DO, NCIT, MESH, ICD, UMLS,
MEDGEN …
ZFIN, Phenoscape: EQ
Köhler, S.. (2013).. F1000Research, 1–
12.
http://doi.org/10.3410/f1000research.2-
Standardized Design
Patterns + OWL
Reasoning
Bayesian OWL Ontology
Merging
(BOOM)
Mungall, C.J et al (2016) kBOOM.
bioRxiv 10.1101/048843
Monarch merged
‘upheno’ ontology
MonDO
Elvira Mitraka
Sue Bello Nicole
Vasileksky
36. Combined score
Remove off-target and common variants
Whole exome
Variant Score based on allele frequency and
pathological impact
Mendelian filters
Whole or partial
phenome (HPO)
Owl
Sim
Gene phenotype scores
Curated
Phenotype
Data
Monarch
Integrated
KB
upheno
Curated
Orthology,
Interaction, ..
Data
+GENOMISER
40. Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
41. Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
• Genes in slime mold
responsible for dorsal fin
development
42. Solution: Taxon constraints
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Encode taxon constraints as OWL
rules in the ontology
only in taxon
never in taxon
Can be propagated across
ontologies
E.g.
dorsal fin only in vertebrata
(uberon)
dorsal fin never in tetrapod
(uberon)
lactation only in mammals (go)
43. Hi, ROBOT
How can we package things up and make
them easier to use in ontology/curation QC
pipelines?
Enter ROBOT
Design Patterns
Continuous Integration
44. Next steps for ontology
annotation
Existing ontology annotation model:
Bag of terms
gene
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
53. Take homes
Knowledge is a force multiplier
Applies to all biocuration work
But pinpoints need for QC
Design for generality
But acknowledge difficulties
Better support required
Biological knowledge is multifaceted and
nuanced
Computer scientists have a tendency towards
hubris
Biology is our nemesis
Collaborative approach is vital
59. Acknowledgments
Monarch Initiative: Jeremy Nguyen-Xuan, Kent Shefcheck, Matt Brush, Tom Conlin, Lilly
Winfree, Eric Douglass, Jules Jacobsen, Craig McLachan, Suzanna Lewis, Julie McMurry, Dan
Keith, Nicole Washington, Nicole Vasilevsky, Nathan Dunn, Harry Hochheiser, William Bone, Neal
Boerkel, Damian Smedley, Tudor Groza, Sebastian Koehler, Melissa Haendel, Peter
Robinson
GO: Michael Ashburner, David Hill, Paola Roncaglia, David Osumi-Sutherland, Tanya Berardini,
Jen Deegan, Jane Lomax, Karen Christie, Pascale Gaudet, Monica Munoz-Torres, Seth
Carbon, Eric Douglass, Heiko Dietze, Ruth Loverin, Rachael Huntley, Midori Harris, Harold
Drabkin, Kimberley Van Auken, Marc Feuermann, Petra Fey, Jim Hu, Debbie Siegel, Helen
Parkinson, Tony Sawford, Stacia Engel, Sylav Poux, Melanie Courtot, Becky Foulger, Emily
Dimmer, Rachael Huntley, Huaiyu Mi, Judy Blake, Paul Sternberg, Mike Cherry, Suzi Lewis, Paul
Thomas
OBO: Michael Ashburner, Suzanna Lewis, Barry Smith, Richard Scheuermann, Chris Stockert,
Jie Zheng, Melanie Courtot, Simon Jupp, Ramona Wall,s Darren Natale, Melissa Haendel, Lynn
Schriml, Alan Ruttenberg, Seth Carbon, James Overton, Bjoern Peters, + all contributors
Planteome: Pankaj Jaiswal, Dennis Stevenson, Laurel Cooper, Austin Meier, Marie Angelique
Laporte, Elizabeth Arnaud
Uberon: David Osumi-Sutherland, Paula Mabee, Jim Balhoff, Wasila Dahdul, Alex Dececci,
Nizar Ibrahim, Paul Sereno, Frederic Bastian, Ann Niknejad, Marc Robinson-Rechavi, David
Blackburn, Terry Hayamizu, Yvonne Bradford, Ceri Van Slyke, Alex Diehl, Terry Meehab,
Robert Druzinsky, Melissa Haendel
ALL OF THE BIOCURATORSNIH ORIP R24OD011883
NHGRI U41HG 002273 NSF DEB-0956049 DOE DE-AC02-05CH11231
NSF IOS 1340112
NSF DBI 1062404
62. Give me a place to stand and with a lever I
will move the whole world
63. Uncovering latent meaning in
ontologies
Mungall, C. J. (2004). Obol: Integrating Language and Meaning in Bio-Ontologies. Comparative and
Functional Genomics, 5(7), 509–520.
regulation of Notch signaling pathway involved in heart
induction
relation relation anatomicpathway
OWL EXPRESSION HERE
≡ ∃regulates (NSP ⊓ ∃ part-of HI)
64. Open Biological Ontologies
(OBO)
To provide modular building
blocks
Not just functional annotation of
genes and gene products
Framework, tools and
infrastructure for cooperation and
harmonization
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., … Lewis, S. (2007). The OBO Foundry: coordinated
evolution of ontologies to support biomedical data integration. Nat Biotechnol, 25(11), 1251–1255.
Functio
n
(GO)
Anatomy
Environ
ment
Chemical
s
(CHEBI)
Phenotyp
e and
Disease
Genes
(SO,
GENO)
Occurs
in
…
http://obofoundry.org
66. Relations: the glue that holds it
together
RO 2005 paper
10 relations
Current RO
>500 relations
Molecular biology
Neurobiology
Biotic interactions
…
Many rules on how relations compose together
Working with wikidata
http://obofoundry.org/ontology/ro.html
67. Beyond the GO
Functional
Genomics: Gene
function
Transcriptomics:
Gene expression
Phenomics: Effects
of gene mutations
Gene Ontology
Anatomy and
Stage Ontology
Phenotype and
Trait Ontology
Links genes to
What they do
Links genes to
where they
are expressed
Links genes to
what happens
when they are
disrupted or
when they varyDisease Ontology
Environment
Ontology
68. anatomical
structure
endoderm of
forgut
lung bud
lung
respiration organ
organ
foregut
alveolus
alveolus of lung
organ part
FMA:lung
MA:lung
endoderm
GO: respiratory
gaseous exchange
MA:lung
alveolus
FMA:
pulmonary
alveolus
is_a (taxon equivalent)
develops_from
part_of
is_a (SubClassOf)
capable_of
NCBITaxon: Mammalia
EHDAA:
lung bud
only_in_taxon
pulmonary
acinus
alveolar sac
lung primordium
swim bladder
respiratory
primordium
NCBITaxon:
Actinopterygii
http://uberon.org
Mungall, C. J., Torniai, C., Gkoutos, G. V, Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy
ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5
Uberon bridges anatomy
ontologies
70. Uberon Core
Extensions to other animals…
Thacker, R. W., Díaz, M. C., Kerner, A., Vignes-Lebbe, R., Segerdell, E., Haendel, M. a,
& Mungall, C. J. (2014). The Porifera Ontology (PORO): enhancing sponge
systematics with an anatomy ontology. Journal of Biomedical Semantics, 5(1), 39
Non-model/human
extension
Porifera
Ontology
Ctenophore
Ontology
Cephalopod
Ontology
http://phenotypercn.org
https://github.com/obophenotype/cephalopod-ontology
https://github.com/obophenotype/ctenophore-ontology
https://github.com/obophenotype/porifera-ontology
https://github.com/obophenotype/uberon
Arthropod
Ontology
74. TODO DEPRECATED The need
for modularization
Growing pains of GO
Terms were added as-needed for curation
Hard to maintain
Scope: Encompassing all of biology is hard
Biochemistry, cell biology, plants, animal development and
physiology, …
We needed to modularize
Meanwhile
Other ontologies in the ‘style’ of GO were popping up,
for annotating other kinds of data
Challenge: how were we going to coordinate this?
75. Biological knowledge and curation
QC
Taxon constraints
CONCRETE EXAMPLE HERE
Intersection rules
(see Seth’s talk)
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
82. Uberon/CL applications and
users
Ontology Modularization
GO
CLO
Pheno Ontologies (EQ definitions)
ENVO
Transcriptomics and genome annotation
ENCODE
FANTOM5
LINCS
BgeeDb
Phenomics
Human and Mammalia Phenotype Ontology
Phenotype comparison algorithms
Evolutionary Phenotypes: Phenoscape
http://uberon.github.io/about/adopters.html
83. The path to AI, 1990s
Two goals
Broad AI
Narrow AI
What path to get there?
Knowledge-Based
Explicit Encoding of knowledge about the world
Analytic or deductive reasoning
Mathematical Logic vs Cognitively inspired (neats vs scruffs)
‘Knowing that’
Knowledge-Free
Machine Learning, Neural Networks
Statistics
Pattern Recognition
Biological Inspired
‘Knowing how’