2. My profile
<location>USA</locati
on>
<work>postdoctoral
training</work>
<work:company_type
>University</work:has_
degree>
<work:has_title>resear
ch
fellow</work:has_title>
<bioinformatics>ontolo
gy
development</bioinfo
rmatics>
<bioinformatics>social
network
analysis</bioinformatic
s>
<bioinformatics>ontolo
gy applying data
analysis</bioinformatic
s>
Postdoc training:
Ontologies
<location>Japan</l
ocation>
<work>institution</w
ork>
<work:company_typ
e>non profit
organization</work:
has_degree>
<work:has_title>bioin
formatician</work:h
as_title>
<bioinformatics>454s
equence
assembly</bioinform
atics>
<bioinformatics>non
model organism
sequence
analysis</bioinforma
ticsl>
Bioinformatician:
NGS
<location>Japan</l
ocation>
<education:has_de
gree>Ph.D</educat
ion:has_degree>
<education:has_ma
jor>medical
informatics</educa
tion:has_major>
<bioinformatics>ont
ology</bioinformati
cs>
<bioinformatics>dat
a
integration</bioinfo
rmatics>
<bioinformatics>biol
ogical pathway
analysis</bioinform
atics>
Ph.D. in Medical
Informatics
<location>China</l
ocation>
<work>industry</wo
rk>
<work:company_ty
pe>start_up IT
</work:has_degree
>
<work:has_title>con
tent
manager</work:ha
s_title>
<work:has_title>proj
ect
manager</work:ha
s_title>
<IT_skill>web site
building</IT_skill>
<IT_skill>relational
database </IT_skill>
Content Manager
& Project
Manager
<location>China</l
ocation>
<education>Medic
al School
</education>
<education:has_de
gree>master</edu
cation:has_degree
>
<education:has_ma
jor>molecular
immunology</educ
ation:has_major>
<bioinformatics>seq
uencing</bioinform
atics>
<bioinformatics>pro
tein 3D
simulation</bioinfor
matics>
Master in
Molecular
Immunology
<location>China</l
ocation>
<education>Medic
al School
</education>
<education:has_de
gree>bachelor</ed
ucation:has_degre
e>
<education:has_ma
jor>Pediatrics</edu
cation:has_major>
M.D. in Pediatrics
3. Agenda
Introduction : ontologies, semantic web and big data
Selected projects:
1. Informed Consent Ontology (ICO)
2. miRNA and Aging Ontology (MIAGO)
3. Ontology of Drug Neuropathy Adverse Event (ODNAE)
4. LINCS-BD2K
5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
Conclusion
4. Ontologies, form of knowledge representation, the
structural frameworks for organizing terms hierarchically
and defining relations between terms within a domain
1. A hierarchical vocabulary, class-subclass-instance
2. Defined relations between terms to interlink the whole system
3. Constrains and logical definitions
4. Explicit specification of a conceptualization (Gruber,1993)
What is ontology?
5. Why ontology ?
Knowledge
management
•RDF, RDFS, OWL
Natural language
processing
•Linguistic ontology:
WordNet
E-commerce
Intelligent
information
integration
Knowledge
acquisition and
discovery
Database
design and
integration
Medical decision
making agent
Linked Open
Data, Semantic
Web
6. Semantic Web Layer Cake
RDF: simple triples, graph-based queries,
supports very large amount of data
Bill –has_address- Location A
OWL: significantly more expressive
language, strong axioms, inference
capabilities, consistency verification, but
can be rather slow
Bill –has_address- Location A
Location A –is_address_of- Bill
Inverse relation
7. SELECTED PROJECTS
1. Informed Consent Ontology (ICO)
2. miRNA and Aging Ontology (MIAGO)
3. Adverse event analysis
Ontology of Drug Neuropathy Adverse Event
4. LINCS-BD2K
5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
9. SELECTED PROJECTS
1. Informed Consent Ontology (ICO)
2. miRNA and Aging Ontology (MIAGO)
3. Adverse event analysis
Ontology of Drug Neuropathy Adverse Event (ODNAE)
4. LINCS-BD2K
5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
10. The power of reasoning
miRNA and Aging Ontology (MIAGO)
Database
(in revision)
11. SELECTED PROJECTS
1. Informed Consent Ontology (ICO)
2. miRNA and Aging Ontology (MIAGO)
3. Adverse event analysis
Ontology of Drug Neuropathy Adverse Event (ODNAE)
4. LINCS-BD2K
5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
12. drug-associated
neuropathy AE
(ODNAE)
drug administration
(OAE_0000011)
a drug
(DrON, linked to
RxNORM, NDFRT)
preceded_by
chemical element
(ChEBI)
has_proper_part
biological
process (GO)
drug role in
mechanism of
action (NDFRT)
has_role
is_
realized
_in
human
(NCBITaxon_9606)
occurs in
has
participant
a quality (e.g., age)
(PATO)
has_quality
has participant
neuropathy AE
(OAE_0000418)
is_a
bupropion (Aplezin,
Wellbutrin, Zyban,
Budeprion SR,
Buproban, Forfivo
XL)-associated
neuropathy AE
(ODNAE_0000043)
drug administration
(OAE_0000011)
Bupropion Oral
Tablet
(DRON_00026665)
preceded_by
bupropion
(CHEBI_3219)
has_proper_part
negative regulation
of dopamine uptake
(GO_0051585)
has_specified_input
Dopamine Uptake
Inhibitors [MoA]
(N0000000114)
has_role
is_
realized
_in
human
(NCBITaxon_9606)
occurs in
has
participant
age
(PATO_0000011)
has_quality
has participant
neuropathy AE
(OAE_0000418)
is_a
(A)
(B)
drug product
(DrON_00000005)
is_a
has_specified_input
drug product
(DrON_00000005)
is_a
negative regulation
of neurotransmitter
uptake
(GO_0051581)
is_a
ODNAE:
Linking knowledge together
13. ODNAE results:
215 neuropathy AE drugs knowledge base
related AEs and 20 AE types
(A) (B)
127
127
1
18
8
7
116
1
13
20
96
15
39
1
1
21
7
14
1
related chemical compounds
139 Mode of Action
ICBO 2015 VDOS workshop
14. What’s missing in ODNAE
Only 13 GO biological processes were mapped to some MoA.
Holistic analytic methods are needed to understand the mechanism.
We need more…
16. University of Miami Computational LINCS Center
LINCS Data Coordinating Center
http://lifeKB.org
BD2K LINCS Data Coordination and Integration Center
http://lincs-dcic.org/
NIH LINCS Program
16
Library of Integrated Network-based Cellular Signatures
17. Drug and Gene Knockdown Followed
by Genome-Wide Expression
KO and Mutant Genes and their
Disease Phenotypes
Drug and Knockdown Effects on Cell
Viability
Transcription Factors and Histone
Modifications Profiled by ChIP-Seq
Protein-Protein Interactions and Cell-
or Metabolic-Pathways
Gene Expression from Patient Cohorts
with Genomics and Clinical Outcome
Data
Drugs and Toxic Chemicals that
Cause Adverse Events
Networks
Bi-partite
Graphs
Gene-Set
Libraries
Hierarchical
Trees
19. Data Sources
Metadata
Semantic model
/ ontology
Sets, graphs,
trees, networks
bit set libraries bipartite graphs networkshierarchy tree
protein gene cell assaydiseasedrug
application
ontlogy
LIFE knowledge
model
Data integration and systems modeling
19
20. SOCR Analytics Dashboard
Statistics Online Computational Resource
Provide graphical querying, navigating and exploring the multivariate
associations in complex heterogeneous datasets.
Integrate dispersed multi-source data and service the mashed
information via human and machine interactions in a secure, scalable
manner.
http://socr.umich.edu/HTML5/Dashboard/
21.
22. 1. Ontologies are important components for Big Data integration and manipulation.
2. Reuse ontologies will enable seamless integration with other resources.
3. However, ontologies can not solve all the problems in biomedical world; they are tools to support
science.
4. Formalized ontologies can be used by humans and automated systems as a basis for communication and
data exchange (such as RDF data)
5. Ontologies based application may go beyond reasoning alone and use statistical analyses (enrichment),
semantic similarity, network analysis, graph algorithms, clustering, etc.
6. Many more to explore in the big-data era.
Conclusion:
Editor's Notes
The Library of Integrative Network-based Cellular Signatures (LINCS) is an NIH Common Fund project that was recently expanded to its second phase. The idea is to perturb different types of human cells with many different types of perturbations such as: drugs and other small molecules; genetic manipulations such as knockdown or overexpression of genes, manipulation of the extracellular microenvironment conditions, i.e., growing cells on different surfaces, and more. These perturbations are applied to various types of human cells including induced pluripotent stem cells from patients, differentiated into various lineages such as neuron or cardiomyocytes. Then, to better understand the molecular networks that are affected by these perturbations, changes in levels of many different variables are measured including: mRNA, protein, and metabolites, as well as cellular phenotypic changes such as changes in cell morphology. In most cases, the data that is collected is genome-wide and from across different regulatory layers.
Seven data types that can be converted into single entity networks, gene-set libraries, hierarchical trees and bi-partite graphs.
LINCS is an important glue that connects various entities.