SlideShare a Scribd company logo
1 of 44
Why Research Data
Management
May Save Science
Anita de Waard
VP Research Data Collaborations
a.dewaard@elsevier.com
http://researchdata.elsevier.com/
Why Life is Difficult,
And What We Can Do About It
Outline:
• The problem: life is difficult.
• One approach to tackling this: claim-evidence
networks.
– How do we find claims?
– How do we find evidence?
– How do we connect the two?
• What is still missing?
• Call to action!
The Problem
Problem 1: a rose is not a rose:
• “…there was significant variability of the
injected venom composition from
specimen to specimen, in spite of their
common biogeographic origin.”
Jose A. Rivera-Ortiz, Herminsul Cano, Frank Marí, Intraspecies variability of the
injected venom of Conus ermineus, doi:10.1016/j.peptides.2010.11.014
• “…Strains DV-3/84 DV-7/84 (group 3)
showed 76.6% similarity to each other and
were similar to all other strains at the
67.6% level.”
Zofia Dzierżewicz et al., Intraspecies variability of Desulfovibrio desulfuricans
strains determined by the genetic profiles, FEMS Microbiology Letters, Volume
219, Issue 1, 14 February 2003, Pages 69–74, doi:10.1016/S0378-
1097(02)01199-0
=> A specimen is not a species!
Problem 2: gene expression varies with:
Age: “SIRT1-Associated genes are deregulated in the aged brain”
Philipp Oberdoerffer et al., SIRT1 Redistribution on Chromatin Promotes Genomic Stability but Alters Gene Expression
during Aging, Cell, Volume 135, Issue 5, 28 November 2008, Pages 907–918, doi:10.1016/j.cell.2008.10.025
Smell: “…major urinary proteins *…+ mediate the pregnancy blocking
effects of male urine”
P.A. Brennan, et al, Patterns of expression of the immediate-early gene egr-1 in the accessory olfactory bulb of female
mice exposed to pheromonal constituents of male urine, Neuroscience, Volume 90, Issue 4, June 1999, P 1463–
1470, doi:10.1016/S0306-4522(98)00556-9
Hunger: “Out of the ~30K genes, about 10K are differentially expressed
in liver cells when an animal is in different states of satiety.“
Zhang F, Xu X, Zhou B, He Z, Zhai Q (2011) Gene Expression Profile Change and Associated Physiological and
Pathological Effects in Mouse Liver Induced by Fasting and Refeeding.
PLoS ONE 6(11): e27553. doi:10.1371/journal.pone.002755
Light: “Longer-term enrichment training also altered the mRNA levels of
many genes associated with structural changes that occur during
neuronal growth.”
Cailotto C., et al. (2009) Effects of Nocturnal Light on (Clock) Gene Expression in Peripheral Organs: A Role for the
Autonomic Innervation of the Liver. PLoS ONE 4(5): e5650. doi:10.1371/journal.pone.0005650:
=> Knowing genes is not knowing
how they are expressed!
• “We found the diversity and abundance of each habitat’s
signature microbes to vary widely even among healthy
subjects, with strong niche specialization both within
and among individuals.”
The Human Microbiome Project Consortium, Structure, function and diversity of the healthy
human microbiome, Nature 486, 207–214 (14 June 2012) doi:10.1038/nature11234
• “Colonization of an infant’s gastrointestinal tract begins
at birth. The acquisition and normal development of the
neonatal microflora is vital for the healthy maturation of
the immune system.”
Mackie RI, Sghir A, Gaskins HR., Developmental microbial ecology of the neonatal
gastrointestinal tract. Am J Clin Nutr. 1999 May;69(5):1035S-1045S
Problem 3:
No man (or mouse) is an island…
=> An animal is an ecosystem!
Problem 4:
Interactions create more complexity:
• Computing cancer: “No amount of information about
what happens inside a single cell can ever tell you
what a tissue is going to do,” *Glazier+ said. “Much of
the information and complexity of tissues and life is
embedded in the way cells talk to each other and the
extracellular environment.”
• Megadata:“These complex emergent systems are
impossible to understand,”,”*we+ founded Applied
Proteomics to create a protein diagnostic that reveals
not just where a cancer is, but how it interacts with
the body..” Nature Special Issue Vol. 491 No. 7425
‘Physical Scientists Take On Cancer’ :
=> The whole is more than the sum of its parts!
Big problems in biology:
http://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg
1. Interspecies variability > A specimen is not a species!
2. Gene expression variability > Knowing genes is not
knowing how they are expressed!
3. Microbiome > An animal is an ecosystem!
4. Systems biology > Whole is more than the sum of its parts!
5. Models vs. experiment > Are we talking about the same
things? In a way we can all use?
6. Dynamics > Life is not in equilibrium!
Life is complicated!
Reductionism doesn’t
work for living systems.
Statistics could help!
With enough observations, trends and anomalies can be
detected:
• “Here we present resources from a population of 242
healthy adults sampled at 15 or 18 body sites up to three
times, which have generated 5,177 microbial taxonomic
profiles from 16S ribosomal RNA genes and over 3.5
terabases of metagenomic sequence so far.”
The Human Microbiome Project Consortium, Structure, function and diversity of
the healthy human microbiome, Nature 486, 207–214 (14 June 2012)
doi:10.1038/nature11234
• “The large sample size — 4,298 North Americans of
European descent and 2,217 African Americans — has
enabled the researchers to mine down into the human
genome.”
Nidhi Subbaraman, Nature News, 28 November 2012, High-resolution sequencing
study emphasizes importance of rare variants in disease.
But biological research is insular!
• Biology is small: size 10^-5 – 10^2
m, scientist can work alone (‘King’ and
‘subjects’).
• Biology is messy: it doesn’t
happen behind a terminal.
• Biology is competitive: many
people with similar skill sets,
vying for the same grants
• In summary: the structure of biological
research does not inherently promote
collaboration (vs., for instance, HE physics or
astronomy (and they’re not all they’re cracked up to
be, either…)).
Prepare
Observe
Analyze
Ponder
Communicate
How Can We Connect
This Knowledge?
Claim-Evidence Networks
Offer A Model for Connecting Knowledge:
Experimental
Evidence
Converging on Claim/Evidence/Networks, e.g. here:
• The Karyotype Ontology: a computational representation for human cytogenetic patterns. Jennifer Warrender and
Phillip Lord
• Lexical Analysis and Characterization of the OBOFoundry Ontologies. Manuel Quesada-Martínez, Jesualdo Tomás
Fernández-Breis and Robert Stevens
• Exomiser: improved exome prioritization of disease genes through cross species phenotype comparison. Peter
Robinson, Sebastian Köhler, Anika Oellrich, Kai Wang, Chris Mungall, Suzanna E. Lewis, Sebastian Bauer, Dominik
Seelow, Peter Krawitz, Christian Gilissen, Melissa Haendel and Damian Smedley
• BioAssay Ontology (BAO): Modularization, Integration and Applications. Uma Vempati, Hande Kucuk, Saminda
Abeyruwan, Ubbo Visser, Vance Lemmon, Ahsan Mir and Stephan Schürer
• eXframe: A Semantic Web Platform for Genomics Experiments. Emily Merrill, Stephane Corlosquet, Paolo
Ciccarese, Tim Clark and Sudeshna Das
• Ovopub: Modular data publication with minimal. provenance Alison Callahan and Michel Dumontier
• Zooma – A tool for automated ontology annotation. Tony Burdett, Simon Jupp, James Malone, Helen
Parkinson, Eleanor Williams and Adam Faulconbridge
• A Probabilistic Framework for Ontology-Based Annotation in Neuroimaging Literature. Chayan
Chakrabarti, Thomas B. Jones, Jiawei F. Xu, George F. Luger, Angela R. Laird, Matthew D. Turner and Jessica A.
Turner
• Preserving sequence annotations across reference sequences. Zuotian Tatum, Andrew Gibson, Marco Roos, Peter
E.M. Taschner, Mark Thompson, Erik A. Schultes and Jeroen F. J. Laros
• A Taxonomy for Immunologists. James A. Overton, Randi Vita, Jason A. Greenbaum, Heiko Dietze, Alessandro Sette
and Bjoern Peters
• Health Data Ontology Trunk: A middle-layer ontology for health- care. Ulf Schwarz, Luc Schneider, Emilio
Sanfilippo, Holger Stenzhorn and Nikolina Koleva
• Structured representation of scientific evidence using semantic web techniques – a biochemistry use
case.Christian Bölling, Michael Weidlich and Hermann-Georg Holzhütter
• Synthetic Biology Open Language Visual: an ontological use case. Jacqueline Quinn, Michal Galdzicki, Robert
Step 1: Find claims:
E.g., using XIP for discourse analysis:
In contrast with previous hypotheses compact plaques form before significant
deposition of diffuse A beta, suggesting that different mechanisms are involved
in the deposition of diffuse amyloid and the aggregation into plaques.
Entities
Relationships
Temporality
Connections thematic roles
Status
core information
(proposition)
information extraction
rhetorical
metadiscourse
discourse analysis
discourse analysisdiscourse structure
Sándor, Àgnes and de Waard, Anita, (2012).
Finding Claimed Knowledge Updates:
Sandor, A. and de Waard, A. (2012)
Here we used mass spectrometry to identify HuD as a novel
neuronal SMN-interacting partner
Our analysis of known HuD-associated mRNAs in neurons identified
cpg15 mRNA as a highly abundant mRNA in HuD IPs
Our finding that SMN protein associates with HuD protein and the
HuD target cpg15 mRNA in neurons …
Definition:
1) A CKU expresses a verbal or nominal proposition about biological entities.
2) A CKU is a new proposition.
3) The authors present the CKU as factual.
4) A CKU is derived from the experimental work described in the article.
5) The ownership of the proposition is attributed to the author(s) of the article.
6) 4) and 5) are either explicitly expressed or are implicitly conveyed by a
structural position as title, section or caption title.
Allow for Hedging and Uncertainty:
Ontology of Reasoning, Certainty and Attribution (ORCA)
For a Proposition P, an epistemically marked clause E
is an evaluation of P, where EV, B, S(P), with:
– V = Value:
3 = Assumed true, 2 = Probable, 1 = Possible, 0 = Unknown,
(- 1= possibly untrue, - 2 = probably untrue, -3 = assumed untrue)
– B = Basis:
Reasoning
Data
– S = Source:
A = speaker is author A, explicit
IA = speaker author, A, implicit
N = other author N, explicit
NN = other author NN, implicit
Based on a conversation with Ed Hovy;
de Waard, A. and Schneider, J. (2012)
Turning claims into formal representations:
Biological statement with BEL/ epistemic
markup
BEL representation: Epistemic
evaluation
These miRNAs neutralize p53-mediated CDK
inhibition, possibly through direct inhibition
of the expression of the tumor-suppressor
LATS2.
r(MIR:miR-372) -
|(tscript(p(HUGO:Trp53)) -|
kin(p(PFH:”CDK Family”)))
Increased abundance of miR-
372 decreases abundance of
LATS2
r(MIR:miR-372) -|
r(HUGO:LATS2)
Value =
Possible
Source =
Unknown
Basis =
Unknown
Biological statement with
Medscan/epistemic markup
MedScan Representation: Epistemic
evaluation
Furthermore, we present evidence that the
secretion of nesfatin-1 into the culture
media was dramatically increased during the
differentiation of 3T3-L1 preadipocytes into
adipocytes (P < 0.001) and after treatments
with TNF-alpha, IL-6, insulin, and
dexamethasone (P < 0.01).
IL-6  NUCB2 (nesfatin-1)
Relation: MolTransport
Effect: Positive
CellType: Adipocytes
Cell Line: 3T3-L1
Value =
Probable
Source =
Author
Basis = Data
Claims Link to Evidence:
The evidence is in data. To structure this:
• There are many different research databases– both generic
(Dryad, Dataverse, DataBank, Zenodo, etc) and specific
(NIF, IEDA, PDB)
• There are many systems for creating/sharing workflows
(Taverna, MyExperiment, Vistrails, Workflow4Ever,)
• There are many e-lab notebooks
(LabGuru, LabArchives, LaBlog etc)
• There are scores of
projects, committees, standards, bodies, grants, initiatives,
conferences for discussing and connecting all of this
(KEfED, Pegasus, PROV, RDA, Science
Gateways, Codata, BRDI, Earthcube, etc. etc)
• … you could make a living out of this !
…but this is what most scientists do:
Using antibodies
and squishy bits
Grad Students experiment
and enter details into their
lab notebook.
The PI then tries to make
sense of their slides,
and writes a paper.
End of story.
One attempt to structure data:
CMU Urban Legend
de Waard, A., Burton, S. et al., 2013
Connecting experimental results:
Prepare
Analyze Communicate
Prepare
Analyze Communicate
Observations
Observations
Observations
Across labs, experiments:
track reagents and how
they are used
Prepare
Analyze Communicate
Prepare
Analyze Communicate
Observations
Observations
Observations
Compare outcome of
interactions with these
entities
Connecting experimental results:
Prepare
Analyze Communicate
Prepare
AnalyzeCommunicate
Observations
Observations
Observations
Build a ‘virtual reagent
spectrogram’ by comparing
how different entities
interacted in different
experiments Think
Reason collectively!
Connecting experimental results:
NIF Antibodies Registry
collects antibody information:
Step 3: Connect Claims and Evidence
Example: Hunter et al., Hanalyzer:
Step 1: Manually identify DDIs and
drug names in wide collection of
content sources
Step 2: Develop a model of Drug-Drug
Interaction and define candidates
Step 3: Automate this process
and store as Linked Data
Example: Drug-Drug Interactions
Boyce, Schroeder et al., 2013
Connect recommendations
in clinical guidelines to underlying
evidence
Hoekstra, de Waard and Vdovjak, 2012
Example:
Using what is known about interactions in fly & yeast,
predict new interactions with a human protein –
Running over data on the web that he neither created nor knew about!
Given a protein P in Species X:
Find proteins similar to P in Species Y
Retrieve interactors in Species Y
Sequence-compare Y-interactors with Species X
genome
(1)  Keep only those with homologue in
Find proteins similar to P in Species Z
Retrieve interactors in Species Z
Sequence-compare Z-interactors with (1)
 Putative interactors in Species X
Example: do science ON the web:
Great! So we’re almost
done, right – and we can all go
home!
Not so fast…
Both seminomas and the EC component of
nonseminomas share features with ES cells. To
exclude that the detection of miR-371-3 merely
reflects its expression pattern in ES cells, we tested
by RPA miR-302a-d, another ES cells-specific
miRNA cluster (Suh et al, 2004). In many of the
miR-371-3 expressing seminomas and
nonseminomas, miR-302a-d was undetectable (Figs
S7 and S8), suggesting that miR-371-3 expression is
a selective event during tumorigenesis.
Both seminomas and the EC component of
nonseminomas share features with ES cells.
To exclude that
the detection of miR-371-3 merely reflects its
expression pattern in ES cells,
we tested by RPA miR-302a-d, another ES cells-
specific miRNA cluster (Suh et al, 2004).
In many of the miR-371-3 expressing seminomas
and nonseminomas, miR-302a-d was undetectable
(Figs S7 and S8),
suggesting that
miR-371-3 expression is a selective event during
tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
Reg-Implication
Conceptual
knowledge
Experimental
Evidence
What is a claim? In a paragraph?
• Voorhoeve et al., 2006: “These miRNAs neutralize p53- mediated CDK
inhibition, possibly through direct inhibition of the expression of the tumor
suppressor LATS2.”
• Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373
were found to allow proliferation of primary human cells that express
oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor
LATS2 (Voorhoeve et al., 2006).”
• Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly
inhibit the expression of Lats2, thereby allowing tumorigenic growth in the
presence of p53 (Voorhoeve et al., 2006).”
“[Y]ou can transform .. fiction into fact, just by adding
or subtracting references”, Latour, 1987
What is the claim? Who makes it?
> 50 My Papers
2 M scientists
2 My papers/year
Evidence is largely lost….
Majority of data
(90%?) is stored
on local hard drives
Dryad:
7,631 files
Dataverse:
0.6 My
Datacite:
1.5 My
Some data
(8%?) stored in large,
generic data
repositories
MiRB:
25k
PetDB:
1,5 k
TAIR:
72,1 k
PDB:
88,3 k
SedDB:
0.6 k
A small portion of data
(1-2%?) stored in small,
topic-focused
data repositories
…or buried..
• In 220 publications only 40% of antibodies, 40% of cell lines and 25% of
constructs can be manually identified (Vasilevsly et al, submitted)
• The good news: we can find automatically
what we can find manually
• Proposal (NIH, June 2013):
– Author is asked to add methods section to a tool
– Tool extracts likely reagents / resources
– User interface asks author to confirm or select
…and you can’t extract it after the fact.
49 publications193 publications 76 publications 214 publications 210 publica
Entity
Type
Precision Recall
Antibody 87.5 63.3
Resource 95.6 98.9
Even if we can link to evidence:
• Is it true?
In Summary:
We’re not out of the woods
(or a job) just yet!
We need to improve claim networks:
• Can we make systems of computer-readable
meaning that still represent the fullness of
natural language?
>> Let’s work with computational linguists!
• Trace claims across publications:
>> Let’s work with legal/political argumentation
specialists! Sentiment analysis!
> 50 My Papers
2 M scientists
2 My papers/year
Improve evidence: scale up data curation!
Dryad:
7,631 files
Dataverse:
0.6 My
Datacite:
1.5 My
MiRB:
25k
PetDB:
1,5 k
Majority of data
(90%?) is stored
on local hard drives
Some data
(8%?) stored in large,
generic data
repositories
TAIR:
72,1 k
PDB:
88,3 k
SedDB:
0.6 k
A small portion of data
(1-2%?) stored in small,
topic-focused
data repositories
INCREASE DATA
DIGITISATION
DEVELOP
SUSTAINABLE MODELS
IMPROVE
REPOSITORY
INTEROPERABILITY
Keep asking big questions:
• Is this true?
• Does it matter?
• To whom?
“Let us now build systems that allow a kid in Mali
who wants to learn about proteomics to not be
overwhelmed by the irrelevant and the untrue.”
- John Perry Barlow, iAnnotate, SF 2013
In Memoriam Douglas C. Engelbart, 1925-2013:
“This is an initial summary report of a project taking a new
and systematic approach to improving the intellectual
effectiveness of the individual human being. A detailed
conceptual framework explores the nature of the system
composed of the individual and the tools, concepts, and
methods that match his basic capabilities to his problems.
One of the tools that shows the greatest immediate promise
is the computer, when it can be harnessed for direct on-line
assistance, integrated with new concepts and methods.”
Summary:
• The problem: life is difficult.
• One approach to tackle this: claim-evidence
networks:
– Find claims
– Identify evidence
– Connect the two.
• But we still need:
– Better ways to represent subtlety of natural language
– Better evidence: more structured, better connected
– Focus on the big questions.
• There’s a lot of work to do!
Collaborations and discussions gratefully acknowledged:
• CMU: Nathan Urban, Shreejoy Tripathy, Shawn Burton, Ed Hovy
• UCSD: Phil Bourne, Brian Shoettlander, Ilya Zaslavsky
• NIF: Maryann Martone, Anita Bandrowski
• MSU: Brian Bothner
• OHSU: Melissa Haendel, Nicole Vasilevsky
• CDL: Carly Strasser, John Kunze, Stephen Abrams
• Harvard/MGH: Tim Clark, Paolo Ciccarese
• VU: Rinke Hoekstra, Frank van Harmelen, Paul Groth
• Columbia/IEDA: Kerstin Lehnert, Leslie Hsu
• University of Pittsburgh: Richard Boyce
• Xerox Research Europe: Agnes Sandor
• DERI: Jodi Schneider
Thank you!
References:
• de Waard, Buckingham Shum, Park, Samwald, Sandor, 2009: Hypotheses, Evidence and Relationships, ISWC2009
• Biological Expression Language – http://www.openbel.org
• Latour, B. and Woolgar, S., Laboratory Life: the Social Construction of Scientific Facts, 1979, Sage Publications
• Latour, B., Science in Action, 1987
• de Waard, A. and Pander Maat, H. (2012). Epistemic Modality and Knowledge Attribution in Scientific Discourse: A
Taxonomy of Types and Overview of Features. Proceedings of the 50th Annual Meeting of the Association for
Computational Linguistics, pages 47–55, Jeju, Republic of Korea, 12 July 2012.
• Data2Semantics project: http://www.data2semantics.org/
• Sándor, Àgnes and de Waard, Anita, (2012). Identifying Claimed Knowledge Updates in Biomedical Research
Articles, Workshop on Detecting Structure in Scholarly Discourse, ACL 2012.
• de Waard, A. and Schneider, J. (2012) Formalising Uncertainty: An Ontology of Reasoning, Certainty and Attribution
(ORCA), Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine workshop, ISWC 2012
• de Waard, A., Burton, S.D., Gerkin, R.C., Harviston, M., Marques, D., Tripathy, S.J., Urban, N.N., Creating an Urban
Legend: A System for Electrophysiology Data Management and Exploration, Discovery Informatics, 2013
• Boyce, R.D., Horn, J.R., Hassanzadeh, O., de Waard, A., Schneider, J., Luciano, J. S, Liakata, M., Dynamic enhancement of
drug process labels to support drug safety, efficacy, and effectiveness. Jnl of Biomedical Semantics, 2013, 4:5.
• Hoekstra, R., de Waard,A., Vdovjak, R. (2012) Annotating Evidenced Based Clinical Guidelines - A Lightweight
Ontology, Proceedings of SWAT4LS 2012, Paris, Adrian Paschke, Albert Burger, Paolo Roma, M. Scott Marshall, Andrea
Splendiani (ed.), Springer.
http://researchdata.elsevier.com/

More Related Content

What's hot

Why Scientist Analyze Single Cells
Why Scientist Analyze Single CellsWhy Scientist Analyze Single Cells
Why Scientist Analyze Single CellsQIAGEN
 
Human genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsHuman genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsMayank Sagar
 
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLING
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLINGGROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLING
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLINGVinitha Govindan Rajan
 
Development of animal model (Knockout Mice)
Development of animal model   (Knockout Mice)Development of animal model   (Knockout Mice)
Development of animal model (Knockout Mice)AnilBehera8
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesJonathan Eisen
 
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...Prashant Hosmani
 
The genomics of why there are so many species of bats
The genomics of why there are so many species of batsThe genomics of why there are so many species of bats
The genomics of why there are so many species of batsLiliana Davalos
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...Arvinder Singh
 
Genomic Insights into bat Senses
Genomic Insights into bat SensesGenomic Insights into bat Senses
Genomic Insights into bat SensesLiliana Davalos
 
Journal Club presentation on the correlation between telomere shortening rate...
Journal Club presentation on the correlation between telomere shortening rate...Journal Club presentation on the correlation between telomere shortening rate...
Journal Club presentation on the correlation between telomere shortening rate...CristinaCardonaBarrena
 
Quantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryQuantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryLarry Smarr
 
Lewins genes xi [pdf][tahir99] vrg
Lewins genes xi [pdf][tahir99] vrgLewins genes xi [pdf][tahir99] vrg
Lewins genes xi [pdf][tahir99] vrgPabitra Augasti
 
SJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personalSJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personalSara Jawdy
 
Scott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationScott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationGigaScience, BGI Hong Kong
 
AI Decision System and Longevity
AI Decision System and LongevityAI Decision System and Longevity
AI Decision System and LongevityVeerendra Raju
 
Discuss an example of knockout mouse model used for disease modelling (Metast...
Discuss an example of knockout mouse model used for disease modelling (Metast...Discuss an example of knockout mouse model used for disease modelling (Metast...
Discuss an example of knockout mouse model used for disease modelling (Metast...SaniikaRenganadan
 
Neuroscience: Transforming Visual Percepts into Memories
Neuroscience: Transforming Visual Percepts into MemoriesNeuroscience: Transforming Visual Percepts into Memories
Neuroscience: Transforming Visual Percepts into Memoriesmustafa sarac
 

What's hot (20)

Why Scientist Analyze Single Cells
Why Scientist Analyze Single CellsWhy Scientist Analyze Single Cells
Why Scientist Analyze Single Cells
 
Human genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsHuman genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groups
 
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLING
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLINGGROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLING
GROUP 7- KNOCK IN MOUSE MODEL USED FOR DISEASE MODELLING
 
Development of animal model (Knockout Mice)
Development of animal model   (Knockout Mice)Development of animal model   (Knockout Mice)
Development of animal model (Knockout Mice)
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and Opportunities
 
BioPosterPP
BioPosterPPBioPosterPP
BioPosterPP
 
In a Different Class?
In a Different Class?In a Different Class?
In a Different Class?
 
sequencing-methods-review
sequencing-methods-reviewsequencing-methods-review
sequencing-methods-review
 
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...
Biocuration: Deciphering the draft genome of Asian Citrus Psyllid one gene at...
 
The genomics of why there are so many species of bats
The genomics of why there are so many species of batsThe genomics of why there are so many species of bats
The genomics of why there are so many species of bats
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
 
Genomic Insights into bat Senses
Genomic Insights into bat SensesGenomic Insights into bat Senses
Genomic Insights into bat Senses
 
Journal Club presentation on the correlation between telomere shortening rate...
Journal Club presentation on the correlation between telomere shortening rate...Journal Club presentation on the correlation between telomere shortening rate...
Journal Club presentation on the correlation between telomere shortening rate...
 
Quantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryQuantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic Observatory
 
Lewins genes xi [pdf][tahir99] vrg
Lewins genes xi [pdf][tahir99] vrgLewins genes xi [pdf][tahir99] vrg
Lewins genes xi [pdf][tahir99] vrg
 
SJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personalSJawdy_CV_June2016_no_personal
SJawdy_CV_June2016_no_personal
 
Scott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationScott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data Citation
 
AI Decision System and Longevity
AI Decision System and LongevityAI Decision System and Longevity
AI Decision System and Longevity
 
Discuss an example of knockout mouse model used for disease modelling (Metast...
Discuss an example of knockout mouse model used for disease modelling (Metast...Discuss an example of knockout mouse model used for disease modelling (Metast...
Discuss an example of knockout mouse model used for disease modelling (Metast...
 
Neuroscience: Transforming Visual Percepts into Memories
Neuroscience: Transforming Visual Percepts into MemoriesNeuroscience: Transforming Visual Percepts into Memories
Neuroscience: Transforming Visual Percepts into Memories
 

Viewers also liked

Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In... Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...Anita de Waard
 
Linking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papersLinking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papersAnita de Waard
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 

Viewers also liked (6)

Unknown Unknowns
Unknown UnknownsUnknown Unknowns
Unknown Unknowns
 
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In... Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 
Ncbo webinar force11
Ncbo webinar force11Ncbo webinar force11
Ncbo webinar force11
 
deWaardAAMC2012
deWaardAAMC2012deWaardAAMC2012
deWaardAAMC2012
 
Linking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papersLinking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papers
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 

Similar to Why Life is Difficult, and What We MIght Do About It

Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicatedAnita de Waard
 
PAPER 3.1 ~ HUMAN GENOME PROJECT
PAPER 3.1 ~  HUMAN GENOME PROJECTPAPER 3.1 ~  HUMAN GENOME PROJECT
PAPER 3.1 ~ HUMAN GENOME PROJECTNusrat Gulbarga
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2Eli Rosenthal
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2teamchaotex
 
Bioinformatics workshop presentation
Bioinformatics   workshop presentationBioinformatics   workshop presentation
Bioinformatics workshop presentationSKUAST-Kashmir
 
berlin v 12 september 2015
berlin v 12 september 2015berlin v 12 september 2015
berlin v 12 september 2015Marianne Legato
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)jmoore89
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsNikita Khromov-Borisov
 
Genetics research-template
Genetics research-templateGenetics research-template
Genetics research-templateMorganScience
 
Genetics research-template
Genetics research-templateGenetics research-template
Genetics research-templateMorganScience
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
ContentMine (EMBL-EBI Industry Programme)
ContentMine (EMBL-EBI Industry Programme)ContentMine (EMBL-EBI Industry Programme)
ContentMine (EMBL-EBI Industry Programme)Jenny Molloy
 
What s the_big_idea_about_genetics_passage_questions
What s the_big_idea_about_genetics_passage_questionsWhat s the_big_idea_about_genetics_passage_questions
What s the_big_idea_about_genetics_passage_questionsnorvely
 

Similar to Why Life is Difficult, and What We MIght Do About It (20)

Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicated
 
PAPER 3.1 ~ HUMAN GENOME PROJECT
PAPER 3.1 ~  HUMAN GENOME PROJECTPAPER 3.1 ~  HUMAN GENOME PROJECT
PAPER 3.1 ~ HUMAN GENOME PROJECT
 
Ethics and Stem Cells
Ethics and Stem CellsEthics and Stem Cells
Ethics and Stem Cells
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Impact of the human genome project on medical advancement in India.
Impact of the human genome project on medical advancement in India.Impact of the human genome project on medical advancement in India.
Impact of the human genome project on medical advancement in India.
 
Bioinformatics workshop presentation
Bioinformatics   workshop presentationBioinformatics   workshop presentation
Bioinformatics workshop presentation
 
berlin v 12 september 2015
berlin v 12 september 2015berlin v 12 september 2015
berlin v 12 september 2015
 
Forensic Science
Forensic ScienceForensic Science
Forensic Science
 
Mouse model: Pros & Cons
Mouse model: Pros & ConsMouse model: Pros & Cons
Mouse model: Pros & Cons
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)
 
Genetics and genomic
Genetics and genomicGenetics and genomic
Genetics and genomic
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomics
 
Bio
BioBio
Bio
 
 
Genetics research-template
Genetics research-templateGenetics research-template
Genetics research-template
 
Genetics research-template
Genetics research-templateGenetics research-template
Genetics research-template
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
ContentMine (EMBL-EBI Industry Programme)
ContentMine (EMBL-EBI Industry Programme)ContentMine (EMBL-EBI Industry Programme)
ContentMine (EMBL-EBI Industry Programme)
 
What s the_big_idea_about_genetics_passage_questions
What s the_big_idea_about_genetics_passage_questionsWhat s the_big_idea_about_genetics_passage_questions
What s the_big_idea_about_genetics_passage_questions
 

More from Anita de Waard

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 

More from Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Why Life is Difficult, and What We MIght Do About It

  • 1. Why Research Data Management May Save Science Anita de Waard VP Research Data Collaborations a.dewaard@elsevier.com http://researchdata.elsevier.com/ Why Life is Difficult, And What We Can Do About It
  • 2. Outline: • The problem: life is difficult. • One approach to tackling this: claim-evidence networks. – How do we find claims? – How do we find evidence? – How do we connect the two? • What is still missing? • Call to action!
  • 4. Problem 1: a rose is not a rose: • “…there was significant variability of the injected venom composition from specimen to specimen, in spite of their common biogeographic origin.” Jose A. Rivera-Ortiz, Herminsul Cano, Frank Marí, Intraspecies variability of the injected venom of Conus ermineus, doi:10.1016/j.peptides.2010.11.014 • “…Strains DV-3/84 DV-7/84 (group 3) showed 76.6% similarity to each other and were similar to all other strains at the 67.6% level.” Zofia Dzierżewicz et al., Intraspecies variability of Desulfovibrio desulfuricans strains determined by the genetic profiles, FEMS Microbiology Letters, Volume 219, Issue 1, 14 February 2003, Pages 69–74, doi:10.1016/S0378- 1097(02)01199-0 => A specimen is not a species!
  • 5. Problem 2: gene expression varies with: Age: “SIRT1-Associated genes are deregulated in the aged brain” Philipp Oberdoerffer et al., SIRT1 Redistribution on Chromatin Promotes Genomic Stability but Alters Gene Expression during Aging, Cell, Volume 135, Issue 5, 28 November 2008, Pages 907–918, doi:10.1016/j.cell.2008.10.025 Smell: “…major urinary proteins *…+ mediate the pregnancy blocking effects of male urine” P.A. Brennan, et al, Patterns of expression of the immediate-early gene egr-1 in the accessory olfactory bulb of female mice exposed to pheromonal constituents of male urine, Neuroscience, Volume 90, Issue 4, June 1999, P 1463– 1470, doi:10.1016/S0306-4522(98)00556-9 Hunger: “Out of the ~30K genes, about 10K are differentially expressed in liver cells when an animal is in different states of satiety.“ Zhang F, Xu X, Zhou B, He Z, Zhai Q (2011) Gene Expression Profile Change and Associated Physiological and Pathological Effects in Mouse Liver Induced by Fasting and Refeeding. PLoS ONE 6(11): e27553. doi:10.1371/journal.pone.002755 Light: “Longer-term enrichment training also altered the mRNA levels of many genes associated with structural changes that occur during neuronal growth.” Cailotto C., et al. (2009) Effects of Nocturnal Light on (Clock) Gene Expression in Peripheral Organs: A Role for the Autonomic Innervation of the Liver. PLoS ONE 4(5): e5650. doi:10.1371/journal.pone.0005650: => Knowing genes is not knowing how they are expressed!
  • 6. • “We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals.” The Human Microbiome Project Consortium, Structure, function and diversity of the healthy human microbiome, Nature 486, 207–214 (14 June 2012) doi:10.1038/nature11234 • “Colonization of an infant’s gastrointestinal tract begins at birth. The acquisition and normal development of the neonatal microflora is vital for the healthy maturation of the immune system.” Mackie RI, Sghir A, Gaskins HR., Developmental microbial ecology of the neonatal gastrointestinal tract. Am J Clin Nutr. 1999 May;69(5):1035S-1045S Problem 3: No man (or mouse) is an island… => An animal is an ecosystem!
  • 7. Problem 4: Interactions create more complexity: • Computing cancer: “No amount of information about what happens inside a single cell can ever tell you what a tissue is going to do,” *Glazier+ said. “Much of the information and complexity of tissues and life is embedded in the way cells talk to each other and the extracellular environment.” • Megadata:“These complex emergent systems are impossible to understand,”,”*we+ founded Applied Proteomics to create a protein diagnostic that reveals not just where a cancer is, but how it interacts with the body..” Nature Special Issue Vol. 491 No. 7425 ‘Physical Scientists Take On Cancer’ : => The whole is more than the sum of its parts!
  • 8. Big problems in biology: http://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg 1. Interspecies variability > A specimen is not a species! 2. Gene expression variability > Knowing genes is not knowing how they are expressed! 3. Microbiome > An animal is an ecosystem! 4. Systems biology > Whole is more than the sum of its parts! 5. Models vs. experiment > Are we talking about the same things? In a way we can all use? 6. Dynamics > Life is not in equilibrium! Life is complicated! Reductionism doesn’t work for living systems.
  • 9. Statistics could help! With enough observations, trends and anomalies can be detected: • “Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far.” The Human Microbiome Project Consortium, Structure, function and diversity of the healthy human microbiome, Nature 486, 207–214 (14 June 2012) doi:10.1038/nature11234 • “The large sample size — 4,298 North Americans of European descent and 2,217 African Americans — has enabled the researchers to mine down into the human genome.” Nidhi Subbaraman, Nature News, 28 November 2012, High-resolution sequencing study emphasizes importance of rare variants in disease.
  • 10. But biological research is insular! • Biology is small: size 10^-5 – 10^2 m, scientist can work alone (‘King’ and ‘subjects’). • Biology is messy: it doesn’t happen behind a terminal. • Biology is competitive: many people with similar skill sets, vying for the same grants • In summary: the structure of biological research does not inherently promote collaboration (vs., for instance, HE physics or astronomy (and they’re not all they’re cracked up to be, either…)). Prepare Observe Analyze Ponder Communicate
  • 11. How Can We Connect This Knowledge?
  • 12. Claim-Evidence Networks Offer A Model for Connecting Knowledge: Experimental Evidence
  • 13. Converging on Claim/Evidence/Networks, e.g. here: • The Karyotype Ontology: a computational representation for human cytogenetic patterns. Jennifer Warrender and Phillip Lord • Lexical Analysis and Characterization of the OBOFoundry Ontologies. Manuel Quesada-Martínez, Jesualdo Tomás Fernández-Breis and Robert Stevens • Exomiser: improved exome prioritization of disease genes through cross species phenotype comparison. Peter Robinson, Sebastian Köhler, Anika Oellrich, Kai Wang, Chris Mungall, Suzanna E. Lewis, Sebastian Bauer, Dominik Seelow, Peter Krawitz, Christian Gilissen, Melissa Haendel and Damian Smedley • BioAssay Ontology (BAO): Modularization, Integration and Applications. Uma Vempati, Hande Kucuk, Saminda Abeyruwan, Ubbo Visser, Vance Lemmon, Ahsan Mir and Stephan Schürer • eXframe: A Semantic Web Platform for Genomics Experiments. Emily Merrill, Stephane Corlosquet, Paolo Ciccarese, Tim Clark and Sudeshna Das • Ovopub: Modular data publication with minimal. provenance Alison Callahan and Michel Dumontier • Zooma – A tool for automated ontology annotation. Tony Burdett, Simon Jupp, James Malone, Helen Parkinson, Eleanor Williams and Adam Faulconbridge • A Probabilistic Framework for Ontology-Based Annotation in Neuroimaging Literature. Chayan Chakrabarti, Thomas B. Jones, Jiawei F. Xu, George F. Luger, Angela R. Laird, Matthew D. Turner and Jessica A. Turner • Preserving sequence annotations across reference sequences. Zuotian Tatum, Andrew Gibson, Marco Roos, Peter E.M. Taschner, Mark Thompson, Erik A. Schultes and Jeroen F. J. Laros • A Taxonomy for Immunologists. James A. Overton, Randi Vita, Jason A. Greenbaum, Heiko Dietze, Alessandro Sette and Bjoern Peters • Health Data Ontology Trunk: A middle-layer ontology for health- care. Ulf Schwarz, Luc Schneider, Emilio Sanfilippo, Holger Stenzhorn and Nikolina Koleva • Structured representation of scientific evidence using semantic web techniques – a biochemistry use case.Christian Bölling, Michael Weidlich and Hermann-Georg Holzhütter • Synthetic Biology Open Language Visual: an ontological use case. Jacqueline Quinn, Michal Galdzicki, Robert
  • 14. Step 1: Find claims: E.g., using XIP for discourse analysis: In contrast with previous hypotheses compact plaques form before significant deposition of diffuse A beta, suggesting that different mechanisms are involved in the deposition of diffuse amyloid and the aggregation into plaques. Entities Relationships Temporality Connections thematic roles Status core information (proposition) information extraction rhetorical metadiscourse discourse analysis discourse analysisdiscourse structure Sándor, Àgnes and de Waard, Anita, (2012).
  • 15. Finding Claimed Knowledge Updates: Sandor, A. and de Waard, A. (2012) Here we used mass spectrometry to identify HuD as a novel neuronal SMN-interacting partner Our analysis of known HuD-associated mRNAs in neurons identified cpg15 mRNA as a highly abundant mRNA in HuD IPs Our finding that SMN protein associates with HuD protein and the HuD target cpg15 mRNA in neurons … Definition: 1) A CKU expresses a verbal or nominal proposition about biological entities. 2) A CKU is a new proposition. 3) The authors present the CKU as factual. 4) A CKU is derived from the experimental work described in the article. 5) The ownership of the proposition is attributed to the author(s) of the article. 6) 4) and 5) are either explicitly expressed or are implicitly conveyed by a structural position as title, section or caption title.
  • 16. Allow for Hedging and Uncertainty: Ontology of Reasoning, Certainty and Attribution (ORCA) For a Proposition P, an epistemically marked clause E is an evaluation of P, where EV, B, S(P), with: – V = Value: 3 = Assumed true, 2 = Probable, 1 = Possible, 0 = Unknown, (- 1= possibly untrue, - 2 = probably untrue, -3 = assumed untrue) – B = Basis: Reasoning Data – S = Source: A = speaker is author A, explicit IA = speaker author, A, implicit N = other author N, explicit NN = other author NN, implicit Based on a conversation with Ed Hovy; de Waard, A. and Schneider, J. (2012)
  • 17. Turning claims into formal representations: Biological statement with BEL/ epistemic markup BEL representation: Epistemic evaluation These miRNAs neutralize p53-mediated CDK inhibition, possibly through direct inhibition of the expression of the tumor-suppressor LATS2. r(MIR:miR-372) - |(tscript(p(HUGO:Trp53)) -| kin(p(PFH:”CDK Family”))) Increased abundance of miR- 372 decreases abundance of LATS2 r(MIR:miR-372) -| r(HUGO:LATS2) Value = Possible Source = Unknown Basis = Unknown Biological statement with Medscan/epistemic markup MedScan Representation: Epistemic evaluation Furthermore, we present evidence that the secretion of nesfatin-1 into the culture media was dramatically increased during the differentiation of 3T3-L1 preadipocytes into adipocytes (P < 0.001) and after treatments with TNF-alpha, IL-6, insulin, and dexamethasone (P < 0.01). IL-6  NUCB2 (nesfatin-1) Relation: MolTransport Effect: Positive CellType: Adipocytes Cell Line: 3T3-L1 Value = Probable Source = Author Basis = Data
  • 18. Claims Link to Evidence:
  • 19. The evidence is in data. To structure this: • There are many different research databases– both generic (Dryad, Dataverse, DataBank, Zenodo, etc) and specific (NIF, IEDA, PDB) • There are many systems for creating/sharing workflows (Taverna, MyExperiment, Vistrails, Workflow4Ever,) • There are many e-lab notebooks (LabGuru, LabArchives, LaBlog etc) • There are scores of projects, committees, standards, bodies, grants, initiatives, conferences for discussing and connecting all of this (KEfED, Pegasus, PROV, RDA, Science Gateways, Codata, BRDI, Earthcube, etc. etc) • … you could make a living out of this !
  • 20. …but this is what most scientists do: Using antibodies and squishy bits Grad Students experiment and enter details into their lab notebook. The PI then tries to make sense of their slides, and writes a paper. End of story.
  • 21. One attempt to structure data: CMU Urban Legend de Waard, A., Burton, S. et al., 2013
  • 22. Connecting experimental results: Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Across labs, experiments: track reagents and how they are used
  • 23. Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Compare outcome of interactions with these entities Connecting experimental results:
  • 24. Prepare Analyze Communicate Prepare AnalyzeCommunicate Observations Observations Observations Build a ‘virtual reagent spectrogram’ by comparing how different entities interacted in different experiments Think Reason collectively! Connecting experimental results:
  • 25. NIF Antibodies Registry collects antibody information:
  • 26. Step 3: Connect Claims and Evidence Example: Hunter et al., Hanalyzer:
  • 27. Step 1: Manually identify DDIs and drug names in wide collection of content sources Step 2: Develop a model of Drug-Drug Interaction and define candidates Step 3: Automate this process and store as Linked Data Example: Drug-Drug Interactions Boyce, Schroeder et al., 2013
  • 28. Connect recommendations in clinical guidelines to underlying evidence Hoekstra, de Waard and Vdovjak, 2012 Example:
  • 29. Using what is known about interactions in fly & yeast, predict new interactions with a human protein – Running over data on the web that he neither created nor knew about! Given a protein P in Species X: Find proteins similar to P in Species Y Retrieve interactors in Species Y Sequence-compare Y-interactors with Species X genome (1)  Keep only those with homologue in Find proteins similar to P in Species Z Retrieve interactors in Species Z Sequence-compare Z-interactors with (1)  Putative interactors in Species X Example: do science ON the web:
  • 30. Great! So we’re almost done, right – and we can all go home! Not so fast…
  • 31. Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis. Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells- specific miRNA cluster (Suh et al, 2004). In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis. Fact Hypothesis Method Result Implication Goal Reg-Implication Conceptual knowledge Experimental Evidence What is a claim? In a paragraph?
  • 32. • Voorhoeve et al., 2006: “These miRNAs neutralize p53- mediated CDK inhibition, possibly through direct inhibition of the expression of the tumor suppressor LATS2.” • Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor LATS2 (Voorhoeve et al., 2006).” • Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly inhibit the expression of Lats2, thereby allowing tumorigenic growth in the presence of p53 (Voorhoeve et al., 2006).” “[Y]ou can transform .. fiction into fact, just by adding or subtracting references”, Latour, 1987 What is the claim? Who makes it?
  • 33. > 50 My Papers 2 M scientists 2 My papers/year Evidence is largely lost…. Majority of data (90%?) is stored on local hard drives Dryad: 7,631 files Dataverse: 0.6 My Datacite: 1.5 My Some data (8%?) stored in large, generic data repositories MiRB: 25k PetDB: 1,5 k TAIR: 72,1 k PDB: 88,3 k SedDB: 0.6 k A small portion of data (1-2%?) stored in small, topic-focused data repositories
  • 35. • In 220 publications only 40% of antibodies, 40% of cell lines and 25% of constructs can be manually identified (Vasilevsly et al, submitted) • The good news: we can find automatically what we can find manually • Proposal (NIH, June 2013): – Author is asked to add methods section to a tool – Tool extracts likely reagents / resources – User interface asks author to confirm or select …and you can’t extract it after the fact. 49 publications193 publications 76 publications 214 publications 210 publica Entity Type Precision Recall Antibody 87.5 63.3 Resource 95.6 98.9
  • 36. Even if we can link to evidence: • Is it true?
  • 37. In Summary: We’re not out of the woods (or a job) just yet!
  • 38. We need to improve claim networks: • Can we make systems of computer-readable meaning that still represent the fullness of natural language? >> Let’s work with computational linguists! • Trace claims across publications: >> Let’s work with legal/political argumentation specialists! Sentiment analysis!
  • 39. > 50 My Papers 2 M scientists 2 My papers/year Improve evidence: scale up data curation! Dryad: 7,631 files Dataverse: 0.6 My Datacite: 1.5 My MiRB: 25k PetDB: 1,5 k Majority of data (90%?) is stored on local hard drives Some data (8%?) stored in large, generic data repositories TAIR: 72,1 k PDB: 88,3 k SedDB: 0.6 k A small portion of data (1-2%?) stored in small, topic-focused data repositories INCREASE DATA DIGITISATION DEVELOP SUSTAINABLE MODELS IMPROVE REPOSITORY INTEROPERABILITY
  • 40. Keep asking big questions: • Is this true? • Does it matter? • To whom? “Let us now build systems that allow a kid in Mali who wants to learn about proteomics to not be overwhelmed by the irrelevant and the untrue.” - John Perry Barlow, iAnnotate, SF 2013
  • 41. In Memoriam Douglas C. Engelbart, 1925-2013: “This is an initial summary report of a project taking a new and systematic approach to improving the intellectual effectiveness of the individual human being. A detailed conceptual framework explores the nature of the system composed of the individual and the tools, concepts, and methods that match his basic capabilities to his problems. One of the tools that shows the greatest immediate promise is the computer, when it can be harnessed for direct on-line assistance, integrated with new concepts and methods.”
  • 42. Summary: • The problem: life is difficult. • One approach to tackle this: claim-evidence networks: – Find claims – Identify evidence – Connect the two. • But we still need: – Better ways to represent subtlety of natural language – Better evidence: more structured, better connected – Focus on the big questions. • There’s a lot of work to do!
  • 43. Collaborations and discussions gratefully acknowledged: • CMU: Nathan Urban, Shreejoy Tripathy, Shawn Burton, Ed Hovy • UCSD: Phil Bourne, Brian Shoettlander, Ilya Zaslavsky • NIF: Maryann Martone, Anita Bandrowski • MSU: Brian Bothner • OHSU: Melissa Haendel, Nicole Vasilevsky • CDL: Carly Strasser, John Kunze, Stephen Abrams • Harvard/MGH: Tim Clark, Paolo Ciccarese • VU: Rinke Hoekstra, Frank van Harmelen, Paul Groth • Columbia/IEDA: Kerstin Lehnert, Leslie Hsu • University of Pittsburgh: Richard Boyce • Xerox Research Europe: Agnes Sandor • DERI: Jodi Schneider Thank you!
  • 44. References: • de Waard, Buckingham Shum, Park, Samwald, Sandor, 2009: Hypotheses, Evidence and Relationships, ISWC2009 • Biological Expression Language – http://www.openbel.org • Latour, B. and Woolgar, S., Laboratory Life: the Social Construction of Scientific Facts, 1979, Sage Publications • Latour, B., Science in Action, 1987 • de Waard, A. and Pander Maat, H. (2012). Epistemic Modality and Knowledge Attribution in Scientific Discourse: A Taxonomy of Types and Overview of Features. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 47–55, Jeju, Republic of Korea, 12 July 2012. • Data2Semantics project: http://www.data2semantics.org/ • Sándor, Àgnes and de Waard, Anita, (2012). Identifying Claimed Knowledge Updates in Biomedical Research Articles, Workshop on Detecting Structure in Scholarly Discourse, ACL 2012. • de Waard, A. and Schneider, J. (2012) Formalising Uncertainty: An Ontology of Reasoning, Certainty and Attribution (ORCA), Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine workshop, ISWC 2012 • de Waard, A., Burton, S.D., Gerkin, R.C., Harviston, M., Marques, D., Tripathy, S.J., Urban, N.N., Creating an Urban Legend: A System for Electrophysiology Data Management and Exploration, Discovery Informatics, 2013 • Boyce, R.D., Horn, J.R., Hassanzadeh, O., de Waard, A., Schneider, J., Luciano, J. S, Liakata, M., Dynamic enhancement of drug process labels to support drug safety, efficacy, and effectiveness. Jnl of Biomedical Semantics, 2013, 4:5. • Hoekstra, R., de Waard,A., Vdovjak, R. (2012) Annotating Evidenced Based Clinical Guidelines - A Lightweight Ontology, Proceedings of SWAT4LS 2012, Paris, Adrian Paschke, Albert Burger, Paolo Roma, M. Scott Marshall, Andrea Splendiani (ed.), Springer. http://researchdata.elsevier.com/