SlideShare a Scribd company logo
The Narrative Structure of Research Articles,
or, Why Science is Like a Fairy Tale
Anita de Waard, VP Research Data Collaborations
Research Data Management Services, Elsevier
Overview
1. Discourse Comprehension 101
2. Story grammars and the Cycle of Scientific
Investigation
3. How can we help scientists read?
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
Discourse Comprehension 101
• Letter < syllable < word < clause < sentence < discourse:
This is how linguistics is structured.
But it is not how we understand text!
• Kintsch and Van Dijk, ‘93: we read a text at three levels:
– surface code: literal text, exact words/syntax
– text base: preserves meaning, but not exact wording
– situation model: ‘microworld’ that the text is about:
constructed inferentially through interaction between the
text and background knowledge
• We use knowledge about text genre to activate a schema:
this allows creation of the text base and situation model
Discourse Comprehension 101
Examples of schema’s:
The structure of a research paper:
Discussion:
• Statement of principal findings
• Strengths and weaknesses of the study
• Relation to other studies
• Unanswered questions and future research
Introduction: “Create a Research Space”
• Establish a research territory
• Establish a niche
• Occupy the niche
Methods and Results:
“Cycles of Scientific Investigation”
(see below)
THORNDYKE, P.W. (1977), Cognitive Structures in Comprehension and Memory of
Narrative Discourse, COGNITIVE PSYCHOLOGY 9, 77- 110 (1977)
A Story Grammar:
Story Grammar The Story of Goldilocks and
the Three Bears
Setting Time Once upon a time
Character a little girl named Goldilocks
Location She went for a walk in the forest.
Pretty soon, she came upon a
house.
Theme Goal She knocked and, when no one
answered,
Attempt she walked right in.
Episode Name At the table in the kitchen, there
were three bowls of porridge.
Subgoal Goldilocks was hungry.
Attempt She tasted the porridge from the
first bowl.
Outcome This porridge is too hot! she
exclaimed.
Attempt So, she tasted the porridge from
the second bowl.
Outcome This porridge is too cold, she
said
Attempt So, she tasted the last bowl of
porridge.
Paper
Grammar
The AXH Domain of Ataxin-1 Mediates Neurodegeneration
through Its Interaction with Gfi-1/Senseless Proteins
Background The mechanisms mediating SCA1 pathogenesis are still not fully
understood, but some general principles have emerged.
Objects of
study
the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,
Experimental
setup
studied and compared in vivo effects and interactions to those of
the human protein
Research
goal
Gain insight into how Atx-1's function contributes to SCA1
pathogenesis. How these interactions might contribute to the
disease process and how they might cause toxicity in only a
subset of neurons in SCA1 is not fully understood.
Hypothesis Atx-1 may play a role in the regulation of gene expression
Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When
Overexpressed in Files
Subgoal test the function of the AXH domain
Method overexpressed dAtx-1 in flies using the GAL4/UAS system
(Brand and Perrimon, 1993) and compared its effects to those
of hAtx-1.
Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which
drives expression in the differentiated R1-R6 photoreceptor cells
(Mollereau et al., 2000 and O'Tousa et al., 1985), results in
neurodegeneration in the eye, as does overexpression of hAtx-
1[82Q]. Although at 2 days after eclosion, overexpression of
either Atx-1 does not show obvious morphological changes in
the photoreceptor cells
Data (data not shown),
Results both genotypes show many large holes and loss of cell integrity
Story Grammar For A Science Paper:
Rubber hits the road in Results:
Cycles of Scientific Investigation
© Gully Burns, 2011
CoSI in action:
© Gully Burns, 2011
3. We used for this experiment BJ/ET cells containing
p14ARFkd because, following RASV12 treatment, in
those cells p53 is still activated but more clearly
stabilized than in parental BJ/ET cells (Voorhoeve and
Agami, 2003), resulting in a sensitized system for slight
alterations in p53 in response to RASV12.
1. Importantly, our results so far indicate that the
expression of miR-372&3 did not reduce the activity
of RASV12, as these cells were still growing faster
than normal cells and were tumorigenic, for which
RAS activity is indispensable (Hahn et al, 1999 and
Kolfschoten et al, 2005).
2. To shed more light on this aspect, we
examined the effect of miR-372&3 expression on
p53 activation in response to oncogenic
stimulation.
4. Figure 4A shows that following RASV12
stimulation, p53 was stabilized and activated, and its
target gene, p21cip1, was induced in all cases,
indicating an intact p53 pathway in these cells.
In defense of the clause
as the unit of thought:
1. Importantly, our results so far indicate that the expression of miR-
372&3 did not reduce the activity of RASV12, as these cells were still
growing faster than normal cells and were tumorigenic, for which RAS
activity is indispensable (Hahn et al, 1999 and Kolfschoten et al, 2005).
2. To shed more light on this aspect, we examined the effect of miR-
372&3 expression on p53 activation in response to oncogenic
stimulation.
3. We used for this experiment BJ/ET cells containing p14ARFkd because,
following RASV12 treatment, in those cells p53 is still activated but
more clearly stabilized than in parental BJ/ET cells (Voorhoeve and
Agami, 2003), resulting in a sensitized system for slight alterations in
p53 in response to RASV12.
4. Figure 4A shows that following RASV12 stimulation, p53 was stabilized
and activated, and its target gene, p21cip1, was induced in all cases,
indicating an intact p53 pathway in these cells.
Regulatory
clause
Fact Goal Method Result Implication
Both seminomas and the EC component of
nonseminomas share features with ES cells. To
exclude that the detection of miR-371-3 merely
reflects its expression pattern in ES cells, we tested
by RPA miR-302a-d, another ES cells-specific
miRNA cluster (Suh et al, 2004). In many of the
miR-371-3 expressing seminomas and
nonseminomas, miR-302a-d was undetectable (Figs
S7 and S8), suggesting that miR-371-3 expression is
a selective event during tumorigenesis.
Both seminomas and the EC component of
nonseminomas share features with ES cells.
To exclude that
the detection of miR-371-3 merely reflects its
expression pattern in ES cells,
we tested by RPA miR-302a-d, another ES cells-
specific miRNA cluster (Suh et al, 2004).
In many of the miR-371-3 expressing seminomas
and nonseminomas, miR-302a-d was undetectable
(Figs S7 and S8),
suggesting that
miR-371-3 expression is a selective event during
tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
Reg-Implication
Conceptual
knowledge
Experimental
Evidence
Clause, realm and tense:
Facts in the
eternal present
Endogenous small RNAs (miRNAs) regulate
gene expression by mechanisms conserved
across metazoans.
I sing of golden-throned Hera whom Rhea bare.
Queen of the immortals is she, surpassing all in
beauty: she is the sister and the wife of loud-
thundering Zeus, --the glorious one whom all the
blessed throughout high Olympus reverence and
honor.
Events in the
simple past
Vehicle-treated animals spent equivalent
time investigating a juvenile in the first and
second sessions in experiments conducted in
the NAC and the striatum: T1 values were
122 ± 6 s and 114 ± 5 s.
Now the wooers turned to the dance and to
gladsome song, and made them merry, and waited
till evening should come; and as they made merry
dark evening came upon them.
Events with
embedded
facts
We also generated BJ/ET cells expressing the
RASV12-ERTAM chimera gene, which is only
active when tamoxifen is added (De Vita et al,
2005).
And she took her mighty spear, tipped with sharp
bronze, heavy and huge and strong, wherewith she
vanquishes the ranks of men-of warriors, with
whom she is wroth, she, the daughter of the
mighty sire.
Attribution in
the present
perfect
miRNAs have emerged as important
regulators of development and control
processes such as cell fate determination and
cell death (Abrahante et al., 2003, Brennecke
et al., 2003, Chang et al., 2004, Chen et al.,
2004, Johnston and Hobert, 2003, Lee et al.,
1993]
In this book I have had old stories written down, as
I have heard them told by intelligent people,
concerning chiefs who have held dominion in the
northern countries, and who spoke the Danish
tongue; and also concerning some of their family
branches, according to what has been told me.
Implications
are hedged,
and in the
present tense
These results indicate that although miR-
372&3 confer complete protection to
oncogene-induced senescence in a manner
similar to p53 inactivation, the cellular
response to DNA damage remains intact
Now it is said that ever since then whenever the
camel sees a place where ashes have been
scattered, he wants to get revenge with his enemy
the rat and stomps and rolls in the ashes hoping to
get the rat
Tense use in science and mythology:
Summing up:
1. Discourse Comprehension 101:
– We read gobs of text and integrate these with our
knowledge networks
– We understand through schema’s
2. Story grammars and the Cycle of Scientific Investigation
– Papers are like fairytales
– Within Results, Cycles of Scientific Investigation connect
data to claims
– Tense helps identify the realm of the claim (like in
mythology)
3. How can we use this to help scientists read?
So how can this understanding
help us help scientists read papers?
• Why do we read?
To learn, i.e.: obtain the knowledge contained within the
text and integrate it with what we already know.
• What do we read?
Things that are ‘interesting’ :
– Pertinent
– Possibly/probably true
– Novel, but in agreement with what I know
• How do we read?
human breast cancer
noninvasive MCF7-Ras
antisense oligonucleotides
high-grade malignancy
cell viability
retroviral vector
miR-31
cloned
transiently expressed miRNA sponges
Is it pertinent? -> Possibly…
Is it true? -> ?
Is it new, but in agreement with what I know? -> -?
Represent a paper as
Collections of Noun Phrases?
miR-31 PREVENT acquisition of aggressive traits
miR-31 INHIBIT noninvasive MCF7-Ras cells
miR-31 ENHANCE invasion
cell viability AFFECT inhibitor
miR-31 expression DEPRIVE metastatic cells
Is it pertinent? -> Possibly…
Is it true? -> ?
Is it new, but in agreement with what I know? ->?
Represent a paper as Triples
(Two Nouns and a Verb):
The preceding observations demonstrated that X expression deprives Y cells of
attributes associated with Z.
We next asked whether X also prevents the acquisition of A traits by B cells.
To do so, we transiently inhibited X in C cells with either D or E.
Both approaches inhibited X function by > 4.5-fold (Figure S7A).
Suppression of X enhanced invasion by 20-fold and motility by 5-fold, but F was
unaffected by either inhibitor (Figure 3A; Figure S7B).
The E sponge reduced X function by 2.5-fold, but did not affect the activity of other
known Js (Figures S8A and S8B).
Collectively, these data indicated that sustained X activity is necessary to prevent the
acquisition of Z traits by both K and untransformed B cells.
Is it pertinent? -> Need content
Is it true? -> Sounds likely! I know this stuff!
Is it new, but in agreement with what I know? -> Need content
Represent a paper’s Metadiscourse:
Claim:
• sustained miR-31 activity is necessary to prevent the acquisition of aggressive
traits by both tumor cells and untransformed breast epithelial
Evidence: Method:
• We transiently inhibited miR-31 in noninvasive MCF7-Ras cells with either
antisense oligonucleotides or miRNA sponges.
Evidence: Result:
• Both approaches inhibited miR-31 function by >4.5-fold (Figure S7A).
• Suppression of miR-31 enhanced invasion by 20-fold and motility by 5-fold,
but cell viability was unaffected by either inhibitor (Figure 3A; Figure S7B).
• The miR-31 sponge reduced miR-31 function by 2.5-fold, but did not affect
the activity of other known antimetastatic miRNAs (Figures S8A and S8B).
Is it pertinent? -> Probably
Is it true? -> Sounds likely!
Is it new, but in agreement with what I know? -> Check/know
Represent a Paper as a
Set of Claims and Evidence:
Is it pertinent? -> Possibly
Is it true?
Is it new, but in agreement with what I know? -> Need background
-> Probably!
Show who wrote it, and where:
So we probably need all of these:
• Surface code provides noun phrases and triples that offer
pointers re. topical relevance
• Text base and and situation model are created through specific
metadiscourse conventions (e.g. refs at the end) that create a
biological reasoning model:
• This can be expressed as a set of claims, linked to evidence, that
can help represent key points in the paper
• Journal name and author’s affiliation help define schema and
provide ‘willingness to be convinced’ socially/interpersonally.
We next asked whether …
To do so, we transiently inhibited…
Suppression of X enhanced invasion …
but F was unaffected …(Figure 3A). …
Collectively, these data indicated that … .
Hypothesis
Goal/Method
Result
Results
Implication
But wait: there’s a wolf in the woods!
This article has been retracted: please see Elsevier Policy on Article Withdrawal
(http://www.elsevier.com/locate/withdrawalpolicy). This article has been retracted
at the request of the authors.
Our study reported that miR-31 is a regulator of multiple mRNAs important for
different aspects of breast cancer metastasis. We recently identified concerns with
several figure panels in which original data were compiled from different replicate
experiments in order to assemble the presented figure. The scope of the figure
preparation issues includes compiling data from independent experiments to
present them as one internally controlled experiment, statistical analyses based
on technical replicates that are not reflective of the biological replicates, and
comparisons of selectively chosen data points from multiple experiments. As
many of the published figures are therefore not appropriate or accurate
representations of the original data, we believe that the responsible course of
action is to retract the paper. We apologize for any inconvenience we have caused.
In Summary:
1. Discourse Comprehension 101
2. Story grammars and the Cycle of Scientific
Investigation
3. How can we help scientists read?
– Tools that ‘read’ papers and allow easy access to claims
and evidence
– Tools and practices that record data (=evidence)
throughout the practice of creating it
– Tools that help us make sense out of all of this
networked knowledge
– Cultural habits to support these practices.
For Change to Occur,
We Need Networks of Collaboration:
Force11:
– Multi-stakeholder, member-driven organisation
– Unites scholars, tool developers, librarians, publishers, funding agencies etc. etc.
– E.g.: RRID initiative just got implemented in Cell: “STAR Methods: Structured,
Transparent, Accessible Reporting.”
National Data Service:
– Multi-stakeholder group, based around supercomputing centres
– Aims to be a ‘connective tissue’ between data creation, curation, storage etc projects.
– Inviting Pilots: two or more partners who have not worked together, interested in
collaborating on a data-centric project to solve a real-world needs
– E.g. Datasearch, Data Linking systems
RDA:
– Coleading Data publishing, linking group
– Colead Cost Recovery group, part of RDA US Sustainability effort
– Active in Chemistry, Earth Science groups, starting IG on Data Search
– SciDataCon, Sept 11-16, Denver, CO
The National
DATA SERVICE
Anita de Waard
VP Research Data Collaborations Research Data
Management Services, Elsevier
a.dewaard@elsevier.com
And we all live happily
ever after….
Addendum:
Can Computers Help Us Read?
Noun Phrases: some issues
• Problem 1: disambiguating terms (© GoPubMed):
– Hnrpa1 = Tis = Fli-2 = nuclear ribonucleoprotein A1 = helix
destabilizing protein = single-strand binding protein = hnRNP core
protein A1 = HDP-1 = topoisomerase-inhibitor suppressed.
– Cellulose 1,4-beta-cellobiosidase = exoglucanase
– COLD =/ C.O.L.D. =/ cold (runny nose) =/ cold (low T)
• Problem 2: disambiguating entities (© M. Martone):
– 95 antibodies were (manually!) identified in 8 articles
– 52 did not contain enough information to determine the antibody
used
– Some provided details in other papers
– Failed to give species, clonality, vendor, or catalog number
Noun Phrases: some progress
• Despite these difficulties, noun phrase recall/precision is
quite high, e.g. I2B22011 [1], [2], others: 90%-98%
• Many tools, see [3] for a list; e.g. GoPubMed:
Triples: some issues:
• Contingent on good NP & VP detection
• Hard to parse text! E.g. a commercial tool gave:
insulin maintaining glucose homeostasis
When insulin secretion cannot be increased adequately (type I
diabetes defect) to overcome insulin resistance in maintaining
glucose homeostasis, hyperglycemia and glucose intolerance
ensues.
insulin may be involved glucose homeostasis
Because PANDER is expressed by pancreatic beta-cells and in
response to glucose in a similar way to those of insulin, PANDER
may be involved in glucose homeostasis.
Triples: some progress:
Biological Expression Language [4]:
We provide evidence that these miRNAs are potential novel oncogenes participating in the development
of human testicular germ cell tumors by numbing the p53 pathway, thus allowing tumorigenic growth in
the presence of wild-type p53.
Increased abundance of miR-372 decreases activity of TP53
r(MIR:miR-372) -| tscript(p(HUGO:Trp53))
Context: cancer
SET Disease = “Cancer”
Activity of TP53 decreases cell growth
tscript(p(HUGO:Trp53)) -| bp(GO:”Cell Growth”
Metadiscourse: why it matters
• Voorhoeve et al., 2006: “These miRNAs neutralize p53- mediated CDK
inhibition, possibly through direct inhibition of the expression of the tumor
suppressor LATS2.”
• Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373
were found to allow proliferation of primary human cells that express
oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor
LATS2 (Voorhoeve et al., 2006).”
• Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373,
function as potential novel oncogenes in testicular germ cell tumors by
inhibition of LATS2 expression, which suggests that Lats2 is an important tumo
suppressor (Voorhoeve et al., 2006).”
• Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly
inhibit the expression of Lats2, thereby allowing tumorigenic growth in the
presence of p53 (Voorhoeve et al., 2006).”
“[Y]ou can transform .. fiction into fact just by adding or
subtracting references”, Bruno Latour [5]
Adding Metadiscourse To Triples
Claim ORCA Value
Together, Lats2 and ASPP1 shunt p53 to proapoptotic
promoters and promote the death of polyploid cells [1]. (…)
Value = 3
Source = N
Basis = 0
Further biochemical characterization of hMOBs showed that
only hMOB1A and hMOB1B interact with both LATS1 and LATS2
in vitro and in vivo [39]. (…)
Value = 3
Source = N
Basis = Data
Our findings reveal that miR-373 would be a potential
oncogene and it participates in the carcinogenesis of human
esophageal cancer by suppressing LATS2 expression.
Value = 1 or 2 ?
Source = Author
Basis = Data
Furthermore, we demonstrated that the direct inhibition of
LATS2 protein was mediated by miR-373 and manipulated the
expression of miR-373 to affect esophageal cancer cells growth.
Value = 2 (or 3?)
Source = Author
Basis = Data
Claims and Evidence: some issues:
• Data2Semantics [11]: linking clinical guidelines to evidence.
Inconsistency within guideline and guidelines v. evidence:
• Studies have demonstrated inconsistent results regarding the use of such
markers of inflammation as C-reactive protein (CRP), interleukins- 6 (IL-6) and
-8, and procalcitonin (PCT) in neutropenic patients with cancer [55–57].
• [55]: PCT and IL-6 are more reliable markers than CRP for predicting
bacteremia in patients with febrile neutropenia
• [56] In conclusion, daily measurement of PCT or IL-6 could help identify
neutropenic patients with a stable course when the fever lasts >3 d. …,
it would reduce adverse events and treatment costs.
• [57] Our study supports the value of PCT as a reliable tool to predict
clinical outcome in febrile neutropenia.
• Drug Interaction Knowledgebase [12]: how to identify evidence?
• R-citalopram_is_not_substrate_of_cyp2c19:
• At 10uM R- or S-CT, ketoconazole reduced reaction velocity to 55 -60% of
control, quinidine to 80%, and omeprazole to 80-85% of control (Fig. 6).
Claims and Evidence: some progress
• Defining ‘salient knowledge components’ in text:
– Argumentative zones, CoreSC can both be found
– Blake, Claim networks (more soon!)
– Claimed Knowledge Updates (Sandor/de Waard, [13]):

More Related Content

Similar to The Narrative Structure of Research Articles, or, Why Science is Like a Fairy Tale

Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papers
Anita de Waard
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014Anita de Waard
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysis
Lars Juhl Jensen
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
Anita de Waard
 
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...Zach Rana
 
Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Katie K. Hsiao
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
Anita de Waard
 
How Scientists Read, How Computers Read, and What We Should Do
How Scientists Read, How Computers Read, and What We Should DoHow Scientists Read, How Computers Read, and What We Should Do
How Scientists Read, How Computers Read, and What We Should Do
Anita de Waard
 
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In... Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Anita de Waard
 
Zoology Second Year Important Question | Exam Tips and Tricks
Zoology Second Year Important Question | Exam Tips and TricksZoology Second Year Important Question | Exam Tips and Tricks
Zoology Second Year Important Question | Exam Tips and Tricks
PreethyKs
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathias Hibbard
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
GauravYadav883711
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
GauravYadav883711
 
Gray et al. 2004 Science copy
Gray et al. 2004 Science copyGray et al. 2004 Science copy
Gray et al. 2004 Science copyPaul Gray
 
Bio
BioBio
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
Anita de Waard
 
Kimpo and Raymond_JNsci 2007
Kimpo and Raymond_JNsci 2007Kimpo and Raymond_JNsci 2007
Kimpo and Raymond_JNsci 2007Rhea Kimpo
 

Similar to The Narrative Structure of Research Articles, or, Why Science is Like a Fairy Tale (20)

Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papers
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysis
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
An Evolutionarily Conserved Long Noncoding RNA TUNA Controls Pluripotency and...
 
Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
 
How Scientists Read, How Computers Read, and What We Should Do
How Scientists Read, How Computers Read, and What We Should DoHow Scientists Read, How Computers Read, and What We Should Do
How Scientists Read, How Computers Read, and What We Should Do
 
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In... Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 
Zoology Second Year Important Question | Exam Tips and Tricks
Zoology Second Year Important Question | Exam Tips and TricksZoology Second Year Important Question | Exam Tips and Tricks
Zoology Second Year Important Question | Exam Tips and Tricks
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaper
 
MLa_NCUR2015
MLa_NCUR2015MLa_NCUR2015
MLa_NCUR2015
 
Reiter lecture 11.11.14
Reiter lecture 11.11.14Reiter lecture 11.11.14
Reiter lecture 11.11.14
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
 
Gray et al. 2004 Science copy
Gray et al. 2004 Science copyGray et al. 2004 Science copy
Gray et al. 2004 Science copy
 
Bio
BioBio
Bio
 
CSBJ-9-e201401001
CSBJ-9-e201401001CSBJ-9-e201401001
CSBJ-9-e201401001
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 
Kimpo and Raymond_JNsci 2007
Kimpo and Raymond_JNsci 2007Kimpo and Raymond_JNsci 2007
Kimpo and Raymond_JNsci 2007
 

More from Anita de Waard

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Anita de Waard
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
Anita de Waard
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
Anita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
Anita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
Anita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
Anita de Waard
 
History of the future
History of the futureHistory of the future
History of the future
Anita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
Anita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
Anita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Anita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
Anita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
Anita de Waard
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
Anita de Waard
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
Anita de Waard
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to Reuse
Anita de Waard
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
Anita de Waard
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Anita de Waard
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective Data
Anita de Waard
 

More from Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to Reuse
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective Data
 

Recently uploaded

platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
yusufzako14
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
azzyixes
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
anitaento25
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 

Recently uploaded (20)

platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 

The Narrative Structure of Research Articles, or, Why Science is Like a Fairy Tale

  • 1. The Narrative Structure of Research Articles, or, Why Science is Like a Fairy Tale Anita de Waard, VP Research Data Collaborations Research Data Management Services, Elsevier
  • 2. Overview 1. Discourse Comprehension 101 2. Story grammars and the Cycle of Scientific Investigation 3. How can we help scientists read?
  • 3. Discourse Comprehension 101 • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text!
  • 4. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! Discourse Comprehension 101
  • 5. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! Discourse Comprehension 101
  • 6. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! Discourse Comprehension 101
  • 7. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! Discourse Comprehension 101
  • 8. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! Discourse Comprehension 101
  • 9. • Letter < syllable < word < clause < sentence < discourse: This is how linguistics is structured. But it is not how we understand text! • Kintsch and Van Dijk, ‘93: we read a text at three levels: – surface code: literal text, exact words/syntax – text base: preserves meaning, but not exact wording – situation model: ‘microworld’ that the text is about: constructed inferentially through interaction between the text and background knowledge • We use knowledge about text genre to activate a schema: this allows creation of the text base and situation model Discourse Comprehension 101
  • 11. The structure of a research paper: Discussion: • Statement of principal findings • Strengths and weaknesses of the study • Relation to other studies • Unanswered questions and future research Introduction: “Create a Research Space” • Establish a research territory • Establish a niche • Occupy the niche Methods and Results: “Cycles of Scientific Investigation” (see below)
  • 12. THORNDYKE, P.W. (1977), Cognitive Structures in Comprehension and Memory of Narrative Discourse, COGNITIVE PSYCHOLOGY 9, 77- 110 (1977) A Story Grammar:
  • 13. Story Grammar The Story of Goldilocks and the Three Bears Setting Time Once upon a time Character a little girl named Goldilocks Location She went for a walk in the forest. Pretty soon, she came upon a house. Theme Goal She knocked and, when no one answered, Attempt she walked right in. Episode Name At the table in the kitchen, there were three bowls of porridge. Subgoal Goldilocks was hungry. Attempt She tasted the porridge from the first bowl. Outcome This porridge is too hot! she exclaimed. Attempt So, she tasted the porridge from the second bowl. Outcome This porridge is too cold, she said Attempt So, she tasted the last bowl of porridge. Paper Grammar The AXH Domain of Ataxin-1 Mediates Neurodegeneration through Its Interaction with Gfi-1/Senseless Proteins Background The mechanisms mediating SCA1 pathogenesis are still not fully understood, but some general principles have emerged. Objects of study the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract, Experimental setup studied and compared in vivo effects and interactions to those of the human protein Research goal Gain insight into how Atx-1's function contributes to SCA1 pathogenesis. How these interactions might contribute to the disease process and how they might cause toxicity in only a subset of neurons in SCA1 is not fully understood. Hypothesis Atx-1 may play a role in the regulation of gene expression Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in Files Subgoal test the function of the AXH domain Method overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and Perrimon, 1993) and compared its effects to those of hAtx-1. Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives expression in the differentiated R1-R6 photoreceptor cells (Mollereau et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the eye, as does overexpression of hAtx- 1[82Q]. Although at 2 days after eclosion, overexpression of either Atx-1 does not show obvious morphological changes in the photoreceptor cells Data (data not shown), Results both genotypes show many large holes and loss of cell integrity Story Grammar For A Science Paper:
  • 14. Rubber hits the road in Results: Cycles of Scientific Investigation © Gully Burns, 2011
  • 15. CoSI in action: © Gully Burns, 2011 3. We used for this experiment BJ/ET cells containing p14ARFkd because, following RASV12 treatment, in those cells p53 is still activated but more clearly stabilized than in parental BJ/ET cells (Voorhoeve and Agami, 2003), resulting in a sensitized system for slight alterations in p53 in response to RASV12. 1. Importantly, our results so far indicate that the expression of miR-372&3 did not reduce the activity of RASV12, as these cells were still growing faster than normal cells and were tumorigenic, for which RAS activity is indispensable (Hahn et al, 1999 and Kolfschoten et al, 2005). 2. To shed more light on this aspect, we examined the effect of miR-372&3 expression on p53 activation in response to oncogenic stimulation. 4. Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells.
  • 16. In defense of the clause as the unit of thought: 1. Importantly, our results so far indicate that the expression of miR- 372&3 did not reduce the activity of RASV12, as these cells were still growing faster than normal cells and were tumorigenic, for which RAS activity is indispensable (Hahn et al, 1999 and Kolfschoten et al, 2005). 2. To shed more light on this aspect, we examined the effect of miR- 372&3 expression on p53 activation in response to oncogenic stimulation. 3. We used for this experiment BJ/ET cells containing p14ARFkd because, following RASV12 treatment, in those cells p53 is still activated but more clearly stabilized than in parental BJ/ET cells (Voorhoeve and Agami, 2003), resulting in a sensitized system for slight alterations in p53 in response to RASV12. 4. Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells. Regulatory clause Fact Goal Method Result Implication
  • 17. Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis. Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells- specific miRNA cluster (Suh et al, 2004). In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis. Fact Hypothesis Method Result Implication Goal Reg-Implication Conceptual knowledge Experimental Evidence Clause, realm and tense:
  • 18. Facts in the eternal present Endogenous small RNAs (miRNAs) regulate gene expression by mechanisms conserved across metazoans. I sing of golden-throned Hera whom Rhea bare. Queen of the immortals is she, surpassing all in beauty: she is the sister and the wife of loud- thundering Zeus, --the glorious one whom all the blessed throughout high Olympus reverence and honor. Events in the simple past Vehicle-treated animals spent equivalent time investigating a juvenile in the first and second sessions in experiments conducted in the NAC and the striatum: T1 values were 122 ± 6 s and 114 ± 5 s. Now the wooers turned to the dance and to gladsome song, and made them merry, and waited till evening should come; and as they made merry dark evening came upon them. Events with embedded facts We also generated BJ/ET cells expressing the RASV12-ERTAM chimera gene, which is only active when tamoxifen is added (De Vita et al, 2005). And she took her mighty spear, tipped with sharp bronze, heavy and huge and strong, wherewith she vanquishes the ranks of men-of warriors, with whom she is wroth, she, the daughter of the mighty sire. Attribution in the present perfect miRNAs have emerged as important regulators of development and control processes such as cell fate determination and cell death (Abrahante et al., 2003, Brennecke et al., 2003, Chang et al., 2004, Chen et al., 2004, Johnston and Hobert, 2003, Lee et al., 1993] In this book I have had old stories written down, as I have heard them told by intelligent people, concerning chiefs who have held dominion in the northern countries, and who spoke the Danish tongue; and also concerning some of their family branches, according to what has been told me. Implications are hedged, and in the present tense These results indicate that although miR- 372&3 confer complete protection to oncogene-induced senescence in a manner similar to p53 inactivation, the cellular response to DNA damage remains intact Now it is said that ever since then whenever the camel sees a place where ashes have been scattered, he wants to get revenge with his enemy the rat and stomps and rolls in the ashes hoping to get the rat Tense use in science and mythology:
  • 19. Summing up: 1. Discourse Comprehension 101: – We read gobs of text and integrate these with our knowledge networks – We understand through schema’s 2. Story grammars and the Cycle of Scientific Investigation – Papers are like fairytales – Within Results, Cycles of Scientific Investigation connect data to claims – Tense helps identify the realm of the claim (like in mythology) 3. How can we use this to help scientists read?
  • 20. So how can this understanding help us help scientists read papers? • Why do we read? To learn, i.e.: obtain the knowledge contained within the text and integrate it with what we already know. • What do we read? Things that are ‘interesting’ : – Pertinent – Possibly/probably true – Novel, but in agreement with what I know • How do we read?
  • 21. human breast cancer noninvasive MCF7-Ras antisense oligonucleotides high-grade malignancy cell viability retroviral vector miR-31 cloned transiently expressed miRNA sponges Is it pertinent? -> Possibly… Is it true? -> ? Is it new, but in agreement with what I know? -> -? Represent a paper as Collections of Noun Phrases?
  • 22. miR-31 PREVENT acquisition of aggressive traits miR-31 INHIBIT noninvasive MCF7-Ras cells miR-31 ENHANCE invasion cell viability AFFECT inhibitor miR-31 expression DEPRIVE metastatic cells Is it pertinent? -> Possibly… Is it true? -> ? Is it new, but in agreement with what I know? ->? Represent a paper as Triples (Two Nouns and a Verb):
  • 23. The preceding observations demonstrated that X expression deprives Y cells of attributes associated with Z. We next asked whether X also prevents the acquisition of A traits by B cells. To do so, we transiently inhibited X in C cells with either D or E. Both approaches inhibited X function by > 4.5-fold (Figure S7A). Suppression of X enhanced invasion by 20-fold and motility by 5-fold, but F was unaffected by either inhibitor (Figure 3A; Figure S7B). The E sponge reduced X function by 2.5-fold, but did not affect the activity of other known Js (Figures S8A and S8B). Collectively, these data indicated that sustained X activity is necessary to prevent the acquisition of Z traits by both K and untransformed B cells. Is it pertinent? -> Need content Is it true? -> Sounds likely! I know this stuff! Is it new, but in agreement with what I know? -> Need content Represent a paper’s Metadiscourse:
  • 24. Claim: • sustained miR-31 activity is necessary to prevent the acquisition of aggressive traits by both tumor cells and untransformed breast epithelial Evidence: Method: • We transiently inhibited miR-31 in noninvasive MCF7-Ras cells with either antisense oligonucleotides or miRNA sponges. Evidence: Result: • Both approaches inhibited miR-31 function by >4.5-fold (Figure S7A). • Suppression of miR-31 enhanced invasion by 20-fold and motility by 5-fold, but cell viability was unaffected by either inhibitor (Figure 3A; Figure S7B). • The miR-31 sponge reduced miR-31 function by 2.5-fold, but did not affect the activity of other known antimetastatic miRNAs (Figures S8A and S8B). Is it pertinent? -> Probably Is it true? -> Sounds likely! Is it new, but in agreement with what I know? -> Check/know Represent a Paper as a Set of Claims and Evidence:
  • 25. Is it pertinent? -> Possibly Is it true? Is it new, but in agreement with what I know? -> Need background -> Probably! Show who wrote it, and where:
  • 26. So we probably need all of these: • Surface code provides noun phrases and triples that offer pointers re. topical relevance • Text base and and situation model are created through specific metadiscourse conventions (e.g. refs at the end) that create a biological reasoning model: • This can be expressed as a set of claims, linked to evidence, that can help represent key points in the paper • Journal name and author’s affiliation help define schema and provide ‘willingness to be convinced’ socially/interpersonally. We next asked whether … To do so, we transiently inhibited… Suppression of X enhanced invasion … but F was unaffected …(Figure 3A). … Collectively, these data indicated that … . Hypothesis Goal/Method Result Results Implication
  • 27. But wait: there’s a wolf in the woods! This article has been retracted: please see Elsevier Policy on Article Withdrawal (http://www.elsevier.com/locate/withdrawalpolicy). This article has been retracted at the request of the authors. Our study reported that miR-31 is a regulator of multiple mRNAs important for different aspects of breast cancer metastasis. We recently identified concerns with several figure panels in which original data were compiled from different replicate experiments in order to assemble the presented figure. The scope of the figure preparation issues includes compiling data from independent experiments to present them as one internally controlled experiment, statistical analyses based on technical replicates that are not reflective of the biological replicates, and comparisons of selectively chosen data points from multiple experiments. As many of the published figures are therefore not appropriate or accurate representations of the original data, we believe that the responsible course of action is to retract the paper. We apologize for any inconvenience we have caused.
  • 28. In Summary: 1. Discourse Comprehension 101 2. Story grammars and the Cycle of Scientific Investigation 3. How can we help scientists read? – Tools that ‘read’ papers and allow easy access to claims and evidence – Tools and practices that record data (=evidence) throughout the practice of creating it – Tools that help us make sense out of all of this networked knowledge – Cultural habits to support these practices.
  • 29. For Change to Occur, We Need Networks of Collaboration: Force11: – Multi-stakeholder, member-driven organisation – Unites scholars, tool developers, librarians, publishers, funding agencies etc. etc. – E.g.: RRID initiative just got implemented in Cell: “STAR Methods: Structured, Transparent, Accessible Reporting.” National Data Service: – Multi-stakeholder group, based around supercomputing centres – Aims to be a ‘connective tissue’ between data creation, curation, storage etc projects. – Inviting Pilots: two or more partners who have not worked together, interested in collaborating on a data-centric project to solve a real-world needs – E.g. Datasearch, Data Linking systems RDA: – Coleading Data publishing, linking group – Colead Cost Recovery group, part of RDA US Sustainability effort – Active in Chemistry, Earth Science groups, starting IG on Data Search – SciDataCon, Sept 11-16, Denver, CO The National DATA SERVICE
  • 30. Anita de Waard VP Research Data Collaborations Research Data Management Services, Elsevier a.dewaard@elsevier.com And we all live happily ever after….
  • 32. Noun Phrases: some issues • Problem 1: disambiguating terms (© GoPubMed): – Hnrpa1 = Tis = Fli-2 = nuclear ribonucleoprotein A1 = helix destabilizing protein = single-strand binding protein = hnRNP core protein A1 = HDP-1 = topoisomerase-inhibitor suppressed. – Cellulose 1,4-beta-cellobiosidase = exoglucanase – COLD =/ C.O.L.D. =/ cold (runny nose) =/ cold (low T) • Problem 2: disambiguating entities (© M. Martone): – 95 antibodies were (manually!) identified in 8 articles – 52 did not contain enough information to determine the antibody used – Some provided details in other papers – Failed to give species, clonality, vendor, or catalog number
  • 33. Noun Phrases: some progress • Despite these difficulties, noun phrase recall/precision is quite high, e.g. I2B22011 [1], [2], others: 90%-98% • Many tools, see [3] for a list; e.g. GoPubMed:
  • 34. Triples: some issues: • Contingent on good NP & VP detection • Hard to parse text! E.g. a commercial tool gave: insulin maintaining glucose homeostasis When insulin secretion cannot be increased adequately (type I diabetes defect) to overcome insulin resistance in maintaining glucose homeostasis, hyperglycemia and glucose intolerance ensues. insulin may be involved glucose homeostasis Because PANDER is expressed by pancreatic beta-cells and in response to glucose in a similar way to those of insulin, PANDER may be involved in glucose homeostasis.
  • 35. Triples: some progress: Biological Expression Language [4]: We provide evidence that these miRNAs are potential novel oncogenes participating in the development of human testicular germ cell tumors by numbing the p53 pathway, thus allowing tumorigenic growth in the presence of wild-type p53. Increased abundance of miR-372 decreases activity of TP53 r(MIR:miR-372) -| tscript(p(HUGO:Trp53)) Context: cancer SET Disease = “Cancer” Activity of TP53 decreases cell growth tscript(p(HUGO:Trp53)) -| bp(GO:”Cell Growth”
  • 36. Metadiscourse: why it matters • Voorhoeve et al., 2006: “These miRNAs neutralize p53- mediated CDK inhibition, possibly through direct inhibition of the expression of the tumor suppressor LATS2.” • Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor LATS2 (Voorhoeve et al., 2006).” • Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumo suppressor (Voorhoeve et al., 2006).” • Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly inhibit the expression of Lats2, thereby allowing tumorigenic growth in the presence of p53 (Voorhoeve et al., 2006).” “[Y]ou can transform .. fiction into fact just by adding or subtracting references”, Bruno Latour [5]
  • 37. Adding Metadiscourse To Triples Claim ORCA Value Together, Lats2 and ASPP1 shunt p53 to proapoptotic promoters and promote the death of polyploid cells [1]. (…) Value = 3 Source = N Basis = 0 Further biochemical characterization of hMOBs showed that only hMOB1A and hMOB1B interact with both LATS1 and LATS2 in vitro and in vivo [39]. (…) Value = 3 Source = N Basis = Data Our findings reveal that miR-373 would be a potential oncogene and it participates in the carcinogenesis of human esophageal cancer by suppressing LATS2 expression. Value = 1 or 2 ? Source = Author Basis = Data Furthermore, we demonstrated that the direct inhibition of LATS2 protein was mediated by miR-373 and manipulated the expression of miR-373 to affect esophageal cancer cells growth. Value = 2 (or 3?) Source = Author Basis = Data
  • 38. Claims and Evidence: some issues: • Data2Semantics [11]: linking clinical guidelines to evidence. Inconsistency within guideline and guidelines v. evidence: • Studies have demonstrated inconsistent results regarding the use of such markers of inflammation as C-reactive protein (CRP), interleukins- 6 (IL-6) and -8, and procalcitonin (PCT) in neutropenic patients with cancer [55–57]. • [55]: PCT and IL-6 are more reliable markers than CRP for predicting bacteremia in patients with febrile neutropenia • [56] In conclusion, daily measurement of PCT or IL-6 could help identify neutropenic patients with a stable course when the fever lasts >3 d. …, it would reduce adverse events and treatment costs. • [57] Our study supports the value of PCT as a reliable tool to predict clinical outcome in febrile neutropenia. • Drug Interaction Knowledgebase [12]: how to identify evidence? • R-citalopram_is_not_substrate_of_cyp2c19: • At 10uM R- or S-CT, ketoconazole reduced reaction velocity to 55 -60% of control, quinidine to 80%, and omeprazole to 80-85% of control (Fig. 6).
  • 39. Claims and Evidence: some progress • Defining ‘salient knowledge components’ in text: – Argumentative zones, CoreSC can both be found – Blake, Claim networks (more soon!) – Claimed Knowledge Updates (Sandor/de Waard, [13]):