The document discusses structuring evidence from online discussions to synthesize knowledge. It describes annotating a corpus of Wikipedia deletion discussions to identify key decision factors for determining what content should be included. These factors were then used to build a computer system that semantically enriches the discussion data and generates a summary organized by decision factor. A user test found the experimental system was preferred as it made the discussions and decisions easier to understand. The process demonstrates how identifying a community's evidence criteria can help structure information and support knowledge synthesis.
How communities curate knowledge & how ontologists can help -Eurecom--2015-01-19jodischneider
Invited talk 2015-01-19 at EURCOM.
Two themes:
How do communities curate knowledge?
and
How can information technology help?
Q: How do communities curate knowledge?
A: Communities curate knowledge by discussing evidence and applying community standards to it.
In Wikipedia, 4 questions are used to evaluate borderline articles:
Notability – Is the topic appropriate for our encyclopedia?
Sources – Is the article well-sourced?
Maintenance – Can we maintain this article?
Bias – Is the article neutral? POV appropriately weighted?
Q: How can information technology help?
A: Information technology can organize evidence based on the criteria communities use.
In Wikipedia, we developed an alternate interface for deletion discussions.
Envisioning argumentation and decision making support for debates in open onl...jodischneider
Paper for the First Workshop on Argumentation Mining at the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, June 26 2014
Abstract:
Argumentation mining, a relatively new area of discourse analysis, involves automatically identifying and structuring arguments. Following a basic introduction to argumentation, we describe a new possible domain for argumentation mining: debates in open online collaboration communities. Based on our experience with manual annotation of arguments in debates, we envision argumentation mining as the basis for three kinds of support tools, for authoring more persuasive arguments, finding weaknesses in others’ arguments, and summarizing a debate’s overall conclusions.
Full paper:
http://jodischneider.com/pubs/aclargmining2014.pdf
Proceedings with links:
http://acl2014.org/acl2014/W14-21/index.html
Workshop homepage:
http://www.uncg.edu/cmp/ArgMining2014/
How communities curate knowledge & how ontologists can help -Eurecom--2015-01-19jodischneider
Invited talk 2015-01-19 at EURCOM.
Two themes:
How do communities curate knowledge?
and
How can information technology help?
Q: How do communities curate knowledge?
A: Communities curate knowledge by discussing evidence and applying community standards to it.
In Wikipedia, 4 questions are used to evaluate borderline articles:
Notability – Is the topic appropriate for our encyclopedia?
Sources – Is the article well-sourced?
Maintenance – Can we maintain this article?
Bias – Is the article neutral? POV appropriately weighted?
Q: How can information technology help?
A: Information technology can organize evidence based on the criteria communities use.
In Wikipedia, we developed an alternate interface for deletion discussions.
Envisioning argumentation and decision making support for debates in open onl...jodischneider
Paper for the First Workshop on Argumentation Mining at the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, June 26 2014
Abstract:
Argumentation mining, a relatively new area of discourse analysis, involves automatically identifying and structuring arguments. Following a basic introduction to argumentation, we describe a new possible domain for argumentation mining: debates in open online collaboration communities. Based on our experience with manual annotation of arguments in debates, we envision argumentation mining as the basis for three kinds of support tools, for authoring more persuasive arguments, finding weaknesses in others’ arguments, and summarizing a debate’s overall conclusions.
Full paper:
http://jodischneider.com/pubs/aclargmining2014.pdf
Proceedings with links:
http://acl2014.org/acl2014/W14-21/index.html
Workshop homepage:
http://www.uncg.edu/cmp/ArgMining2014/
An informatics perspective on argumentation mining - SICSA 2014-07-09jodischneider
Informal talk for the SICSA argumentation mining workshop: http://www.arg-tech.org/index.php/sicsa-workshop-on-argument-mining-2014/
For more details, see two related papers:
(1) Automated argumentation mining to the rescue? Envisioning argumentation and decision-making support for debates in open online collaboration communities.
ACL First Workshop on Argumentation Mining (summary of my PhD work)
http://jodischneider.com/pubs/aclargmining2014.pdf
(2) Modeling Arguments in Scientific Papers
Jodi Schneider, Carol Collins, Lisa Hines, John R Horn and Richard Boyce
ArgDiaP conference
http://jodischneider.com/pubs/argdiap2014.pdf
Talking is (virtual) work -supporting online argumentation--2013-09-18 Malta ...jodischneider
In open collaboration systems, work gets done through talking. We support a particular kind of talk-based work -- deletion discussions in Wikipedia -- by categorizing and summarizing discussions. In a user test, 84% find benefit from this.
This talk about my thesis was given 2013-09-18 in Malta at the Virtual Work training school:
http://dynamicsofvirtualwork.com/malta-training-school/
part of the COST action on Virtual Work
http://cost.eu/domains_actions/isch/Actions/IS1202
Slides for a workshop session on "Open Knowledge: Wikipedia and Beyond" facilitated by Brian Kelly and Simon Grant, Cetis at the Cetis 2014 conference at the University of Bolton on 17-18 June 2014.
See http://ukwebfocus.wordpress.com/events/cetis-2014-open-knowledge-wikipedia-and-beyond/
Wikipedia, the encylopedia that anyone can edit, “can never work in theory, only in practice.” Accounting for one in every 200 page views on the Internet, it has become a part of our everyday lives. Wikipedia is changing the way we think about the economics of the web, the potential and the pitfalls of engaging the masses, and the role of professional information architects in a world in which content arrives from literally every direction.
In this session, we’ll explore the nuts-and-bolts of how the Wikipedia project works. Who writes Wikipedia, and why? How does the English Wikipedia maintain quality, consistent tagging, and coherent organization across over two million articles? What happens when contributors disagree? We will take a tour behind the scenes at Wikipedia to learn what happens when users are encouraged to - as they say on Wikipedia… “be bold.”
Musings at the Crossroads of Digital Libraries, Information Retrieval, and Sc...Guillaume Cabanac
Digital documents support and shape people’s daily activities. Regarding Computer Science, such documents are the cornerstone of two areas of research: Digital Libraries and Information Retrieval. In this presentation, we discuss the research questions that we addressed in these areas, such as:
* Digital Libraries:
- How to transpose paper-based annotations into digital documents?
- How to measure the social validity of a statement according to the argumentative discussion it sparked off?
- How to harness a quiescent capital present in any organization: its documents?
* Information Retrieval
- Is document tie-breaking affecting the evaluation of Information Retrieval systems?
- How to retrieve documents matching keywords and spatiotemporal constraints?
- Do operators in search queries (e.g., ‘+’, ‘^’) improve the effectiveness of search results?
Each question gives us the opportunity to recall background knowledge, such as how to evaluate the effectiveness of a search engine?
Finally, we discuss some of our works related to Scientometrics, which may be defined as the study of science with scientific methods. We applied techniques of Information Retrieval to documents extracted from scientific Digital Libraries. We plan to introduce our findings to the following questions:
- How to recommend researchers according to their research topics and social clues?
- What is the landscape of research in Information Systems from the perspective of gatekeepers?
- What if submission date influenced the acceptance of conference papers?
Through this journey at the crossroads of Digital Libraries, Information Retrieval, and Scientometrics, we wish to pass on our enthusiasm for these subjects to academics and students alike.
Presented this deck a numerous time @Wikimanias, university, First Monday - this final version is for Barcamp NorthEast @Newcastle May 08. Questions or Comments: http://cathyma.com
Where are all the Semantic Web agents? There are billions of "machine readable" open facts on the Semantic Web, i.e. Linked Open Data (LOD), isn't that enough? It looks like it's not. We're still far from seeing Lucy's and Pete's agents brilliantly solving their tasks with the help of other Semantic Web agents they can trust (Tim Berners Lee et al., The Semantic Web, Scientific American (2001) ). Despite its technological impact on many applications and areas, the Semantic Web promised to cause a breakthrough that we didn't yet experience. One issue is that LOD ontologies are not as linked as they should be. Another issue is that formalising only semi-structured Web pages or databases is not enough for making them able to operate. They also need to reason with commonsense knowledge, the encoding of which is a long-standing challenge in Artificial Intelligence. A third consideration is that most existing commonsense knowledge bases lack formal semantics and situational constraints. In this talk I will advocate the role of the Semantic Web as a provider of a knowledge graph of commonsense to Artificial Intelligence, and discuss ways and obstacles towards the achievement of this goal.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
A basic wiki primer. Examples are nonprofit but info applies to any organization. Covers:
What are the attributes of a wiki?
How do wikis differ from other commonly used communication and collaboration tools?
What kind of problems can a wiki solve? What are its uses?
What are examples of these use cases? (with screenshots)
How can you build a successful wiki?
Continued citation of bad science and what we can do about it--2021-04-20jodischneider
Continued Citation of Bad Science and What We Can Do About It
Even papers that falsify data continue to be cited. I describe network and text analysis for studying how authors continue to cite bad science: articles retracted from the literature due to serious flaws or errors. I will present an in-depth case study of a human trial cited for over 10 years after it was retracted for falsifying data. Then, I will describe how the team scaled up to study a data set of 7000 retracted papers and hundreds of thousands of citations. Finally, I will discuss an ongoing Sloan-funded stakeholder consultation that is bringing editors, publishers, librarians, researchers, and research integrity experts together to address this problem.
BiographyJodi Schneider is Assistant Professor at the School of Information Sciences, University of Illinois at Urbana-Champaign where she runs the Information Quality Lab. She studies the science of science through the lens of arguments, evidence, and persuasion with a special interest in controversies in science. Her recent work has focused on topics such as systematic review automation, semantic publication, and the citation of retracted papers. Interdisciplinarity (PhD in Informatics, MS Library & Information Science, MA Mathematics; BA Great Books/liberal arts) is a fundamental principle of her work. She has held research positions across the U.S. as well as in Ireland, England, France, and Chile. She leads the Alfred P. Sloan-funded project, Reducing the Inadvertent Spread of Retracted Science: Shaping a Research and Implementation Agenda. With Aaron Cohen and Neil Smalheiser she is working on the NIH R01 "Text Mining Pipeline to Accelerate Systematic Reviews in Evidence-Based Medicine". Talk with her about scoping reviews and about citation-based methods for updating systematic reviews!
Tuesday, April 20th, 2021
Noon-1PM Eastern
GWU - CNHS Informatics Seminar
Continued citation of bad science and what we can do about it--2021-02-19jodischneider
Title: Continued Citation of Bad Science and What We Can Do About It
Abstract: Even papers that falsify data continue to be cited. I describe network and text analysis for studying how authors continue to cite bad science: articles retracted from the literature due to serious flaws or errors. Jodi will present an in-depth case study of a human trial cited for over 10 years after it was retracted for falsifying data. Then, will describe how the team scaled up to study a data set of 7000 retracted papers and hundreds of thousands of citations. Finally, Jodi will discuss an ongoing Sloan-funded stakeholder consultation that is bringing editors, publishers, librarians, researchers, and research integrity experts together to address this problem.
More Related Content
Similar to Synthesizing knowledge from disagreement -cwi-2015-04-23
An informatics perspective on argumentation mining - SICSA 2014-07-09jodischneider
Informal talk for the SICSA argumentation mining workshop: http://www.arg-tech.org/index.php/sicsa-workshop-on-argument-mining-2014/
For more details, see two related papers:
(1) Automated argumentation mining to the rescue? Envisioning argumentation and decision-making support for debates in open online collaboration communities.
ACL First Workshop on Argumentation Mining (summary of my PhD work)
http://jodischneider.com/pubs/aclargmining2014.pdf
(2) Modeling Arguments in Scientific Papers
Jodi Schneider, Carol Collins, Lisa Hines, John R Horn and Richard Boyce
ArgDiaP conference
http://jodischneider.com/pubs/argdiap2014.pdf
Talking is (virtual) work -supporting online argumentation--2013-09-18 Malta ...jodischneider
In open collaboration systems, work gets done through talking. We support a particular kind of talk-based work -- deletion discussions in Wikipedia -- by categorizing and summarizing discussions. In a user test, 84% find benefit from this.
This talk about my thesis was given 2013-09-18 in Malta at the Virtual Work training school:
http://dynamicsofvirtualwork.com/malta-training-school/
part of the COST action on Virtual Work
http://cost.eu/domains_actions/isch/Actions/IS1202
Slides for a workshop session on "Open Knowledge: Wikipedia and Beyond" facilitated by Brian Kelly and Simon Grant, Cetis at the Cetis 2014 conference at the University of Bolton on 17-18 June 2014.
See http://ukwebfocus.wordpress.com/events/cetis-2014-open-knowledge-wikipedia-and-beyond/
Wikipedia, the encylopedia that anyone can edit, “can never work in theory, only in practice.” Accounting for one in every 200 page views on the Internet, it has become a part of our everyday lives. Wikipedia is changing the way we think about the economics of the web, the potential and the pitfalls of engaging the masses, and the role of professional information architects in a world in which content arrives from literally every direction.
In this session, we’ll explore the nuts-and-bolts of how the Wikipedia project works. Who writes Wikipedia, and why? How does the English Wikipedia maintain quality, consistent tagging, and coherent organization across over two million articles? What happens when contributors disagree? We will take a tour behind the scenes at Wikipedia to learn what happens when users are encouraged to - as they say on Wikipedia… “be bold.”
Musings at the Crossroads of Digital Libraries, Information Retrieval, and Sc...Guillaume Cabanac
Digital documents support and shape people’s daily activities. Regarding Computer Science, such documents are the cornerstone of two areas of research: Digital Libraries and Information Retrieval. In this presentation, we discuss the research questions that we addressed in these areas, such as:
* Digital Libraries:
- How to transpose paper-based annotations into digital documents?
- How to measure the social validity of a statement according to the argumentative discussion it sparked off?
- How to harness a quiescent capital present in any organization: its documents?
* Information Retrieval
- Is document tie-breaking affecting the evaluation of Information Retrieval systems?
- How to retrieve documents matching keywords and spatiotemporal constraints?
- Do operators in search queries (e.g., ‘+’, ‘^’) improve the effectiveness of search results?
Each question gives us the opportunity to recall background knowledge, such as how to evaluate the effectiveness of a search engine?
Finally, we discuss some of our works related to Scientometrics, which may be defined as the study of science with scientific methods. We applied techniques of Information Retrieval to documents extracted from scientific Digital Libraries. We plan to introduce our findings to the following questions:
- How to recommend researchers according to their research topics and social clues?
- What is the landscape of research in Information Systems from the perspective of gatekeepers?
- What if submission date influenced the acceptance of conference papers?
Through this journey at the crossroads of Digital Libraries, Information Retrieval, and Scientometrics, we wish to pass on our enthusiasm for these subjects to academics and students alike.
Presented this deck a numerous time @Wikimanias, university, First Monday - this final version is for Barcamp NorthEast @Newcastle May 08. Questions or Comments: http://cathyma.com
Where are all the Semantic Web agents? There are billions of "machine readable" open facts on the Semantic Web, i.e. Linked Open Data (LOD), isn't that enough? It looks like it's not. We're still far from seeing Lucy's and Pete's agents brilliantly solving their tasks with the help of other Semantic Web agents they can trust (Tim Berners Lee et al., The Semantic Web, Scientific American (2001) ). Despite its technological impact on many applications and areas, the Semantic Web promised to cause a breakthrough that we didn't yet experience. One issue is that LOD ontologies are not as linked as they should be. Another issue is that formalising only semi-structured Web pages or databases is not enough for making them able to operate. They also need to reason with commonsense knowledge, the encoding of which is a long-standing challenge in Artificial Intelligence. A third consideration is that most existing commonsense knowledge bases lack formal semantics and situational constraints. In this talk I will advocate the role of the Semantic Web as a provider of a knowledge graph of commonsense to Artificial Intelligence, and discuss ways and obstacles towards the achievement of this goal.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
A basic wiki primer. Examples are nonprofit but info applies to any organization. Covers:
What are the attributes of a wiki?
How do wikis differ from other commonly used communication and collaboration tools?
What kind of problems can a wiki solve? What are its uses?
What are examples of these use cases? (with screenshots)
How can you build a successful wiki?
Similar to Synthesizing knowledge from disagreement -cwi-2015-04-23 (20)
Continued citation of bad science and what we can do about it--2021-04-20jodischneider
Continued Citation of Bad Science and What We Can Do About It
Even papers that falsify data continue to be cited. I describe network and text analysis for studying how authors continue to cite bad science: articles retracted from the literature due to serious flaws or errors. I will present an in-depth case study of a human trial cited for over 10 years after it was retracted for falsifying data. Then, I will describe how the team scaled up to study a data set of 7000 retracted papers and hundreds of thousands of citations. Finally, I will discuss an ongoing Sloan-funded stakeholder consultation that is bringing editors, publishers, librarians, researchers, and research integrity experts together to address this problem.
BiographyJodi Schneider is Assistant Professor at the School of Information Sciences, University of Illinois at Urbana-Champaign where she runs the Information Quality Lab. She studies the science of science through the lens of arguments, evidence, and persuasion with a special interest in controversies in science. Her recent work has focused on topics such as systematic review automation, semantic publication, and the citation of retracted papers. Interdisciplinarity (PhD in Informatics, MS Library & Information Science, MA Mathematics; BA Great Books/liberal arts) is a fundamental principle of her work. She has held research positions across the U.S. as well as in Ireland, England, France, and Chile. She leads the Alfred P. Sloan-funded project, Reducing the Inadvertent Spread of Retracted Science: Shaping a Research and Implementation Agenda. With Aaron Cohen and Neil Smalheiser she is working on the NIH R01 "Text Mining Pipeline to Accelerate Systematic Reviews in Evidence-Based Medicine". Talk with her about scoping reviews and about citation-based methods for updating systematic reviews!
Tuesday, April 20th, 2021
Noon-1PM Eastern
GWU - CNHS Informatics Seminar
Continued citation of bad science and what we can do about it--2021-02-19jodischneider
Title: Continued Citation of Bad Science and What We Can Do About It
Abstract: Even papers that falsify data continue to be cited. I describe network and text analysis for studying how authors continue to cite bad science: articles retracted from the literature due to serious flaws or errors. Jodi will present an in-depth case study of a human trial cited for over 10 years after it was retracted for falsifying data. Then, will describe how the team scaled up to study a data set of 7000 retracted papers and hundreds of thousands of citations. Finally, Jodi will discuss an ongoing Sloan-funded stakeholder consultation that is bringing editors, publishers, librarians, researchers, and research integrity experts together to address this problem.
The problems of post retraction citation - and mitigation strategies that wor...jodischneider
Presentation for the Bibliometrics & Research Assessment Symposium 2020 (bibSymp20) https://www.nihlibrary.nih.gov/services/bibliometrics/bibSymp20
October 9, 2020
Retraction is intended to remove articles from the citable literature. However, a series of studies from over 30 years, from 1990 through 2020, have found that many retracted papers continue to be cited, and cited positively, even following misconduct-related retractions. For instance, a fraudulent clinical trial report retracted in 2008 continues to receive citations in 2020, and 96% of post-retraction citations do not mention its citation - perhaps because its retraction not marked on the publisher website and its retraction notice cannot be readily retrieved from 7 out of 8 databases (8 out of 9 database records) we tested. This talk draws an ongoing systematic mapping study of research about retraction and our own research projects to summarize what is known about post-retraction citation in biomedicine. We outline practical steps that authors and reviewers can take to avoid being caught out by poorly marked retracted papers.
20 minutes including Q&A
Towards knowledge maintenance in scientific digital libraries with the keysto...jodischneider
JCDL2020 full paper.
Abstract:
Scientific digital libraries speed dissemination of scientific publications, but also the propagation of invalid or unreliable knowledge. Although many papers with known validity problems are highly cited, no auditing process is currently available to determine whether a citing paper’s findings fundamentally depend on invalid or unreliable knowledge. To address this, we introduce a new framework, the keystone framework, designed to identify when and how citing unreliable findings impacts a paper, using argumentation theory and citation context analysis. Through two pilot case studies, we demonstrate how the keystone framework can be applied to knowledge maintenance tasks for digital libraries, including addressing citations of a non-reproducible paper and identifying statements most needing validation in a high-impact paper. We identify roles for librarians, database maintainers, knowledge base curators, and research software engineers in applying the framework to scientific digital libraries.
doi:10.1145/3383583.3398514
Preprint: http://jodischneider.com/pubs/jcdl2020.pdf
Methods Pyramids as an Organizing Structure for Evidence-Based Medicine--SIGC...jodischneider
Keynote talk 2020-08-01 for the JCDL Workshop on Conceptual Models: https://sig-cm.github.io/news/JCDL-2020-CFP/
Discussion points:
* Methods are a key part of the Knowledge Organizing Structure for Evidence-Based Medicine.
* Methods relate to how we GENERATE evidence.
* Different methods generate evidence of different kinds and strength.
* I believe Methods can be useful in mining claims and arguments from papers: methods AUTHORIZE claims.
* More specialized hierarchies of evidence can be found in medicine
* Various groups are complicating the “evidence pyramid” hierarchy of evidence.
Annotation examples. This is an overview of some of the software I have used for annotation (and a few extra features some of this software has.) This was presented in the SwissUniversities Doctoral Programme, Language & Cognition, in the Module: Linguistic and corpus perspectives on argumentative discourse.
Screenshots are given of GATE, UAM Corpus Tool, Excel, BRAT, EPPI Reviewer, and a custom tool. In most cases there are references to one of my papers for further details.
I briefly describe a typical annotation process:
Find text of interest
Find phenomena of interest
Draft an annotation manual
Iteratively test annotation & revise manual
Find questionable annotations, check disagreements.
Revise the manual.
Iterate.
Annotate
Argumentation mining--an introduction for linguists--Fribourg--2019-09-02jodischneider
An introduction to argumentation mining for PhD students. This was presented in the SwissUniversities Doctoral Programme, Language & Cognition, in the Module: Linguistic and corpus perspectives on argumentative discourse. The presentation largely follows Chapters 1-4 and Chapter 10 of my book, Argumentation Mining, co-authored with Manfred Stede in the Synthesis Lectures on Human Language Technologies from Morgan & Claypool: https://doi.org/10.2200/S00883ED1V01Y201811HLT040
Topics:
My book w/computational linguist Manfred Stede: Argumentation Mining
What is argumentation?
Argumentation mining: a first look
Argumentative language
Challenges for argumentation mining
Argumentation structures
Corpus annotation
Why study argumentation mining?
Beyond Randomized Clinical Trials: emerging innovations in reasoning about he...jodischneider
Talk at the 3rd European Conference on Argumentation
ABSTRACT: Specialized fields may at any time invent new inference rules—that is, new warrants—to improve on their stock of resources for drawing and defending conclusions. Yet disagreement over the acceptability of an invented warrant can always be re-opened. Randomized Clinical Trial is widely regarded as the gold standard for making inferences about causal relationships between medical treatments and patient outcomes. Once controversial, RCT achieved broad acceptance within the field as a result of warrant-establishing arguments circulating in the medical literature starting in the 1950s. And RCT has accumulated a very impressive track record of generating new conclusions that withstand critical scrutiny.
Here we look at two emerging innovations whose purpose is to support reasoning about health, offering ways to generate different classes of conclusions. These innovations could be seen as complementary to RCTs, but for both there are also hints of challenge to the enormous prestige of RCTs. We see this most particularly in the gap that has developed between the RCT-generated fact base and the decisions doctors and health policy officials have to make about treatments for patients. We’ve mentioned before that specialized inference methods that become stabilized within an expert community can meet unexpected challenges when they become components of reasoning by other communities. The two innovations considered here each allow us to explore the tensions that arise from the contrasting perspectives of scientists, clinicians, and patients.
Publishers are caretakers of science. Part of that work is maintaining the integrity of scientific literature. Science builds directly upon past work, so we need to be sure that we are building upon a solid foundation and not faulty research. Publishers need to take an active role in monitoring and tracking faulty, retracted research and its influence. I'm asking publishers to (1) clearly mark retracted papers; (2) alert authors who have already cited a retracted paper; and (3) before publishing an article, check its bibliography for retracted papers.
Retracted papers should be clearly marked everywhere they appear, but today that is not the case. Publishers can also use the CrossRef CrossMark service, which lets readers check for article updates (such as retraction) from a little red ribbon at the top of an article. Checking for citations to retracted articles, and limiting future citations, can help science self-correct by shoring up its foundations.
The structure of citation networks provides evidence about how scientific information is diffused. Problematic citation patterns include the selective citation of positive findings, citation bias, as well as the continued citation of retracted literature (i.e. literature formally withdrawn due to error, fraud, or ethical problems). For instance, there is some evidence that positive results tend to receive more citations. The public domain licensing of the Open Citations Corpus makes it possible, in principle, to estimate the likelihood that any network of research papers suffers from problematic citation. To-date, problematic citation been documented ad-hoc, in several striking studies. In Alzheimer's disease research, biased citation, ignoring critical findings, was used to support successful U.S. NIH grant proposals (Greenberg 2009). Mistranslation of obesity research has been used to justify exertion game research (Marshall & Linehan 2017). Citation of fraudulent research about Chronic Obstructive Pulmonary Disease continued after its retraction (Fulton et al. 2015). The data resulting from such studies is of great use to my lab in replicating and determining how to generalize the detection of problematic citation patterns. Previously, the detection of problematic citation patterns has been a side effect of astute researchers, noticing suspicious findings while conducting systematic literature reviews. This talk will describe work-in-progress in my lab detecting problematic citation patterns using natural language processing, combined with network analysis on the Open Citations Corpus.
Modeling Alzheimer’s Disease research claims, evidence, and arguments from a ...jodischneider
Presentation: Jodi Schneider and Novejot Sandhu, “Modeling Alzheimer’s Disease Reseach Claims, Evidence, and Arguments from a Biology Research Paper.” 9th International Conference on Argumentation, International Society for the Society of Argumentation, Amsterdam, Netherlands, July 5, 2018
Abstract: Argument visualization may help make research papers easier to understand, which could both speed quality assessment within a discipline and help build interdisciplinary knowledge networks. This paper presents a case study of the arguments in a single high-profile paper on Alzheimer's disease research. Within this one paper, we analyze and hand-annotate the main claim, which is supported by 4 subclaims, in turn supported by data, methods, and materials. We also investigate how the paper imports and uses knowledge claims from other research papers. We create a specialized argument-based knowledge representation called a micropublication. In future work, we will investigate automatic argumentation mining for experimental biology research papers. Our long-term vision is to create literature-scale claim-argument networks that help more quickly use new knowledge about human health.
Innovations in reasoning about health: the case of the Randomized Clinical Tr...jodischneider
Presentation: Jodi Schneider and Sally Jackson, “Innovations in Reasoning About Health: The Case of the Randomized Clinical Trial.” 9th International Conference on Argumentation, International Society for the Society of Argumentation, Amsterdam, Netherlands, July 5, 2018
Abstract: Field-dependence in argumentation comes about through forms of inference invented by specialized fields. In recent work we introduced the concept of a "warranting device": (1) an inference license (2) invented for a specialized argumentative purpose and (3) backed by institutional, procedural, and material assurances of the dependability of conclusions generated by the device. Once established, fields employ such devices across many situations without further defense, even as the devices develop in response to newly-noticed problems.
Many new warranting devices have appeared over the past century to solve problems in reasoning about health and medicine, replacing and obsolescing earlier forms of medical reasoning. One such device is the Randomized Controlled Trial. This case study traces its historical evolution and discusses some current movements toward competing device types.
Rhetorical moves and audience considerations in the discussion sections of ra...jodischneider
European Conference on Argumentation talk
Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu “Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions” [Conference Panel Presentation], 2nd European Conference on Argumentation: Argumentation and Inference, Fribourg, Switzerland, June 20-23
1 of 3 talks in Jodi Schneider and Sally Jackson, organizers, “Innovations in Reasoning and Arguing about Health ”[Conference Panel], 2nd European Conference on Argumentation: Argumentation and Inference, Fribourg, Switzerland, June 20-23.
Citation practices and the construction of scientific fact--ECA-facts-preconf...jodischneider
Citation practices and the construction of scientific fact. Presentation at the European Conference on Argumentation preconference on status, relevance, and authority of facts.
What WikiCite can learn from biomedical citation networks--Wikicite2017--2017...jodischneider
This is a quick, high-level tour of some ideas from evidence-based medicine, citation-related ontologies for argumentation and evidence curation and biomedicine.
Medication safety as a use case for argumentation mining, Dagstuhl seminar 16...jodischneider
Medication safety as a use case for argumentation mining
We present a use case for argumentation mining, from biomedical informatics, specifically from medication safety. Tens of thousands of preventable medical errors occur in the U.S. each year, due to limitations in the information available to clinicians. Current knowledge sources about potential drug-drug interactions (PDDIs) often fail to provide essential management recommendations and differ significantly in their coverage, accuracy, and agreement. The Drug Interaction Knowledge Base Project (Boyce, 2006-present; dikb.org) is addressing this problem.
Our current work is using knowledge representations and human annotation in order to represent clinically-relevant claims and evidence. Our data model incorporates an existing argumentation-focused ontology, the Micropublications Ontology. Further, to describe more specific information, such as the types of studies that allow inference of a particular type of claim, we are developing an evidence-focused ontology called DIDEO--Drug-drug Interaction and Drug-drug Interaction Evidence Ontology. On the curation side, we will describe how our research team is hand-extracting knowledge claims and evidence from the primary research literature, case reports, and FDA-approved drug labels for 65 drugs.
We think that medication safety could be an important domain for applying automatic argumentation mining in the future. In discussions at Dagstuhl, we would like to investigate how current argumentation mining techniques might be used to scale up this work. We can also discuss possible implications for representing evidence from other biomedical domains.
Talk for Dagstuhl Seminar 16161: Natural Language Argumentation: Mining, Processing, and Reasoning over Textual Arguments
http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16161
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...jodischneider
Presentation to Diane Litman's lab at the University of Pittsburgh about modeling and acquiring evidence for the Drug Interaction Knowledge Base (DIKB) project.
Persons, documents, models: organising and structuring information for the We...jodischneider
A talk for the Moore Institute for Humanities -
People and documents are of enduring interest. Documents may be generated by individuals, collective groups, and administrations, on any number of topics. We are particularly interested in the relationships between people and documents. The most important relationships are creation (authors, illustrators, translators, ...), usage (e.g. association copies), and topic-of (e.g. people may be the subjects of biographies).
In this lecture, we will talk about several approaches for modeling, or representing, people and documents. We pay particular attention to computer-based approaches to organization, and to organizing information for websites. We will talk briefly about TEI and XML, and the focus on my area of research expertise: modeling "linked data", a widely adopted approach for interlinking data. Adopted by the UK and US governments and search engines such as Google and Yahoo!, linked data has also been widely used in the digital humanities and by libraries, archives, and museums. It consists in naming objects of interest (be they authors, documents, or whatnot) and using standard data formats to enable interlinking.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
2. Overview
o My Background & Research Themes
o Structuring Evidence in Wikipedia Discussions
o Supporting Systematic Review of Biomedical Evidence
2
3. Themes in My Research
o How do people collaborate to generate knowledge?
o What counts as evidence in a given community?
o How can structuring evidence help synthesize info?
3
4. What knowledge should be included
in Wikipedia?
Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion:
How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW
2013.
Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia:
Some Proposals Using Decision Factors. [Demo]” In WikiSym2012.
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia:
Decision Factors and Outcomes.” In WikiSym2012.
4
13. Problem: Newcomers are confused about
Wikipedia’s standards.
o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
13
14. Problem: Newcomers are confused about
Wikipedia’s standards.
o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
14
18. Problem Summary
o Long, no-consensus discussions
Summarize discussions
o Newcomers are confused about Wikipedia's standards
Make article criteria more explicit
18
19. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
19
20. Approach: Structure Evidence
1. Understand what evidence the community uses
to establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
20
21. Sample Corpus
o 72 discussions started on 1 day.
Each discussion has
• 3–33 messages
• 2–15 participants
o In total, 741 messages contributed by 244 users.
Each message has
• 3–350+ words
o 98 printed A4 sheets
21
22. Structuring the Data: Annotation
o Content analysis of the corpus
o Compare two different annotation approaches
o Iterative annotation
• Multiple annotators
• Refine to get good inter-annotator agreement
• 4 rounds of annotation
22
23. 2 Types of Annotation
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Informal argumentation
(philosophical & computational argumentation)
• Identify & prevent errors in reasoning (fallacies)
• 60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
23
24. 2 Types of Annotation
1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
Informal argumentation
(philosophical & computational argumentation)
Identify & prevent errors in reasoning (fallacies)
60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
24
26. Factor Example (used to justify ‘keep’)
Notability Anyone covered by another
encyclopedic reference is considered
notable enough for inclusion in
Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly
notable band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket “hangon” for
all articles on newly-drafted players…
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Key Factors (& “Other”)
26
27. Decision factors articulate values/criteria.
o 4 Factors in Deletion Discussions cover:
• 91% of comments
• 70% of discussions
o Readers who understand these criteria:
• Understand what content is appropriate.
• Are less likely to have content deleted, and less likely to
take deletion personally.
27
28. To structure the data, we chose factors.
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Most appropriate for writing support
• 15 categories + 2 non-argumentative categories
• Detailed analysis of content
o 2. Factors Analysis
o (drawing on Ashley 1991)
• Close to the community rules & policies
• 4 categories + 1 catchall
• Good domain coverage
28
29. Approach: Structure Evidence
1. Understand what evidence the community uses
to establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
29
30. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
30
33. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
33
34. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
34
35. Build a computer support system.
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
47. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
47
53. PU* - Perceived usefulness
PE* - Perceived ease of use
DC -Decision completeness
PF - Perceived effort
IC* - Information
completeness
Statistical Significance
PU* p < .001
PE* p .001
IC* p .039
53
55. Results: 84% prefer our system.
“Information is structured and I can quickly get an
overview of the key arguments.”
“The ability to navigate the comments made it a bit
easier to filter my mind set and to come to a
conclusion.”
“It offers the structure needed to consider each factor
separately, thus making the decision easier. Also, the
number of comments per factor offers a quick
indication of the relevance and the deepness of the
decision.”
16/19, based on a 20 participant user test.
1 participant did not take the final survey
55
56. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test…
… & refine the system.
56
57. Summary
o Information technology can organize information
based on a community’s key decision factors.
o In Wikipedia, we developed an alternate interface for
deletion discussions.
o In Wikipedia, 4 questions are used to evaluate
borderline articles:
o Notability – Is the topic appropriate for our encyclopedia?
o Sources – Is the article well-sourced?
o Maintenance – Can we maintain this article?
o Bias – Is the article neutral? POV appropriately weighted?
57
58. Summary: Our Process
1. Get to know a community and its needs.
Ethnography
1. Structure the data.
Annotation & ontology development
1. Build a computer support system.
Web standards:
HTML, JavaScript, RDF/OWL, SPARQL
1. Test & refine the system.
Human computer interaction
58
61. Info overload now goes beyond
papers
Bastian, Glasziou, and Chalmers. "75 trials and 11 systematic reviews
a day: how will we ever keep up?." PLoS medicine 7.9 (2010): e1000326.
62. For medication safety, how to
structure evidence on drug-drug
interactions and keep it up-to-date?
Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce. “Using the
Micropublications ontology and the Open Annotation Data Model to represent evidence
within a drug-drug interaction knowledge base.” 4th Workshop on Linked Science 2014—
Making Sense Out of Data (LISC2014) at ISWC 2014
Mathias Brochhausen, Jodi Schneider, Daniel Malone, Philip E. Empey, William R. Hogan
and Richard D. Boyce “Towards a foundational representation of potential drug-drug
interaction knowledge.” First International Workshop on Drug Interaction Knowledge
Representation (DIKR-2014) at the International Conference on Biomedical Ontologies
(ICBO 2014)
Jodi Schneider, Carol Collins, Lisa Hines, John R Horn and Richard Boyce. “Modeling
Arguments in Scientific Papers to Support Pharmacists.” at ArgDiaP 2014, The 12th
ArgDiaP Conference: From Real Data to Argument Mining, Warsaw, Poland
62
63. Part of a Larger Effort
o “Addressing gaps in clinically useful evidence on
drug-drug interactions”
o 4-year project, U.S. National Library of Medicine R01
grant
(PI, Richard Boyce; 1R01LM011838-01)
o Since February 2013:
evidence panel of domain experts
(Carol Collins, Lisa Hines, John R Horn, Phil Empey)
& informaticists
(Tim Clark, Paolo Ciccarese, Jodi Schneider)
o Programmer: Yifan Ning
65. Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
65
66. Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
66
67. Goals
o Support evidence-based updates to
drug-interaction reference databases.
o Make sense of the EVIDENCE:
• New clinical trials
• Adverse drug event reports
• Drug product labels
• FDA regulatory updates
http://jama.jamanetwork.com/article.aspx?articleid=18345467
69. Evidence Base Competency Questions
o 40 competency questions, such as:
• List all evidence by drug, drug pair, …
• List all default assumptions
(assertions not supported by evidence)
• Which single evidence items act as as support or rebuttal
for multiple assertions of type X?
(e.g., substrate_of assertions)
• What data, methods, materials, were used in the study
reported in evidence item X?
• Which research group conducted the study reported in
evidence item X?
• Show me what evidence has been deprecated since my
last visit?
• Which assertions are supported by a specific FDA
guidance statement?
69
70. An Ontology for Representing Evidence
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications
70
71. An Ontology for Representing Evidence
71
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications
83. Next steps
o Continuing data model development & testing.
o NLP support: Create a pipeline for extracting
potential drug-drug interaction mentions from
scientific & clinical literature.
o NLP + "expertsourcing" and crowdsourcing
(distributed annotation).
o Test annotation tools: usability for domain experts.
o Resolving links to paywalled PDFs.
83
90. Walton’s Argumentation Schemes
Example Argumentation Scheme:
Argument from Rules – “we apply rule X”
Critical Questions
1. Does the rule require carrying out this type of
action?
2. Are there other established rules that might conflict
with or override this one?
3. Are there extenuating circumstances or an excuse
for noncompliance?
Walton, Reed, and Macagno 2008
Online discussions are the focus of the first project, which addresses the question of “What knowledge should be included in Wikipedia?"
Wikipedia is extremely popular: it’s the world’s 7th most visited website. But what knowledge gets included?
It’s a little known fact that Wikipedia deletes articles. For most readers, messages like these are the only sign of articles at risk for deletion,
or deleted articles.
In fact, 1 in 4 Wikipedia articles is deleted.
While many articles are deleted without discussion, each week about 500 borderline articles are considered for deletion, through open online discussions that anyone can comment on.
Here is an example discussion. First, someone nominates the article for deletion. In this case, the article is about a baseball pitcher. The nominator says that we should delete the article: Heath Totten doesn’t merit an article since he doesn’t have a very good record and hasn’t played in a few years.
The second message responds and suggests keeping the article. This message gives new evidence to support keeping the article about Heath Totten. That he is actively playing.
We find that there are a few problems with these discussions. First of all, some discussions have no consensus, even after lengthy discussion. The same article may be repeatedly proposed for deletion, in some cases over 20 times.
One goal of this work is to summarize long discussions.
Second, newcomers are confused about Wikipedia’s standards. Newcomers make comments like these:
"Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?"
just as special as a article on a breed of dog
especially on a website where pointless people get a mention
Making the criteria
A second goal of this work is to make the community standards more explicit.
Newcomers also do not understand particular terminology, such as “reliable secondary source”. A common argument from an old-hand in our corpus is that “Notability [is] not demonstrated in a reliable secondary source”.
Newcomers misunderstand what Wikipedia counts as a “reliable secondary source”. Here, a newcomer replies that the article “will have refs from other sources” once the website it is describing goes live. To a Wikipedian, this is not a convincing argument, because how does this person know this? The "refs from other sources" sound like press releases – but reliable secondary sources must be independent.
So again, this shows the need to make the community standards more explicit.
Technically started or relisted
Corpus is https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Log/2011_January_29
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
very few content standards need to be clearly communicated to readers in order to bring significant benefit.
69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
Identify and explicitly represent arguments, and in particular
successful arguments that are persuasive to a given audience.
Adverse drug events are a leading cause of death
Image from https://www.njpharmacy.com/wp-content/uploads/2013/02/drug-interactions-checker.png
Image from http://www.clipartbest.com/clipart-McLLpbGKi
Adverse drug events are a leading cause of death
Images from
http://www.knowabouthealth.com/android-version-of-medscape-app-ready-to-download/7568/
Android Play store
http://amazingsgs.blogspot.com/2011/10/top-5-free-android-medical-apps-for.html
Most sources of clinically-oriented PDDI knowledge disagree substantially in their content,
including about which drug combinations should never be never co-administered. For
example, only one quarter of 59 contraindicated drug pairs were listed in three PDDI
information sources[4], only 18 (28%) of 64 pharmacy information and clinical decisions
support systems correctly identified 13 PDDIs considered clinically significant
by a team of drug interaction experts[5], and four clinically oriented drug information
compendia agreed on only 2.2% of 406 PDDIs considered to be “major” by at least
one source[6].
From our paper: http://ceur-ws.org/Vol-1309/paper2.pdf
4. Wang, L.M., Wong, M., Lightwood, J.M., Cheng, C.M.: Black box
warning contraindicated comedications: concordance among three
major drug interaction screening programs. Ann. Pharmacother. 44,
28–34 (2010).
5. Saverno, K.R., Hines, L.E., Warholak, T.L., Grizzle, A.J., Babits, L.,
Clark, C., Taylor, A.M., Malone, D.C.: Ability of pharmacy clinical
decision-support software to alert users about clinically important
drug-drug interactions. J. Am. Med. Inform. Assoc. JAMIA. 18, 32–
37 (2011).
6. Abarca, J., Malone, D.C., Armstrong, E.P., Grizzle, A.J., Hansten,
P.D., Van Bergen, R.C., Lipton, R.B.: Concordance of severity ratings
provided in four drug interaction compendia. J. Am. Pharm. Assoc.
JAPhA. 44, 136–141 (2004).
Adverse drug events are a leading cause of death
Images from
http://www.knowabouthealth.com/android-version-of-medscape-app-ready-to-download/7568/
Android Play store
http://amazingsgs.blogspot.com/2011/10/top-5-free-android-medical-apps-for.html
40 compentency questions
https://docs.google.com/document/d/1o0DYpu9FuXGCz861OOGkhYKA-KWMY-hHRBQ-R8IlqXc/edit
Not the only competency questions – also have e.g.
Queries Supporting Drug Interaction Management
https://docs.google.com/spreadsheets/d/1ikYsOB09XHUQiSl-KPlDZBQWScbi15rHUeyfOcUQz5M/edit#gid=0
Very precise specification of the entities
Improve sensitivity of information retrieval (recall/precision)
From http://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?setid=13bb8267-1cab-43e5-acae-55a4d957630a&type=display
Evidence entry form from:
https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxkZGlrcmFuZGlyfGd4OjE0ZGIwY2IwNzJhOWNjMjY
From http://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?setid=13bb8267-1cab-43e5-acae-55a4d957630a&type=display
For adding annotations: Existing MP plugin for Domeo
For viewing annotations: Want them highlighted in a web-based interface BUT Resolving annotations requires a method for pointing to paywalled/subscription PDF & HTML
An existing Micropublication plugin for Domeo [Ciccarese2014] is being mod- ified as part of the project. Our plan is to use the revised plugin to support the evidence board with the collection of the evidence and associated annotation data. It will also enable the broader community to access and view annotations of PDDIs highlighted in a web-based interface. We anticipate that this approach will enable a broader community of experts to review each PDDI recorded in the DIKB and examine the underlying research study to confirm its appropriateness and relevance to the evidence base.
The usability of the annotation plug-in is critically important so that the panel of domain experts will not face barriers to annotating and entering ev- idence. This will require usability studies of the new PDDI Micropublication plugin. Another issue is that many PDDI evidence items can be found only in PDF documents. Currently, the tool chain for PDF annotation is relatively weak: compared to text and HTML, PDF annotation tools are not as widely available and not as familiar to end-users. Suitable tools will have to be integrated into the revised plugin.
PDF documents may be in proprietary portals or academic library systems
Annotations in the data model are a set of RDF resources that connect some target to a set of resources that are in some way about it.
We would count this as an Argument from Rules
Major Premise: If carrying out types of actions including A is the established rule for x, then (unless the case is an exception), a must carry out A.
Minor Premise: Carrying out types of actions including A is the established rule for a.
Conclusion: Therefore, a must carry out A.
Earlier in CSCW: Jodi Schneider, Krystian Samp, Alexandre Passant, Stefan Decker. “Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”. In Computer Supported Cooperative Work and Social Computing (CSCW). San Antonio, TX, February 23-27, 2013.
Used as categories
Initial annotation
60 categories (each Walton argumentation scheme)
all arguments in each message
Round 4
15 most common argumentation schemes
main argument in each message
Good inter-annotator agreement for hard task:54% agreement (compared to 12% chance) among 2 annotators