SlideShare a Scribd company logo
Improving Text Mining with Controlled
Natural Language:
A Case Study for Protein Interactions
Tobias Kuhn (speaker)
Loïc Royer
Norbert E. Fuchs
Michael Schroeder
DILS'06, Hinxton (UK)
21 July 2006
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2
Cooperation of
University of Zurich
(Norbert E. Fuchs, Tobias Kuhn)
and
TU Dresden
(Loïc Royer, Michael Schroeder)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3
Introduction
 Biomedical literature is growing at a
tremendous pace
 PubMed contains 16 million articles and
grows by over 600'000 articles per year
 Computational support is needed!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4
Today's Solution
NLP, manual
annotation
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5
Our Approach
 Let the researchers express their own
results in a formal language
 Perfect processing of scientific results by
computers
 This formal language has to be ...
 easy to learn and understand
 expressive enough to express even
complicated scientific results
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6
Knowledge Representation
Languages
OWL with RDF/XML
Description Logics
first-order logic
ACE
UML
has
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7
Attempto Controlled English
(ACE)
 Formal language that looks like natural
English
 Unambiguously translatable into first-
order logic
 Restricted grammar
 Unlimited vocabulary
 www.ifi.unizh.ch/attempto
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8
Formal Summaries
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9
Formal Summaries
BubR1 interacts-with a trunk-domain of Beta2-Adaptin.
[A, B, C, D]
named(A, BubR1)-1
object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
named(B, Beta2-Adaptin)-1
object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1
relation(C, trunk-domain, of, B)-1
predicate(D, unspecified, interact_with, A, C)-1
ACE text
Logical representation (DRS)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10
Ontology for Protein Interactions
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11
Empirical Study
 “How suitable is ACE together with our
ontology to express scientific results of
protein interactions?”
 Manual translation of 273 facts about
protein interactions
 These facts are subheadings of the
“Results”-sections of 89 articles (journals
by Elsevier)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12
Empirical Study
154
57
62
matched perfectly
matched partially
unmatched not covered by the model
relations of relations
fuzzy
21
56
11
31
not understood
Total: Non-perfect:
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13
Authoring tool
 Helps writing ACE sentences
 Shows step by step the possible
continuations of the sentence
 New words can be created on-the-fly
 Awareness of the underlying ontology
 The users do not need to know the details
of the ACE syntax and of the underlying
ontology
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14
Authoring tool:
Prototype demo
http://gopubmed.biotec.tu-dresden.de/AceWiki/
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15
Benefits of our Approach
 Consistency / redundancy checks
 “Is there a paper that contradicts my results?”
 “Is there a paper that comes to the same or similar
results?”
 Answer extraction
 “Which proteins interact with a certain domain of
protein X?”
 Automatically updated knowledge bases
 “Give me an overview of the relations of a protein X
to other proteins!”
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16
Conclusions
 Formal summaries for scientific articles
can make text mining easier and more
powerful
 ACE combines the power of ontologies
with the convenience of natural language
 Let the researchers formalize their own
results!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17
Thank you for your attention!
Questions
&
Discussion
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18
Subheadings: Example
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19
Degree of Matching: Examples
 Matched perfectly:
 Interaction of Act1 with TRAF6
 → Act1 interacts-with TRAF6.
 Matched partially:
 The mtFabD protein is part of the core of the FAS-II
complex
 → MtFabD is a subunit of FAS-II.
 Unmatched:
 Cav1 interacts differentially with distinct Dyn2 forms
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20
Reasons for Non-perfect
Matching: Examples
 Not covered by the model:
 Daxx Potentiates Fas-Mediated Apoptosis
 Relations of relations:
 Kal-GEF1 activation of Pak does not require GEF activity
 Fuzzy:
 ANKRD1 contains potential CASQ2 binding sequences
located in both its NT- and CT-regions
 Not understood:
 hSrb7 does not interact with other nuclear receptors

More Related Content

Viewers also liked

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
Tobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
Tobias Kuhn
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsChris Willmott
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
The Higher Education Academy
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Cary Gillenwater
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
cahafner
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
Stefanie Panke
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case studyKobi Vider
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
Dr. Shivananda Koteshwar
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
vibhabhagat2007
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
dr m m bagali, phd in hr
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
Amanda Preston
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
Zaana Jaclyn
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
Nour Elbader
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
nainadesh
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
Dr Mohan Savade
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
MalissaHopeCollins
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
David Onoue
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
Sarthak Kaura
 

Viewers also liked (19)

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethics
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case study
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
 

Similar to Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
Alison Specht
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualizationLars Juhl Jensen
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
hsls
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
khinsen
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
Bria Davis
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
Robin Haunschild
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
chikumbutsochimbatat
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment ICZN
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
hcsc2016
 

Similar to Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions (11)

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualization
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
 
BACE1 inhibitor
BACE1 inhibitorBACE1 inhibitor
BACE1 inhibitor
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
 

More from Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
Tobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
Tobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
Tobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
Tobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
Tobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
Tobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
Tobias Kuhn
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Tobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
Tobias Kuhn
 

More from Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 

Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

  • 1. Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions Tobias Kuhn (speaker) Loïc Royer Norbert E. Fuchs Michael Schroeder DILS'06, Hinxton (UK) 21 July 2006
  • 2. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2 Cooperation of University of Zurich (Norbert E. Fuchs, Tobias Kuhn) and TU Dresden (Loïc Royer, Michael Schroeder)
  • 3. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3 Introduction  Biomedical literature is growing at a tremendous pace  PubMed contains 16 million articles and grows by over 600'000 articles per year  Computational support is needed!
  • 4. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4 Today's Solution NLP, manual annotation
  • 5. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5 Our Approach  Let the researchers express their own results in a formal language  Perfect processing of scientific results by computers  This formal language has to be ...  easy to learn and understand  expressive enough to express even complicated scientific results
  • 6. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6 Knowledge Representation Languages OWL with RDF/XML Description Logics first-order logic ACE UML has
  • 7. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7 Attempto Controlled English (ACE)  Formal language that looks like natural English  Unambiguously translatable into first- order logic  Restricted grammar  Unlimited vocabulary  www.ifi.unizh.ch/attempto
  • 8. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8 Formal Summaries
  • 9. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9 Formal Summaries BubR1 interacts-with a trunk-domain of Beta2-Adaptin. [A, B, C, D] named(A, BubR1)-1 object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 named(B, Beta2-Adaptin)-1 object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1 relation(C, trunk-domain, of, B)-1 predicate(D, unspecified, interact_with, A, C)-1 ACE text Logical representation (DRS)
  • 10. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10 Ontology for Protein Interactions
  • 11. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11 Empirical Study  “How suitable is ACE together with our ontology to express scientific results of protein interactions?”  Manual translation of 273 facts about protein interactions  These facts are subheadings of the “Results”-sections of 89 articles (journals by Elsevier)
  • 12. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12 Empirical Study 154 57 62 matched perfectly matched partially unmatched not covered by the model relations of relations fuzzy 21 56 11 31 not understood Total: Non-perfect:
  • 13. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13 Authoring tool  Helps writing ACE sentences  Shows step by step the possible continuations of the sentence  New words can be created on-the-fly  Awareness of the underlying ontology  The users do not need to know the details of the ACE syntax and of the underlying ontology
  • 14. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14 Authoring tool: Prototype demo http://gopubmed.biotec.tu-dresden.de/AceWiki/
  • 15. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15 Benefits of our Approach  Consistency / redundancy checks  “Is there a paper that contradicts my results?”  “Is there a paper that comes to the same or similar results?”  Answer extraction  “Which proteins interact with a certain domain of protein X?”  Automatically updated knowledge bases  “Give me an overview of the relations of a protein X to other proteins!”
  • 16. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16 Conclusions  Formal summaries for scientific articles can make text mining easier and more powerful  ACE combines the power of ontologies with the convenience of natural language  Let the researchers formalize their own results!
  • 17. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17 Thank you for your attention! Questions & Discussion
  • 18. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18 Subheadings: Example
  • 19. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19 Degree of Matching: Examples  Matched perfectly:  Interaction of Act1 with TRAF6  → Act1 interacts-with TRAF6.  Matched partially:  The mtFabD protein is part of the core of the FAS-II complex  → MtFabD is a subunit of FAS-II.  Unmatched:  Cav1 interacts differentially with distinct Dyn2 forms
  • 20. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20 Reasons for Non-perfect Matching: Examples  Not covered by the model:  Daxx Potentiates Fas-Mediated Apoptosis  Relations of relations:  Kal-GEF1 activation of Pak does not require GEF activity  Fuzzy:  ANKRD1 contains potential CASQ2 binding sequences located in both its NT- and CT-regions  Not understood:  hSrb7 does not interact with other nuclear receptors