SlideShare a Scribd company logo
Improving Text Mining with Controlled
Natural Language:
A Case Study for Protein Interactions
Tobias Kuhn (speaker)
Loïc Royer
Norbert E. Fuchs
Michael Schroeder
DILS'06, Hinxton (UK)
21 July 2006
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2
Cooperation of
University of Zurich
(Norbert E. Fuchs, Tobias Kuhn)
and
TU Dresden
(Loïc Royer, Michael Schroeder)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3
Introduction
 Biomedical literature is growing at a
tremendous pace
 PubMed contains 16 million articles and
grows by over 600'000 articles per year
 Computational support is needed!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4
Today's Solution
NLP, manual
annotation
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5
Our Approach
 Let the researchers express their own
results in a formal language
 Perfect processing of scientific results by
computers
 This formal language has to be ...
 easy to learn and understand
 expressive enough to express even
complicated scientific results
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6
Knowledge Representation
Languages
OWL with RDF/XML
Description Logics
first-order logic
ACE
UML
has
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7
Attempto Controlled English
(ACE)
 Formal language that looks like natural
English
 Unambiguously translatable into first-
order logic
 Restricted grammar
 Unlimited vocabulary
 www.ifi.unizh.ch/attempto
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8
Formal Summaries
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9
Formal Summaries
BubR1 interacts-with a trunk-domain of Beta2-Adaptin.
[A, B, C, D]
named(A, BubR1)-1
object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
named(B, Beta2-Adaptin)-1
object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1
relation(C, trunk-domain, of, B)-1
predicate(D, unspecified, interact_with, A, C)-1
ACE text
Logical representation (DRS)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10
Ontology for Protein Interactions
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11
Empirical Study
 “How suitable is ACE together with our
ontology to express scientific results of
protein interactions?”
 Manual translation of 273 facts about
protein interactions
 These facts are subheadings of the
“Results”-sections of 89 articles (journals
by Elsevier)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12
Empirical Study
154
57
62
matched perfectly
matched partially
unmatched not covered by the model
relations of relations
fuzzy
21
56
11
31
not understood
Total: Non-perfect:
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13
Authoring tool
 Helps writing ACE sentences
 Shows step by step the possible
continuations of the sentence
 New words can be created on-the-fly
 Awareness of the underlying ontology
 The users do not need to know the details
of the ACE syntax and of the underlying
ontology
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14
Authoring tool:
Prototype demo
http://gopubmed.biotec.tu-dresden.de/AceWiki/
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15
Benefits of our Approach
 Consistency / redundancy checks
 “Is there a paper that contradicts my results?”
 “Is there a paper that comes to the same or similar
results?”
 Answer extraction
 “Which proteins interact with a certain domain of
protein X?”
 Automatically updated knowledge bases
 “Give me an overview of the relations of a protein X
to other proteins!”
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16
Conclusions
 Formal summaries for scientific articles
can make text mining easier and more
powerful
 ACE combines the power of ontologies
with the convenience of natural language
 Let the researchers formalize their own
results!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17
Thank you for your attention!
Questions
&
Discussion
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18
Subheadings: Example
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19
Degree of Matching: Examples
 Matched perfectly:
 Interaction of Act1 with TRAF6
 → Act1 interacts-with TRAF6.
 Matched partially:
 The mtFabD protein is part of the core of the FAS-II
complex
 → MtFabD is a subunit of FAS-II.
 Unmatched:
 Cav1 interacts differentially with distinct Dyn2 forms
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20
Reasons for Non-perfect
Matching: Examples
 Not covered by the model:
 Daxx Potentiates Fas-Mediated Apoptosis
 Relations of relations:
 Kal-GEF1 activation of Pak does not require GEF activity
 Fuzzy:
 ANKRD1 contains potential CASQ2 binding sequences
located in both its NT- and CT-regions
 Not understood:
 hSrb7 does not interact with other nuclear receptors

More Related Content

Viewers also liked

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
Tobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
Tobias Kuhn
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethics
Chris Willmott
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
The Higher Education Academy
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Cary Gillenwater
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
cahafner
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
Stefanie Panke
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case study
Kobi Vider
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
Dr. Shivananda Koteshwar
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
vibhabhagat2007
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
dr m m bagali, phd in hr
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
Amanda Preston
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
Zaana Jaclyn
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
Nour Elbader
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
nainadesh
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
Dr Mohan Savade
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
MalissaHopeCollins
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
David Onoue
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
Sarthak Kaura
 

Viewers also liked (19)

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethics
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case study
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
 

Similar to Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
Alison Specht
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualization
Lars Juhl Jensen
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
hsls
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
drnigam
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
khinsen
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
Bria Davis
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
Robin Haunschild
 
BACE1 inhibitor
BACE1 inhibitorBACE1 inhibitor
BACE1 inhibitor
Steven Komjathy
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
chikumbutsochimbatat
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
ICZN
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
hcsc2016
 

Similar to Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions (11)

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualization
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
 
BACE1 inhibitor
BACE1 inhibitorBACE1 inhibitor
BACE1 inhibitor
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
 

More from Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
Tobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
Tobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
Tobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
Tobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
Tobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
Tobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
Tobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
Tobias Kuhn
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Tobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
Tobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
Tobias Kuhn
 

More from Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 

Recently uploaded

(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
Priyanka Aash
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
DianaGray10
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
softsuave
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
Arpan Buwa
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 

Recently uploaded (20)

(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 

Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

  • 1. Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions Tobias Kuhn (speaker) Loïc Royer Norbert E. Fuchs Michael Schroeder DILS'06, Hinxton (UK) 21 July 2006
  • 2. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2 Cooperation of University of Zurich (Norbert E. Fuchs, Tobias Kuhn) and TU Dresden (Loïc Royer, Michael Schroeder)
  • 3. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3 Introduction  Biomedical literature is growing at a tremendous pace  PubMed contains 16 million articles and grows by over 600'000 articles per year  Computational support is needed!
  • 4. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4 Today's Solution NLP, manual annotation
  • 5. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5 Our Approach  Let the researchers express their own results in a formal language  Perfect processing of scientific results by computers  This formal language has to be ...  easy to learn and understand  expressive enough to express even complicated scientific results
  • 6. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6 Knowledge Representation Languages OWL with RDF/XML Description Logics first-order logic ACE UML has
  • 7. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7 Attempto Controlled English (ACE)  Formal language that looks like natural English  Unambiguously translatable into first- order logic  Restricted grammar  Unlimited vocabulary  www.ifi.unizh.ch/attempto
  • 8. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8 Formal Summaries
  • 9. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9 Formal Summaries BubR1 interacts-with a trunk-domain of Beta2-Adaptin. [A, B, C, D] named(A, BubR1)-1 object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 named(B, Beta2-Adaptin)-1 object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1 relation(C, trunk-domain, of, B)-1 predicate(D, unspecified, interact_with, A, C)-1 ACE text Logical representation (DRS)
  • 10. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10 Ontology for Protein Interactions
  • 11. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11 Empirical Study  “How suitable is ACE together with our ontology to express scientific results of protein interactions?”  Manual translation of 273 facts about protein interactions  These facts are subheadings of the “Results”-sections of 89 articles (journals by Elsevier)
  • 12. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12 Empirical Study 154 57 62 matched perfectly matched partially unmatched not covered by the model relations of relations fuzzy 21 56 11 31 not understood Total: Non-perfect:
  • 13. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13 Authoring tool  Helps writing ACE sentences  Shows step by step the possible continuations of the sentence  New words can be created on-the-fly  Awareness of the underlying ontology  The users do not need to know the details of the ACE syntax and of the underlying ontology
  • 14. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14 Authoring tool: Prototype demo http://gopubmed.biotec.tu-dresden.de/AceWiki/
  • 15. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15 Benefits of our Approach  Consistency / redundancy checks  “Is there a paper that contradicts my results?”  “Is there a paper that comes to the same or similar results?”  Answer extraction  “Which proteins interact with a certain domain of protein X?”  Automatically updated knowledge bases  “Give me an overview of the relations of a protein X to other proteins!”
  • 16. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16 Conclusions  Formal summaries for scientific articles can make text mining easier and more powerful  ACE combines the power of ontologies with the convenience of natural language  Let the researchers formalize their own results!
  • 17. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17 Thank you for your attention! Questions & Discussion
  • 18. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18 Subheadings: Example
  • 19. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19 Degree of Matching: Examples  Matched perfectly:  Interaction of Act1 with TRAF6  → Act1 interacts-with TRAF6.  Matched partially:  The mtFabD protein is part of the core of the FAS-II complex  → MtFabD is a subunit of FAS-II.  Unmatched:  Cav1 interacts differentially with distinct Dyn2 forms
  • 20. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20 Reasons for Non-perfect Matching: Examples  Not covered by the model:  Daxx Potentiates Fas-Mediated Apoptosis  Relations of relations:  Kal-GEF1 activation of Pak does not require GEF activity  Fuzzy:  ANKRD1 contains potential CASQ2 binding sequences located in both its NT- and CT-regions  Not understood:  hSrb7 does not interact with other nuclear receptors