SlideShare a Scribd company logo
1 of 16
CLTL 
Software and Web 
Services 
Rubén Izquierdo Beviá
Rubén Izquierdo Beviá 
About me 
 5-year degree on Computer Science (University of Alicante, 
Alicante, Spain) 
 National NLP projects and 1 European project (QALLME) 
(University of Alicante, Alicante, Spain) 
 Thesis about NLP & Word Sense Disambiguation (University 
of Alicante, Alicante, Spain. Sept 2010) 
 Postdoc position at DutchSemCor Project (University of 
Tilburg, Tilburg. Sept 2011-Sept2012) 
 Postdoc position at OpeNER Project (Vrije University, 
Amsterdam. Sept 2012-)
CLTL software 
 In general common input/output format 
 KAF 
 NAF, as an extension of KAF 
 Single components performing single tasks 
 Integration of existing modules 
 Adaptation of input/output formats 
 Development of new ones
KAF 
Kyoto Annotation Format 
 Stand-off, layered, XML-based representation format 
 Different types of information are stored in different layers 
 Layers are linked by means of references 
 Suitable for creating pipelines based on this format 
 Layers: 
 Text  tokens 
 Term  lemmas, part-of-speech, term sentiment, word 
senses 
 Entities, chunks, opinions…
KAF 
Kyoto Annotation Format
NAF 
NewsReader Annotation Format 
 Extension of KAF 
 Allow the cross-document processing 
 Event coreference 
 ID’s are converted into valid URI’s 
 Store the same type of information provided by different 
tools 
 Result of two different pos-taggers
How the software is provided I 
 All modules are publicly available on GitHub 
 CLTL GitHub 
 http://github.com/cltl 
 NewsReader GitHub 
 http://github.com/newsreader 
 OpeNER GitHub 
 http://github.com/opener-project/
How the software is provided 
II 
 Some are available as Web Services 
 Exposed as REST web services 
 Accept and input stream (KAF/NAF) 
 Generate an output stream (KAF/NAF) 
 Easy to call from command line with CURL 
 Easy to create module pipelines in the same way you create a 
linux commands pipeline 
 http://wordpress.let.vupr.nl/web-services/
How the software is provided 
II
How the software is provided 
II
Our software I 
 General modules (integrated) 
 Tokenizers: whitespace based, open-nlp trained... 
 Sentence splitters: based on rules, open-nlp 
 Pos-taggers: treetagger, open-nlp pos taggers 
 Chunker: trained on Alpino data with open-nlp 
 Parsers: Alpino (nl), Stanford (en)
Our software II 
 General modules (developed by us) 
 Wordnet Tools 
 Functions to use a WordNet in LMF format 
 Word Sense Disambiguation systems 
 UKB: unsupersived 
 SVM: supervised (for nl derived from DutchSemcor) 
 Multiword tagger 
 multiword sequences of terms according the WordNet 
 OntoTagger 
 Ontotagger inserts (semantic) labels into KAF representation on the basis 
of lemma or wordnet synset representations of text
Our software III 
 General modules (developed by us) 
 Named Entity Recognizer 
 Detects dates and locations using specific resources + 
GeoNames 
 KyBot 
 Extract tuples and relations from a set of profiles formulated 
using semantic and structural properties
Our software IV 
 OpeNER related (developed by us) 
 Hotel property tagger 
 Detect aspects related with cleanliness, staff, breakfast, 
rooms… 
 Term polarity tagger 
 Positive/negative terms, intensifiers, negators … 
 Opinion miner 
 Detect opinions: target + holder + expression 
 2 rule based version // 1 machine learning version
Our software V 
 NewsReader related (developed by us) 
 Discourse Module 
 Splits incoming texts into headers and paragraphs 
 Factuality Classifier 
 Classifies whether a statement is factual/probable/possible or 
not 
 Event Coreference 
 Compares descriptions of events within and across 
documents to decide if they refer to the same events.
CLTL 
Software and Web 
Services 
Rubén Izquierdo Beviá

More Related Content

What's hot

What's hot (20)

Road to Dynamic LINQ - Part 2
 Road to Dynamic LINQ - Part 2 Road to Dynamic LINQ - Part 2
Road to Dynamic LINQ - Part 2
 
Text Editors and IDEs
Text Editors and IDEsText Editors and IDEs
Text Editors and IDEs
 
SD & D Implementation
SD & D ImplementationSD & D Implementation
SD & D Implementation
 
Python - An Introduction
Python - An IntroductionPython - An Introduction
Python - An Introduction
 
Introduction to python for Beginners
Introduction to python for Beginners Introduction to python for Beginners
Introduction to python for Beginners
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
What is Python? An overview of Python for science.
What is Python? An overview of Python for science.What is Python? An overview of Python for science.
What is Python? An overview of Python for science.
 
Python Programming
Python ProgrammingPython Programming
Python Programming
 
Python Tutorial Part 2
Python Tutorial Part 2Python Tutorial Part 2
Python Tutorial Part 2
 
Dart PPT.pptx
Dart PPT.pptxDart PPT.pptx
Dart PPT.pptx
 
Introduction to c_sharp
Introduction to c_sharpIntroduction to c_sharp
Introduction to c_sharp
 
Introduction to programming languages part 1
Introduction to programming languages   part 1Introduction to programming languages   part 1
Introduction to programming languages part 1
 
Programming with Python: Week 1
Programming with Python: Week 1Programming with Python: Week 1
Programming with Python: Week 1
 
Phython Programming Language
Phython Programming LanguagePhython Programming Language
Phython Programming Language
 
Python
PythonPython
Python
 
Programming Paradigms
Programming ParadigmsProgramming Paradigms
Programming Paradigms
 
Python | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python TutorialPython | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python Tutorial
 
Presentation on python
Presentation on pythonPresentation on python
Presentation on python
 
Python basics
Python basicsPython basics
Python basics
 
SD & D Types of programming language
SD & D Types of programming languageSD & D Types of programming language
SD & D Types of programming language
 

Similar to CLTL Software and Web Services

Net framework
Net frameworkNet framework
Net framework
jhsri
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
Rikki Wright
 
.Net framework
.Net framework.Net framework
.Net framework
Raghu nath
 

Similar to CLTL Software and Web Services (20)

OOP Comparative Study
OOP Comparative StudyOOP Comparative Study
OOP Comparative Study
 
ConFoo Montreal - Microservices for building an IDE - The innards of JetBrain...
ConFoo Montreal - Microservices for building an IDE - The innards of JetBrain...ConFoo Montreal - Microservices for building an IDE - The innards of JetBrain...
ConFoo Montreal - Microservices for building an IDE - The innards of JetBrain...
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part i
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part i
 
F# Tutorial @ QCon
F# Tutorial @ QConF# Tutorial @ QCon
F# Tutorial @ QCon
 
dot NET Framework
dot NET Frameworkdot NET Framework
dot NET Framework
 
Chapter1
Chapter1Chapter1
Chapter1
 
Facebook thrift
Facebook thriftFacebook thrift
Facebook thrift
 
Microservices for building an IDE – The innards of JetBrains Rider - TechDays...
Microservices for building an IDE – The innards of JetBrains Rider - TechDays...Microservices for building an IDE – The innards of JetBrains Rider - TechDays...
Microservices for building an IDE – The innards of JetBrains Rider - TechDays...
 
Building scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thriftBuilding scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thrift
 
Sudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdfSudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdf
 
epicenter2010 Open Xml
epicenter2010   Open Xmlepicenter2010   Open Xml
epicenter2010 Open Xml
 
NDC Sydney 2019 - Microservices for building an IDE – The innards of JetBrain...
NDC Sydney 2019 - Microservices for building an IDE – The innards of JetBrain...NDC Sydney 2019 - Microservices for building an IDE – The innards of JetBrain...
NDC Sydney 2019 - Microservices for building an IDE – The innards of JetBrain...
 
Net framework
Net frameworkNet framework
Net framework
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdf
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...
 
Microsoft.Net
Microsoft.NetMicrosoft.Net
Microsoft.Net
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
 
Python Programming
Python ProgrammingPython Programming
Python Programming
 
.Net framework
.Net framework.Net framework
.Net framework
 

More from Rubén Izquierdo Beviá

Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
Rubén Izquierdo Beviá
 
CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRF
Rubén Izquierdo Beviá
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
Rubén Izquierdo Beviá
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
Rubén Izquierdo Beviá
 

More from Rubén Izquierdo Beviá (16)

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of Ambiguity
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systems
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpus
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Juan Calvino y el Calvinismo
Juan Calvino y el CalvinismoJuan Calvino y el Calvinismo
Juan Calvino y el Calvinismo
 
KafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF filesKafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF files
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)
 
ULM1 - The borders of Ambiguity
ULM1 - The borders of AmbiguityULM1 - The borders of Ambiguity
ULM1 - The borders of Ambiguity
 
CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRF
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

CLTL Software and Web Services

  • 1. CLTL Software and Web Services Rubén Izquierdo Beviá
  • 2. Rubén Izquierdo Beviá About me  5-year degree on Computer Science (University of Alicante, Alicante, Spain)  National NLP projects and 1 European project (QALLME) (University of Alicante, Alicante, Spain)  Thesis about NLP & Word Sense Disambiguation (University of Alicante, Alicante, Spain. Sept 2010)  Postdoc position at DutchSemCor Project (University of Tilburg, Tilburg. Sept 2011-Sept2012)  Postdoc position at OpeNER Project (Vrije University, Amsterdam. Sept 2012-)
  • 3. CLTL software  In general common input/output format  KAF  NAF, as an extension of KAF  Single components performing single tasks  Integration of existing modules  Adaptation of input/output formats  Development of new ones
  • 4. KAF Kyoto Annotation Format  Stand-off, layered, XML-based representation format  Different types of information are stored in different layers  Layers are linked by means of references  Suitable for creating pipelines based on this format  Layers:  Text  tokens  Term  lemmas, part-of-speech, term sentiment, word senses  Entities, chunks, opinions…
  • 6. NAF NewsReader Annotation Format  Extension of KAF  Allow the cross-document processing  Event coreference  ID’s are converted into valid URI’s  Store the same type of information provided by different tools  Result of two different pos-taggers
  • 7. How the software is provided I  All modules are publicly available on GitHub  CLTL GitHub  http://github.com/cltl  NewsReader GitHub  http://github.com/newsreader  OpeNER GitHub  http://github.com/opener-project/
  • 8. How the software is provided II  Some are available as Web Services  Exposed as REST web services  Accept and input stream (KAF/NAF)  Generate an output stream (KAF/NAF)  Easy to call from command line with CURL  Easy to create module pipelines in the same way you create a linux commands pipeline  http://wordpress.let.vupr.nl/web-services/
  • 9. How the software is provided II
  • 10. How the software is provided II
  • 11. Our software I  General modules (integrated)  Tokenizers: whitespace based, open-nlp trained...  Sentence splitters: based on rules, open-nlp  Pos-taggers: treetagger, open-nlp pos taggers  Chunker: trained on Alpino data with open-nlp  Parsers: Alpino (nl), Stanford (en)
  • 12. Our software II  General modules (developed by us)  Wordnet Tools  Functions to use a WordNet in LMF format  Word Sense Disambiguation systems  UKB: unsupersived  SVM: supervised (for nl derived from DutchSemcor)  Multiword tagger  multiword sequences of terms according the WordNet  OntoTagger  Ontotagger inserts (semantic) labels into KAF representation on the basis of lemma or wordnet synset representations of text
  • 13. Our software III  General modules (developed by us)  Named Entity Recognizer  Detects dates and locations using specific resources + GeoNames  KyBot  Extract tuples and relations from a set of profiles formulated using semantic and structural properties
  • 14. Our software IV  OpeNER related (developed by us)  Hotel property tagger  Detect aspects related with cleanliness, staff, breakfast, rooms…  Term polarity tagger  Positive/negative terms, intensifiers, negators …  Opinion miner  Detect opinions: target + holder + expression  2 rule based version // 1 machine learning version
  • 15. Our software V  NewsReader related (developed by us)  Discourse Module  Splits incoming texts into headers and paragraphs  Factuality Classifier  Classifies whether a statement is factual/probable/possible or not  Event Coreference  Compares descriptions of events within and across documents to decide if they refer to the same events.
  • 16. CLTL Software and Web Services Rubén Izquierdo Beviá