SlideShare a Scribd company logo
KafNafParserPy 
A python library for parsing KAF/NAF 
Ruben Izquierdo Bevia 
Vrije University of Amsterdam 
CLTL meeting 19th Nov 2014
What is KAF / NAF ? 
• Annotations formats to represent linguistic information 
o XML based 
o Different information in different layers interconnected 
o Easy to be used in NLP pipelines 
• KAF 
o https://github.com/opener-project/kaf/wiki/KAF-structure-overview 
• NAF 
o http://www.newsreader-project.eu/files/2013/01/techreport.pdf
What is the 
KafNafParserPy 
• It is a Python module/library 
• It allows to parse a KAF or NAF file 
o Read all the layers 
o Provides access to the information by means of python classes (methods and 
attributes) 
• It allows to generate new KAF/NAF files 
o Create new layers 
o Modify existing ones 
• It allows to convert NAF  KAF
KafNafParserPy 
philosophy 
• No validation against DTD (just valid as XML) 
• Python object for each XML element (header, text, 
token,terms…) 
• The attributes are not “parsed/read” 
o The KAF/NAF attributes are not defined as attributes for a class 
o Just the pointer to the XML element is stored 
• It provides access to all the attributes on “real time” 
• Modifications are made “on the fly” 
o If you change the object in memory  you will need to dump it to a new 
file to keep the results
KafNafParserPy 
philosophy 
• Class Cterm (encapsulate a KAF/NAF term) 
o Attributes: 
• string lemma 
• string pos 
• string morphofeat 
• Cspan span …. 
o Methods 
• get_lemma(…)  returns the lemma attribute 
• get_pos(…)  returns the pos attribute 
• …..
KafNafParserPy 
philosophy 
• Class Cterm (encapsulate a KAF/NAF term) 
o Attributes: 
• string lemma 
• string pos 
• string morphofeat 
• Cspan span …. 
o Methods 
• get_lemma(…)  returns the lemma attribute 
• get_pos(…)  returns the pos attribute 
• …..
KafNafParserPy 
philosophy 
• Class Cterm (encapsulate a KAF/NAF term) 
o Attributes: 
• string type (is NAF or KAF?) 
• Pointer to the xml element 
o Methods 
• get_lemma(…)  returns xml_element.get(‘lemma’) 
• get_pos(…)  returns xml_element.get(‘pos’) 
• get_id(…) 
o xml_element.get(‘id’) for NAF 
o xml_element.get(‘tid’) for KAF
Getting Started I 
• https://github.com/cltl/KafNafParserPy 
• Basic steps: 
o Install lxml library for Python 
• pip install lxml 
o Clone the repository 
• git clone https://github.com/cltl/KafNafParserPy 
o Make it available for Python 
• Put it on the same folder of the scripts that will import 
• Add it to PYTHON_PATH 
• Create a symbolic link in your virtualenv 
• …
Getting Started II 
• Documentation: 
o HTML: http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/ 
o PDF: http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/api.pdf 
• Entry point always 
o Module KafNafParserPy 
o Class KafNafParser
Getting tokens 
• How could I? 
o We just have a “KafNafParser” object 
• Go to the API and check the methods for the 
KafNafParser class
Getting tokens 
• How could I? 
o We just have a “KafNafParser” object 
• Go to the API and check the methods for the 
KafNafParser class
Getting tokens 
• How could I? 
o We just have a “KafNafParser” object 
• Go to the API and check the methods for the 
KafNafParser class
Getting tokens
Getting terms 
• Use KafNafParser::get_terms(…) 
• Use methods of Cterm
Modifying one token 
• Change w7->War to Battle
Modifying one token 
• Object “my_parser” after set_text(…) 
o is updated with “Battle” in memory 
o Original file “entities_example.naf” is not changed 
• If we want to keep the changes 
o Close the program  clean memory  changes lost 
o We will need to dump the object to a new file 
• Could be a (string) filename or an open file
Read entities 
• KafNafParser::get_entities() is an iterator for 
entities 
• Centity::get_external_references() is an iterator for 
external references
Adding a new external 
reference 
1. Create the new object external reference 
o “from KafNafParserPy import KafNafParser” 
o “from KafNafParserPy import *” 
2. Set the attributes with the set_XYZ() methods 
1. Add the new object to the layer/tree 
o By adding it to the specific element (the entity if we have it) 
o By adding it to the general parser object providing the identifier (sometimes not 
implemented)
Adding a new external 
reference 
• Create the new external reference 
• Find the element where we want to add it 
• Use the “adding” method of the element
Adding a new external 
reference 
• Create the new external reference 
• Use the “adding” method of the parser and 
providing the id 
• Not always implemented (quite easy to do)
KafNafParserPy 
Ruben Izquierdo Bevia 
ruben.izquierdobevia@vu.nl 
http://rubenizquierdobevia.com 
 GitHub 
https://github.com/cltl/KafNafParserPy 
 API html 
http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/ 
 API pdf 
http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/api.pdf

More Related Content

What's hot

Java 8 ​and ​Best Practices
Java 8 ​and ​Best PracticesJava 8 ​and ​Best Practices
Java 8 ​and ​Best Practices
Buddhini Seneviratne
 
Streams in Java 8
Streams in Java 8Streams in Java 8
Streams in Java 8
Tobias Coetzee
 
Python with data Sciences
Python with data SciencesPython with data Sciences
Python with data Sciences
Krishna Mohan Mishra
 
Lambdas HOL
Lambdas HOLLambdas HOL
Lambdas HOL
Oleg Tsal-Tsalko
 
Java8
Java8Java8
Intro to Java for C++ Developers
Intro to Java for C++ DevelopersIntro to Java for C++ Developers
Intro to Java for C++ Developers
Zachary Blair
 
01 Java Language And OOP Part I LAB
01 Java Language And OOP Part I LAB01 Java Language And OOP Part I LAB
01 Java Language And OOP Part I LAB
Hari Christian
 
Python to scala
Python to scalaPython to scala
Python to scala
kao kuo-tung
 
Java 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & StreamsJava 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & Streams
NewCircle Training
 
Python Session - 6
Python Session - 6Python Session - 6
Python Session - 6
AnirudhaGaikwad4
 
03 Java Language And OOP Part III
03 Java Language And OOP Part III03 Java Language And OOP Part III
03 Java Language And OOP Part III
Hari Christian
 
Delving (Smalltalk) Source Code
Delving (Smalltalk) Source CodeDelving (Smalltalk) Source Code
Delving (Smalltalk) Source Code
ESUG
 
Java 8 presentation
Java 8 presentationJava 8 presentation
Java 8 presentation
Van Huong
 
Functional Programming In Practice
Functional Programming In PracticeFunctional Programming In Practice
Functional Programming In Practice
Michiel Borkent
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
Venkata Naga Ravi
 
Java 8 features
Java 8 featuresJava 8 features
Java 8 features
NexThoughts Technologies
 
Productive Programming in Java 8 - with Lambdas and Streams
Productive Programming in Java 8 - with Lambdas and Streams Productive Programming in Java 8 - with Lambdas and Streams
Productive Programming in Java 8 - with Lambdas and Streams
Ganesh Samarthyam
 
Java 8 lambda
Java 8 lambdaJava 8 lambda
Java 8 lambda
Manav Prasad
 
Lambda Expressions in Java
Lambda Expressions in JavaLambda Expressions in Java
Lambda Expressions in Java
Erhan Bagdemir
 
web programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Malothweb programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Maloth
Bhavsingh Maloth
 

What's hot (20)

Java 8 ​and ​Best Practices
Java 8 ​and ​Best PracticesJava 8 ​and ​Best Practices
Java 8 ​and ​Best Practices
 
Streams in Java 8
Streams in Java 8Streams in Java 8
Streams in Java 8
 
Python with data Sciences
Python with data SciencesPython with data Sciences
Python with data Sciences
 
Lambdas HOL
Lambdas HOLLambdas HOL
Lambdas HOL
 
Java8
Java8Java8
Java8
 
Intro to Java for C++ Developers
Intro to Java for C++ DevelopersIntro to Java for C++ Developers
Intro to Java for C++ Developers
 
01 Java Language And OOP Part I LAB
01 Java Language And OOP Part I LAB01 Java Language And OOP Part I LAB
01 Java Language And OOP Part I LAB
 
Python to scala
Python to scalaPython to scala
Python to scala
 
Java 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & StreamsJava 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & Streams
 
Python Session - 6
Python Session - 6Python Session - 6
Python Session - 6
 
03 Java Language And OOP Part III
03 Java Language And OOP Part III03 Java Language And OOP Part III
03 Java Language And OOP Part III
 
Delving (Smalltalk) Source Code
Delving (Smalltalk) Source CodeDelving (Smalltalk) Source Code
Delving (Smalltalk) Source Code
 
Java 8 presentation
Java 8 presentationJava 8 presentation
Java 8 presentation
 
Functional Programming In Practice
Functional Programming In PracticeFunctional Programming In Practice
Functional Programming In Practice
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
Java 8 features
Java 8 featuresJava 8 features
Java 8 features
 
Productive Programming in Java 8 - with Lambdas and Streams
Productive Programming in Java 8 - with Lambdas and Streams Productive Programming in Java 8 - with Lambdas and Streams
Productive Programming in Java 8 - with Lambdas and Streams
 
Java 8 lambda
Java 8 lambdaJava 8 lambda
Java 8 lambda
 
Lambda Expressions in Java
Lambda Expressions in JavaLambda Expressions in Java
Lambda Expressions in Java
 
web programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Malothweb programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Maloth
 

Viewers also liked

Developing Korean Chatbot 101
Developing Korean Chatbot 101Developing Korean Chatbot 101
Developing Korean Chatbot 101
Jaemin Cho
 
20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기
Kim Sungdong
 
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
Taehoon Kim
 
DEVIEW 2013 - Git은 어떻게 동작하는가
DEVIEW 2013 - Git은 어떻게 동작하는가DEVIEW 2013 - Git은 어떻게 동작하는가
DEVIEW 2013 - Git은 어떻게 동작하는가NAVER D2
 
IRECIPE BOT
IRECIPE BOTIRECIPE BOT
IRECIPE BOT
Kim Sungdong
 
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
Taejoon Yoo
 
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
Taejoon Yoo
 
자바, 미안하다! 파이썬 한국어 NLP
자바, 미안하다! 파이썬 한국어 NLP자바, 미안하다! 파이썬 한국어 NLP
자바, 미안하다! 파이썬 한국어 NLP
Eunjeong (Lucy) Park
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlow
Jeongkyu Shin
 
[F2]자연어처리를 위한 기계학습 소개
[F2]자연어처리를 위한 기계학습 소개[F2]자연어처리를 위한 기계학습 소개
[F2]자연어처리를 위한 기계학습 소개NAVER D2
 
머신러닝의 자연어 처리기술(I)
머신러닝의 자연어 처리기술(I)머신러닝의 자연어 처리기술(I)
머신러닝의 자연어 처리기술(I)
홍배 김
 
딥러닝을 이용한 자연어처리의 연구동향
딥러닝을 이용한 자연어처리의 연구동향딥러닝을 이용한 자연어처리의 연구동향
딥러닝을 이용한 자연어처리의 연구동향
홍배 김
 

Viewers also liked (12)

Developing Korean Chatbot 101
Developing Korean Chatbot 101Developing Korean Chatbot 101
Developing Korean Chatbot 101
 
20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기
 
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 TensorFlow KR Meetup 2016
 
DEVIEW 2013 - Git은 어떻게 동작하는가
DEVIEW 2013 - Git은 어떻게 동작하는가DEVIEW 2013 - Git은 어떻게 동작하는가
DEVIEW 2013 - Git은 어떻게 동작하는가
 
IRECIPE BOT
IRECIPE BOTIRECIPE BOT
IRECIPE BOT
 
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
20160203_마인즈랩_딥러닝세미나_05 딥러닝 자연어처리와 분류엔진 황이규박사
 
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
20160203_마인즈랩_딥러닝세미나_03 the game changer 딥러닝 유태준대표
 
자바, 미안하다! 파이썬 한국어 NLP
자바, 미안하다! 파이썬 한국어 NLP자바, 미안하다! 파이썬 한국어 NLP
자바, 미안하다! 파이썬 한국어 NLP
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlow
 
[F2]자연어처리를 위한 기계학습 소개
[F2]자연어처리를 위한 기계학습 소개[F2]자연어처리를 위한 기계학습 소개
[F2]자연어처리를 위한 기계학습 소개
 
머신러닝의 자연어 처리기술(I)
머신러닝의 자연어 처리기술(I)머신러닝의 자연어 처리기술(I)
머신러닝의 자연어 처리기술(I)
 
딥러닝을 이용한 자연어처리의 연구동향
딥러닝을 이용한 자연어처리의 연구동향딥러닝을 이용한 자연어처리의 연구동향
딥러닝을 이용한 자연어처리의 연구동향
 

Similar to KafNafParserPy: a python library for parsing/creating KAF and NAF files

Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207
Jay Coskey
 
Java I/O
Java I/OJava I/O
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
Krasimir Berov (Красимир Беров)
 
About Python
About PythonAbout Python
About Python
Shao-Chuan Wang
 
Variables in Pharo
Variables in PharoVariables in Pharo
Variables in Pharo
Marcus Denker
 
Clojure beasts-euroclj-2014
Clojure beasts-euroclj-2014Clojure beasts-euroclj-2014
Clojure beasts-euroclj-2014
Renzo Borgatti
 
ppt_on_java.pptx
ppt_on_java.pptxppt_on_java.pptx
ppt_on_java.pptx
MAYANKKUMAR492040
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
Sujit Pal
 
Advanced Reflection in Pharo
Advanced Reflection in PharoAdvanced Reflection in Pharo
Advanced Reflection in Pharo
Marcus Denker
 
Enterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4JEnterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4J
Josh Patterson
 
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
これからのPerlプロダクトのかたち(YAPC::Asia 2013)これからのPerlプロダクトのかたち(YAPC::Asia 2013)
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
goccy
 
Unit-4 PPTs.pptx
Unit-4 PPTs.pptxUnit-4 PPTs.pptx
Unit-4 PPTs.pptx
YashAgarwal413109
 
Dynamic Python
Dynamic PythonDynamic Python
Dynamic Python
Chui-Wen Chiu
 
EKON 24 ML_community_edition
EKON 24 ML_community_editionEKON 24 ML_community_edition
EKON 24 ML_community_edition
Max Kleiner
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
th0masr
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
Ryan Cuprak
 
Javasession6
Javasession6Javasession6
Javasession6
Rajeev Kumar
 
Object Oriented Programming.pptx
Object Oriented Programming.pptxObject Oriented Programming.pptx
Object Oriented Programming.pptx
SAICHARANREDDYN
 
Python - Lecture 8
Python - Lecture 8Python - Lecture 8
Python - Lecture 8
Ravi Kiran Khareedi
 
ANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptxANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptx
ShahzadAhmadJoiya3
 

Similar to KafNafParserPy: a python library for parsing/creating KAF and NAF files (20)

Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207
 
Java I/O
Java I/OJava I/O
Java I/O
 
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
 
About Python
About PythonAbout Python
About Python
 
Variables in Pharo
Variables in PharoVariables in Pharo
Variables in Pharo
 
Clojure beasts-euroclj-2014
Clojure beasts-euroclj-2014Clojure beasts-euroclj-2014
Clojure beasts-euroclj-2014
 
ppt_on_java.pptx
ppt_on_java.pptxppt_on_java.pptx
ppt_on_java.pptx
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
 
Advanced Reflection in Pharo
Advanced Reflection in PharoAdvanced Reflection in Pharo
Advanced Reflection in Pharo
 
Enterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4JEnterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4J
 
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
これからのPerlプロダクトのかたち(YAPC::Asia 2013)これからのPerlプロダクトのかたち(YAPC::Asia 2013)
これからのPerlプロダクトのかたち(YAPC::Asia 2013)
 
Unit-4 PPTs.pptx
Unit-4 PPTs.pptxUnit-4 PPTs.pptx
Unit-4 PPTs.pptx
 
Dynamic Python
Dynamic PythonDynamic Python
Dynamic Python
 
EKON 24 ML_community_edition
EKON 24 ML_community_editionEKON 24 ML_community_edition
EKON 24 ML_community_edition
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
 
Javasession6
Javasession6Javasession6
Javasession6
 
Object Oriented Programming.pptx
Object Oriented Programming.pptxObject Oriented Programming.pptx
Object Oriented Programming.pptx
 
Python - Lecture 8
Python - Lecture 8Python - Lecture 8
Python - Lecture 8
 
ANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptxANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptx
 

More from Rubén Izquierdo Beviá

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of Ambiguity
Rubén Izquierdo Beviá
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systems
Rubén Izquierdo Beviá
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
Rubén Izquierdo Beviá
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpus
Rubén Izquierdo Beviá
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
Rubén Izquierdo Beviá
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
Rubén Izquierdo Beviá
 
Juan Calvino y el Calvinismo
Juan Calvino y el CalvinismoJuan Calvino y el Calvinismo
Juan Calvino y el Calvinismo
Rubén Izquierdo Beviá
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)
Rubén Izquierdo Beviá
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)
Rubén Izquierdo Beviá
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)
Rubén Izquierdo Beviá
 
CLTL Software and Web Services
CLTL Software and Web Services CLTL Software and Web Services
CLTL Software and Web Services
Rubén Izquierdo Beviá
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)
Rubén Izquierdo Beviá
 
ULM1 - The borders of Ambiguity
ULM1 - The borders of AmbiguityULM1 - The borders of Ambiguity
ULM1 - The borders of Ambiguity
Rubén Izquierdo Beviá
 
CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013
Rubén Izquierdo Beviá
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
Rubén Izquierdo Beviá
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
Rubén Izquierdo Beviá
 

More from Rubén Izquierdo Beviá (16)

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of Ambiguity
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systems
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpus
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Juan Calvino y el Calvinismo
Juan Calvino y el CalvinismoJuan Calvino y el Calvinismo
Juan Calvino y el Calvinismo
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)
 
CLTL Software and Web Services
CLTL Software and Web Services CLTL Software and Web Services
CLTL Software and Web Services
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)
 
ULM1 - The borders of Ambiguity
ULM1 - The borders of AmbiguityULM1 - The borders of Ambiguity
ULM1 - The borders of Ambiguity
 
CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
 

Recently uploaded

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 

Recently uploaded (20)

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 

KafNafParserPy: a python library for parsing/creating KAF and NAF files

  • 1. KafNafParserPy A python library for parsing KAF/NAF Ruben Izquierdo Bevia Vrije University of Amsterdam CLTL meeting 19th Nov 2014
  • 2. What is KAF / NAF ? • Annotations formats to represent linguistic information o XML based o Different information in different layers interconnected o Easy to be used in NLP pipelines • KAF o https://github.com/opener-project/kaf/wiki/KAF-structure-overview • NAF o http://www.newsreader-project.eu/files/2013/01/techreport.pdf
  • 3. What is the KafNafParserPy • It is a Python module/library • It allows to parse a KAF or NAF file o Read all the layers o Provides access to the information by means of python classes (methods and attributes) • It allows to generate new KAF/NAF files o Create new layers o Modify existing ones • It allows to convert NAF  KAF
  • 4. KafNafParserPy philosophy • No validation against DTD (just valid as XML) • Python object for each XML element (header, text, token,terms…) • The attributes are not “parsed/read” o The KAF/NAF attributes are not defined as attributes for a class o Just the pointer to the XML element is stored • It provides access to all the attributes on “real time” • Modifications are made “on the fly” o If you change the object in memory  you will need to dump it to a new file to keep the results
  • 5. KafNafParserPy philosophy • Class Cterm (encapsulate a KAF/NAF term) o Attributes: • string lemma • string pos • string morphofeat • Cspan span …. o Methods • get_lemma(…)  returns the lemma attribute • get_pos(…)  returns the pos attribute • …..
  • 6. KafNafParserPy philosophy • Class Cterm (encapsulate a KAF/NAF term) o Attributes: • string lemma • string pos • string morphofeat • Cspan span …. o Methods • get_lemma(…)  returns the lemma attribute • get_pos(…)  returns the pos attribute • …..
  • 7. KafNafParserPy philosophy • Class Cterm (encapsulate a KAF/NAF term) o Attributes: • string type (is NAF or KAF?) • Pointer to the xml element o Methods • get_lemma(…)  returns xml_element.get(‘lemma’) • get_pos(…)  returns xml_element.get(‘pos’) • get_id(…) o xml_element.get(‘id’) for NAF o xml_element.get(‘tid’) for KAF
  • 8. Getting Started I • https://github.com/cltl/KafNafParserPy • Basic steps: o Install lxml library for Python • pip install lxml o Clone the repository • git clone https://github.com/cltl/KafNafParserPy o Make it available for Python • Put it on the same folder of the scripts that will import • Add it to PYTHON_PATH • Create a symbolic link in your virtualenv • …
  • 9. Getting Started II • Documentation: o HTML: http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/ o PDF: http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/api.pdf • Entry point always o Module KafNafParserPy o Class KafNafParser
  • 10. Getting tokens • How could I? o We just have a “KafNafParser” object • Go to the API and check the methods for the KafNafParser class
  • 11. Getting tokens • How could I? o We just have a “KafNafParser” object • Go to the API and check the methods for the KafNafParser class
  • 12. Getting tokens • How could I? o We just have a “KafNafParser” object • Go to the API and check the methods for the KafNafParser class
  • 14. Getting terms • Use KafNafParser::get_terms(…) • Use methods of Cterm
  • 15. Modifying one token • Change w7->War to Battle
  • 16. Modifying one token • Object “my_parser” after set_text(…) o is updated with “Battle” in memory o Original file “entities_example.naf” is not changed • If we want to keep the changes o Close the program  clean memory  changes lost o We will need to dump the object to a new file • Could be a (string) filename or an open file
  • 17. Read entities • KafNafParser::get_entities() is an iterator for entities • Centity::get_external_references() is an iterator for external references
  • 18. Adding a new external reference 1. Create the new object external reference o “from KafNafParserPy import KafNafParser” o “from KafNafParserPy import *” 2. Set the attributes with the set_XYZ() methods 1. Add the new object to the layer/tree o By adding it to the specific element (the entity if we have it) o By adding it to the general parser object providing the identifier (sometimes not implemented)
  • 19. Adding a new external reference • Create the new external reference • Find the element where we want to add it • Use the “adding” method of the element
  • 20. Adding a new external reference • Create the new external reference • Use the “adding” method of the parser and providing the id • Not always implemented (quite easy to do)
  • 21. KafNafParserPy Ruben Izquierdo Bevia ruben.izquierdobevia@vu.nl http://rubenizquierdobevia.com  GitHub https://github.com/cltl/KafNafParserPy  API html http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/  API pdf http://kyoto.let.vu.nl/~izquierdo/api/KafNafParserPy/api.pdf