SlideShare a Scribd company logo
1 of 11
Download to read offline
OWN-PT: TAKING
STOCK
ALEXANDRE RADEMAKER (IBM RESEARCH BR, EMAP, FGV)
JOINT WORKVALERIA DE PAIVA, LIVY REAL, FABRICIO CHALUB
NLCS 2018, OXFORD, UK
LEXICAL RESOURCES FOR PORTUGUESE?
• 6th mostly spoken language in the world (Ethnologue) or 7th (Wikipedia)
• Very few open source resources for PT, almost no connections between them
• Discuss:
• Initial developments
• NOMLEX-PT
• Applications
• Next steps
FOLK WISDOM
Linguistic resources are very easy to start working on, very
hard to improve on and extremely difficult to maintain, as
funding usually only works for new resources.
• Trying to buckle the trend
• Review of work in the last 8 years…
OPENWORDNET-PT
• Wordnet is the most paradigmatic resource for English NLP
• Want a Portuguese Wordnet that is open access, downloadable and updateable, so that
it can be improved by the community
• Especially interested in NLP for KR and automated deduction (our team)
• But also word sense disambiguation, information retrieval, automatic text classification,
automatic text summarization, question answering, etc….
A BIT OF HISTORY
• Initially a transformation and extension of data from the UniversalWordnet/MENTA
(UWN/MENTA)
• machine learning to construct relationships between graphs from Wikipedia in several
languages Plus machine readable dictionaries
• continuously improved through linguistically motivated additions and removals, either manual
or semi-automatic, making use of large Portuguese corpora (DHBB, Bosque, …)
• two-tiered methodology: high precision for popular words, high recall for long tail
• Could be used for other languages: best for languages well represented on the internet and
with reasonably large Wikipedia.
NOMLEX-PT
• useful for linguistic research as well as for information extraction, basic example
destruction/destroy
• an extension of OWN-PT, with links connecting deverbal nouns with their
corresponding verbs.
• Bootstrapped manually, created c 2,000 entries via translation of the English NOMLEX
• Useful to check issues with the coherence and richness of OpenWN-PT, e.g.
aviltar/aviltamento
•
SOCIAL INTERFACE
• new social and collaborative interface implemented and deployed in 2016,“Seeing is
Correcting”
• OWN-PT part of Open Multilingual Wordnet, and Global WordNet Foundation
• Simple interfaceè content perspicuous
• Many experiments, described in the website including
• Verb lexicon improvements, gentilics, morpholinks (many not finished, yet)
• OWN-PT part of FreeLing, Google Translate, BabelNet, Onto.PT
APPLICATIONS
• Freeling
• Tweets for football
• DHBB, recently open-sourced. Biographical data is very interesting, but requires good NER
• Comparison of wordnet-like resources for PT
• Lexical resources do not thrive in a vacuum, they need other resources to interact.
• Universal Dependencies,
• Linked open Data, how to exploit?
CONCLUSION
• Despite:
• Very distributed team
• Different timelines and expectations
• No official project for all
• Quite a lot achieved:
• Use in main international projects like BabelNet, Google Translate, Freeling,…
• 18 main publications, at least as many in different stages of preparedness
• New development plan 2018-2020 from Oxford
SOME REFERENCES
• Valeria de Paiva,Alexandre Rademaker, and Gerard de Melo. OpenWordNet-PT:An open
Brazilian Wordnet for reasoning. In Proceedings of COLING 2012, Mumbai, India,
• Fabricio Chalub, Livy Real,Alexandre Rademaker, andValeria de Paiva. Semantic links for
Portuguese. In 10th Edition of (LREC), Portoroz, Slovenia, May 2016.
• Valeria de Paiva, Livy Real,Alexandre Rademaker, and Gerard de Melo. Nomlex-pt:A lexicon of
Portuguese nominalizations. LREC 2014 Reykjavik, Iceland, May 2014.
• Pedro Delfino, Bruno Cuconato, Guilherme Paulino Passos, Gerson Zaverucha, and Alexandre
Rademaker. Using openwordnet-pt for question answering on legal domain. In Global Wordnet
Conference 2018, Singapore, January 2018
MORE REFERENCES
• Lluis Padro and Evgeny Stanilovsky. Freeling 3.0: Towards Wider Multi- linguality. In
Proceedings of the Language Resources and Evaluation Con- ference (LREC 2012),
• Valeria De Paiva, Dario Oliveira, Suemi Higuchi, Alexandre Rademaker, and Gerard De Melo.
Exploratory information extraction from a historical dictionary e-Science (e- Science), volume
2, pages 11–18. IEEE, October 2014.
• Alexandre Rademaker, Fabricio Chalub, Livy Real, Claudia Freitas, Eckhard Bick, and Valeria
de Paiva Universal Dependencies for Portuguese. (Depling), pages 197–206, Pisa, Italy,
September 2017.
• Livy Real, Fabricio Chalub, Valeria de Paiva, Claudia Freitas, and Alexan- dre Rademaker.
Seeing is correcting: curating lexical resources using social interfaces. - Fourth Workshop on
Linked Data in Linguistic Resources and Applications (LDL 2015), Beijing, China

More Related Content

What's hot

Natural Language Inference for Humans
Natural Language Inference for HumansNatural Language Inference for Humans
Natural Language Inference for HumansValeria de Paiva
 
Embedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTEmbedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTValeria de Paiva
 
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedLean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedValeria de Paiva
 
If I Had a Hammer...
If I Had a Hammer...If I Had a Hammer...
If I Had a Hammer...Kevlin Henney
 
Standardising on C++
Standardising on C++Standardising on C++
Standardising on C++Kevlin Henney
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Itaapy
 
Inheritance Versus Roles - The In-Depth Version
Inheritance Versus Roles - The In-Depth VersionInheritance Versus Roles - The In-Depth Version
Inheritance Versus Roles - The In-Depth VersionCurtis Poe
 
Terminology as a Service – a model for collaborative terminology management
Terminology as a Service – a model for collaborative terminology managementTerminology as a Service – a model for collaborative terminology management
Terminology as a Service – a model for collaborative terminology managementTERMCAT
 

What's hot (9)

Natural Language Inference for Humans
Natural Language Inference for HumansNatural Language Inference for Humans
Natural Language Inference for Humans
 
Embedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTEmbedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PT
 
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction RevisitedLean Logic for Lean Times: Entailment and Contradiction Revisited
Lean Logic for Lean Times: Entailment and Contradiction Revisited
 
If I Had a Hammer...
If I Had a Hammer...If I Had a Hammer...
If I Had a Hammer...
 
Standardising on C++
Standardising on C++Standardising on C++
Standardising on C++
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
 
OOoCon Lpod
OOoCon LpodOOoCon Lpod
OOoCon Lpod
 
Inheritance Versus Roles - The In-Depth Version
Inheritance Versus Roles - The In-Depth VersionInheritance Versus Roles - The In-Depth Version
Inheritance Versus Roles - The In-Depth Version
 
Terminology as a Service – a model for collaborative terminology management
Terminology as a Service – a model for collaborative terminology managementTerminology as a Service – a model for collaborative terminology management
Terminology as a Service – a model for collaborative terminology management
 

Similar to OWN-PT: Taking Stock

Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Web2Learn
 
Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
 
Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
 
OpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportOpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportAlexandre Rademaker
 
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language TechnologyPromoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technologytechiaith
 
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...Daniel Vila Suero
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Alannah Fitzgerald
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Alannah Fitzgerald
 
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hourRadio Ga Ga: corpus-based resources, you’ve yet to have your finest hour
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hourAlannah Fitzgerald
 
OER: insights into a multilingual landscape
OER: insights into a multilingual landscapeOER: insights into a multilingual landscape
OER: insights into a multilingual landscapeLangOER
 
The Great Beyond with Open English Language Resources
The Great Beyond with Open English Language ResourcesThe Great Beyond with Open English Language Resources
The Great Beyond with Open English Language ResourcesAlannah Fitzgerald
 
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...Teaching and learning less used languages through OER and OEP, LINQ Conferenc...
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...LangOER
 
Lexical Resources for Portuguese
Lexical Resources  for PortugueseLexical Resources  for Portuguese
Lexical Resources for PortugueseValeria de Paiva
 
Resources at the Interface of Openness for Academic English
Resources at the Interface of Openness for Academic EnglishResources at the Interface of Openness for Academic English
Resources at the Interface of Openness for Academic EnglishAlannah Fitzgerald
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
Beyond Content: Open Educational Practices for English Language Education
Beyond Content: Open Educational Practices for English Language EducationBeyond Content: Open Educational Practices for English Language Education
Beyond Content: Open Educational Practices for English Language EducationAlannah Fitzgerald
 
Sustainability in OER for less used languages
Sustainability in OER for less used languagesSustainability in OER for less used languages
Sustainability in OER for less used languagesWeb2Learn
 
Building Open Educational Resources for EAP at Hanoi Open University
Building Open Educational Resources for EAP at Hanoi Open UniversityBuilding Open Educational Resources for EAP at Hanoi Open University
Building Open Educational Resources for EAP at Hanoi Open UniversityAlannah Fitzgerald
 
OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference  OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference LangOER
 
Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translateIJwest
 

Similar to OWN-PT: Taking Stock (20)

Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...
 
Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...
 
Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...Framing quality indicators for multilingual repositories of Open Educational ...
Framing quality indicators for multilingual repositories of Open Educational ...
 
OpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportOpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project Report
 
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language TechnologyPromoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technology
 
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...
Multilingual vocabularies for the Web: Session on multilingual vocabularies, ...
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
 
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hourRadio Ga Ga: corpus-based resources, you’ve yet to have your finest hour
Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour
 
OER: insights into a multilingual landscape
OER: insights into a multilingual landscapeOER: insights into a multilingual landscape
OER: insights into a multilingual landscape
 
The Great Beyond with Open English Language Resources
The Great Beyond with Open English Language ResourcesThe Great Beyond with Open English Language Resources
The Great Beyond with Open English Language Resources
 
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...Teaching and learning less used languages through OER and OEP, LINQ Conferenc...
Teaching and learning less used languages through OER and OEP, LINQ Conferenc...
 
Lexical Resources for Portuguese
Lexical Resources  for PortugueseLexical Resources  for Portuguese
Lexical Resources for Portuguese
 
Resources at the Interface of Openness for Academic English
Resources at the Interface of Openness for Academic EnglishResources at the Interface of Openness for Academic English
Resources at the Interface of Openness for Academic English
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
Beyond Content: Open Educational Practices for English Language Education
Beyond Content: Open Educational Practices for English Language EducationBeyond Content: Open Educational Practices for English Language Education
Beyond Content: Open Educational Practices for English Language Education
 
Sustainability in OER for less used languages
Sustainability in OER for less used languagesSustainability in OER for less used languages
Sustainability in OER for less used languages
 
Building Open Educational Resources for EAP at Hanoi Open University
Building Open Educational Resources for EAP at Hanoi Open UniversityBuilding Open Educational Resources for EAP at Hanoi Open University
Building Open Educational Resources for EAP at Hanoi Open University
 
OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference  OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference
 
Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translate
 

More from Valeria de Paiva

Dialectica Categorical Constructions
Dialectica Categorical ConstructionsDialectica Categorical Constructions
Dialectica Categorical ConstructionsValeria de Paiva
 
Logic & Representation 2021
Logic & Representation 2021Logic & Representation 2021
Logic & Representation 2021Valeria de Paiva
 
Constructive Modal and Linear Logics
Constructive Modal and Linear LogicsConstructive Modal and Linear Logics
Constructive Modal and Linear LogicsValeria de Paiva
 
Dialectica Categories Revisited
Dialectica Categories RevisitedDialectica Categories Revisited
Dialectica Categories RevisitedValeria de Paiva
 
Networked Mathematics: NLP tools for Better Science
Networked Mathematics: NLP tools for Better ScienceNetworked Mathematics: NLP tools for Better Science
Networked Mathematics: NLP tools for Better ScienceValeria de Paiva
 
Going Without: a modality and its role
Going Without: a modality and its roleGoing Without: a modality and its role
Going Without: a modality and its roleValeria de Paiva
 
Problemas de Kolmogorov-Veloso
Problemas de Kolmogorov-VelosoProblemas de Kolmogorov-Veloso
Problemas de Kolmogorov-VelosoValeria de Paiva
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesValeria de Paiva
 
Negation in the Ecumenical System
Negation in the Ecumenical SystemNegation in the Ecumenical System
Negation in the Ecumenical SystemValeria de Paiva
 
Constructive Modal and Linear Logics
Constructive Modal and Linear LogicsConstructive Modal and Linear Logics
Constructive Modal and Linear LogicsValeria de Paiva
 
Categorical Explicit Substitutions
Categorical Explicit SubstitutionsCategorical Explicit Substitutions
Categorical Explicit SubstitutionsValeria de Paiva
 
Logic and Probabilistic Methods for Dialog
Logic and Probabilistic Methods for DialogLogic and Probabilistic Methods for Dialog
Logic and Probabilistic Methods for DialogValeria de Paiva
 
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)Intuitive Semantics for Full Intuitionistic Linear Logic (2014)
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)Valeria de Paiva
 
Dialectica and Kolmogorov Problems
Dialectica and Kolmogorov ProblemsDialectica and Kolmogorov Problems
Dialectica and Kolmogorov ProblemsValeria de Paiva
 

More from Valeria de Paiva (20)

Dialectica Comonoids
Dialectica ComonoidsDialectica Comonoids
Dialectica Comonoids
 
Dialectica Categorical Constructions
Dialectica Categorical ConstructionsDialectica Categorical Constructions
Dialectica Categorical Constructions
 
Logic & Representation 2021
Logic & Representation 2021Logic & Representation 2021
Logic & Representation 2021
 
Constructive Modal and Linear Logics
Constructive Modal and Linear LogicsConstructive Modal and Linear Logics
Constructive Modal and Linear Logics
 
Dialectica Categories Revisited
Dialectica Categories RevisitedDialectica Categories Revisited
Dialectica Categories Revisited
 
PLN para Tod@s
PLN para Tod@sPLN para Tod@s
PLN para Tod@s
 
Networked Mathematics: NLP tools for Better Science
Networked Mathematics: NLP tools for Better ScienceNetworked Mathematics: NLP tools for Better Science
Networked Mathematics: NLP tools for Better Science
 
Going Without: a modality and its role
Going Without: a modality and its roleGoing Without: a modality and its role
Going Without: a modality and its role
 
Problemas de Kolmogorov-Veloso
Problemas de Kolmogorov-VelosoProblemas de Kolmogorov-Veloso
Problemas de Kolmogorov-Veloso
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and Machines
 
Dialectica Petri Nets
Dialectica Petri NetsDialectica Petri Nets
Dialectica Petri Nets
 
Negation in the Ecumenical System
Negation in the Ecumenical SystemNegation in the Ecumenical System
Negation in the Ecumenical System
 
Constructive Modal and Linear Logics
Constructive Modal and Linear LogicsConstructive Modal and Linear Logics
Constructive Modal and Linear Logics
 
NLCS 2013 opening slides
NLCS 2013 opening slidesNLCS 2013 opening slides
NLCS 2013 opening slides
 
Dialectica Comonads
Dialectica ComonadsDialectica Comonads
Dialectica Comonads
 
Categorical Explicit Substitutions
Categorical Explicit SubstitutionsCategorical Explicit Substitutions
Categorical Explicit Substitutions
 
Logic and Probabilistic Methods for Dialog
Logic and Probabilistic Methods for DialogLogic and Probabilistic Methods for Dialog
Logic and Probabilistic Methods for Dialog
 
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)Intuitive Semantics for Full Intuitionistic Linear Logic (2014)
Intuitive Semantics for Full Intuitionistic Linear Logic (2014)
 
Dialectica and Kolmogorov Problems
Dialectica and Kolmogorov ProblemsDialectica and Kolmogorov Problems
Dialectica and Kolmogorov Problems
 
Constructive Modalities
Constructive ModalitiesConstructive Modalities
Constructive Modalities
 

Recently uploaded

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

OWN-PT: Taking Stock

  • 1. OWN-PT: TAKING STOCK ALEXANDRE RADEMAKER (IBM RESEARCH BR, EMAP, FGV) JOINT WORKVALERIA DE PAIVA, LIVY REAL, FABRICIO CHALUB NLCS 2018, OXFORD, UK
  • 2. LEXICAL RESOURCES FOR PORTUGUESE? • 6th mostly spoken language in the world (Ethnologue) or 7th (Wikipedia) • Very few open source resources for PT, almost no connections between them • Discuss: • Initial developments • NOMLEX-PT • Applications • Next steps
  • 3. FOLK WISDOM Linguistic resources are very easy to start working on, very hard to improve on and extremely difficult to maintain, as funding usually only works for new resources. • Trying to buckle the trend • Review of work in the last 8 years…
  • 4. OPENWORDNET-PT • Wordnet is the most paradigmatic resource for English NLP • Want a Portuguese Wordnet that is open access, downloadable and updateable, so that it can be improved by the community • Especially interested in NLP for KR and automated deduction (our team) • But also word sense disambiguation, information retrieval, automatic text classification, automatic text summarization, question answering, etc….
  • 5. A BIT OF HISTORY • Initially a transformation and extension of data from the UniversalWordnet/MENTA (UWN/MENTA) • machine learning to construct relationships between graphs from Wikipedia in several languages Plus machine readable dictionaries • continuously improved through linguistically motivated additions and removals, either manual or semi-automatic, making use of large Portuguese corpora (DHBB, Bosque, …) • two-tiered methodology: high precision for popular words, high recall for long tail • Could be used for other languages: best for languages well represented on the internet and with reasonably large Wikipedia.
  • 6. NOMLEX-PT • useful for linguistic research as well as for information extraction, basic example destruction/destroy • an extension of OWN-PT, with links connecting deverbal nouns with their corresponding verbs. • Bootstrapped manually, created c 2,000 entries via translation of the English NOMLEX • Useful to check issues with the coherence and richness of OpenWN-PT, e.g. aviltar/aviltamento •
  • 7. SOCIAL INTERFACE • new social and collaborative interface implemented and deployed in 2016,“Seeing is Correcting” • OWN-PT part of Open Multilingual Wordnet, and Global WordNet Foundation • Simple interfaceè content perspicuous • Many experiments, described in the website including • Verb lexicon improvements, gentilics, morpholinks (many not finished, yet) • OWN-PT part of FreeLing, Google Translate, BabelNet, Onto.PT
  • 8. APPLICATIONS • Freeling • Tweets for football • DHBB, recently open-sourced. Biographical data is very interesting, but requires good NER • Comparison of wordnet-like resources for PT • Lexical resources do not thrive in a vacuum, they need other resources to interact. • Universal Dependencies, • Linked open Data, how to exploit?
  • 9. CONCLUSION • Despite: • Very distributed team • Different timelines and expectations • No official project for all • Quite a lot achieved: • Use in main international projects like BabelNet, Google Translate, Freeling,… • 18 main publications, at least as many in different stages of preparedness • New development plan 2018-2020 from Oxford
  • 10. SOME REFERENCES • Valeria de Paiva,Alexandre Rademaker, and Gerard de Melo. OpenWordNet-PT:An open Brazilian Wordnet for reasoning. In Proceedings of COLING 2012, Mumbai, India, • Fabricio Chalub, Livy Real,Alexandre Rademaker, andValeria de Paiva. Semantic links for Portuguese. In 10th Edition of (LREC), Portoroz, Slovenia, May 2016. • Valeria de Paiva, Livy Real,Alexandre Rademaker, and Gerard de Melo. Nomlex-pt:A lexicon of Portuguese nominalizations. LREC 2014 Reykjavik, Iceland, May 2014. • Pedro Delfino, Bruno Cuconato, Guilherme Paulino Passos, Gerson Zaverucha, and Alexandre Rademaker. Using openwordnet-pt for question answering on legal domain. In Global Wordnet Conference 2018, Singapore, January 2018
  • 11. MORE REFERENCES • Lluis Padro and Evgeny Stanilovsky. Freeling 3.0: Towards Wider Multi- linguality. In Proceedings of the Language Resources and Evaluation Con- ference (LREC 2012), • Valeria De Paiva, Dario Oliveira, Suemi Higuchi, Alexandre Rademaker, and Gerard De Melo. Exploratory information extraction from a historical dictionary e-Science (e- Science), volume 2, pages 11–18. IEEE, October 2014. • Alexandre Rademaker, Fabricio Chalub, Livy Real, Claudia Freitas, Eckhard Bick, and Valeria de Paiva Universal Dependencies for Portuguese. (Depling), pages 197–206, Pisa, Italy, September 2017. • Livy Real, Fabricio Chalub, Valeria de Paiva, Claudia Freitas, and Alexan- dre Rademaker. Seeing is correcting: curating lexical resources using social interfaces. - Fourth Workshop on Linked Data in Linguistic Resources and Applications (LDL 2015), Beijing, China