SlideShare a Scribd company logo
1 of 14
Download to read offline
Formalising the Swedish Constructicon
in Grammatical Framework
Normunds Grūzītis1,3, Dana Dannélls2, Benjamin Lyngfelt2, Aarne Ranta1
1University of Gothenburg, Department of Computer Science and Engineering
2University of Gothenburg, Department of Swedish
3University of Latvia, Institute of Mathematics and Computer Science
ACL/IJCNLP Workshop on Grammar Engineering Across Frameworks
Beijing, China, July 30, 2015
Constructicon
• A collection of conventionalized (learned) pairings of form and meaning
(or function), typically based on principles of Construction Grammar, CxG
(e.g. Fillmore et al. 1988, Goldberg 1995)
– Semantics is associated directly with the surface form
– vs. Lexical units in a dictionary: pairings of word and meaning (frame)
• Including fixed multi-word units
• Each construction (cx) contains at least one variable element
– Often at least one fixed element as well
– Thus, “somewhere” in-between the syntax and the lexicon
• An example from Berkeley Constructicon: “make one’s way”
– Structure: {Motion verb [Verb] [PossNP]}
– Frame: MOTION
• [ThemeThey] {hacked their way} [Sourceout] [Goalinto the open].
• [ThemeWe] {sang our way} [Pathacross Europe].
Constructicons
• Berkeley Constructicon (BCxn) for English
– A pilot project (around 70 cx), linked to Berkeley FrameNet
• Swedish Constructicon (SweCcn)
– An ongoing project (nearly 400 cx so far), partially linked to FrameNet
• ToDo: links to BCxn
• Brazilian Portuguese Constructicon
– An ongoing project
• ...
• A multilingual (interlingual) constructicon would allow for non-
compositional translation in a compositional way
– Constructions with a referential meaning may be linked via FrameNet frames,
while those with a more abstract grammatical function may be related in terms
of their grammatical properties
[Bäckström L., Lyngfelt B., Sköldberg E. (2014) Towards interlingual constructicography]
http://spraakbanken.gu.se/eng/sweccn
SweCcn
• Partially schematic multi-word units/expressions
• Particularly addresses constructions of relevance for second-language
learning, but also covers argument structure constructions
• Descriptions are manually derived from corpus examples
• Construction elements (CE):
– Internal CEs are a part of the cx
– External CEs are a part of
the valency of the cx
– Described in more detail by
attribute-value matrices specifying
their syntactic and semantic features
• A central part of cx descriptions
is the free text definitions
– ‘eat himself full’ vs. ‘feel himself tired’
(äta sig mätt vs. känna sig trött)
SweCcn → GF
• Task: convert the semi-formal SweCcn into a computational CxG
– Test Grammatical Framework (GF) as a framework for implementing CxG
• Why GF?
– There is no formal distinction between lexical and syntactic functions in GF –
fits the nature of constructicons
– The potential support for multilinguality
– Based on GF Resource Grammar Library (RGL) / an extension to RGL
– An extension to a FrameNet-based grammar and lexicon in GF
• Goals:
– From the linguistic point of view
• Improve insights into the interaction between the lexicon and the grammar
• Allow for testing the linguistic descriptions of constructions
– From the language technology point of view:
• Facilitate the language processing in both mono- and multilingual settings
– e.g. Information Extraction, Machine Translation
Conversion steps
• Preprocessing:
– Automatic normalization and consistency checking
– Automatic rewriting of the original structures in case of optional CEs and
alternative types of CEs, so that each combination has a separate GF function
• Does not apply to alternative LUs (either free variants or should be split into
alternative constructions, or the CE should be made more general)
– Automatic conversion of SweCcn categories to RGL categories
• May result in more rewriting
• Automatic generation of the abstract syntax
• Automatic generation of the concrete syntax
– By systematically applying the high-level RGL constructors
• And limited low-level means
• Manual verification and completion (ToDo)
– Requires a good knowledge and linguistic intuition of the language
Preprocessing examples
• behöva NP1 till NP2|VP →
behövaV NP1 tillPrep NP2 | behövaV NP tillPrep VP
• snacka|prata|tala NPindef → (~synonyms of “to talk”)
snackaV|prataV|talaV aSg_Det CN |
snackaV|prataV|talaV aPl_Det CN |
snackaV|prataV|talaV CN
• V av Pnrefl (NP) →
V avPrep reflPron NP | V avPrep reflPron
• N|Adj+städa → (compounds)
N + städaV | A + städaV
Abstract syntax
• Each construction is represented by one or more functions
depending on how many alternative structures are produced in the
preprocessing steps
• Each function takes one or more arguments that correspond to the
variable CEs of the respective alternative construction
• behöva_något_till_något_VP1 : NP -> NP -> VP
behöva_något_till_något_VP2 : NP -> VP -> VP
• snacka_NP1: CN -> VP
snacka_NP2: CN -> VP
snacka_NP3: CN -> VP
• verba_av_sig_transitiv1: V -> NP -> VP
verba_av_sig_transitiv2: V -> VP
• x_städa1: N -> VP
x_städa2: A -> VP
Concrete syntax
Construction Elements Patterns
behöva_något_till_något_VP_1 behöva_V NP_1 till_Prep NP_2 {V} NP {Prep} NP
behöva_något_till_något_VP_2 behöva_V NP_1 till_Prep VP {V} NP {Prep} VP
Code template
1. mkVP (mkVP (mkV2 mkV) NP) (mkAdv mkPrep NP)
2. The parser failed at token VP
• Many constructions can be implemented by systematically applying
the high-level RGL constructors
– A parsing problem: which constructors in which order?
A simple GF grammar
Final code (by automatic post-processing)
lin behöva_något_till_något_VP_1 np_1 np_2 = mkVP
(mkVP (mkV2 (mkV "behöver")) np_1)
(SyntaxSwe.mkAdv (mkPrep "till") np_2) ;
GF RGL API
Code-generating grammar
A simplified fragment of the abstract syntax
A simplified fragment of the concrete syntax
parse -cat=VP "{V} {Prep} NP"
mkVP__V2_NP
(mkV2__V (partV _mkV___V
(toStr__Prep _mkPrep_))) _NP_
mkVP__V2_NP (mkV2__V_Prep
_mkV___V _mkPrep_) _NP_
mkVP__VP_Adv (mkVP__V _mkV___V)
(mkAdv _mkPrep_ _NP_)
Running examples
• parse "jag behöver något till något"
– PredVP (UsePron i_Pron)
(behöva_något_till_något_1 (DetNP someSg_Det) (DetNP someSg_Det))
– PredVP (UsePron i_Pron)
(behöva_något_till_något_1 (DetNP someSg_Det) something_NP)
– PredVP (UsePron i_Pron)
(behöva_något_till_något_1 something_NP (DetNP someSg_Det))
– PredVP (UsePron i_Pron)
(behöva_något_till_något_1 something_NP something_NP)
• parse "han äter sig mätt"
– PredVP (UsePron he_Pron)
(reflexiv_resultativ aeta_vb_1_1_V (PositA maett_av_1_1_A))
– PredVP (UsePron he_Pron)
(AdvVP (SI_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A))
– PredVP (UsePron he_Pron)
(AdvVP (reciprok_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A))
– PredVP (UsePron he_Pron)
(AdvVP (trans_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A))
– PredVP (UsePron he_Pron)
(V_refl_rörelse aeta_vb_1_1_V (PositAdvAdj maett_av_1_1_A))
Results
• In the current experiment, we have considered only the 96 VP
constructions which resulted in 127 functions
– Dominating in SweCcn; have the most complex internal structure
• Given the 127 functions, we have automatically generated the
implementation for 98 functions (77%) achieving a 70–90% accuracy
– There is clear space for improvement
• Manual completion postponed because of the active development of
SweCcn (changes → synchronization)
• https://github.com/GrammaticalFramework/gf-contrib (SweCcn)
• A methodology on how to systematically formalise the semi-formal
representation of SweCcn in GF, showing that a GF construction grammar
can be, to a large extent, acquired automatically
• Consequence: feedback to SweCcn developers on how to improve the
annotation consistency and adequacy of the original construction resource

More Related Content

Similar to Formalising the Swedish Constructicon

Grammatical Framework for implementing multilingual frames and constructions
Grammatical Framework for implementing multilingual frames and constructionsGrammatical Framework for implementing multilingual frames and constructions
Grammatical Framework for implementing multilingual frames and constructionsNormunds Grūzītis
 
Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)H K Yoon
 
Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Edmond Lepedus
 
MOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenMOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenOlga Caprotti
 
System Programming Unit III
System Programming Unit IIISystem Programming Unit III
System Programming Unit IIIManoj Patil
 
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...AIST
 
Encode, tag, realize high precision text editing
Encode, tag, realize high precision text editingEncode, tag, realize high precision text editing
Encode, tag, realize high precision text editingtaeseon ryu
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
Definite Clause Grammars For Language Analysis
Definite Clause Grammars For Language AnalysisDefinite Clause Grammars For Language Analysis
Definite Clause Grammars For Language AnalysisRePierre
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Normunds Grūzītis
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesSchwannden Kuo
 
Functional Programming in JavaScript & ESNext
Functional Programming in JavaScript & ESNextFunctional Programming in JavaScript & ESNext
Functional Programming in JavaScript & ESNextUnfold UI
 
PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)Abhilash Majumder
 
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Innovation Quotient Pvt Ltd
 
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL SupportOWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL SupportNormunds Grūzītis
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFJayavardhan Reddy Peddamail
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaopenseesdays
 
ECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGGeorge Simov
 

Similar to Formalising the Swedish Constructicon (20)

Grammatical Framework for implementing multilingual frames and constructions
Grammatical Framework for implementing multilingual frames and constructionsGrammatical Framework for implementing multilingual frames and constructions
Grammatical Framework for implementing multilingual frames and constructions
 
Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)
 
Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...
 
MOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenMOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala Sweden
 
System Programming Unit III
System Programming Unit IIISystem Programming Unit III
System Programming Unit III
 
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...
Elena Bolshakova and Natalia Efremova - A Heuristic Strategy for Extracting T...
 
Encode, tag, realize high precision text editing
Encode, tag, realize high precision text editingEncode, tag, realize high precision text editing
Encode, tag, realize high precision text editing
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Definite Clause Grammars For Language Analysis
Definite Clause Grammars For Language AnalysisDefinite Clause Grammars For Language Analysis
Definite Clause Grammars For Language Analysis
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminaries
 
Functional Programming in JavaScript & ESNext
Functional Programming in JavaScript & ESNextFunctional Programming in JavaScript & ESNext
Functional Programming in JavaScript & ESNext
 
PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)
 
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
 
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL SupportOWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 
ECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERING
 
Programming in c++
Programming in c++Programming in c++
Programming in c++
 

More from Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 

Recently uploaded (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 

Formalising the Swedish Constructicon

  • 1. Formalising the Swedish Constructicon in Grammatical Framework Normunds Grūzītis1,3, Dana Dannélls2, Benjamin Lyngfelt2, Aarne Ranta1 1University of Gothenburg, Department of Computer Science and Engineering 2University of Gothenburg, Department of Swedish 3University of Latvia, Institute of Mathematics and Computer Science ACL/IJCNLP Workshop on Grammar Engineering Across Frameworks Beijing, China, July 30, 2015
  • 2. Constructicon • A collection of conventionalized (learned) pairings of form and meaning (or function), typically based on principles of Construction Grammar, CxG (e.g. Fillmore et al. 1988, Goldberg 1995) – Semantics is associated directly with the surface form – vs. Lexical units in a dictionary: pairings of word and meaning (frame) • Including fixed multi-word units • Each construction (cx) contains at least one variable element – Often at least one fixed element as well – Thus, “somewhere” in-between the syntax and the lexicon • An example from Berkeley Constructicon: “make one’s way” – Structure: {Motion verb [Verb] [PossNP]} – Frame: MOTION • [ThemeThey] {hacked their way} [Sourceout] [Goalinto the open]. • [ThemeWe] {sang our way} [Pathacross Europe].
  • 3. Constructicons • Berkeley Constructicon (BCxn) for English – A pilot project (around 70 cx), linked to Berkeley FrameNet • Swedish Constructicon (SweCcn) – An ongoing project (nearly 400 cx so far), partially linked to FrameNet • ToDo: links to BCxn • Brazilian Portuguese Constructicon – An ongoing project • ... • A multilingual (interlingual) constructicon would allow for non- compositional translation in a compositional way – Constructions with a referential meaning may be linked via FrameNet frames, while those with a more abstract grammatical function may be related in terms of their grammatical properties [Bäckström L., Lyngfelt B., Sköldberg E. (2014) Towards interlingual constructicography]
  • 5. SweCcn • Partially schematic multi-word units/expressions • Particularly addresses constructions of relevance for second-language learning, but also covers argument structure constructions • Descriptions are manually derived from corpus examples • Construction elements (CE): – Internal CEs are a part of the cx – External CEs are a part of the valency of the cx – Described in more detail by attribute-value matrices specifying their syntactic and semantic features • A central part of cx descriptions is the free text definitions – ‘eat himself full’ vs. ‘feel himself tired’ (äta sig mätt vs. känna sig trött)
  • 6. SweCcn → GF • Task: convert the semi-formal SweCcn into a computational CxG – Test Grammatical Framework (GF) as a framework for implementing CxG • Why GF? – There is no formal distinction between lexical and syntactic functions in GF – fits the nature of constructicons – The potential support for multilinguality – Based on GF Resource Grammar Library (RGL) / an extension to RGL – An extension to a FrameNet-based grammar and lexicon in GF • Goals: – From the linguistic point of view • Improve insights into the interaction between the lexicon and the grammar • Allow for testing the linguistic descriptions of constructions – From the language technology point of view: • Facilitate the language processing in both mono- and multilingual settings – e.g. Information Extraction, Machine Translation
  • 7. Conversion steps • Preprocessing: – Automatic normalization and consistency checking – Automatic rewriting of the original structures in case of optional CEs and alternative types of CEs, so that each combination has a separate GF function • Does not apply to alternative LUs (either free variants or should be split into alternative constructions, or the CE should be made more general) – Automatic conversion of SweCcn categories to RGL categories • May result in more rewriting • Automatic generation of the abstract syntax • Automatic generation of the concrete syntax – By systematically applying the high-level RGL constructors • And limited low-level means • Manual verification and completion (ToDo) – Requires a good knowledge and linguistic intuition of the language
  • 8. Preprocessing examples • behöva NP1 till NP2|VP → behövaV NP1 tillPrep NP2 | behövaV NP tillPrep VP • snacka|prata|tala NPindef → (~synonyms of “to talk”) snackaV|prataV|talaV aSg_Det CN | snackaV|prataV|talaV aPl_Det CN | snackaV|prataV|talaV CN • V av Pnrefl (NP) → V avPrep reflPron NP | V avPrep reflPron • N|Adj+städa → (compounds) N + städaV | A + städaV
  • 9. Abstract syntax • Each construction is represented by one or more functions depending on how many alternative structures are produced in the preprocessing steps • Each function takes one or more arguments that correspond to the variable CEs of the respective alternative construction • behöva_något_till_något_VP1 : NP -> NP -> VP behöva_något_till_något_VP2 : NP -> VP -> VP • snacka_NP1: CN -> VP snacka_NP2: CN -> VP snacka_NP3: CN -> VP • verba_av_sig_transitiv1: V -> NP -> VP verba_av_sig_transitiv2: V -> VP • x_städa1: N -> VP x_städa2: A -> VP
  • 10. Concrete syntax Construction Elements Patterns behöva_något_till_något_VP_1 behöva_V NP_1 till_Prep NP_2 {V} NP {Prep} NP behöva_något_till_något_VP_2 behöva_V NP_1 till_Prep VP {V} NP {Prep} VP Code template 1. mkVP (mkVP (mkV2 mkV) NP) (mkAdv mkPrep NP) 2. The parser failed at token VP • Many constructions can be implemented by systematically applying the high-level RGL constructors – A parsing problem: which constructors in which order? A simple GF grammar Final code (by automatic post-processing) lin behöva_något_till_något_VP_1 np_1 np_2 = mkVP (mkVP (mkV2 (mkV "behöver")) np_1) (SyntaxSwe.mkAdv (mkPrep "till") np_2) ;
  • 12. Code-generating grammar A simplified fragment of the abstract syntax A simplified fragment of the concrete syntax parse -cat=VP "{V} {Prep} NP" mkVP__V2_NP (mkV2__V (partV _mkV___V (toStr__Prep _mkPrep_))) _NP_ mkVP__V2_NP (mkV2__V_Prep _mkV___V _mkPrep_) _NP_ mkVP__VP_Adv (mkVP__V _mkV___V) (mkAdv _mkPrep_ _NP_)
  • 13. Running examples • parse "jag behöver något till något" – PredVP (UsePron i_Pron) (behöva_något_till_något_1 (DetNP someSg_Det) (DetNP someSg_Det)) – PredVP (UsePron i_Pron) (behöva_något_till_något_1 (DetNP someSg_Det) something_NP) – PredVP (UsePron i_Pron) (behöva_något_till_något_1 something_NP (DetNP someSg_Det)) – PredVP (UsePron i_Pron) (behöva_något_till_något_1 something_NP something_NP) • parse "han äter sig mätt" – PredVP (UsePron he_Pron) (reflexiv_resultativ aeta_vb_1_1_V (PositA maett_av_1_1_A)) – PredVP (UsePron he_Pron) (AdvVP (SI_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A)) – PredVP (UsePron he_Pron) (AdvVP (reciprok_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A)) – PredVP (UsePron he_Pron) (AdvVP (trans_refl aeta_vb_1_1_V) (PositAdvAdj maett_av_1_1_A)) – PredVP (UsePron he_Pron) (V_refl_rörelse aeta_vb_1_1_V (PositAdvAdj maett_av_1_1_A))
  • 14. Results • In the current experiment, we have considered only the 96 VP constructions which resulted in 127 functions – Dominating in SweCcn; have the most complex internal structure • Given the 127 functions, we have automatically generated the implementation for 98 functions (77%) achieving a 70–90% accuracy – There is clear space for improvement • Manual completion postponed because of the active development of SweCcn (changes → synchronization) • https://github.com/GrammaticalFramework/gf-contrib (SweCcn) • A methodology on how to systematically formalise the semi-formal representation of SweCcn in GF, showing that a GF construction grammar can be, to a large extent, acquired automatically • Consequence: feedback to SweCcn developers on how to improve the annotation consistency and adequacy of the original construction resource