1. Constructive Adpositional Grammars, Formally
DIP Colloquium, ILLC, Universiteit van Amsterdam
Marco Benini
marco.benini@uninsubria.it
Università degli Studi dell’Insubria
13th February 2015
2. Abstract
Constructive Adpositional Grammars, already presented in this series of
lectures by Federico Gobbo, is a way to describe natural languages which has
been developed to clarify the relations between syntax, semantics and
pragmatics. In this talk, the mathematical model supporting the grammars
is introduced and motivated.
Specifically, it will be shown that, using Category Theory even in a rather
elementary way, it is possible to formalise not only a natural language, but
also, and more significantly, how and why this formalisation is done, making
clear what are the fundamental choices, what are the sound alternatives, and
how variations can be used to obtain essentially equivalent grammars but
with a different orientation towards a specific purpose. So, a formal model
becomes a powerful way to reason about the structure of a language,
making conscious choices on what to emphasise and what to hide, in order
to describe just the aspects of interest.
(2 of 17)
3. Example
“Paul is going to study maths in the library”
D
←
in
E
the
A
←
O
library
O
←
I2
D
→
-ing
O
D
←
O
maths
O
←
I2
2
Paul
O
→
I2
1
study
I2
→
to
I2
2
Paul
O
→
I2
1
go
I2∗
→
I2
2
Paul
O
←
I2
1
is
I2
(3 of 17)
4. Language as a category
A language is a collection of expressions, which are constructed one
from another. In the example, “Paul is going to study math in the
library” is constructed from “Paul is going to study math”, which, in
turn, is constructed from “is”.
So, a language is a mathematical category where
the objects are the expressions
the arrows are the (concrete) constructions
identities are the empty constructions
composition g ◦f is the rule apply the construction g on the result
of the construction f
(4 of 17)
5. Language as a category
Expressions are abstract entities: they have a unique role in the
syntax and a unique meaning in the semantics, although their
presentations, i.e., the way they are written, may be ambiguous.
A grammar is an abstract description of the constructions, i.e., of the
arrows in the category. It is sound when it describes only existing
arrows, and it is complete when it describes all the existing arrows.
A grammar is built to achieve a purpose, for example, efficient
parsing. The adpositional grammars were built to help the analysis of
the language, to provide a neat description of linguistic phenomena,
and to emphasise the structural aspects.
(5 of 17)
6. Grammar characters
Fixed a category C, the collection of grammar characters is a family
G = {Gi }i such that i Gi = ObjC, Gi ∩Gj = when i = j, and each Gi
is inhabited. The collection of possible indexes is called GC.
In the example the grammar characters are indexed by
GC = {I2,I2
1 ,I2
2 ,I2,I2∗
,A,E,O,U,D}.
In words, G is a partition of expressions, a way to classify them.
Although this is the basis of all existing grammars, it is less trivial
than it seems to be. . .
(6 of 17)
7. Grammar characters
. . . in fact, we may define a grammar category G whose objects are the
grammar characters and whose arrows are the abstract constructions
C
G
Thus, concrete constructions are generated by the grammar G as
sheaf-like structures: C is the étalé space over the fibres on G, and the
concrete constructions are the counter-images of the abstract ones.
How effective and useful is this generation process highly depends on
the structure of the grammar category.
(7 of 17)
8. Defining constructions
Naturally, an abstract construction is the collection of the concrete
constructions it generates through the fibres.
Assuming that constructions are applied to expressions, the index in
the fibre on the domain is clear. In fact, an abstract construction is
defined by the indexing in the fibre of the codomain.
In the example, the abstract rule O → O which associates a noun with
an article, generates the concrete construction library → the library.
It does so by choosing the element in the fibre on the domain which
says that library has the O grammar character, and by choosing the
element in the fibre on the codomain which says that the has the A
grammar character thus the library is the concrete instance of the
abstract construction.
(8 of 17)
9. Defining constructions
In general, an abstract construction can be specified by a product of
grammar characters C1 ×···×Cn so that, given the expression e, the
construction builds the expression which combines e with an instance
(c1,...,cn) of the product.
In the adpositional paradigm, we assume that the above combination
is performed by generating a concrete construction of the form
e → a1(e,a2(c1 ...an(cn−1,cn)))
with a1,...,an from the distinct U grammar character which groups
the linking expressions, called adpositions.
In the example, go is in the grammar character I( ,to), so applying it
to the pair (Paul,study maths) yields Paul go to study maths.
(9 of 17)
10. Adtrees
The adpositional abstract constructions provide a way to represent
concrete expressions: adpositional trees (adtrees, for short). Evidently,
this representation can be extended to constructions as well.
For instance, the example in the previous slide can be rewritten as
go
I( ,to)
→ study math
O
↔
to
I
Paul
O
↔
I(to)
go
I( ,to)
Assuming to have a collection of atomic expressions, and that each
abstract construction can be reduced to elementary constructions, the
adtree presentation allows to recursively write any expression.
(10 of 17)
11. Adtrees
Elementary constructions have the form
X
G
→
Y
D
↔
a
E
X
G
Thus, in formal terms, they are abstract construction G → E from the
governor G to the grammar character E of the resulting expression, on
the product (D,a), i.e, the pair formed by the dependent D and the
adposition a. The concrete expression X denotes the index in the fibre
of the domain G, while Y is the concrete instance of the product D.
By swapping X and Y , and G and D, we get the conjugate
construction, which is, of course, equivalent but it moves the focus of
the construction.
(11 of 17)
12. Adtrees
Adtrees together with the instances of abstract constructions form a
category Ad(C). There is an evident embedding of C into Ad(C),
given by the realisation of abstract constructions in C.
In general, Ad(C) extends C: it contains adtrees for non-valid
expressions, like “Paul go to study math”, which does not exists in
the English language.
This extended world is much more regular and so more apt to get
analysed. And, it is easy to get back to the world of “sound”
expressions by means of redundancy transformations, which are
instances of a far more general tool.
Appendix B of F. Gobbo, M. Benini, Constructive Adpositional
Grammars: Foundations of Constructive Linguistics, Cambridge
Scholar Press (2011) contains the technical details omitted here.
(12 of 17)
13. Transformations
A transformation is a functor Ad(C) → Ad(C). Technically, a
transformation maps each adtree into another adtree, and
constructions into constructions, coherently, preserving identities and
composition.
Intuitively, a transformation is a way to uniformly establish a
correspondence between constructions. For example, changing the
tense of a verb from active to passive.
Or, adding redundancies wherever appropriate: for example, taking
care of according the gender or the number of a noun with the other
part of the discourse, thus realising the retraction from Ad(C) to C.
As a matter of fact, as my colleague has already shown in his talk, all
the major linguistic phenomena are modelled through transformations,
leading to surprising insights, sometimes.
(13 of 17)
14. Transformations
A grammar functor is a functor F : G → G from the grammar category
to itself. As a matter of fact, the interesting transformations are the
ones generating grammar functors, simply because these
transformations preserve the grammar.
By allowing hidden or null expressions in the language, grammar
functors can describe very complex and abstract constructions, like in
the guiding example:
maths
O
←
I2
2
Paul
O
→
I2
1
study
I2
→ D
←
to
O
maths
O
←
I2
2
Paul
O
→
I2
1
study
I2
(14 of 17)
15. Conclusion
Necessarily, many aspects of the mathematical model behind
adpositional grammars have been left out of this talk.
The important points are:
an adpositional grammar is a natural instance of a general
paradigm which is based on the categorical notion of sheaves;
grammar characters and abstract constructions are design choices
in the design of a grammar: sound alternatives are given in an
elementary way by conjugate constructions; more complex
alternatives can be obtained by applying transformations whose
generated grammar functor is isomorphic to identity;
hiding expressions by means of transformations leads to a uniform
rendering of highly complex constructions, like infinitives, gerunds,
and similar.
(15 of 17)
16. Conclusion
Not mentioned, but in the background, is the idea to look at
grammars as sheaves, i.e., as Grothendieck toposes: in particular,
among the many other things, this would lead to generate a logical
system where abstract constructions and transformations become
inference rules, allowing to reason about the structures in the
language.
Of course, this way of reasoning uses the most abstract part of
contemporary mathematics and, using the words of P. Johnstone, one
of the most respected experts of topos theory, it is not for the
faint-hearted.
(16 of 17)