HPSG is a linguistic theory developed in the 1980s that uses feature structures to represent syntactic and semantic information about linguistic objects. It synthesizes ideas from previous theories and assumes a small set of universal principles and feature-based representations of words and phrases. Phrases are built using constraint-based rules that combine heads with their arguments via inheritance of features rather than recursive embedding. This allows natural language syntax to be modeled in a way that is both computationally precise and psycholinguistically plausible.
2. Head driven Phrase Structure Grammar
HPSG was developed by Carl Pollard and Ivan Sag since 1987, initially
as a refinement and extension of Generalized Phrase Structure
Grammars (Gazdar, 1981) and belongs to a family of phrase
structure-theoretic approaches in which a rich set of lexical
specifications, coupled with a few very general combinatorial
constraints and restrictions on information sharing, interact
monotonically to give rise to sets of complex objects called feature
structures, which model the properties of linguistic signs.
3. HSPG
HPSG is a ‘sign based’ grammar, where phonological, syntactic and
semantic information is integrated into a formally precise description of
linguistic objects using feature structures, developed by Ivan Sag and Carl
Pollard in the mid 80s.
• Universal grammar:
a) linguistic signs;
b) combination principles.
4. Features
Features and values characterize linguistic objects;
Structure sharing allows to say that certain values in a feature structure are
identical;
• Valence information is represented in lists in a complex description of the
head;
• Types allows for classification of (linguistic) objects.
5. HPSG
HPSG assumes features structures as models of linguistic objects.
Feature structures are called AVM (Attribute Value Matrix);
AVM consists of feature value pairs;
The values can be atomic or feature descriptions;
Every feature structure is of a certain type;
Types are ordered in hierarchies ;
Hierarchies have the most general type at the top and the most specific at the bottom;
6. Attribute Value Matrics
AVM for the word ‘talks’;
The verb categorical information is divided into features that describe it, HEAD, and
features that describe its arguments, VALENCE.
Talks is a sign of type word with a head of type verb.
Intransitive verb with no complement requires a subject that is a third person singular
noun.
The semantic value of the subject is co-indexed with the verb’s only argument (the
individual doing the talking).
7.
8. Feature
Computer processable grammars that scale up and can be implemented;
Most wide-spread grammatical framework employed in computational
linguistics;
A must for everyone working on natural language processing;
Increased precision;
Framework for integration;
Psycholinguistic plausibility.
9. Applications
Various parser based on the HPSG formalism have been written,
Currently there are grammars for German, Mandarin, Chinese, Maltese and
Persian that share a common core and are publicly available;
Large HPSG grammars of various languages are being developed in the
Deep Linguistic Processing with HPSG initiative;
The Babel system, a system of analysing written language
10. HPSG Theory
The theory presented, head-driven phrase structure grammar - so called
because of its central notion of the grammatical head - is an information-
based (or unification-based) theory that has roots in different research
programs within linguistics and neighboring disciplines (philosophy and
computer science).
HPSG draws upon and attempts to synthesize theories, such as categorial
grammar, lexical-functional grammar, generalized phrase-structure grammar,
and government-binding theory; but many key ideas arise from semantic
theories like situation semantics and discourse representation theory, and
from computational work in knowledge representation, data type theory, and
formalisms based on the unification of partial information.
11. HPSG
HPSG is an information-based theory of natural language syntax and semantics. It
was developed by synthesizing a number of theories mentioned above.
In these theories syntactic features are classified as head features, binding
features and the subcategorization feature; thus HPSG uses three principles of
universal grammar including:
Head Feature Principle
Binding Inheritance Principle
Subcategorization Principle
12. Head Feature Principle
Similar to GPSG’s Head Feature Convention. It states that head
features (e.g., part of speech, the case of nouns, verb inflection) of a
phrasal sign be shared with its head daughter, e.g., case of a noun
phrase is determined by the case of its head noun, etc.
13. Binding Inheritance Principle
Similar to GPSG’s Foot Feature Principle. Binding features encode
syntactic dependencies of signs that are essentially nonlocal such as
the presence of gaps, relative pronouns, etc.
This principle states that dependency information be transmitted up
the sign’s constituent structure until the dependency can become
“bound/saturated”.
14. Subcategorization Principle
Generalization of categorial grammar’s “argument cancellation”.
Subcategorization is described by a SUBCAT feature.
SUBCAT value is a list of signs with which the sign in question must
combine to be saturated. For example, the SUBCAT value of the past-
tense intransitive verb walked is the list NP [NOM] since walked must
combine with a single nominative case NP (the subject) to be saturated;
past tense transitive verb liked has the SUBCAT value NP[ACC],
NP[NOM] since liked requires accusative-case NP (direct object) &
nominative-case NP (subject).
15. HPSG
HPSG principles are more explicitly formulated and thus implementations more
likely to be faithful to theory. There is less work for language-specific rules of
grammar. In Pollard & Sag (1987) only four highly schematic HPSG rules accounted
for a substantial English fragment. One rule, informally written as
[ SUBCAT ] H[ LEX - ], C
subsumes a number of conventional phrase structure rules, such as those below.
S NP VP
NP DET NOM
NP NP’s NOM
16. HPSG
In the HPSG rule, one possibility is that the English phrase to be a
saturated sign [ SUBCAT ], with denoting the empty list, has
constituents which are a phrasal head (H[ LEX - ]) and a single
complement (C).
Another HPSG rule, expressed informally as
[ SUBCAT [ ] ] H[ LEX + ], C*
says that another option for English phrases is to be a sign subcategorizing
for exactly one complement [ SUBCAT [ ] ] with “[ ]” stands for any list
of length one, and whose daughters are a lexical head (H[ LEX + ]) and any
number of complement daughters.
17. HPSG
This rule subsumes a number of conventional phrase structure rules, such as
VP V; VP V S’; AP A;
VP V NP; AP A PP; PP P NP;
VP V PP; VP V VP; VP V AP;
VP V NP NP; VP V NP PP; etc.
HPSG rules determine constituency only; this follows GPSG theory where
generalizations about relative order of constituents is factored out of phrase
structure rules and expressed in independent language-specific linear precedence
(LP) constraints. Unlike GPSG’s some LP constraints may refer not only to syntactic
categories but also to their grammatical relations
18. HPSG
tuple (Atom, Feat, Var, Type, Init, Rule):
• Atom - set of atoms
• Feat - set of features or attributes
• Type = (T, subtype) - type hierarchy
• Init - set of initial AVMs (attribute-value matrices)
• Rule - set of rules
HPSG principes are defined and used to define HPSG modules
19. HPSG Feature Structure Descriptions
pizza thing
pizza topping set
vegetarian non-vegetarian
OLIVES
ONIONS
MUSHROOMS
CRUST
TOPPINGS
SAUSAGE
PEPPERONI
HAM