The Edge of Linguistics lecture series from Prof. Fredreck J. Newmeyer
During Oct 7 to Oct 17, Prof. Newmeyer offered a lecture series on a wide range of linguistic topics in Beijing Language and Culture University.
Lecture 1: The Chomskyan Revolution
Lecture 2: Constraining the Theory
Lecture 3: The Boundary between Syntax and Semantics
Lecture 4: The Boundary between Competence and Performance
Lecture 5: Can One Language Be ‘More Complex’ Than Another?
Background:
Fredreck J. Newmeyer is Professor Emeritus of Linguistics at the University of Washington and adjunct professor in the University Of British Columbia Department Of Linguistics and the Simon Fraser University Department of Linguistics. He has published widely in theoretical and English syntax.
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Constraining the Theory - Prof. Fredreck J. Newmeyer
1. Class 2:
Constraining the Theory
1
FREDERICK J . NEWMEYER
UNIVERSITY OF WASHINGTON, UNIVERSITY
OF BRITISH COLUMBIA,
AND SIMON FRASER UNIVERSITY
2. The need for constraints
2
Syntactic Structures contrasted three formal models
of syntax:
Finite-State grammars
Phrase-structure grammars
Transformational grammars
3. The need for constraints
Finite-State grammars. Every rule is of the form:
Si
a Sj (a terminal symbol followed by the
initial symbol)
Notice that a FSG does generate an infinite set of sentences.
But it provides no structure to these sentences.
And it cannot generate sentences like:
If Si, then Sj
Either Si or Sj
3
4. The need for constraints
Phrase-structure grammars. Every rule is of the form:
A B C
B a D
These rules give you a tree diagram:
A
B C
a D
4
• So they can handle structure and constituency.
• But they cannot handle discontinuous elements or long-distance dependencies:
• Mary is work+ing
• Who did you see?
5. The need for constraints
5
Transformational rules. They transform trees into
trees:
6. The need for constraints
6
The problem: Transformational rules are too
powerful!
They allow things to happen that never occur in any
human language:
To reverse all of the words in a sentence.
To move a preposition in the highest clause
and attach it to an adjective in the lowest
clause
To delete every other word.
7. The need for constraints
7
So the important thing was to constrain
transformational rules — to make them less
powerful.
This was especially important, given that a child
acquires a transformational grammar.
The less powerful the rules, the smaller the ‘search
space’ for the child and therefore the easier to
explain the rapidity of language acquisition.
8. Island constraints
8
The most important set of constraints over the years
are what are called ‘island constraints’.
They prohibit movement out of a particular syntactic
configuration.
A configuration out of which nothing can move is
called an ‘island’.
9. Island constraints
9
The first island constraint was the A-over-A principle
(from Chomsky 1964).
Mary saw the boy walking to the railroad station is
ambiguous:
Mary saw [NP the boy [VP walking to the railroad
station]]
Mary saw [NP [NP the boy] [VP walking to the railroad
station]]]
10. Island constraints
10
Who did Mary see walking to the railroad station? is
unambiguous
It can be a question corresponding to (i), but not to (ii):
(i) Mary saw [NP the boy [VP walking to the
railroad station]]
(ii) Mary saw [NP [NP the boy] [VP walking to the
railroad station]]]
11. Island constraints
11
The idea of the A-over-A principle: You can’t move
an NP (or a VP, S, etc.) if it is dominated by an NP
(or a VP, S, etc.):
So A1 is an
island,
preventing
the movement
of x.
12. Island constraints
12
JOHN R. ROSS
In his 1967 dissertation,
John R. Ross showed
that the A-over-A
principle did not work.
He proposed in its place,
6 or 7 other constraints to
replace A-over-A.
13. Island constraints
13
Complex Noun Phrase Constraint:
Element A cannot be
moved out of NP1
*Who do you believe the claim that John saw?
14. Island constraints
14
The Coordinate Structure Constraint
Neither conjunct in
a coordinate
structure can be
moved.
*What did John eat beans and?
15. Island Constraints
15
Ross proposed several other constraints as
well.
We still talk today about ‘Ross constraints’.
However, Chomsky in 1973 found a way to
unify most of them under one simple
constraint.
16. Island Constraints
16
The crucial notion is ‘bounding node’.
Bounding nodes (for English) are the nodes S
(=IP), and NP (=DP).
Subjacency (for English): No element may be
moved across more than one bounding node.
17. Island Constraints
17
Who did you believe Bill saw? is a grammatical sentence. The first movement
crosses only one bounding node and the second also crosses only one.
18. Island Constraints
18
*Who did you believe the claim that Bill saw? is an ungrammatical sentence.
The first movement crosses only one bounding node, but the second crosses
two.
19. Island Constraints
did
you
wonder ed
where
John
put
what
19
*What did you wonder where John put? is
impossible because it is a Subjacency violation.
20. SIMPLIFYING THE SYNTACTIC COMPONENT
20
The trend in chomskyan thinking has been to reduce
the scope and complexity of the ‘narrow syntax’ in
two ways:
1. To derive to the extent possible syntactic complexity
from independently needed principles,
and
2. To shift the burden of accounting for specific
phenomena from the syntax to other components
(lexicon, morphology, interfaces).
21. SIMPLIFYING THE SYNTACTIC COMPONENT
21
It’s the first (more interesting!) strategy that
dominated syntactic theory for the most part.
In early transformational syntax, grammars were
lists of complex rules, the rule lists of one language
not looking very much like the rule lists of another
language.
23. SIMPLIFYING THE SYNTACTIC COMPONENT
23
Throughout most of the history of generative syntax,
language-particular rules have been simplified (or
eliminated).
More general principles have replaced them.
24. SIMPLIFYING THE SYNTACTIC COMPONENT
24
A good example of deriving syntactic facts from
independently needed principles:
Joe Emonds’s Structure
Preserving Constraint
A large class of T-rules
can move an element
only into a position that
could have been created
by the Phrase-Structure
rules.
25. SIMPLIFYING THE SYNTACTIC COMPONENT
PASSIVE RULE
25
Russia defeated Germany.
Germany was defeated by Russia.
BUT NOT
a. *Germany Russia was defeated by
b. *Germany was Russia defeated by
c. *Germany was by Russia defeated
d. *Germany was defeated Russia by
Therefore, the passive rule can be greatly simplified.
26. THE LEXICALIST HYPOTHESIS
26
THE LEXICALIST HYPOTHESIS: Transformational
rules cannot change the syntactic category of an
item, perform derivational morphology, etc.
Before the late 1960s (and in Generative Semantics),
John’s refusal of the offer was derived
transformationally from something like the fact that
John refused the offer.
27. THE LEXICALIST HYPOTHESIS
27
But that fails to explain why
John’s three unexpected refusals of the offer
has exactly the same structure as
John’s three boring books about surfing
In other words, you lose a generalization by deriving
nouns from verbs.
28. THE LEXICALIST HYPOTHESIS
28
Another argument for the LH is that the relationship between
verbs and their corresponding nominalizations can be very
idiosyncratic:
motion, but *mote; usher, but *ush; tuition, but *tuit; etc.
profess (‘declare openly’) — professor (‘university teacher’) — profession (‘career’)
ignore (‘pay no attention to’) — ignorance (‘lack of knowledge’) — ignoramus (‘very
stupid person’)
person (‘human individual’) — personal (‘private’) — personable (‘friendly’) —
personality (‘character’) — personalize (‘tailor to the individual’) — impersonate (‘pass
oneself off as’)
social (‘pertaining to society’; ‘interactive with others’) — socialist (‘follower of a
particular political doctrine’) — socialite (‘member of high society’)
29. SURFACE SEMANTIC INTERPRETATION
The lexicalist hypothesis emphasized the importance
of ‘shallow’ levels of syntactic structure.
Shallow levels became even more important with the
introduction of surface semantic interpretation in
the late 1960s.
RAY JACKENDOFF
29
30. SURFACE SEMANTIC INTERPRETATION
S
30
NP VP
Q N
V NP
many men read
Q N
few books
INTERPRET WITH WIDE SCOPE INTERPRET WITH NARROW SCOPE
31. THE EXTENDED STANDARY THEORY
31
But some aspects of interpretation still seemed to
take place at Deep Structure.
For example, interpretation seems to have to take
place before Passive, since in a passive sentence like
Mary was seen by John, Mary is interpreted in
object position.
The model with both Deep and Surface Structure
rules of interpretation was called the Extended
Standard Theory (EST).
33. TRACES AND OTHER ABSTRACT ELEMENTS
33
The ‘price paid’ for the addition of constraints on
movement, surface interpretation, and so on was an
explosion of the number of ‘invisible elements’ like
traces, PRO, pro, and so on.
Look at how traces work(ed).
34. TRACES AND OTHER ABSTRACT ELEMENTS
S
NP AUX VP
was V NP
seen
Mary
t
34
35. TRACES AND OTHER ABSTRACT ELEMENTS
35
But the biggest plus for traces and other empty
elements was that they seemed to allow for the
unification of constraints on movement and
constraints on anaphora.
36. TRACES AND OTHER ABSTRACT ELEMENTS
36
Mary helped herself and
Mary seemed t to be happy
are grammatical for the same reason.
*Mary asked [John to help herself] and
*Mary seemed [to be true t to be happy]
are ungrammatical for the same reason.
37. MOVE-a
37
By the mid 1970s the transformational component
had been ‘cleaned up’ to the point where it was
suggested (by Chomsky) that there could be one all-purpose
movement rule Move-a.
39. ALL INTERPRETATION NOW ON THE
SURFACE
39
Note that traces allow all interpretation to take place
on the surface.
Johni was seen ti by Mary
The trace of John marks its original D-Structure
position.
The model on the next slide (also called the
Extended Standard Theory) was dominant from the
mid 1970s to the mid 1990s.
41. THE GOVERNMENT-BINDING THEORY (GB)
41
The next major step forward in syntactic theory was
the Government-Binding Theory (GB).
Published in 1981,
LGB synthesized all
of the results of the
previous decade.
42. THE GOVERNMENT-BINDING THEORY (GB)
42
GB posited that a grammar is a set of interacting principles.
Movement applies freely, constrained by these principles.
The principles are:
Bounding
Government
Theta-theory
Binding
Case
Control
X-bar
43. THE GOVERNMENT-BINDING THEORY (GB)
43
Bounding: The principles that determine how far an
element can move. Subjacency is the most important
Bounding principle.
Government: The relation between a head and its
dependent element, e. g. V and NP, V and PP, INFL
and the subject position.
The centrepiece of government was the Empty
Category Principle: Every empty element needs to be
governed in a particularly strong way.
44. THE GOVERNMENT-BINDING THEORY (GB)
44
*Who did you wonder if solved the problem is an ECP
violation (note that it does not violate Subjacency).
45. THE GOVERNMENT-BINDING THEORY (GB)
45
Theta-theory: governs the positioning of arguments
(elements that have thematic roles).
A consequence of theta-theory: an element can move
only into a position that is not assigned a semantic role:
it seems [John is willing to help]
Johni seems [ti willing to help]
This movement is possible because seem does not assign
a thematic role to its subject.
46. THE GOVERNMENT-BINDING THEORY (GB)
46
Binding theory: Governs the relationship between an
element and its antecedent:
Principle-A: An anaphor must be free in its governing
category: Theyi like each otheri, but not *Theyi think that
Mary likes each otheri.
Principle-B: A pronominal must be free in its governing
category: Johni likes himj, but not *Johni likes himi.
Principle-C: A referring expression must be free
everywhere: *Hei thinks that Johni is smart.
47. THE GOVERNMENT-BINDING THEORY (GB)
47
Case theory: Every NP must be Case-marked:
e was seen John
Johni was seen ei
John has to move because participles do not assign
Case. (It only looks like Passive is obligatory.)
• Control theory: The relationship between an
antecedent and PRO (Johni wants PROi to leave).
48. THE GOVERNMENT-BINDING THEORY (GB)
48
X-bar theory: The principles that govern phrase
structure:
Functional and lexical categories
49. THE GOVERNMENT-BINDING THEORY (GB)
GOVERNMENT
THEORY
BINDING THEORY
CASE THEORY
THETA-THEORY
X-BAR THEORY
BINDING THEORY
ETC.
49
A SENTENCE IS THE
PRODUCT OF THE
INTERACTION
OF THE DIFFERENT
PRINCIPLES
50. PARAMETERS
50
The principles are parameterized: (Ideally) by
allowing a small number of principles each to have a
small number of settings, the superficially complex
differences of the worlds’ languages can be
accounted for.
LUIGI RIZZI
51. PARAMETERS
51
Rizzi noticed that extraction is more permissive in
Italian than in English.
In Italian the literal equivalent of English *What did
you wonder where John put? is grammatical.
Rizzi proposed that Subjacency is parameterized:
Bounding nodes in English: IP and DP
Bounding nodes in Italian: CP and DP.
52. PARAMETERS
52
Other languages are more restrictive than English.
Russian has Wh-Movement, but the wh-element
cannot be extracted from its clauses.
So in Russian you can say things like Who did you
see?, but not *Who did you ask Mary to see?
Therefore in Russian, the bounding nodes are both
CP and IP.
53. PARAMETERS
53
But what about languages like Chinese that appear not to have
any Wh-Movement (wh-in situ languages)?
Zhangsan xiang-zhidao [Lisi mai-le shenme] (Chinese)
Zhangsan wonder Lisi boughtwhat
‘Zhangsan wonders what Lisi bought’
John-ga dare-o butta ka (Japanese)
John-SU who-OB hit
‘Whom did John hit?’ (compare: John-ga Bill-o butta ‘John hit
Bill’)
54. PARAMETERS
54
James Huang has worked out the parameters for
Chinese and typologically similar languages:
JAMES HUANG
55. PARAMETERS
55
Huang has proposed a ‘Question Movement
Parameter:
In languages like English, Italian, and Russian, Wh-
Movement applies in the overt syntax.
In languages like Chinese and Japanese, Wh-
Movement applies covertly in LF:
56. PARAMETERS
56
So the first sentence below would have the LF
representation under it:
Zhangsan xiang-zhidao [Lisimai-le shenme]
Zhangsan wonder Lisi bought what
‘Zhangsan wonders what Lisi bought’
Zhangsan xiang-zhidao [CP shenmei [IP Lisi mai-le ti]
Zhangsan wonder what Lisi bought
57. PARAMETERS
57
Why believe in LF movement?
Because interpretations in wh-in-situ languages (often!) obey at least some
island constraints.
*Ni xiangxin Lisi weisheme lai de shuofa
you believe [the claim [that [Lisi came why]]]
‘*Why do you believe the claim that Mary came ___?’
*John-wa Mary-ga naze sore-o katta kadooka siritagatte iru no
John wants to know [whether [Mary bought it why]]
‘Why does John want to know whether Mary bought it ___?’
This can be captured if LF Wh-Movement is subject to these constraints.
58. PARAMETERS
58
Another parametric difference among languages:
No null arguments (English, French): It is raining,
*is raining; Mary left, *left
Null subjects (Spanish, Italian): llueve (‘It is
raining’); comió la manzana (‘He/She ate the apple’)
Null subjects are usually analyzed as the empty
pronominal ‘pro’
59. PARAMETERS
59
Both null subjects and null objects (Chinese):
Zhangsani xiwang [ei keyi kanjian Lisi]
Zhangsan hope can see Lisi
‘Zhangsan hopes that he can see Lisi’
Zhangsani shuo Lisi kanjian-le ei
Zhangsan say Lisi see LE
‘Zhangsan said Lisi saw him’
• Huang analysed the empty position as null topic (old
information, salient in discourse)
60. PARAMETERS
60
The Head Parameter has
also been historically very
important.
Joseph Greenberg’s 1963
paper launched modern
typology.
JOSEPH GREENBERG, 1915-2001
61. PARAMETERS
61
The Greenbergian correlations:
VO correlate OV correlate
adposition - NP NP – adposition
copula verb - predicate predicate - copula verb
‘want’ - VP VP - ‘want’
tense/aspect auxiliary verb - VP VP - tense/aspect auxiliary verb
negative auxiliary - VP VP - negative auxiliary
complementizer - S S – complementizer
question particle - S S - question particle
adverbial subordinator - S S - adverbial subordinator
article - N' N' – article
plural word - N' N' - plural word
noun - genitive genitive – noun
noun - relative clause relative clause – noun
adjective - standard of comparison standard of comparison – adjective
verb - PP PP – verb
verb - manner adverb manner adverb - verb
62. PARAMETERS
62
The correlations are generally captured by the Head
parameter: A language is either head-initial (VO) or
head-final (OV).
The problem: Many, probably most, languages are
not completely consistent.
For example, Chinese is consistently head-final
except in the rule expanding X’ to X0 (if the head is
verbal it precedes the complement).
63. PARAMETERS
63
So Chinese manifests the ordering V-NP, but NP-N:
you sange ren mai-le shu
HAVE three man buy-ASP book
‘Three men bought books’
Zhangsan de sanben shu
Zhangsan DE three book
‘Zhangsan’s three books’
64. PARAMETERS
64
The usual assumption has been that ‘inconsistent’
language have more complex grammars than
‘consistent’ languages.
So Huang has suggested that Chinese has a more
complicated X-bar schema to ‘pay’ for its
inconsistency:
XP —> YP X’
X' —> X0 YP iff X = [+v]
YP X0 otherwise
65. PARAMETERS
65
Lisa Travis has suggested a
different way of handling the
inconsistent ordering of
Chinese.
LISA TRAVIS
Normally, if a language is head final, it assigns Case and Theta-Role to the left, as in
(a). However Chinese has a special setting (b) that violates this default ordering.
a. Unmarked setting: HEAD-RIGHT THETA-ASSIGNMENT TO LEFT &
CASE-ASSIGNMENT TO LEFT
b. Marked setting (Chinese): HEAD-RIGHT & THETA-ASSIGNMENT TO
RIGHT & CASE-ASSIGNMENT TO RIGHT
66. PARAMETERS
66
Some other important parameters:
o Serial verbs (YES, as in Chinese; no, as in
English)
o Polysynthesis (YES, as in Inuit and
Athabaskan; NO, as in English and
Chinese)
o Accusative or Ergative (Accusative as in
English and Chinese; Ergative as in
Dyirbal and Georgian)
68. PARAMETERS
68
Parameterized principles came to play such an
important role that the Government-Binding theory
is sometimes called the ‘Principles-and-Parameters’
approach.
At one point, syntacticians were confident that
acquiring a language was just a matter of finding the
right ON-OFF settings for each language.
69. PARAMETERS
69
The theory of parameters is in difficulty now:
o One might need hundreds or even thousands
of them.
o The clustering effects have not worked out
very well.
o They are out of spirit with the Minimalist
Program.
o There have been recent attempts to derive
them from the process of language
acquisition.
70. THE MINIMALIST PROGRAM
70
Starting in the 1990s, the Government-Binding
theory has been gradually replaced by the Minimalist
Program.
But the MP is in many ways not an abrupt change of
direction from GB.
Many conceptions from GB were incorporated
directly into the MP.
71. THE MINIMALIST PROGRAM
71
Everybody knew that there was huge redundancy in the
scope of the GB principles.
Binding, bounding, Case, theta, etc. overlapped
considerably in their domains.
Some ungrammatical sentences were ruled out by 3 or
4 different principles!
So it became clear that it was desirable to reduce the
number of principles.
72. THE MINIMALIST PROGRAM
72
Many of the principles seemed to have a ‘least-effort/
economy’ essence.
That is, they moved elements as short a distance as
possible …
… or they looked at only the closest possible
relationship between an anaphor or its antecedent of a
gap and its filler.
That seemed to suggest that ‘formal economy’ should
be at the centre of the theory.
73. THE MINIMALIST PROGRAM
73
Other things that were important in the early theory no
longer seemed so important.
The levels of D(eep)-Structure and S(urface)-Structure
seemed to be playing less and less work.
X-bar theory seemed to follow from independent
principles.
More and more generalisations seemed to apply at the
interfaces with PF and LF, rather than in the course of
the derivation.
74. THE MINIMALIST PROGRAM
74
Chomsky’s big idea: Rebuild grammatical theory from
the ‘bottom up’: Start with only what we know is
necessary and go from there.
That’s why it is called the ‘Minimalist Program’.
The idea that language might be ‘perfect’ is a leading idea
of the MP.
If that is true, language is unlike all other known
biological systems.
75. THE MINIMALIST PROGRAM
75
• The Minimalist Program is committed to probing to
what extent the human language faculty is an
optimal solution to minimal design specification.
• The hope is that the only grammatical processes are
those that are subject to ‘virtual conceptual
necessity’.
• Notice that one consequence is that there is a much
smaller innate UG.
76. THE MINIMALIST PROGRAM
76
has no level of D-Structure or S-Structure
leaves a more important role for the
semantic and phonetic interfaces
So processes that used to be considered
syntax-internal, like binding, bounding,
etc., are now handled at LF or at PF.
77. THE MINIMALIST PROGRAM
77
has only one structure-building operation, namely,
‘Merge’ (in other words, recursion) is all that there is
in the narrow syntax.
Sentences are built from the ‘bottom up’, in the
manner of categorial grammar.
Movement is considered to be ‘Internal Merge’, that
is, the merging (expansion) of an element already in
the derivation.
79. THE MINIMALIST PROGRAM
79
appeals to ‘third-factor’ explanations: those that are
based on factors outside of universal grammar.
The reason for that is clear — the more that you
remove from UG, the more that other systems are
going to need to take over the work.
So maybe economy principles arise from pressure for
efficient computation and have nothing to do with
UG.
80. THE MINIMALIST PROGRAM
80
The next few classes will go into more detail about the
MP, its strengths and its weaknesses.
One thing to keep in mind: If the narrow syntax is not
accounting for grammatical complexity, then what is?
The answer: the lexicon and the interface components.
If so, does the MP lead to an overall simplification?
81. LEXICALIST APPROACHES
81
Not all formal linguists work in ‘Chomskyan’ syntax.
Recall that the idea was to impose more and more
constraints on syntactic transformations.
By the late 1970s, some linguists posited
‘constraining’ transformations out of existence.
These became (super)-lexicalist approaches.