1. Rethinking language structures Composition and constituency
探索語言
Forming sentences
chenhaochiu@ntu.edu.tw
Mar. 8, 2022
1 / 87
2. Rethinking language structures Composition and constituency
Outlines
In today’s lecture, we will be talking about:
1 Rethinking language structures
2 Composition and constituency
2 / 87
4. Rethinking language structures Composition and constituency
Rethinking language structures
我在街上看見一個殺了殺了殺了人的人的人的人。
I saw a person who killed a person who killed a person
who killed a person who killed a person on the street.
4 / 87
5. Rethinking language structures Composition and constituency
Rethinking language structures
我在街上看見一個殺了殺了殺了人的人的人的人。
I saw a person who killed a person who killed a person
who killed a person who killed a person on the street.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo
buffalo.
5 / 87
6. Rethinking language structures Composition and constituency
Rethinking language structures
我在街上看見一個殺了殺了殺了人的人的人的人。
I saw a person who killed a person who killed a person
who killed a person who killed a person on the street.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo
buffalo.
(The) Buffalo buffalo (that) Buffalo buffalo (often)
buffalo (in turn) buffalo (other) Buffalo buffalo.(時常
被水牛城水牛恫嚇的水牛城水牛於是恫嚇其他水牛城
水牛)。
[(Buffalonian bison) (that) (Buffalonian bison
intimidate)] intimidate (Buffalonian bison)
6 / 87
8. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Greek: triangular roof, pillars
8 / 87
9. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Greek: triangular roof, pillars
9 / 87
10. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Greek: triangular roof, pillars
Roman Greek + arch
10 / 87
11. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Byzantine: onion head
11 / 87
12. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Byzantine: onion head
Gothic: tall, mosaic glasses
12 / 87
13. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Byzantine: onion head
Gothic: tall, mosaic glasses
Renaissance: symmetric
13 / 87
14. Rethinking language structures Composition and constituency
Architecture
Defining characters:
Byzantine: onion head
Gothic: tall, mosaic glasses
Renaissance: symmetric
Baroque: contrast; “rich” surface
14 / 87
15. Rethinking language structures Composition and constituency
The composition
“Words” are (relatively) easy to identify. And they are
very important for language use. But, these basic units
(elements) do not just occur randomly or arbitrarily.
15 / 87
16. Rethinking language structures Composition and constituency
The composition
“Words” are (relatively) easy to identify. And they are
very important for language use. But, these basic units
(elements) do not just occur randomly or arbitrarily.
Random piles of rocks or stones don’t make
“buildings.” (i.e., they don’t serve any functions.)
16 / 87
17. Rethinking language structures Composition and constituency
The composition
“Words” are (relatively) easy to identify. And they are
very important for language use. But, these basic units
(elements) do not just occur randomly or arbitrarily.
Random piles of rocks or stones don’t make
“buildings.” (i.e., they don’t serve any functions.)
“Words” are combined in a systematic way to make a
larger, more complicated structure. These structures
thus convey meanings for communication.
17 / 87
18. Rethinking language structures Composition and constituency
The composition
“Words” are (relatively) easy to identify. And they are
very important for language use. But, these basic units
(elements) do not just occur randomly or arbitrarily.
Random piles of rocks or stones don’t make
“buildings.” (i.e., they don’t serve any functions.)
“Words” are combined in a systematic way to make a
larger, more complicated structure. These structures
thus convey meanings for communication.
18 / 87
19. Rethinking language structures Composition and constituency
The composition
There are just too many sentences to list.
Speakers know more than lists.
The way words are organized into sentences is not
random. There are patterns.
19 / 87
20. Rethinking language structures Composition and constituency
The composition
There are just too many sentences to list.
Speakers know more than lists.
The way words are organized into sentences is not
random. There are patterns.
⇒ Syntax
20 / 87
21. Rethinking language structures Composition and constituency
Languages are not buildings?
https://youtu.be/7 ToAF46GPQ
21 / 87
22. Rethinking language structures Composition and constituency
Languages are not buildings?
https://youtu.be/7 ToAF46GPQ
22 / 87
23. Rethinking language structures Composition and constituency
Languages are not buildings?
https://youtu.be/7 ToAF46GPQ
23 / 87
27. Rethinking language structures Composition and constituency
B-line buses in Vancouver
B for Broadway?
If there is B-line, then where is A-line (bus)?
27 / 87
30. Rethinking language structures Composition and constituency
B-line buses in Vancouver
B for bee?
Wiki: a bee line is the shortest route or a straight line
between two points.
30 / 87
31. Rethinking language structures Composition and constituency
Rethinking language structures
Language structures are like architecture, but they are
expressed linearly/sequentially.
How do we break sentences into pieces
“systematically?”
我在街上看見一個殺了殺了殺了人的人的人的人。
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo
buffalo.
31 / 87
33. Rethinking language structures Composition and constituency
Languages as bee line?
How do we break sentences into pieces “systematically?”
羅馬人對建築工程的重大貢獻,是發明了一種被稱為
「黏漿」(Caementum) 的萬用材料,這是一種用火山
灰岩、石灰和水拌成的石漿,再加上碎石或碎磚,用
之於建築業,非常堅固,又增添色彩。這就是世界上
第一種足以支撐大跨度建築的混凝土。羅馬人發明了
混凝土,從此大拱門、大圓頂、大拱頂就都能獨立,
而無需像古希臘建築那樣靠許多柱子來支撐了,從而
成為世界建築史上劃時代的創舉。
(原文網址: https://kknews.cc/travel/oegga5.html)
33 / 87
34. Rethinking language structures Composition and constituency
Languages as bee line?
n-gram?
羅馬人對建築工程的重大貢獻,是發明了一種被稱
為「黏漿」(Caementum) 的萬用材料,這是一種用火
山灰岩、石灰和水拌成的石漿,再加上碎石或碎
磚,用之於建築業,非常堅固,又增添色彩。這就
是世界上第一種足以支撐大跨度建築的混凝土。羅馬
人發明了混凝土,從此大拱門、大圓頂、大拱頂就都
能獨立,而無需像古希臘建築那樣靠許多柱子來支撐
了,從而成為世界建築史上劃時代的創舉。
34 / 87
35. Rethinking language structures Composition and constituency
Languages as bee line?
Positions?
羅馬人對建築工程的重大貢獻,是發明了一種被稱為
「黏漿」(Caementum) 的萬用材料,這是一種用火山
灰岩、石灰和水拌成的石漿,再加上碎石或碎磚,用
之於建築業,非常堅固,又增添色彩。這就是世界上
第一種足以支撐大跨度建築的混凝土。羅馬人發明了
混凝土,從此大拱門、大圓頂、大拱頂就都能獨
立,而無需像古希臘建築那樣靠許多柱子來支撐
了,從而成為世界建築史上劃時代的創舉。
35 / 87
36. Rethinking language structures Composition and constituency
Languages as bee line?
Parts of speech?
羅馬人對建築工程的重大貢獻,是發明了一種被稱為
「黏漿」(Caementum) 的萬用材料,這是一種用火
山灰岩、石灰和水拌成的石漿,再加上碎石或碎
磚,用之於建築業,非常堅固,又增添色彩。這
就是世界上第一種足以支撐大跨度建築的混凝土。羅
馬人發明了混凝土,從此大拱門、大圓頂、大拱頂就
都能獨立,而無需像古希臘建築那樣靠許多柱子來支
撐了,從而成為世界建築史上劃時代的創舉。
36 / 87
37. Rethinking language structures Composition and constituency
Languages as bee line?
Parts of speech?
羅馬人對建築工程的重大貢獻,是發明了一種被稱為
「黏漿」(Caementum) 的萬用材料,這是一種用火
山灰岩、石灰和水拌成的石漿,再加上碎石或碎
磚,用之於建築業,非常堅固,又增添色彩。這
就是世界上第一種足以支撐大跨度建築的混凝土。羅
馬人發明了混凝土,從此大拱門、大圓頂、大拱頂就
都能獨立,而無需像古希臘建築那樣靠許多柱子來支
撐了,從而成為世界建築史上劃時代的創舉。
Do you find any patterns?
37 / 87
38. Rethinking language structures Composition and constituency
Parts of speech
As in architecture, buildings are composed of different
“elements” and “parts” that serve different functions.
38 / 87
39. Rethinking language structures Composition and constituency
Parts of speech
As in architecture, buildings are composed of different
“elements” and “parts” that serve different functions.
Speech, too! What are the “parts of speech?”
39 / 87
40. Rethinking language structures Composition and constituency
Parts of speech
As in architecture, buildings are composed of different
“elements” and “parts” that serve different functions.
Speech, too! What are the “parts of speech?”
Some lexical categories: noun, verb, participle,
interjection, pronoun, preposition, adverb,
and conjunction.
40 / 87
41. Rethinking language structures Composition and constituency
Parts of speech
As in architecture, buildings are composed of different
“elements” and “parts” that serve different functions.
Speech, too! What are the “parts of speech?”
Some lexical categories: noun, verb, participle,
interjection, pronoun, preposition, adverb,
and conjunction.
Parts of speech are also called lexical categories,
word classes.
41 / 87
42. Rethinking language structures Composition and constituency
Testing for categories
1 Affixation: walk, walks, walked vs. Fred, yellow
42 / 87
43. Rethinking language structures Composition and constituency
Testing for categories
1 Affixation: walk, walks, walked vs. Fred, yellow
2 Distribution:
apple fell straight to the ground.
An apple straight to the ground.
43 / 87
44. Rethinking language structures Composition and constituency
Testing for categories
1 Affixation: walk, walks, walked vs. Fred, yellow
2 Distribution:
apple fell straight to the ground.
An apple straight to the ground.
3 Meaning: Doraemon, the moon, ... etc.
⇒ Both names are directly referring to (specific)
individuals in the world. (But this test doesn’t always
capture all the words that actually go in the category.)
44 / 87
45. Rethinking language structures Composition and constituency
Generative rules
How are these categories organized in sentences?
Alice sighed.
N V
45 / 87
46. Rethinking language structures Composition and constituency
Generative rules
How are these categories organized in sentences?
Alice sighed.
N V
The rabbit sighed.
DET N V
The rabbit saw Alice.
DET N V N
The girl saw the rabbit.
DET N V DET N
46 / 87
47. Rethinking language structures Composition and constituency
Generative rules
How are these categories organized in sentences?
Alice sighed.
N V
The rabbit sighed.
DET N V
The rabbit saw Alice.
DET N V N
The girl saw the rabbit.
DET N V DET N
The white rabbit left.
DET ADJ N V
The white rabbit saw Alice.
DET ADJ N V N
The girl saw the white rabbit.
DET N V DET ADJ N
(DET = determiner)
47 / 87
48. Rethinking language structures Composition and constituency
Generative rules
The sentences structures from the previous slide:
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
48 / 87
49. Rethinking language structures Composition and constituency
Generative rules
The sentences structures from the previous slide:
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
Listing all possible structures appears a little redundant.
Can we make them more generalized and comprehensive?
49 / 87
50. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
50 / 87
51. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
51 / 87
52. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
⇒ (DET) N
52 / 87
53. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
⇒ (DET) N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
53 / 87
54. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
⇒ (DET) N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
⇒ N can be N, DET N, or
DET ADJ N.
54 / 87
55. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
⇒ (DET) N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
⇒ N can be N, DET N, or
DET ADJ N.
⇒ (DET) (ADJ) N
55 / 87
56. Rethinking language structures Composition and constituency
Generative rules
S → N V
S → DET N V
S → DET N V N
S → DET N V DET N
⇒ N can be either N or DET
N.
⇒ (DET) N
S → DET ADJ N V
S → DET ADJ N V N
S → DET N V DET ADJ N
⇒ N can be N, DET N, or
DET ADJ N.
⇒ (DET) (ADJ) N
So far: S → (DET) (ADJ) N V ((DET) (ADJ) N)
56 / 87
57. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
57 / 87
58. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
A sentence can be embedded within another sentence!
58 / 87
59. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
A sentence can be embedded within another sentence!
Shall we revise our sentence structure?
59 / 87
60. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
A sentence can be embedded within another sentence!
Shall we revise our sentence structure?
How about S → (DET) (ADJ) N V (that) (S)?
60 / 87
61. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
A sentence can be embedded within another sentence!
Shall we revise our sentence structure?
How about S → (DET) (ADJ) N V (that) (S)?
What does that predict?
61 / 87
62. Rethinking language structures Composition and constituency
Generative rules
What about “Mary knows that Alice sighed”?
A sentence can be embedded within another sentence!
Shall we revise our sentence structure?
How about S → (DET) (ADJ) N V (that) (S)?
What does that predict?
⇒ Recursiveness
62 / 87
63. Rethinking language structures Composition and constituency
Word order
Different languages may have different word orders.
(cf. the architectural styles)
SVO: English, Mandarin, ... etc.
SOV: Korean, Japanese, Turkish, ... etc.
VSO: Arabic, Irish, ... etc.
63 / 87
64. Rethinking language structures Composition and constituency
Word order
Different languages may have different word orders.
(cf. the architectural styles)
SVO: English, Mandarin, ... etc.
SOV: Korean, Japanese, Turkish, ... etc.
VSO: Arabic, Irish, ... etc.
About 35% of the world’s languages have the order of
SVO.
64 / 87
65. Rethinking language structures Composition and constituency
Word order
Different languages may have different word orders.
(cf. the architectural styles)
SVO: English, Mandarin, ... etc.
SOV: Korean, Japanese, Turkish, ... etc.
VSO: Arabic, Irish, ... etc.
About 35% of the world’s languages have the order of
SVO.
What about SOV?
65 / 87
66. Rethinking language structures Composition and constituency
Word order
Different languages may have different word orders.
(cf. the architectural styles)
SVO: English, Mandarin, ... etc.
SOV: Korean, Japanese, Turkish, ... etc.
VSO: Arabic, Irish, ... etc.
About 35% of the world’s languages have the order of
SVO.
What about SOV? 44%
66 / 87
67. Rethinking language structures Composition and constituency
Word order
Different languages may have different word orders.
(cf. the architectural styles)
SVO: English, Mandarin, ... etc.
SOV: Korean, Japanese, Turkish, ... etc.
VSO: Arabic, Irish, ... etc.
About 35% of the world’s languages have the order of
SVO.
What about SOV? 44%
About 19% of languages, including Arabic and Irish,
have VSO word order.
67 / 87
68. Rethinking language structures Composition and constituency
Word order
The remaining patterns, VOS, OVS, and OSV, are
quite rare. Below is an example of Malagasy, a VOS
Austronesian language spoken in Madagascar.
Manasa lamba amin’ny savony ny lehilahy
washes clothes with-the soap the man
‘The man washes clothes with the soap.’
68 / 87
69. Rethinking language structures Composition and constituency
Word order
While it may be convenient to label a language as
being VOS, SOV, etc., such labels can be misleading.
Consider the following German examples.
1 Karl kocht die Suppe.
Karl cooks the soup
‘Karl is cooking the soup.’
69 / 87
70. Rethinking language structures Composition and constituency
Word order
While it may be convenient to label a language as
being VOS, SOV, etc., such labels can be misleading.
Consider the following German examples.
1 Karl kocht die Suppe.
Karl cooks the soup
‘Karl is cooking the soup.’
⇒ SVO
70 / 87
71. Rethinking language structures Composition and constituency
Word order
While it may be convenient to label a language as
being VOS, SOV, etc., such labels can be misleading.
Consider the following German examples.
1 Karl kocht die Suppe.
Karl cooks the soup
‘Karl is cooking the soup.’
⇒ SVO
2 Magda ist froh, dass Karl die Suppe kocht.
Magda is happy that Karl the soup cooks
‘Magda is happy that Karl is cooking the soup.’
71 / 87
72. Rethinking language structures Composition and constituency
Word order
While it may be convenient to label a language as
being VOS, SOV, etc., such labels can be misleading.
Consider the following German examples.
1 Karl kocht die Suppe.
Karl cooks the soup
‘Karl is cooking the soup.’
⇒ SVO
2 Magda ist froh, dass Karl die Suppe kocht.
Magda is happy that Karl the soup cooks
‘Magda is happy that Karl is cooking the soup.’
⇒ The subordinate clause has an SOV order.
72 / 87
73. Rethinking language structures Composition and constituency
Word order
Word orders are not only restricted to Subjects (S),
Verbs (V), and Objects (O). Compare the following
examples, comparing the order between the noun and
the determiner.
1 [English]: these books. *books these
2 [Malay]: buku-buku ini
books these
‘these books’
*ini buku-buku
73 / 87
74. Rethinking language structures Composition and constituency
Word order
Constraints for prepositions:
Sally finally met with that person.
*Sally finally met that person with.
74 / 87
75. Rethinking language structures Composition and constituency
Word order
Constraints for prepositions:
Sally finally met with that person.
*Sally finally met that person with.
Now consider Japanese:
kono kodomo to
this child with
‘with this child’
*to kono kodomo
75 / 87
76. Rethinking language structures Composition and constituency
Word order
Constraints for prepositions:
Sally finally met with that person.
*Sally finally met that person with.
Now consider Japanese:
kono kodomo to
this child with
‘with this child’
*to kono kodomo
⇒ Postposition
76 / 87
77. Rethinking language structures Composition and constituency
Constituents
OK. It looks like these generative rules work just fine,
though this linear combination (sort of) hints that
stronger relationships should lie between neighbours.
77 / 87
78. Rethinking language structures Composition and constituency
Constituents
OK. It looks like these generative rules work just fine,
though this linear combination (sort of) hints that
stronger relationships should lie between neighbours.
Consider: The child found a puppy.
78 / 87
79. Rethinking language structures Composition and constituency
Constituents
OK. It looks like these generative rules work just fine,
though this linear combination (sort of) hints that
stronger relationships should lie between neighbours.
Consider: The child found a puppy.
Do you think “a puppy” (DET N) has a stronger
relationship than “found a” (V DET)?
(NB: The generative rules either suggest that DET-N
and V-DET should be equal in terms of relationship,
or fail to reveal that there is a difference between these
neighbouring units!)
79 / 87
80. Rethinking language structures Composition and constituency
Constituents
So things that bear a stronger relationship should be
grouped together, at least structurally.
80 / 87
81. Rethinking language structures Composition and constituency
Constituents
So things that bear a stronger relationship should be
grouped together, at least structurally.
The child found a puppy.
81 / 87
82. Rethinking language structures Composition and constituency
Constituents
So things that bear a stronger relationship should be
grouped together, at least structurally.
The child found a puppy.
⇒ [The child] [found] [a puppy].
82 / 87
83. Rethinking language structures Composition and constituency
Constituents
So things that bear a stronger relationship should be
grouped together, at least structurally.
The child found a puppy.
⇒ [The child] [found] [a puppy].
In linguistics, we call these grouping “constituents.”
(This term is more straightforward in a tree-like
structure, which we will come back to in a few slides.)
83 / 87
84. Rethinking language structures Composition and constituency
Constituents
How do we know if a group is a true constituent? There are
some tests.
1 Stand alone: If a group of words can stand alone
(usually in answering a question), they form a
constituent. For example,
“What did the child find?” “A puppy.”
84 / 87
85. Rethinking language structures Composition and constituency
Constituents
How do we know if a group is a true constituent? There are
some tests.
1 Stand alone: If a group of words can stand alone
(usually in answering a question), they form a
constituent. For example,
“What did the child find?” “A puppy.”
2 Replacement: Natural groups can be substituted by a
pronoun or a word like “do.” For example,
“Where did the child find a puppy?” “He found it in
the park.”
85 / 87
86. Rethinking language structures Composition and constituency
Constituents
How do we know if a group is a true constituent? There are
some tests.
1 Stand alone: If a group of words can stand alone
(usually in answering a question), they form a
constituent. For example,
“What did the child find?” “A puppy.”
2 Replacement: Natural groups can be substituted by a
pronoun or a word like “do.” For example,
“Where did the child find a puppy?” “He found it in
the park.”
3 Movement: If a group of words can be moved together
and remain grammatical, they form a constituent. For
example,
“A puppy was found by the child.”
86 / 87
87. Rethinking language structures Composition and constituency
Constituents
YOUR CONTRIBUTION
• Give some examples (in any language you know) showing
where the constituency may (seemingly) fail. For example,
你在家嗎?
⇒ 「在!」
天能你看了嗎?
⇒ 「看了。」
「我咖啡、他可樂」
[Japanese] Hoshii! (meaning “(I) want (this).”)
87 / 87