If this Giant Must Walk: A Manifesto for a New Nigeria
Discovering the units in language cognition:From empirical evidence to a computational model
1. Jinbiao Yang
Promotor: Prof. dr. Antal van den Bosch
Copromotor: Dr. Stefan L. Frank
Discovering the units in
language cognition:
From empirical evidence to
a computational model
2. What are the real language units that
we use in our daily life?
perceive, memorize, and produce...
Cognitiv
units
Words
Word parts
Word
combinations
“bicycle”
? “biunique”
“紫菜”
? “紫贝”
“pine·wood”
? “pine·apple”
“马术”
? “马桶”
“how are you?”
? “how is Jim?”
“人工智能”
? “人工心脏”
• What are the cognitive units?
• Which type of units are more
likely to be the cognitive units?
4. ABCD
AB CD
80~120 ms
The processing of the units during reading
ABCD
AB
170 ms ~
230 ms ~
Detecting the
familiar units
Recognizing the
detected units
Integration
Reading
the text
5. What are the cognitive units?
• The larger units tend to be the cognitive
units in use.
• Fewer number of units in sentence;
• Less effort on working memory.
Unit Detection:
Familiarity-based
Unit Recognition: Larger-
first
• The familiar units (to the
language user) tend to be
the cognitive units;
6. What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Only one unit in each sentence)
• Long term memory
• (Too many units in mental lexicon)
Least effort
Heavy load
7. Heavy load
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Too many units in each sentence)
• Long term memory
• (Only symbol units in mental lexicon)
Least effort
8. • Working memory
• (Fewer number of units in
sentence)
• Long term memory
• (Fewer number of units in
mental lexicon)
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
Cognitive units are the
units that can minimize
the cognitive load.
Less effort
Less effort
9. Unsupervised learning of cognitive units
g,o,i,n,g,t,r,a,
go,in,ing,to,ra,
going,rain,
goingto
The Less-is-Better model (LiB)
11. LiB units = cognitive units?
Hypotheses:
1. Reading is cognitive-unit-wise
2. LiB units = cognitive units
Eye fixations
=
Centers of cognitive units
=
Centers of LiB units
12. Model English Dutch
LiB-unit-wise Reading 53.1% 51.9%
Word-wise Reading 38.3% 38.7%
Prediction F1 scores
LiB units ≈ Cognitive units
Predict
Train
13. Take-Home Message
• The familiar/larger units tend to be the cognitive units;
• Reading is cognitive-unit-by-cognitive-unit.
• The LiB Model can learn cognitive units.
• Cognitive units are :
• ✘words/morphemes/phrases.
• ✓the units that can minimize the cognitive load.
(The need of cognitive economy)
17. For the experiment:
Four types of 4-character Chinese strings
• Phrase
• e.g., “希腊神话”,
translation: Greek
Greek mythology.
• Random words
• e.g., “存款电脑”,
translation: Deposit-
computer.
• Idiom
• e.g., “以逸待劳”,
translation: Wait for
the exhausted enemy
at your ease.
• Random characters
• e.g., “投其顾此”, a
nonsense word.
23. Prediction performance
Model English Dutch
Less-is-better 53.06 51.87
Adaptor Grammar (collocation) 53.35 51.45
Chunk-Based Learner 52.20 50.04
Fixation counts determined by word length 50.82 50.57
Word-by-Word reading 38.32 38.68
Adaptor Grammar (word) 30.10 28.95
F1 scores
24. • Q1: What takes priority of processing in language hierarchy?
• A: The global & familiar units (Yang et al. 2020a).
• Q2: How to learn/segment the flexible cognitive units?
• A: The Less-is-Better unsupervised model (Yang et al. 2020b).
• Q3: Can a computational model generate empirical cognitive units?
• A: Very likely, because we can predict eye fixations using LiB model (Conditional
Acceptance).
25. Previous research (methodology)
Toolboxes:
• Analyzing MEEG: EasyEEG (Yang et al. 2018)
• Running experiments: Expy
• Making stimuli: CharDB, VoiceGen
Denoising algorithms:
• Removing EOG noise: DeEOG
• Finding the true zero/baseline point of MEEG wave: DeTrend, TrialAlign
Improving the trial-by-trial decoding:
• by desensitizing the phase of high-frequency bands
• by Contrast Learning (ongoing work)
Editor's Notes
Word-by-Word reading
assumes that the cognitive units are equal to words
Only-Length
assumes that the fixation counts on a word is determined by the word length
with eye fixation knowledge
Q1 Discussion: Cognitive units are not just words, they are flexible and reflect the larger-first principle