Discovering the units in language cognition:From empirical evidence to a computational model

•Download as PPTX, PDF•

0 likes•24 views

R

None

Presentations & Public Speaking

Jinbiao Yang
Promotor: Prof. dr. Antal van den Bosch
Copromotor: Dr. Stefan L. Frank
Discovering the units in
language cognition:
From empirical evidence to
a computational model

What are the real language units that
we use in our daily life?
perceive, memorize, and produce...
Cognitiv
units
Words
Word parts
Word
combinations
“bicycle”
? “biunique”
“紫菜”
? “紫贝”
“pine·wood”
? “pine·apple”
“马术”
? “马桶”
“how are you?”
? “how is Jim?”
“人工智能”
? “人工心脏”
• What are the cognitive units?
• Which type of units are more
likely to be the cognitive units?

Experiments

ABCD
AB CD
80~120 ms
The processing of the units during reading
ABCD
AB
170 ms ~
230 ms ~
Detecting the
familiar units
Recognizing the
detected units
Integration
Reading
the text

What are the cognitive units?
• The larger units tend to be the cognitive
units in use.
• Fewer number of units in sentence;
• Less effort on working memory.
Unit Detection:
Familiarity-based
Unit Recognition: Larger-
first
• The familiar units (to the
language user) tend to be
the cognitive units;

What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Only one unit in each sentence)
• Long term memory
• (Too many units in mental lexicon)
Least effort
Heavy load

Heavy load
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Too many units in each sentence)
• Long term memory
• (Only symbol units in mental lexicon)
Least effort

• Working memory
• (Fewer number of units in
sentence)
• Long term memory
• (Fewer number of units in
mental lexicon)
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
Cognitive units are the
units that can minimize
the cognitive load.
Less effort
Less effort

Unsupervised learning of cognitive units
g,o,i,n,g,t,r,a,
go,in,ing,to,ra,
going,rain,
goingto
The Less-is-Better model (LiB)

• Unit examples:
• yeah, what, can you, that’s, ing, ly
• 的, 没有, 我们, 写完, 一个, 长大了
• Segmentation examples:
• Allright· whydon’t· we· puthimaway· now
• 这个· 出口信贷· 项目· 委托· 中国银行· 为· 代理·银行

LiB units = cognitive units?
Hypotheses:
1. Reading is cognitive-unit-wise
2. LiB units = cognitive units
Eye fixations
=
Centers of cognitive units
=
Centers of LiB units

Model English Dutch
LiB-unit-wise Reading 53.1% 51.9%
Word-wise Reading 38.3% 38.7%
Prediction F1 scores
LiB units ≈ Cognitive units
Predict
Train

Take-Home Message
• The familiar/larger units tend to be the cognitive units;
• Reading is cognitive-unit-by-cognitive-unit.
• The LiB Model can learn cognitive units.
• Cognitive units are :
• ✘words/morphemes/phrases.
• ✓the units that can minimize the cognitive load.
(The need of cognitive economy)

Expanding the horizon
Cognitive units may be the better units for psycholinguistics
and NLP.

Thank you for listening !

For the experiment:
Four types of 4-character Chinese strings
• Phrase
• e.g., “希腊神话”,
translation: Greek
Greek mythology.
• Random words
• e.g., “存款电脑”,
translation: Deposit-
computer.
• Idiom
• e.g., “以逸待劳”,
translation: Wait for
the exhausted enemy
at your ease.
• Random characters
• e.g., “投其顾此”, a
nonsense word.

80~120 ms
-
The brain signal (EEG) of reading
=
Group Timing

Decide whether the target is
familiar or not.
Target
FAST
SLOW
Target
Processing
Processing

+
+
The brain signal (EEG) of reading
Group Timing
170 ms ~

230 ms ~
+
+
The brain signal (EEG) of reading
Group Timing
170 ms ~

• Lexicon examples:
• the, yeah, you, what, wanna, can you, two, and, that’s
• 没有, 中国, 我们, 经济, 已经, 孩子, 但是, 教育, 可以
• Segmentation examples:
• allright·whydon’t·we·puthimaway·now
• 这个·出口信贷·项目·委托·中国银行·为·代理·银行
Performance on computational tasks
Symbols BPE subwords Words LiB units
Minimum Description
Length
BRphono 490 451 289 281
CTB8 21,921 18,362 16,809 16,755
2-gram surprisal BRphono 1.539 0.695 0.677 0.548
CTB8 2.466 1.932 1.617 1.452
3-gram surprisal BRphono 0.950 0.390 0.405 0.335
CTB8 1.404 0.827 0.806 0.626

Prediction performance
Model English Dutch
Less-is-better 53.06 51.87
Adaptor Grammar (collocation) 53.35 51.45
Chunk-Based Learner 52.20 50.04
Fixation counts determined by word length 50.82 50.57
Word-by-Word reading 38.32 38.68
Adaptor Grammar (word) 30.10 28.95
F1 scores

• Q1: What takes priority of processing in language hierarchy?
• A: The global & familiar units (Yang et al. 2020a).
• Q2: How to learn/segment the flexible cognitive units?
• A: The Less-is-Better unsupervised model (Yang et al. 2020b).
• Q3: Can a computational model generate empirical cognitive units?
• A: Very likely, because we can predict eye fixations using LiB model (Conditional
Acceptance).

Previous research (methodology)
Toolboxes:
• Analyzing MEEG: EasyEEG (Yang et al. 2018)
• Running experiments: Expy
• Making stimuli: CharDB, VoiceGen
Denoising algorithms:
• Removing EOG noise: DeEOG
• Finding the true zero/baseline point of MEEG wave: DeTrend, TrialAlign
Improving the trial-by-trial decoding:
• by desensitizing the phase of high-frequency bands
• by Contrast Learning (ongoing work)

More Related Content

Similar to Discovering the units in language cognition:From empirical evidence to a computational model

Interactive Video for Training: Secrets of Success

Interactive Video for Training: Secrets of Success

Interactive Video for Training: Secrets of SuccessCass Sapir

A Panorama of Natural Language Processing

A Panorama of Natural Language Processing

A Panorama of Natural Language ProcessingTed Xiao

NLP BootcampAnuj Gupta

NeurIPS_2018_ConvAI2_ParticipantSlides.pptx

NeurIPS_2018_ConvAI2_ParticipantSlides.pptx

NeurIPS_2018_ConvAI2_ParticipantSlides.pptxKaiduTester

Deep Learning for NLP: An Introduction to Neural Word Embeddings

Deep Learning for NLP: An Introduction to Neural Word Embeddings

Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters

NLP pipeline in machine translation

NLP pipeline in machine translation

NLP pipeline in machine translationMarcis Pinnis

Data Science Salon: In your own words: computing customer similarity from tex...

Data Science Salon: In your own words: computing customer similarity from tex...

Data Science Salon: In your own words: computing customer similarity from tex...Formulatedby

Growth meetup-q4-2014

Growth meetup-q4-2014

Growth meetup-q4-2014Franz Enzenhofer

NLP Bootcamp 2018 : Representation Learning of text for NLP

NLP Bootcamp 2018 : Representation Learning of text for NLP

NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5Joanie McMahon MS,BSN,RN

輪読：単語認知１・前半（関西学院大学・金澤）

輪読：単語認知１・前半（関西学院大学・金澤）

輪読：単語認知１・前半（関西学院大学・金澤）Yu Kanazawa / Osaka University

Modality-Preserving Phrase-based Statistical Machine Translation

Modality-Preserving Phrase-based Statistical Machine Translation

Modality-Preserving Phrase-based Statistical Machine Translation長岡技術科学大学　自然言語処理研究室

Soft skills for students

Soft skills for students

Soft skills for studentsLouis Britto

Natural language processing (nlp)

Natural language processing (nlp)

Natural language processing (nlp)Kuppusamy P

A Programmer's Guide to Humans

A Programmer's Guide to Humans

A Programmer's Guide to HumansArty Starr

Beyond Words: Journey into Large Language Models(LLMs) - Day-1

Beyond Words: Journey into Large Language Models(LLMs) - Day-1

Beyond Words: Journey into Large Language Models(LLMs) - Day-1SahithiGurlinka

Generative Analysis Overview

Generative Analysis Overview

Generative Analysis OverviewJim Arlow

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik

How to make a presentation

How to make a presentation

How to make a presentationttyVl2012

HLTMike Tian-Jian Jiang

Similar to Discovering the units in language cognition:From empirical evidence to a computational model (20)

Interactive Video for Training: Secrets of Success

Interactive Video for Training: Secrets of Success

Interactive Video for Training: Secrets of Success

A Panorama of Natural Language Processing

A Panorama of Natural Language Processing

A Panorama of Natural Language Processing

NLP Bootcamp

NeurIPS_2018_ConvAI2_ParticipantSlides.pptx

NeurIPS_2018_ConvAI2_ParticipantSlides.pptx

NeurIPS_2018_ConvAI2_ParticipantSlides.pptx

Deep Learning for NLP: An Introduction to Neural Word Embeddings

Deep Learning for NLP: An Introduction to Neural Word Embeddings

Deep Learning for NLP: An Introduction to Neural Word Embeddings

NLP pipeline in machine translation

NLP pipeline in machine translation

NLP pipeline in machine translation

Data Science Salon: In your own words: computing customer similarity from tex...

Data Science Salon: In your own words: computing customer similarity from tex...

Data Science Salon: In your own words: computing customer similarity from tex...

Growth meetup-q4-2014

Growth meetup-q4-2014

Growth meetup-q4-2014

NLP Bootcamp 2018 : Representation Learning of text for NLP

NLP Bootcamp 2018 : Representation Learning of text for NLP

NLP Bootcamp 2018 : Representation Learning of text for NLP

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5

Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5

輪読：単語認知１・前半（関西学院大学・金澤）

輪読：単語認知１・前半（関西学院大学・金澤）

輪読：単語認知１・前半（関西学院大学・金澤）

Modality-Preserving Phrase-based Statistical Machine Translation

Modality-Preserving Phrase-based Statistical Machine Translation

Modality-Preserving Phrase-based Statistical Machine Translation

Soft skills for students

Soft skills for students

Soft skills for students

Natural language processing (nlp)

Natural language processing (nlp)

Natural language processing (nlp)

A Programmer's Guide to Humans

A Programmer's Guide to Humans

A Programmer's Guide to Humans

Beyond Words: Journey into Large Language Models(LLMs) - Day-1

Beyond Words: Journey into Large Language Models(LLMs) - Day-1

Beyond Words: Journey into Large Language Models(LLMs) - Day-1

Generative Analysis Overview

Generative Analysis Overview

Generative Analysis Overview

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

How to make a presentation

How to make a presentation

How to make a presentation

HLT

Recently uploaded

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

Thirunelveli call girls Tamil escorts 7877702510

Thirunelveli call girls Tamil escorts 7877702510

Thirunelveli call girls Tamil escorts 7877702510Vipesco

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt

Mathematics of Finance Presentation.pptx

Mathematics of Finance Presentation.pptx

Mathematics of Finance Presentation.pptxMoumonDas2

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal

Microsoft Copilot AI for Everyone - created by AI

Microsoft Copilot AI for Everyone - created by AI

Microsoft Copilot AI for Everyone - created by AITatiana Gurgel

George Lever - eCommerce Day Chile 2024

George Lever - eCommerce Day Chile 2024

George Lever - eCommerce Day Chile 2024eCommerce Institute

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22

Introduction to Prompt Engineering (Focusing on ChatGPT)

Introduction to Prompt Engineering (Focusing on ChatGPT)

Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls

Presentation on Engagement in Book Clubs

Presentation on Engagement in Book Clubs

Presentation on Engagement in Book Clubssamaasim06

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen

Report Writing Webinar Training

Report Writing Webinar Training

Report Writing Webinar TrainingKylaCullinane

If this Giant Must Walk: A Manifesto for a New Nigeria

If this Giant Must Walk: A Manifesto for a New Nigeria

If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi

Recently uploaded (20)

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

Thirunelveli call girls Tamil escorts 7877702510

Thirunelveli call girls Tamil escorts 7877702510

Thirunelveli call girls Tamil escorts 7877702510

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

Mathematics of Finance Presentation.pptx

Mathematics of Finance Presentation.pptx

Mathematics of Finance Presentation.pptx

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

Microsoft Copilot AI for Everyone - created by AI

Microsoft Copilot AI for Everyone - created by AI

Microsoft Copilot AI for Everyone - created by AI

George Lever - eCommerce Day Chile 2024

George Lever - eCommerce Day Chile 2024

George Lever - eCommerce Day Chile 2024

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

Introduction to Prompt Engineering (Focusing on ChatGPT)

Introduction to Prompt Engineering (Focusing on ChatGPT)

Introduction to Prompt Engineering (Focusing on ChatGPT)

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

Presentation on Engagement in Book Clubs

Presentation on Engagement in Book Clubs

Presentation on Engagement in Book Clubs

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

Report Writing Webinar Training

Report Writing Webinar Training

Report Writing Webinar Training

If this Giant Must Walk: A Manifesto for a New Nigeria

If this Giant Must Walk: A Manifesto for a New Nigeria

If this Giant Must Walk: A Manifesto for a New Nigeria

Discovering the units in language cognition:From empirical evidence to a computational model

1. Jinbiao Yang Promotor: Prof. dr. Antal van den Bosch Copromotor: Dr. Stefan L. Frank Discovering the units in language cognition: From empirical evidence to a computational model

2. What are the real language units that we use in our daily life? perceive, memorize, and produce... Cognitiv units Words Word parts Word combinations “bicycle” ? “biunique” “紫菜” ? “紫贝” “pine·wood” ? “pine·apple” “马术” ? “马桶” “how are you?” ? “how is Jim?” “人工智能” ? “人工心脏” • What are the cognitive units? • Which type of units are more likely to be the cognitive units?

4. ABCD AB CD 80~120 ms The processing of the units during reading ABCD AB 170 ms ~ 230 ms ~ Detecting the familiar units Recognizing the detected units Integration Reading the text

5. What are the cognitive units? • The larger units tend to be the cognitive units in use. • Fewer number of units in sentence; • Less effort on working memory. Unit Detection: Familiarity-based Unit Recognition: Larger- first • The familiar units (to the language user) tend to be the cognitive units;

6. What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … • Working memory • (Only one unit in each sentence) • Long term memory • (Too many units in mental lexicon) Least effort Heavy load

7. Heavy load What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … • Working memory • (Too many units in each sentence) • Long term memory • (Only symbol units in mental lexicon) Least effort

8. • Working memory • (Fewer number of units in sentence) • Long term memory • (Fewer number of units in mental lexicon) What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … Cognitive units are the units that can minimize the cognitive load. Less effort Less effort

9. Unsupervised learning of cognitive units g,o,i,n,g,t,r,a, go,in,ing,to,ra, going,rain, goingto The Less-is-Better model (LiB)

10. • Unit examples: • yeah, what, can you, that’s, ing, ly • 的, 没有, 我们, 写完, 一个, 长大了 • Segmentation examples: • Allright· whydon’t· we· puthimaway· now • 这个· 出口信贷· 项目· 委托· 中国银行· 为· 代理·银行

11. LiB units = cognitive units? Hypotheses: 1. Reading is cognitive-unit-wise 2. LiB units = cognitive units Eye fixations = Centers of cognitive units = Centers of LiB units

12. Model English Dutch LiB-unit-wise Reading 53.1% 51.9% Word-wise Reading 38.3% 38.7% Prediction F1 scores LiB units ≈ Cognitive units Predict Train

13. Take-Home Message • The familiar/larger units tend to be the cognitive units; • Reading is cognitive-unit-by-cognitive-unit. • The LiB Model can learn cognitive units. • Cognitive units are : • ✘words/morphemes/phrases. • ✓the units that can minimize the cognitive load. (The need of cognitive economy)

14. Expanding the horizon Cognitive units may be the better units for psycholinguistics and NLP.

15. Thank you for listening !

16.

17. For the experiment: Four types of 4-character Chinese strings • Phrase • e.g., “希腊神话”, translation: Greek Greek mythology. • Random words • e.g., “存款电脑”, translation: Deposit- computer. • Idiom • e.g., “以逸待劳”, translation: Wait for the exhausted enemy at your ease. • Random characters • e.g., “投其顾此”, a nonsense word.

18. 80~120 ms - The brain signal (EEG) of reading = Group Timing

19. Decide whether the target is familiar or not. Target FAST SLOW Target Processing Processing

20. + + The brain signal (EEG) of reading Group Timing 170 ms ~

21. 230 ms ~ + + The brain signal (EEG) of reading Group Timing 170 ms ~

22. • Lexicon examples: • the, yeah, you, what, wanna, can you, two, and, that’s • 没有, 中国, 我们, 经济, 已经, 孩子, 但是, 教育, 可以 • Segmentation examples: • allright·whydon’t·we·puthimaway·now • 这个·出口信贷·项目·委托·中国银行·为·代理·银行 Performance on computational tasks Symbols BPE subwords Words LiB units Minimum Description Length BRphono 490 451 289 281 CTB8 21,921 18,362 16,809 16,755 2-gram surprisal BRphono 1.539 0.695 0.677 0.548 CTB8 2.466 1.932 1.617 1.452 3-gram surprisal BRphono 0.950 0.390 0.405 0.335 CTB8 1.404 0.827 0.806 0.626

23. Prediction performance Model English Dutch Less-is-better 53.06 51.87 Adaptor Grammar (collocation) 53.35 51.45 Chunk-Based Learner 52.20 50.04 Fixation counts determined by word length 50.82 50.57 Word-by-Word reading 38.32 38.68 Adaptor Grammar (word) 30.10 28.95 F1 scores

24. • Q1: What takes priority of processing in language hierarchy? • A: The global & familiar units (Yang et al. 2020a). • Q2: How to learn/segment the flexible cognitive units? • A: The Less-is-Better unsupervised model (Yang et al. 2020b). • Q3: Can a computational model generate empirical cognitive units? • A: Very likely, because we can predict eye fixations using LiB model (Conditional Acceptance).

25. Previous research (methodology) Toolboxes: • Analyzing MEEG: EasyEEG (Yang et al. 2018) • Running experiments: Expy • Making stimuli: CharDB, VoiceGen Denoising algorithms: • Removing EOG noise: DeEOG • Finding the true zero/baseline point of MEEG wave: DeTrend, TrialAlign Improving the trial-by-trial decoding: • by desensitizing the phase of high-frequency bands • by Contrast Learning (ongoing work)

Editor's Notes

Word-by-Word reading assumes that the cognitive units are equal to words Only-Length assumes that the fixation counts on a word is determined by the word length with eye fixation knowledge
Q1 Discussion: Cognitive units are not just words, they are flexible and reflect the larger-first principle