SlideShare a Scribd company logo
1 of 25
Jinbiao Yang
Promotor: Prof. dr. Antal van den Bosch
Copromotor: Dr. Stefan L. Frank
Discovering the units in
language cognition:
From empirical evidence to
a computational model
What are the real language units that
we use in our daily life?
perceive, memorize, and produce...
Cognitiv
units
Words
Word parts
Word
combinations
“bicycle”
? “biunique”
“紫菜”
? “紫贝”
“pine·wood”
? “pine·apple”
“马术”
? “马桶”
“how are you?”
? “how is Jim?”
“人工智能”
? “人工心脏”
• What are the cognitive units?
• Which type of units are more
likely to be the cognitive units?
Experiments
ABCD
AB CD
80~120 ms
The processing of the units during reading
ABCD
AB
170 ms ~
230 ms ~
Detecting the
familiar units
Recognizing the
detected units
Integration
Reading
the text
What are the cognitive units?
• The larger units tend to be the cognitive
units in use.
• Fewer number of units in sentence;
• Less effort on working memory.
Unit Detection:
Familiarity-based
Unit Recognition: Larger-
first
• The familiar units (to the
language user) tend to be
the cognitive units;
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Only one unit in each sentence)
• Long term memory
• (Too many units in mental lexicon)
Least effort
Heavy load
Heavy load
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
• Working memory
• (Too many units in each sentence)
• Long term memory
• (Only symbol units in mental lexicon)
Least effort
• Working memory
• (Fewer number of units in
sentence)
• Long term memory
• (Fewer number of units in
mental lexicon)
What are the cognitive units?
• "what's this letter“
• "okay"
• "let's see"
• "what's this wonderful toy"
• "i don't know"
• …
Cognitive units are the
units that can minimize
the cognitive load.
Less effort
Less effort
Unsupervised learning of cognitive units
g,o,i,n,g,t,r,a,
go,in,ing,to,ra,
going,rain,
goingto
The Less-is-Better model (LiB)
• Unit examples:
• yeah, what, can you, that’s, ing, ly
• 的, 没有, 我们, 写完, 一个, 长大了
• Segmentation examples:
• Allright· whydon’t· we· puthimaway· now
• 这个· 出口信贷· 项目· 委托· 中国银行· 为· 代理·银行
LiB units = cognitive units?
Hypotheses:
1. Reading is cognitive-unit-wise
2. LiB units = cognitive units
Eye fixations
=
Centers of cognitive units
=
Centers of LiB units
Model English Dutch
LiB-unit-wise Reading 53.1% 51.9%
Word-wise Reading 38.3% 38.7%
Prediction F1 scores
LiB units ≈ Cognitive units
Predict
Train
Take-Home Message
• The familiar/larger units tend to be the cognitive units;
• Reading is cognitive-unit-by-cognitive-unit.
• The LiB Model can learn cognitive units.
• Cognitive units are :
• ✘words/morphemes/phrases.
• ✓the units that can minimize the cognitive load.
(The need of cognitive economy)
Expanding the horizon
Cognitive units may be the better units for psycholinguistics
and NLP.
Thank you for listening !
For the experiment:
Four types of 4-character Chinese strings
• Phrase
• e.g., “希腊神话”,
translation: Greek
Greek mythology.
• Random words
• e.g., “存款电脑”,
translation: Deposit-
computer.
• Idiom
• e.g., “以逸待劳”,
translation: Wait for
the exhausted enemy
at your ease.
• Random characters
• e.g., “投其顾此”, a
nonsense word.
80~120 ms
-
The brain signal (EEG) of reading
=
Group Timing
Decide whether the target is
familiar or not.
Target
FAST
SLOW
Target
Processing
Processing
+
+
The brain signal (EEG) of reading
Group Timing
170 ms ~
230 ms ~
+
+
The brain signal (EEG) of reading
Group Timing
170 ms ~
• Lexicon examples:
• the, yeah, you, what, wanna, can you, two, and, that’s
• 没有, 中国, 我们, 经济, 已经, 孩子, 但是, 教育, 可以
• Segmentation examples:
• allright·whydon’t·we·puthimaway·now
• 这个·出口信贷·项目·委托·中国银行·为·代理·银行
Performance on computational tasks
Symbols BPE subwords Words LiB units
Minimum Description
Length
BRphono 490 451 289 281
CTB8 21,921 18,362 16,809 16,755
2-gram surprisal BRphono 1.539 0.695 0.677 0.548
CTB8 2.466 1.932 1.617 1.452
3-gram surprisal BRphono 0.950 0.390 0.405 0.335
CTB8 1.404 0.827 0.806 0.626
Prediction performance
Model English Dutch
Less-is-better 53.06 51.87
Adaptor Grammar (collocation) 53.35 51.45
Chunk-Based Learner 52.20 50.04
Fixation counts determined by word length 50.82 50.57
Word-by-Word reading 38.32 38.68
Adaptor Grammar (word) 30.10 28.95
F1 scores
• Q1: What takes priority of processing in language hierarchy?
• A: The global & familiar units (Yang et al. 2020a).
• Q2: How to learn/segment the flexible cognitive units?
• A: The Less-is-Better unsupervised model (Yang et al. 2020b).
• Q3: Can a computational model generate empirical cognitive units?
• A: Very likely, because we can predict eye fixations using LiB model (Conditional
Acceptance).
Previous research (methodology)
Toolboxes:
• Analyzing MEEG: EasyEEG (Yang et al. 2018)
• Running experiments: Expy
• Making stimuli: CharDB, VoiceGen
Denoising algorithms:
• Removing EOG noise: DeEOG
• Finding the true zero/baseline point of MEEG wave: DeTrend, TrialAlign
Improving the trial-by-trial decoding:
• by desensitizing the phase of high-frequency bands
• by Contrast Learning (ongoing work)

More Related Content

Similar to Discovering the units in language cognition: From empirical evidence to a computational model

Interactive Video for Training: Secrets of Success
Interactive Video for Training: Secrets of SuccessInteractive Video for Training: Secrets of Success
Interactive Video for Training: Secrets of SuccessCass Sapir
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxNeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxKaiduTester
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translationMarcis Pinnis
 
Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Formulatedby
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5Joanie McMahon MS,BSN,RN
 
輪読:単語認知1・前半 (関西学院大学・金澤)
輪読:単語認知1・前半 (関西学院大学・金澤)輪読:単語認知1・前半 (関西学院大学・金澤)
輪読:単語認知1・前半 (関西学院大学・金澤)Yu Kanazawa / Osaka University
 
Soft skills for students
Soft skills for studentsSoft skills for students
Soft skills for studentsLouis Britto
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
A Programmer's Guide to Humans
A Programmer's Guide to HumansA Programmer's Guide to Humans
A Programmer's Guide to HumansArty Starr
 
Beyond Words: Journey into Large Language Models(LLMs) - Day-1
Beyond Words: Journey into Large Language Models(LLMs) - Day-1Beyond Words: Journey into Large Language Models(LLMs) - Day-1
Beyond Words: Journey into Large Language Models(LLMs) - Day-1SahithiGurlinka
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis OverviewJim Arlow
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
How to make a presentation
How to make a presentationHow to make a presentation
How to make a presentationttyVl2012
 

Similar to Discovering the units in language cognition: From empirical evidence to a computational model (20)

Interactive Video for Training: Secrets of Success
Interactive Video for Training: Secrets of SuccessInteractive Video for Training: Secrets of Success
Interactive Video for Training: Secrets of Success
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxNeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...
 
Growth meetup-q4-2014
Growth meetup-q4-2014Growth meetup-q4-2014
Growth meetup-q4-2014
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5
Dump the-drone-easy-steps-to-livelier-elearning-1201324532943289-5
 
輪読:単語認知1・前半 (関西学院大学・金澤)
輪読:単語認知1・前半 (関西学院大学・金澤)輪読:単語認知1・前半 (関西学院大学・金澤)
輪読:単語認知1・前半 (関西学院大学・金澤)
 
Modality-Preserving Phrase-based Statistical Machine Translation
Modality-Preserving Phrase-based Statistical Machine TranslationModality-Preserving Phrase-based Statistical Machine Translation
Modality-Preserving Phrase-based Statistical Machine Translation
 
Soft skills for students
Soft skills for studentsSoft skills for students
Soft skills for students
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
A Programmer's Guide to Humans
A Programmer's Guide to HumansA Programmer's Guide to Humans
A Programmer's Guide to Humans
 
Beyond Words: Journey into Large Language Models(LLMs) - Day-1
Beyond Words: Journey into Large Language Models(LLMs) - Day-1Beyond Words: Journey into Large Language Models(LLMs) - Day-1
Beyond Words: Journey into Large Language Models(LLMs) - Day-1
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis Overview
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
How to make a presentation
How to make a presentationHow to make a presentation
How to make a presentation
 
HLT
HLTHLT
HLT
 

Recently uploaded

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 

Recently uploaded (20)

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 

Discovering the units in language cognition: From empirical evidence to a computational model

  • 1. Jinbiao Yang Promotor: Prof. dr. Antal van den Bosch Copromotor: Dr. Stefan L. Frank Discovering the units in language cognition: From empirical evidence to a computational model
  • 2. What are the real language units that we use in our daily life? perceive, memorize, and produce... Cognitiv units Words Word parts Word combinations “bicycle” ? “biunique” “紫菜” ? “紫贝” “pine·wood” ? “pine·apple” “马术” ? “马桶” “how are you?” ? “how is Jim?” “人工智能” ? “人工心脏” • What are the cognitive units? • Which type of units are more likely to be the cognitive units?
  • 4. ABCD AB CD 80~120 ms The processing of the units during reading ABCD AB 170 ms ~ 230 ms ~ Detecting the familiar units Recognizing the detected units Integration Reading the text
  • 5. What are the cognitive units? • The larger units tend to be the cognitive units in use. • Fewer number of units in sentence; • Less effort on working memory. Unit Detection: Familiarity-based Unit Recognition: Larger- first • The familiar units (to the language user) tend to be the cognitive units;
  • 6. What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … • Working memory • (Only one unit in each sentence) • Long term memory • (Too many units in mental lexicon) Least effort Heavy load
  • 7. Heavy load What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … • Working memory • (Too many units in each sentence) • Long term memory • (Only symbol units in mental lexicon) Least effort
  • 8. • Working memory • (Fewer number of units in sentence) • Long term memory • (Fewer number of units in mental lexicon) What are the cognitive units? • "what's this letter“ • "okay" • "let's see" • "what's this wonderful toy" • "i don't know" • … Cognitive units are the units that can minimize the cognitive load. Less effort Less effort
  • 9. Unsupervised learning of cognitive units g,o,i,n,g,t,r,a, go,in,ing,to,ra, going,rain, goingto The Less-is-Better model (LiB)
  • 10. • Unit examples: • yeah, what, can you, that’s, ing, ly • 的, 没有, 我们, 写完, 一个, 长大了 • Segmentation examples: • Allright· whydon’t· we· puthimaway· now • 这个· 出口信贷· 项目· 委托· 中国银行· 为· 代理·银行
  • 11. LiB units = cognitive units? Hypotheses: 1. Reading is cognitive-unit-wise 2. LiB units = cognitive units Eye fixations = Centers of cognitive units = Centers of LiB units
  • 12. Model English Dutch LiB-unit-wise Reading 53.1% 51.9% Word-wise Reading 38.3% 38.7% Prediction F1 scores LiB units ≈ Cognitive units Predict Train
  • 13. Take-Home Message • The familiar/larger units tend to be the cognitive units; • Reading is cognitive-unit-by-cognitive-unit. • The LiB Model can learn cognitive units. • Cognitive units are : • ✘words/morphemes/phrases. • ✓the units that can minimize the cognitive load. (The need of cognitive economy)
  • 14. Expanding the horizon Cognitive units may be the better units for psycholinguistics and NLP.
  • 15. Thank you for listening !
  • 16.
  • 17. For the experiment: Four types of 4-character Chinese strings • Phrase • e.g., “希腊神话”, translation: Greek Greek mythology. • Random words • e.g., “存款电脑”, translation: Deposit- computer. • Idiom • e.g., “以逸待劳”, translation: Wait for the exhausted enemy at your ease. • Random characters • e.g., “投其顾此”, a nonsense word.
  • 18. 80~120 ms - The brain signal (EEG) of reading = Group Timing
  • 19. Decide whether the target is familiar or not. Target FAST SLOW Target Processing Processing
  • 20. + + The brain signal (EEG) of reading Group Timing 170 ms ~
  • 21. 230 ms ~ + + The brain signal (EEG) of reading Group Timing 170 ms ~
  • 22. • Lexicon examples: • the, yeah, you, what, wanna, can you, two, and, that’s • 没有, 中国, 我们, 经济, 已经, 孩子, 但是, 教育, 可以 • Segmentation examples: • allright·whydon’t·we·puthimaway·now • 这个·出口信贷·项目·委托·中国银行·为·代理·银行 Performance on computational tasks Symbols BPE subwords Words LiB units Minimum Description Length BRphono 490 451 289 281 CTB8 21,921 18,362 16,809 16,755 2-gram surprisal BRphono 1.539 0.695 0.677 0.548 CTB8 2.466 1.932 1.617 1.452 3-gram surprisal BRphono 0.950 0.390 0.405 0.335 CTB8 1.404 0.827 0.806 0.626
  • 23. Prediction performance Model English Dutch Less-is-better 53.06 51.87 Adaptor Grammar (collocation) 53.35 51.45 Chunk-Based Learner 52.20 50.04 Fixation counts determined by word length 50.82 50.57 Word-by-Word reading 38.32 38.68 Adaptor Grammar (word) 30.10 28.95 F1 scores
  • 24. • Q1: What takes priority of processing in language hierarchy? • A: The global & familiar units (Yang et al. 2020a). • Q2: How to learn/segment the flexible cognitive units? • A: The Less-is-Better unsupervised model (Yang et al. 2020b). • Q3: Can a computational model generate empirical cognitive units? • A: Very likely, because we can predict eye fixations using LiB model (Conditional Acceptance).
  • 25. Previous research (methodology) Toolboxes: • Analyzing MEEG: EasyEEG (Yang et al. 2018) • Running experiments: Expy • Making stimuli: CharDB, VoiceGen Denoising algorithms: • Removing EOG noise: DeEOG • Finding the true zero/baseline point of MEEG wave: DeTrend, TrialAlign Improving the trial-by-trial decoding: • by desensitizing the phase of high-frequency bands • by Contrast Learning (ongoing work)

Editor's Notes

  1. Word-by-Word reading assumes that the cognitive units are equal to words Only-Length assumes that the fixation counts on a word is determined by the word length with eye fixation knowledge
  2. Q1 Discussion: Cognitive units are not just words, they are flexible and reflect the larger-first principle