SlideShare a Scribd company logo
1 of 25
Download to read offline
How to supervise a PhD in NLP
in the ChatGPT area?
WiMLDS
September 27th, 2023
Laure Soulier
Who I am?
2
Associate professor at Sorbonne University - MLIA team in the ISIR lab
Research interests:
- Information retrieval and NLP
- Deep learning, representation learning
- Language models
Supervision:
- 3 defended theses
- 6 on-going theses
- 1 postdoctoral researcher /year
- 2-3 master intern students /year
Conversational search
& neural ranking models
Data-to-text generation
Language grounding
2
Why this topic?
à The ChatGPT craze
3
1 million users in 5 days
173 million active users in April 2023
0
5000
10000
15000
20000
25000
30000
35000
40000
2
0
1
5
2
0
1
6
2
0
1
7
2
0
1
8
2
0
1
9
2
0
2
0
2
0
2
1
2
0
2
2
2
0
2
3
Large language models Language models
à Emergence of large
language models
àThings are moving faster and
faster in the research community
(statistics extracted from google scholar)
A Survey of Large Language Models, Zhao et al, 2023
For who is this talk?
à Colleagues: opening up a debate
- What to expect from Ph.D. students
- How to « survive »
4
For who is this talk?
à (Future) PhD sutdents
- What to expect from your advisors
- How to « survive »
5
For who is this talk?
à Industrial partners
- How to collaborate with Ph.D. students during a CIFRE
- Indentifying what Ph.D. are good at
6
For who is this talk?
à Curious people
- What does a thesis look like?
7
Outline of the talk
➜ Overview of LLM
➜ The impact of recent advances of LLM on NLP use cases
8
This talk is built on the basis of my own experience and does not engage colleagues.
You might have different opinions or different experiences.
Feel free to share them in the Q&A sessions or during the cocktail!
Conversational
search
Data-to-text
generation
(Large) Language Models
Given a sequence of items !!, !", … , !#$!, what is the probability of the next item !#?
$ !# !!, !", … , !#$!)
A salad is composed of (Large) Language model
Lettuce Probability: 0.9
Tomatoes Probability: 0.85
Corn Probability: 0.6
Ice cream Probability: 0.001
.
.
.
Principle:
- Modeling the probability of sequences !!, !", … , !_'
- Items may be words, characters, character ngrams, word pieces, etc
Semantics, word representation and latent space
Salad
Lettuce
Tomatoes
Ice cream
Corn Salad = (0.3, 0.2, 0.45, -0.1, -0.3)
Lettuce = (0.2, 0.1, 0.38, -0.5, -0.4)
…
Ice cream = (-0.9, -0.3, -0.5, 0.8, 0.7)
9
(Large) Language Models
Transformer networks (2017) A encoder-decoder neural network w/:
- About 65M parameters
- Successive feed-forward blocks
- Paralel heads
… That estimates contextual representations of items
with self-attention
Distinguishing Washington/city from Washington/man
(Vaswini et al 2017)
10
Large Language Models: interesting properties
➜ Scaling law
Larger Language Models reach good
performance level earlier than small
language models:
- fewer optimization steps
- fewer data points
(Kaplan et al, 2020)
© https://aibusiness.com/companies/nvidia-and-
microsoft-build-the-world-s-largest-530bn-
parameter-language-model
➜ Emerging properties
11
Large Language Models: interesting properties
➜ Prompting
➜ Prompt :
Instruction explicitly expressing
what is expected
➜ Challenge:
Writing the good prompt
(task, context, expected output …)
➜ Implication:
Everything is generation
From Thomas Gerald - 2023
Translate this sentence in
French: « the sun shines »
Output:
Le soleil brille
12
Large Language Models: interesting properties
➜ In-context learning
• Learning from examples mentioned in the prompt
• Without fine-tuning of the model
Multimodal few-shot learning with frozen language models, Tsimpoukelli et al. 2021
13
Large Language Models: interesting properties
1. Language model: general knowledge
2. Adaptation to a new task with fine-tuning
cat dog
Encoder
Pretraining
text
Decoder
words & text
representations
Word prediction; sentence completion; ...
Pretrained Language Model Finetuned Model
Language Model
your
(small)
data
expected
target
+
Adapted Language
Model
Massive corpus
= 3%
of the corpus
It's raining MASK and PRED
14
15
The impact of LLM on research
Use case on conversational search
Introduction
→ Replacing or augmenting IR systems to perform search session in natural language
Objectives [Radlinsky and Craswell 2017, Culpepper et al 2018]
6
16
Use case on conversational search
17
→ Understanding users’ information need
→ Retrieving documents according to the conversation context
→ Generating a response according to the retrieved documents
Initial definition of the research project
What current LLMs do
What we need
- Capturing the semantics of words
- Leveraging the conversation context
- Word representations
- Prompting*
What current LLMs do
What we need
- Matching contextual information
needs with documents
- Leveraging users’ feedback
- Word representations
- Neural ranking models
What current LLMs do
What we need
- Synthesizing document content into a
structured response
- Text generation
- Prompting*
2017
Pierre
Erbacher’s
thesis
Use case on conversational search
Proactive information systems
with clarifying questions
18
First strategy: Thinking to the next step 2018-2019
→ Multi-turn clarification framework and analyzing its impact on the retrieval effectiveness
[Erbacher et al., SIGIR 2021]
Contributions
à What existed:
- Small human-annotated datasets
- Single-turn interaction datasets
Except that….
Use case on conversational search
19
How to react? Which strategy?
- Stop your thesis? Change thesis subject?
- Change task?
- Since GPT3 and ChatGPT are not open-sourced, designing an open-source model
- … What else?
Use case on conversational search
20
Second strategy: Leveraging existing models 2023
→ Generating new conversational search sessions using IR datasets
LLM with the following prompt:
« Query: q Facet: f »
fine-tune to generate clarifying questions
LLM with the following prompt:
« Query: q Intent: i Question: cq »
fine-tune to a yes/no user’s answer
Use case on conversational search
21
Second strategy: Leveraging existing models 2023
→ Beyond Toolformer: learning LLM when to search
Toolformer: Language Models Can Teach Themselves to Use Tools, Schick et al, 2023
Our approach
(Erbacher et al – under submission)
Toolformer
22
Conclusion
Conclusion - Discussion
What it has changed in a thesis?
à Huge competition
à Big actors, huge number of (un)submitted papers
à Big GPU clusters (but we have Jean Zay!!!!)
à Collaborative projects between Ph.D. students (and advisors)
à Faster reactivity against the literature review
à More experiments
à Not a 3-year project anymore
à Adapt the research project to on-going innovations
23
Conclusion - Discussion
24
à Don’t be afraid!
à You are not the only one facing the tornado
à No pression: you don't have to create
version 10 of the transformer
à It is always possible to find a good idea
à You are learning valuable knowledge and skills
à Might be difficult to design effective models
à You are learning a methodology
à You are accumulating knowledge on the best
LLMs
à Be passionate!
Wrap up for future and current Ph.D. students
25
Thank you for your attention
@LaureSoulier
laure-soulier-18829948
https://pages.isir.upmc.fr/soulier/

More Related Content

Similar to How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier

Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLLawrie Hunter
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.pptmilkesa13
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023HyunJoon Jung
 
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...JohannWanja
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
The I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and QuestioningThe I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and QuestioningSue Sentance
 
1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdfssusere320ca
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLPGVS Chaitanya
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingTheodore J. LaGrow
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systemsCJ Jenkins
 

Similar to How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier (20)

ChatGPT PPT
ChatGPT PPTChatGPT PPT
ChatGPT PPT
 
Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALL
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
The I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and QuestioningThe I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and Questioning
 
1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
LuisValeroInterests
LuisValeroInterestsLuisValeroInterests
LuisValeroInterests
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systems
 

More from Paris Women in Machine Learning and Data Science

More from Paris Women in Machine Learning and Data Science (20)

Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
 
Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?
 
New edge prediction and anomaly-detection in large computer networks
New edge prediction and anomaly-detection in large computer networksNew edge prediction and anomaly-detection in large computer networks
New edge prediction and anomaly-detection in large computer networks
 
transformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdftransformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdf
 

Recently uploaded

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier

  • 1. How to supervise a PhD in NLP in the ChatGPT area? WiMLDS September 27th, 2023 Laure Soulier
  • 2. Who I am? 2 Associate professor at Sorbonne University - MLIA team in the ISIR lab Research interests: - Information retrieval and NLP - Deep learning, representation learning - Language models Supervision: - 3 defended theses - 6 on-going theses - 1 postdoctoral researcher /year - 2-3 master intern students /year Conversational search & neural ranking models Data-to-text generation Language grounding 2
  • 3. Why this topic? à The ChatGPT craze 3 1 million users in 5 days 173 million active users in April 2023 0 5000 10000 15000 20000 25000 30000 35000 40000 2 0 1 5 2 0 1 6 2 0 1 7 2 0 1 8 2 0 1 9 2 0 2 0 2 0 2 1 2 0 2 2 2 0 2 3 Large language models Language models à Emergence of large language models àThings are moving faster and faster in the research community (statistics extracted from google scholar) A Survey of Large Language Models, Zhao et al, 2023
  • 4. For who is this talk? à Colleagues: opening up a debate - What to expect from Ph.D. students - How to « survive » 4
  • 5. For who is this talk? à (Future) PhD sutdents - What to expect from your advisors - How to « survive » 5
  • 6. For who is this talk? à Industrial partners - How to collaborate with Ph.D. students during a CIFRE - Indentifying what Ph.D. are good at 6
  • 7. For who is this talk? à Curious people - What does a thesis look like? 7
  • 8. Outline of the talk ➜ Overview of LLM ➜ The impact of recent advances of LLM on NLP use cases 8 This talk is built on the basis of my own experience and does not engage colleagues. You might have different opinions or different experiences. Feel free to share them in the Q&A sessions or during the cocktail! Conversational search Data-to-text generation
  • 9. (Large) Language Models Given a sequence of items !!, !", … , !#$!, what is the probability of the next item !#? $ !# !!, !", … , !#$!) A salad is composed of (Large) Language model Lettuce Probability: 0.9 Tomatoes Probability: 0.85 Corn Probability: 0.6 Ice cream Probability: 0.001 . . . Principle: - Modeling the probability of sequences !!, !", … , !_' - Items may be words, characters, character ngrams, word pieces, etc Semantics, word representation and latent space Salad Lettuce Tomatoes Ice cream Corn Salad = (0.3, 0.2, 0.45, -0.1, -0.3) Lettuce = (0.2, 0.1, 0.38, -0.5, -0.4) … Ice cream = (-0.9, -0.3, -0.5, 0.8, 0.7) 9
  • 10. (Large) Language Models Transformer networks (2017) A encoder-decoder neural network w/: - About 65M parameters - Successive feed-forward blocks - Paralel heads … That estimates contextual representations of items with self-attention Distinguishing Washington/city from Washington/man (Vaswini et al 2017) 10
  • 11. Large Language Models: interesting properties ➜ Scaling law Larger Language Models reach good performance level earlier than small language models: - fewer optimization steps - fewer data points (Kaplan et al, 2020) © https://aibusiness.com/companies/nvidia-and- microsoft-build-the-world-s-largest-530bn- parameter-language-model ➜ Emerging properties 11
  • 12. Large Language Models: interesting properties ➜ Prompting ➜ Prompt : Instruction explicitly expressing what is expected ➜ Challenge: Writing the good prompt (task, context, expected output …) ➜ Implication: Everything is generation From Thomas Gerald - 2023 Translate this sentence in French: « the sun shines » Output: Le soleil brille 12
  • 13. Large Language Models: interesting properties ➜ In-context learning • Learning from examples mentioned in the prompt • Without fine-tuning of the model Multimodal few-shot learning with frozen language models, Tsimpoukelli et al. 2021 13
  • 14. Large Language Models: interesting properties 1. Language model: general knowledge 2. Adaptation to a new task with fine-tuning cat dog Encoder Pretraining text Decoder words & text representations Word prediction; sentence completion; ... Pretrained Language Model Finetuned Model Language Model your (small) data expected target + Adapted Language Model Massive corpus = 3% of the corpus It's raining MASK and PRED 14
  • 15. 15 The impact of LLM on research
  • 16. Use case on conversational search Introduction → Replacing or augmenting IR systems to perform search session in natural language Objectives [Radlinsky and Craswell 2017, Culpepper et al 2018] 6 16
  • 17. Use case on conversational search 17 → Understanding users’ information need → Retrieving documents according to the conversation context → Generating a response according to the retrieved documents Initial definition of the research project What current LLMs do What we need - Capturing the semantics of words - Leveraging the conversation context - Word representations - Prompting* What current LLMs do What we need - Matching contextual information needs with documents - Leveraging users’ feedback - Word representations - Neural ranking models What current LLMs do What we need - Synthesizing document content into a structured response - Text generation - Prompting* 2017 Pierre Erbacher’s thesis
  • 18. Use case on conversational search Proactive information systems with clarifying questions 18 First strategy: Thinking to the next step 2018-2019 → Multi-turn clarification framework and analyzing its impact on the retrieval effectiveness [Erbacher et al., SIGIR 2021] Contributions à What existed: - Small human-annotated datasets - Single-turn interaction datasets Except that….
  • 19. Use case on conversational search 19 How to react? Which strategy? - Stop your thesis? Change thesis subject? - Change task? - Since GPT3 and ChatGPT are not open-sourced, designing an open-source model - … What else?
  • 20. Use case on conversational search 20 Second strategy: Leveraging existing models 2023 → Generating new conversational search sessions using IR datasets LLM with the following prompt: « Query: q Facet: f » fine-tune to generate clarifying questions LLM with the following prompt: « Query: q Intent: i Question: cq » fine-tune to a yes/no user’s answer
  • 21. Use case on conversational search 21 Second strategy: Leveraging existing models 2023 → Beyond Toolformer: learning LLM when to search Toolformer: Language Models Can Teach Themselves to Use Tools, Schick et al, 2023 Our approach (Erbacher et al – under submission) Toolformer
  • 23. Conclusion - Discussion What it has changed in a thesis? à Huge competition à Big actors, huge number of (un)submitted papers à Big GPU clusters (but we have Jean Zay!!!!) à Collaborative projects between Ph.D. students (and advisors) à Faster reactivity against the literature review à More experiments à Not a 3-year project anymore à Adapt the research project to on-going innovations 23
  • 24. Conclusion - Discussion 24 à Don’t be afraid! à You are not the only one facing the tornado à No pression: you don't have to create version 10 of the transformer à It is always possible to find a good idea à You are learning valuable knowledge and skills à Might be difficult to design effective models à You are learning a methodology à You are accumulating knowledge on the best LLMs à Be passionate! Wrap up for future and current Ph.D. students
  • 25. 25 Thank you for your attention @LaureSoulier laure-soulier-18829948 https://pages.isir.upmc.fr/soulier/