SlideShare a Scribd company logo
How to supervise a PhD in NLP
in the ChatGPT area?
WiMLDS
September 27th, 2023
Laure Soulier
Who I am?
2
Associate professor at Sorbonne University - MLIA team in the ISIR lab
Research interests:
- Information retrieval and NLP
- Deep learning, representation learning
- Language models
Supervision:
- 3 defended theses
- 6 on-going theses
- 1 postdoctoral researcher /year
- 2-3 master intern students /year
Conversational search
& neural ranking models
Data-to-text generation
Language grounding
2
Why this topic?
à The ChatGPT craze
3
1 million users in 5 days
173 million active users in April 2023
0
5000
10000
15000
20000
25000
30000
35000
40000
2
0
1
5
2
0
1
6
2
0
1
7
2
0
1
8
2
0
1
9
2
0
2
0
2
0
2
1
2
0
2
2
2
0
2
3
Large language models Language models
à Emergence of large
language models
àThings are moving faster and
faster in the research community
(statistics extracted from google scholar)
A Survey of Large Language Models, Zhao et al, 2023
For who is this talk?
à Colleagues: opening up a debate
- What to expect from Ph.D. students
- How to « survive »
4
For who is this talk?
à (Future) PhD sutdents
- What to expect from your advisors
- How to « survive »
5
For who is this talk?
à Industrial partners
- How to collaborate with Ph.D. students during a CIFRE
- Indentifying what Ph.D. are good at
6
For who is this talk?
à Curious people
- What does a thesis look like?
7
Outline of the talk
➜ Overview of LLM
➜ The impact of recent advances of LLM on NLP use cases
8
This talk is built on the basis of my own experience and does not engage colleagues.
You might have different opinions or different experiences.
Feel free to share them in the Q&A sessions or during the cocktail!
Conversational
search
Data-to-text
generation
(Large) Language Models
Given a sequence of items !!, !", … , !#$!, what is the probability of the next item !#?
$ !# !!, !", … , !#$!)
A salad is composed of (Large) Language model
Lettuce Probability: 0.9
Tomatoes Probability: 0.85
Corn Probability: 0.6
Ice cream Probability: 0.001
.
.
.
Principle:
- Modeling the probability of sequences !!, !", … , !_'
- Items may be words, characters, character ngrams, word pieces, etc
Semantics, word representation and latent space
Salad
Lettuce
Tomatoes
Ice cream
Corn Salad = (0.3, 0.2, 0.45, -0.1, -0.3)
Lettuce = (0.2, 0.1, 0.38, -0.5, -0.4)
…
Ice cream = (-0.9, -0.3, -0.5, 0.8, 0.7)
9
(Large) Language Models
Transformer networks (2017) A encoder-decoder neural network w/:
- About 65M parameters
- Successive feed-forward blocks
- Paralel heads
… That estimates contextual representations of items
with self-attention
Distinguishing Washington/city from Washington/man
(Vaswini et al 2017)
10
Large Language Models: interesting properties
➜ Scaling law
Larger Language Models reach good
performance level earlier than small
language models:
- fewer optimization steps
- fewer data points
(Kaplan et al, 2020)
© https://aibusiness.com/companies/nvidia-and-
microsoft-build-the-world-s-largest-530bn-
parameter-language-model
➜ Emerging properties
11
Large Language Models: interesting properties
➜ Prompting
➜ Prompt :
Instruction explicitly expressing
what is expected
➜ Challenge:
Writing the good prompt
(task, context, expected output …)
➜ Implication:
Everything is generation
From Thomas Gerald - 2023
Translate this sentence in
French: « the sun shines »
Output:
Le soleil brille
12
Large Language Models: interesting properties
➜ In-context learning
• Learning from examples mentioned in the prompt
• Without fine-tuning of the model
Multimodal few-shot learning with frozen language models, Tsimpoukelli et al. 2021
13
Large Language Models: interesting properties
1. Language model: general knowledge
2. Adaptation to a new task with fine-tuning
cat dog
Encoder
Pretraining
text
Decoder
words & text
representations
Word prediction; sentence completion; ...
Pretrained Language Model Finetuned Model
Language Model
your
(small)
data
expected
target
+
Adapted Language
Model
Massive corpus
= 3%
of the corpus
It's raining MASK and PRED
14
15
The impact of LLM on research
Use case on conversational search
Introduction
→ Replacing or augmenting IR systems to perform search session in natural language
Objectives [Radlinsky and Craswell 2017, Culpepper et al 2018]
6
16
Use case on conversational search
17
→ Understanding users’ information need
→ Retrieving documents according to the conversation context
→ Generating a response according to the retrieved documents
Initial definition of the research project
What current LLMs do
What we need
- Capturing the semantics of words
- Leveraging the conversation context
- Word representations
- Prompting*
What current LLMs do
What we need
- Matching contextual information
needs with documents
- Leveraging users’ feedback
- Word representations
- Neural ranking models
What current LLMs do
What we need
- Synthesizing document content into a
structured response
- Text generation
- Prompting*
2017
Pierre
Erbacher’s
thesis
Use case on conversational search
Proactive information systems
with clarifying questions
18
First strategy: Thinking to the next step 2018-2019
→ Multi-turn clarification framework and analyzing its impact on the retrieval effectiveness
[Erbacher et al., SIGIR 2021]
Contributions
à What existed:
- Small human-annotated datasets
- Single-turn interaction datasets
Except that….
Use case on conversational search
19
How to react? Which strategy?
- Stop your thesis? Change thesis subject?
- Change task?
- Since GPT3 and ChatGPT are not open-sourced, designing an open-source model
- … What else?
Use case on conversational search
20
Second strategy: Leveraging existing models 2023
→ Generating new conversational search sessions using IR datasets
LLM with the following prompt:
« Query: q Facet: f »
fine-tune to generate clarifying questions
LLM with the following prompt:
« Query: q Intent: i Question: cq »
fine-tune to a yes/no user’s answer
Use case on conversational search
21
Second strategy: Leveraging existing models 2023
→ Beyond Toolformer: learning LLM when to search
Toolformer: Language Models Can Teach Themselves to Use Tools, Schick et al, 2023
Our approach
(Erbacher et al – under submission)
Toolformer
22
Conclusion
Conclusion - Discussion
What it has changed in a thesis?
à Huge competition
à Big actors, huge number of (un)submitted papers
à Big GPU clusters (but we have Jean Zay!!!!)
à Collaborative projects between Ph.D. students (and advisors)
à Faster reactivity against the literature review
à More experiments
à Not a 3-year project anymore
à Adapt the research project to on-going innovations
23
Conclusion - Discussion
24
à Don’t be afraid!
à You are not the only one facing the tornado
à No pression: you don't have to create
version 10 of the transformer
à It is always possible to find a good idea
à You are learning valuable knowledge and skills
à Might be difficult to design effective models
à You are learning a methodology
à You are accumulating knowledge on the best
LLMs
à Be passionate!
Wrap up for future and current Ph.D. students
25
Thank you for your attention
@LaureSoulier
laure-soulier-18829948
https://pages.isir.upmc.fr/soulier/

More Related Content

Similar to How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier

ChatGPT PPT
ChatGPT PPTChatGPT PPT
ChatGPT PPT
Pallavi Lata
 
Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLLawrie Hunter
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
Michel Bruley
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
HyunJoon Jung
 
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
JohannWanja
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
CS, NcState
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
ssuser4edc93
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
HPCC Systems
 
ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3
Miriam Fernandez
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
DataScienceConferenc1
 
The I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and QuestioningThe I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and Questioning
Sue Sentance
 
1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf
ssusere320ca
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Paris Women in Machine Learning and Data Science
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
Carole Goble
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
GVS Chaitanya
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingTheodore J. LaGrow
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systems
CJ Jenkins
 

Similar to How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier (20)

ChatGPT PPT
ChatGPT PPTChatGPT PPT
ChatGPT PPT
 
Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALL
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
The I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and QuestioningThe I in PRIMM - Code Comprehension and Questioning
The I in PRIMM - Code Comprehension and Questioning
 
1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf1066_multitask_prompted_training_en.pdf
1066_multitask_prompted_training_en.pdf
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
LuisValeroInterests
LuisValeroInterestsLuisValeroInterests
LuisValeroInterests
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systems
 

More from Paris Women in Machine Learning and Data Science

Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
Paris Women in Machine Learning and Data Science
 
How and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe DaudierHow and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe Daudier
Paris Women in Machine Learning and Data Science
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
Paris Women in Machine Learning and Data Science
 
Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
Paris Women in Machine Learning and Data Science
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Paris Women in Machine Learning and Data Science
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
Paris Women in Machine Learning and Data Science
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
Paris Women in Machine Learning and Data Science
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Paris Women in Machine Learning and Data Science
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
Paris Women in Machine Learning and Data Science
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Paris Women in Machine Learning and Data Science
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
Paris Women in Machine Learning and Data Science
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Paris Women in Machine Learning and Data Science
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
Paris Women in Machine Learning and Data Science
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Paris Women in Machine Learning and Data Science
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
Paris Women in Machine Learning and Data Science
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Paris Women in Machine Learning and Data Science
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Paris Women in Machine Learning and Data Science
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
Paris Women in Machine Learning and Data Science
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
Paris Women in Machine Learning and Data Science
 

More from Paris Women in Machine Learning and Data Science (20)

Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
How and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe DaudierHow and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe Daudier
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
 

Recently uploaded

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 

Recently uploaded (20)

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 

How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier

  • 1. How to supervise a PhD in NLP in the ChatGPT area? WiMLDS September 27th, 2023 Laure Soulier
  • 2. Who I am? 2 Associate professor at Sorbonne University - MLIA team in the ISIR lab Research interests: - Information retrieval and NLP - Deep learning, representation learning - Language models Supervision: - 3 defended theses - 6 on-going theses - 1 postdoctoral researcher /year - 2-3 master intern students /year Conversational search & neural ranking models Data-to-text generation Language grounding 2
  • 3. Why this topic? à The ChatGPT craze 3 1 million users in 5 days 173 million active users in April 2023 0 5000 10000 15000 20000 25000 30000 35000 40000 2 0 1 5 2 0 1 6 2 0 1 7 2 0 1 8 2 0 1 9 2 0 2 0 2 0 2 1 2 0 2 2 2 0 2 3 Large language models Language models à Emergence of large language models àThings are moving faster and faster in the research community (statistics extracted from google scholar) A Survey of Large Language Models, Zhao et al, 2023
  • 4. For who is this talk? à Colleagues: opening up a debate - What to expect from Ph.D. students - How to « survive » 4
  • 5. For who is this talk? à (Future) PhD sutdents - What to expect from your advisors - How to « survive » 5
  • 6. For who is this talk? à Industrial partners - How to collaborate with Ph.D. students during a CIFRE - Indentifying what Ph.D. are good at 6
  • 7. For who is this talk? à Curious people - What does a thesis look like? 7
  • 8. Outline of the talk ➜ Overview of LLM ➜ The impact of recent advances of LLM on NLP use cases 8 This talk is built on the basis of my own experience and does not engage colleagues. You might have different opinions or different experiences. Feel free to share them in the Q&A sessions or during the cocktail! Conversational search Data-to-text generation
  • 9. (Large) Language Models Given a sequence of items !!, !", … , !#$!, what is the probability of the next item !#? $ !# !!, !", … , !#$!) A salad is composed of (Large) Language model Lettuce Probability: 0.9 Tomatoes Probability: 0.85 Corn Probability: 0.6 Ice cream Probability: 0.001 . . . Principle: - Modeling the probability of sequences !!, !", … , !_' - Items may be words, characters, character ngrams, word pieces, etc Semantics, word representation and latent space Salad Lettuce Tomatoes Ice cream Corn Salad = (0.3, 0.2, 0.45, -0.1, -0.3) Lettuce = (0.2, 0.1, 0.38, -0.5, -0.4) … Ice cream = (-0.9, -0.3, -0.5, 0.8, 0.7) 9
  • 10. (Large) Language Models Transformer networks (2017) A encoder-decoder neural network w/: - About 65M parameters - Successive feed-forward blocks - Paralel heads … That estimates contextual representations of items with self-attention Distinguishing Washington/city from Washington/man (Vaswini et al 2017) 10
  • 11. Large Language Models: interesting properties ➜ Scaling law Larger Language Models reach good performance level earlier than small language models: - fewer optimization steps - fewer data points (Kaplan et al, 2020) © https://aibusiness.com/companies/nvidia-and- microsoft-build-the-world-s-largest-530bn- parameter-language-model ➜ Emerging properties 11
  • 12. Large Language Models: interesting properties ➜ Prompting ➜ Prompt : Instruction explicitly expressing what is expected ➜ Challenge: Writing the good prompt (task, context, expected output …) ➜ Implication: Everything is generation From Thomas Gerald - 2023 Translate this sentence in French: « the sun shines » Output: Le soleil brille 12
  • 13. Large Language Models: interesting properties ➜ In-context learning • Learning from examples mentioned in the prompt • Without fine-tuning of the model Multimodal few-shot learning with frozen language models, Tsimpoukelli et al. 2021 13
  • 14. Large Language Models: interesting properties 1. Language model: general knowledge 2. Adaptation to a new task with fine-tuning cat dog Encoder Pretraining text Decoder words & text representations Word prediction; sentence completion; ... Pretrained Language Model Finetuned Model Language Model your (small) data expected target + Adapted Language Model Massive corpus = 3% of the corpus It's raining MASK and PRED 14
  • 15. 15 The impact of LLM on research
  • 16. Use case on conversational search Introduction → Replacing or augmenting IR systems to perform search session in natural language Objectives [Radlinsky and Craswell 2017, Culpepper et al 2018] 6 16
  • 17. Use case on conversational search 17 → Understanding users’ information need → Retrieving documents according to the conversation context → Generating a response according to the retrieved documents Initial definition of the research project What current LLMs do What we need - Capturing the semantics of words - Leveraging the conversation context - Word representations - Prompting* What current LLMs do What we need - Matching contextual information needs with documents - Leveraging users’ feedback - Word representations - Neural ranking models What current LLMs do What we need - Synthesizing document content into a structured response - Text generation - Prompting* 2017 Pierre Erbacher’s thesis
  • 18. Use case on conversational search Proactive information systems with clarifying questions 18 First strategy: Thinking to the next step 2018-2019 → Multi-turn clarification framework and analyzing its impact on the retrieval effectiveness [Erbacher et al., SIGIR 2021] Contributions à What existed: - Small human-annotated datasets - Single-turn interaction datasets Except that….
  • 19. Use case on conversational search 19 How to react? Which strategy? - Stop your thesis? Change thesis subject? - Change task? - Since GPT3 and ChatGPT are not open-sourced, designing an open-source model - … What else?
  • 20. Use case on conversational search 20 Second strategy: Leveraging existing models 2023 → Generating new conversational search sessions using IR datasets LLM with the following prompt: « Query: q Facet: f » fine-tune to generate clarifying questions LLM with the following prompt: « Query: q Intent: i Question: cq » fine-tune to a yes/no user’s answer
  • 21. Use case on conversational search 21 Second strategy: Leveraging existing models 2023 → Beyond Toolformer: learning LLM when to search Toolformer: Language Models Can Teach Themselves to Use Tools, Schick et al, 2023 Our approach (Erbacher et al – under submission) Toolformer
  • 23. Conclusion - Discussion What it has changed in a thesis? à Huge competition à Big actors, huge number of (un)submitted papers à Big GPU clusters (but we have Jean Zay!!!!) à Collaborative projects between Ph.D. students (and advisors) à Faster reactivity against the literature review à More experiments à Not a 3-year project anymore à Adapt the research project to on-going innovations 23
  • 24. Conclusion - Discussion 24 à Don’t be afraid! à You are not the only one facing the tornado à No pression: you don't have to create version 10 of the transformer à It is always possible to find a good idea à You are learning valuable knowledge and skills à Might be difficult to design effective models à You are learning a methodology à You are accumulating knowledge on the best LLMs à Be passionate! Wrap up for future and current Ph.D. students
  • 25. 25 Thank you for your attention @LaureSoulier laure-soulier-18829948 https://pages.isir.upmc.fr/soulier/