Laure talked about a very hot topic in the community at the moment with the ChatGPT phenomenon: how to supervise a PhD thesis in NLP in the age of Large Language Models (LLMs)?
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
French machine reading for question answeringAli Kabbadj
This paper proposes to unlock the main barrier to machine reading and comprehension French natural language texts. This open the way to machine to find to a question a precise answer buried in the mass of unstructured French texts. Or to create a universal French chatbot. Deep learning has produced extremely promising results for various tasks in natural language understanding particularly topic classification, sentiment analysis, question answering, and language translation. But to be effective Deep Learning methods need very large training da-tasets. Until now these technics cannot be actually used for French texts Question Answering (Q&A) applications since there was not a large Q&A training dataset. We produced a large (100 000+) French training Dataset for Q&A by translating and adapting the English SQuAD v1.1 Dataset, a GloVe French word and character embed-ding vectors from Wikipedia French Dump. We trained and evaluated of three different Q&A neural network ar-chitectures in French and carried out a French Q&A models with F1 score around 70%.
https://bigscience.huggingface.co/
EN: Presentation of the BigScience project: a research initiative launched by HuggingFace and aiming to build a large language model (inspired by OpenAI and GPTx) over multiple languages and a very large processing cluster. The participants plan to investigate the dataset and the model from all angles: bias, social impact, capabilities, limitations, ethics, potential improvements, specific domain performances, carbon impact, general AI/cognitive research landscape.
FR : Présentation du projet Bigscience : un projet de recherche ouvert lancé par HuggingFace et qui a pour objectif de contruire un modèle de langue (ie un peu comme openAI et GPT-3) mais en explorant les problèmes liés au jeux de données et au modèle selon les angles des biais cognitifs, de l'impact social et environemental, des limites éthiques, des possibles gain de performance et de l'impact général de ce type d'approche lorsque le but n'est pas seulement "d'avoir un plus gros modèle".
The slides present a text recovery method based on a probabilistic post-recognition processing of the output of an Optical Character Recognition system. The proposed method is trying to fill in the gaps of missing text resulted from the recognition process of degraded documents. For this task, a corpus of up to 5-grams provided by Google is used. Several heuristics for using this corpus for the fulfilment of this task are described after presenting the general problem and alternative solutions. These heuristics have been validated using a set of experiments that are also discussed together with the results that have been obtained.
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
French machine reading for question answeringAli Kabbadj
This paper proposes to unlock the main barrier to machine reading and comprehension French natural language texts. This open the way to machine to find to a question a precise answer buried in the mass of unstructured French texts. Or to create a universal French chatbot. Deep learning has produced extremely promising results for various tasks in natural language understanding particularly topic classification, sentiment analysis, question answering, and language translation. But to be effective Deep Learning methods need very large training da-tasets. Until now these technics cannot be actually used for French texts Question Answering (Q&A) applications since there was not a large Q&A training dataset. We produced a large (100 000+) French training Dataset for Q&A by translating and adapting the English SQuAD v1.1 Dataset, a GloVe French word and character embed-ding vectors from Wikipedia French Dump. We trained and evaluated of three different Q&A neural network ar-chitectures in French and carried out a French Q&A models with F1 score around 70%.
https://bigscience.huggingface.co/
EN: Presentation of the BigScience project: a research initiative launched by HuggingFace and aiming to build a large language model (inspired by OpenAI and GPTx) over multiple languages and a very large processing cluster. The participants plan to investigate the dataset and the model from all angles: bias, social impact, capabilities, limitations, ethics, potential improvements, specific domain performances, carbon impact, general AI/cognitive research landscape.
FR : Présentation du projet Bigscience : un projet de recherche ouvert lancé par HuggingFace et qui a pour objectif de contruire un modèle de langue (ie un peu comme openAI et GPT-3) mais en explorant les problèmes liés au jeux de données et au modèle selon les angles des biais cognitifs, de l'impact social et environemental, des limites éthiques, des possibles gain de performance et de l'impact général de ce type d'approche lorsque le but n'est pas seulement "d'avoir un plus gros modèle".
The slides present a text recovery method based on a probabilistic post-recognition processing of the output of an Optical Character Recognition system. The proposed method is trying to fill in the gaps of missing text resulted from the recognition process of degraded documents. For this task, a corpus of up to 5-grams provided by Google is used. Several heuristics for using this corpus for the fulfilment of this task are described after presenting the general problem and alternative solutions. These heuristics have been validated using a set of experiments that are also discussed together with the results that have been obtained.
Welcome to my SlideShare presentation on ChatGPT, a powerful language model based on the GPT-3.5 architecture.
In this presentation, I will introduce you to ChatGPT and explore its features and capabilities. ChatGPT is a state-of-the-art language model that can understand natural language and generate responses that are highly relevant and accurate.
I will discuss the underlying technology behind ChatGPT, including its neural network architecture and training process. I will also highlight the benefits of using ChatGPT, such as its ability to understand complex language and its potential applications in various industries.
Additionally, I will share examples of how ChatGPT can be used to improve customer service, create conversational interfaces, and generate human-like responses in various applications.
In conclusion, ChatGPT is a powerful tool for businesses and individuals looking to enhance their communication capabilities. Its advanced language understanding and generation capabilities make it an ideal solution for a variety of use cases. I hope this presentation has been informative and has given you a better understanding of the capabilities of ChatGPT.
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...JohannWanja
The choice of which vocabulary to reuse when modeling and publishing Linked Open Data (LOD) is far from trivial. There is no study that investigates the different strategies of reusing vocabularies for LOD modeling and publishing. In this paper, we present the results of a survey with 79 participants that examines the most preferred vocabulary reuse strategies of LOD modeling. The participants, LOD publishers and practitioners, were asked to assess different vocabulary reuse strategies and explain their ranking decision. We found significant differences between the modeling strategies that range from reusing popular vocabularies, minimizing the number of vocabularies, and staying within one domain vocabulary. A very interesting insight is that the popularity in the meaning of how frequent a vocabulary is used in a data source is more important than how often individual classes and properties are used in the LOD cloud. Overall, the results of this survey help in better understanding the strategies how data engineers reuse vocabularies and may also be used to develop future vocabulary engineering tools.
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
Tim Menzies, NC State University, presents at the 2015 HPCC Systems Engineering Summit Community Day.
For Big Data applications, there is a lack of any gold standards for "good analysis" or methods to assess our certification programs. Hence, we are still in the dark about whether or not our human analysts are making the best use possible of the tools of Big Data. While much progress has been made in the systems aspects of Big Data, certain critical human-centered aspects remain an open issue. Regardless of the sophistication of the analysis tools and environment, all that architecture can still be used incorrectly by users. If this issue was confined to a small number of inexperienced users, then it could be addressed via process improvements such as better training. But is it? What do we know about our analysts? Where are the studies that mine the people doing the data miners?
This presentation offers some preliminary results on tools that combine ECL with other methods that recognize the code generated by experienced or inexperienced developers. While the results are preliminary, they do raise the possibility that we can better characterize what it means to be experienced (or inexperienced) at Big Data applications.
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
As many organizations are bundling large language models (LLMs) in their products, they face the problem of rigorous model selection. This talk gives a data-centric understanding of how LLMs are built and evaluated. We will discuss the limitations of current models and pay special attention to the available evaluation protocols. How do we distinguish good models from the others? What tasks and datasets should we try or avoid? How do we incorporate feedback from our users? We will present the guidelines the attendees can use in their future experiments.
The I in PRIMM - Code Comprehension and QuestioningSue Sentance
Slides from a talk given at the CAS London conference on 29th February 2020. Discusses the teaching of computer programming using PRIMM and in particular, the Investigate stage. Looks at the Block Model and how we can explore students' understanding by asking a range of different questions.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Open domain Question Answering System - Research project in NLPGVS Chaitanya
Using a computer to answer questions has been a human dream since the beginning of the digital era. A first step towards the achievement of such an ambitious goal is to deal with natural language to enable the computer to understand what its user asks. The discipline that studies the connection between natural language and the representation of its meaning via computational models is computational linguistics. According to such discipline, Question Answering can be defined as the task that, given a question formulated in natural language , aims at finding one or more concise answers. And the Improvements in Technology and the Explosive demand for better information access has reignited the interest in Q & A systems , The wealth of the information on the web makes it an Interactive resource for seeking quick Answers to factual Questions such as “Who is the first American to land in space ?”, or “what is the second Tallest Mountain in the world ?”, yet Today’s Most advanced web Search systems(Bing , Google , yahoo) make it Surprisingly Tedious to locate the Answers , Q& A System Aims to develop techniques that go beyond Retrieval of Relevant documents in order to return the exact answers using Natural language factoid question
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
The evolution of data environments towards the growth in the size, complexity, dy-
namicity and decentralisation (SCoDD) of schemas drastically impacts contemporary
data management. The SCoDD trend emerges as a central data management concern
in Big Data scenarios, where users and applications have a demand for more complete
data, produced by independent data sources, under different semantic assumptions and
contexts of use. Most Database Management Systems (DBMSs) today target a closed
communication scenario, where the symbolic schema of the database is known a priori
by the database user, which is able to interpret it in an unambiguous way. The context
in which the data is consumed and produced is well-defined and it is typically the
same context in which the data was created. In contrast, data management under the
SCoDD conditions target an open communication scenario where the symbolic system of
the database is unknown by the user and multiple interpretation contexts are possible.
In this case the database can be created under a different context from the database
user. The emergence of this new data environment demands the revisit of the semantic
assumptions behind databases and the design of data access mechanisms which can
support semantically heterogeneous (open communication) data environments.
This work aims at filling this gap by proposing a complementary semantic model for
databases, based on distributional semantic models. Distributional semantics provides a
complementary perspective to the formal perspective of database semantics, which supports
semantic approximation as a first-class database operation. Differently from models
which describe uncertain and incomplete data or probabilistic databases, distributional-
relational models focuses on the construction of conceptual approximation approaches
for databases, supported by a comprehensive semantic model automatically built from
large-scale unstructured data external to the database, which serves as a semantic/com-
monsense knowledge base. The semantic model can be used to support schema-agnosticqueries, i.e. abstracting the data consumer from a specific conceptualization behind the
data.
The proposed distributional-relational semantic model is supported by a distributional
structured vector space model, named τ −Space, which represents structured data under
a distributional semantic model representation which, in coordination with a query plan-
ning approach, supports a schema-agnostic query mechanism for large-schema databases.
The query mechanism is materialized in the Treo query engine and is evaluated using
schema-agnostic natural language queries.
The evaluation of the query mechanism confirms that distributional semantics provides
a high-recall, medium-high precision, and low maintainability solution to cope with
the abstraction and conceptual-level differences in schema-agnostic queries over largeschema/
schema-less open domain dataset
As electricity is difficult to store, it is crucial to strictly maintain the balance between production and consumption. The integration of intermittent renewable energies into the production mix has made the management of the balance more complex. However, access to near real-time data and communication with consumers via smart meters suggest demand response. Specifically, sending signals would encourage users to adjust their consumption according to the production of electricity. The algorithms used to select these signals must learn consumer reactions and optimize them while balancing exploration and exploitation. Various sequential or reinforcement learning approaches are being considered.
Online violence amplifies IRL discriminations, and the lack of diversity grows in a vicious circle. Understanding cyber-violence, its forms and mechanisms, can help us fight back. To process massive volumes of data, AI finally comes into play for good.
More Related Content
Similar to How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
Welcome to my SlideShare presentation on ChatGPT, a powerful language model based on the GPT-3.5 architecture.
In this presentation, I will introduce you to ChatGPT and explore its features and capabilities. ChatGPT is a state-of-the-art language model that can understand natural language and generate responses that are highly relevant and accurate.
I will discuss the underlying technology behind ChatGPT, including its neural network architecture and training process. I will also highlight the benefits of using ChatGPT, such as its ability to understand complex language and its potential applications in various industries.
Additionally, I will share examples of how ChatGPT can be used to improve customer service, create conversational interfaces, and generate human-like responses in various applications.
In conclusion, ChatGPT is a powerful tool for businesses and individuals looking to enhance their communication capabilities. Its advanced language understanding and generation capabilities make it an ideal solution for a variety of use cases. I hope this presentation has been informative and has given you a better understanding of the capabilities of ChatGPT.
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...JohannWanja
The choice of which vocabulary to reuse when modeling and publishing Linked Open Data (LOD) is far from trivial. There is no study that investigates the different strategies of reusing vocabularies for LOD modeling and publishing. In this paper, we present the results of a survey with 79 participants that examines the most preferred vocabulary reuse strategies of LOD modeling. The participants, LOD publishers and practitioners, were asked to assess different vocabulary reuse strategies and explain their ranking decision. We found significant differences between the modeling strategies that range from reusing popular vocabularies, minimizing the number of vocabularies, and staying within one domain vocabulary. A very interesting insight is that the popularity in the meaning of how frequent a vocabulary is used in a data source is more important than how often individual classes and properties are used in the LOD cloud. Overall, the results of this survey help in better understanding the strategies how data engineers reuse vocabularies and may also be used to develop future vocabulary engineering tools.
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
Tim Menzies, NC State University, presents at the 2015 HPCC Systems Engineering Summit Community Day.
For Big Data applications, there is a lack of any gold standards for "good analysis" or methods to assess our certification programs. Hence, we are still in the dark about whether or not our human analysts are making the best use possible of the tools of Big Data. While much progress has been made in the systems aspects of Big Data, certain critical human-centered aspects remain an open issue. Regardless of the sophistication of the analysis tools and environment, all that architecture can still be used incorrectly by users. If this issue was confined to a small number of inexperienced users, then it could be addressed via process improvements such as better training. But is it? What do we know about our analysts? Where are the studies that mine the people doing the data miners?
This presentation offers some preliminary results on tools that combine ECL with other methods that recognize the code generated by experienced or inexperienced developers. While the results are preliminary, they do raise the possibility that we can better characterize what it means to be experienced (or inexperienced) at Big Data applications.
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
As many organizations are bundling large language models (LLMs) in their products, they face the problem of rigorous model selection. This talk gives a data-centric understanding of how LLMs are built and evaluated. We will discuss the limitations of current models and pay special attention to the available evaluation protocols. How do we distinguish good models from the others? What tasks and datasets should we try or avoid? How do we incorporate feedback from our users? We will present the guidelines the attendees can use in their future experiments.
The I in PRIMM - Code Comprehension and QuestioningSue Sentance
Slides from a talk given at the CAS London conference on 29th February 2020. Discusses the teaching of computer programming using PRIMM and in particular, the Investigate stage. Looks at the Block Model and how we can explore students' understanding by asking a range of different questions.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Open domain Question Answering System - Research project in NLPGVS Chaitanya
Using a computer to answer questions has been a human dream since the beginning of the digital era. A first step towards the achievement of such an ambitious goal is to deal with natural language to enable the computer to understand what its user asks. The discipline that studies the connection between natural language and the representation of its meaning via computational models is computational linguistics. According to such discipline, Question Answering can be defined as the task that, given a question formulated in natural language , aims at finding one or more concise answers. And the Improvements in Technology and the Explosive demand for better information access has reignited the interest in Q & A systems , The wealth of the information on the web makes it an Interactive resource for seeking quick Answers to factual Questions such as “Who is the first American to land in space ?”, or “what is the second Tallest Mountain in the world ?”, yet Today’s Most advanced web Search systems(Bing , Google , yahoo) make it Surprisingly Tedious to locate the Answers , Q& A System Aims to develop techniques that go beyond Retrieval of Relevant documents in order to return the exact answers using Natural language factoid question
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
The evolution of data environments towards the growth in the size, complexity, dy-
namicity and decentralisation (SCoDD) of schemas drastically impacts contemporary
data management. The SCoDD trend emerges as a central data management concern
in Big Data scenarios, where users and applications have a demand for more complete
data, produced by independent data sources, under different semantic assumptions and
contexts of use. Most Database Management Systems (DBMSs) today target a closed
communication scenario, where the symbolic schema of the database is known a priori
by the database user, which is able to interpret it in an unambiguous way. The context
in which the data is consumed and produced is well-defined and it is typically the
same context in which the data was created. In contrast, data management under the
SCoDD conditions target an open communication scenario where the symbolic system of
the database is unknown by the user and multiple interpretation contexts are possible.
In this case the database can be created under a different context from the database
user. The emergence of this new data environment demands the revisit of the semantic
assumptions behind databases and the design of data access mechanisms which can
support semantically heterogeneous (open communication) data environments.
This work aims at filling this gap by proposing a complementary semantic model for
databases, based on distributional semantic models. Distributional semantics provides a
complementary perspective to the formal perspective of database semantics, which supports
semantic approximation as a first-class database operation. Differently from models
which describe uncertain and incomplete data or probabilistic databases, distributional-
relational models focuses on the construction of conceptual approximation approaches
for databases, supported by a comprehensive semantic model automatically built from
large-scale unstructured data external to the database, which serves as a semantic/com-
monsense knowledge base. The semantic model can be used to support schema-agnosticqueries, i.e. abstracting the data consumer from a specific conceptualization behind the
data.
The proposed distributional-relational semantic model is supported by a distributional
structured vector space model, named τ −Space, which represents structured data under
a distributional semantic model representation which, in coordination with a query plan-
ning approach, supports a schema-agnostic query mechanism for large-schema databases.
The query mechanism is materialized in the Treo query engine and is evaluated using
schema-agnostic natural language queries.
The evaluation of the query mechanism confirms that distributional semantics provides
a high-recall, medium-high precision, and low maintainability solution to cope with
the abstraction and conceptual-level differences in schema-agnostic queries over largeschema/
schema-less open domain dataset
As electricity is difficult to store, it is crucial to strictly maintain the balance between production and consumption. The integration of intermittent renewable energies into the production mix has made the management of the balance more complex. However, access to near real-time data and communication with consumers via smart meters suggest demand response. Specifically, sending signals would encourage users to adjust their consumption according to the production of electricity. The algorithms used to select these signals must learn consumer reactions and optimize them while balancing exploration and exploitation. Various sequential or reinforcement learning approaches are being considered.
Online violence amplifies IRL discriminations, and the lack of diversity grows in a vicious circle. Understanding cyber-violence, its forms and mechanisms, can help us fight back. To process massive volumes of data, AI finally comes into play for good.
In the energy sector, the use of temporal data stands as a pivotal topic. At GRDF, we have developed several methods to effectively handle such data. This presentation will specifically delve into our approaches for anomaly detection and data imputation within time series, leveraging transformers and adversarial training techniques.
Natasha shares her experience to delve into the complexities, challenges, and strategies associated with effectively leading tech teams dispersed across borders.
Nour and Maria present the work they did at Tweag, Modus Create innovation arm, where the GenAI team developed an evaluation framework for Retrieval-Augmented Generation (RAG) systems. RAG systems provide an easy and low-cost way to extend the knowledge of Large Language Models (LLMs) but measuring their performance is not an easy task.
The presentation will review existing evaluation frameworks, ranging from those based on the traditional ML approach of using groundtruth datasets, including Tweag's, to those that use LLMs to compute evaluation metrics.
It will also delve into the practical implementation of Tweag's chatbot over two distinct documents datasets and provide insights on chunking, embedding and how open source and commercial LLMs compare.
Sharone Dayan, Machine Learning Engineer and Daria Stefic, Data Scientist, both from Contentsquare, delve into evaluation strategies for dealing with partially labelled or unlabelled data.
Abstract: Who hasn't heard of the "Pilot Syndrome"? 85% of Data Science Pilots remain pilots and do not make it to the production stage. Let's build a production-ready and end-user-friendly Data Science application. 100% python and 100% open source.
Phase 1 | Building the GUI: create an interactive and powerful interface in a few lines of code
Phase 2 | Integrated back end: Manage your models and pipelines and create scenarios the smart way
"Nature Language Processing for proteins" by Amélie Héliou, Software Engineer @ Google Research
Abstract: Over the past few months, Large Language Models have become very popular.
We'll see how a simple LLM works, from input sentence to prediction.
I'll then present an application of LLM to protein name prediction.
Twitter: @Amelie_hel
"We are not passing by, and we are not a trend". What if an automated and large scale version of the Bechdel-Wallace test could confirm the speech of Alice Diop at the Cesar 2023?
That's the objective of BechdelAI : to build a tool based on Artificial Intelligence and open-source, allowing to measure the inequalities and the under-representation of women in movies and audiovisual.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
1. How to supervise a PhD in NLP
in the ChatGPT area?
WiMLDS
September 27th, 2023
Laure Soulier
2. Who I am?
2
Associate professor at Sorbonne University - MLIA team in the ISIR lab
Research interests:
- Information retrieval and NLP
- Deep learning, representation learning
- Language models
Supervision:
- 3 defended theses
- 6 on-going theses
- 1 postdoctoral researcher /year
- 2-3 master intern students /year
Conversational search
& neural ranking models
Data-to-text generation
Language grounding
2
3. Why this topic?
à The ChatGPT craze
3
1 million users in 5 days
173 million active users in April 2023
0
5000
10000
15000
20000
25000
30000
35000
40000
2
0
1
5
2
0
1
6
2
0
1
7
2
0
1
8
2
0
1
9
2
0
2
0
2
0
2
1
2
0
2
2
2
0
2
3
Large language models Language models
à Emergence of large
language models
àThings are moving faster and
faster in the research community
(statistics extracted from google scholar)
A Survey of Large Language Models, Zhao et al, 2023
4. For who is this talk?
à Colleagues: opening up a debate
- What to expect from Ph.D. students
- How to « survive »
4
5. For who is this talk?
à (Future) PhD sutdents
- What to expect from your advisors
- How to « survive »
5
6. For who is this talk?
à Industrial partners
- How to collaborate with Ph.D. students during a CIFRE
- Indentifying what Ph.D. are good at
6
7. For who is this talk?
à Curious people
- What does a thesis look like?
7
8. Outline of the talk
➜ Overview of LLM
➜ The impact of recent advances of LLM on NLP use cases
8
This talk is built on the basis of my own experience and does not engage colleagues.
You might have different opinions or different experiences.
Feel free to share them in the Q&A sessions or during the cocktail!
Conversational
search
Data-to-text
generation
9. (Large) Language Models
Given a sequence of items !!, !", … , !#$!, what is the probability of the next item !#?
$ !# !!, !", … , !#$!)
A salad is composed of (Large) Language model
Lettuce Probability: 0.9
Tomatoes Probability: 0.85
Corn Probability: 0.6
Ice cream Probability: 0.001
.
.
.
Principle:
- Modeling the probability of sequences !!, !", … , !_'
- Items may be words, characters, character ngrams, word pieces, etc
Semantics, word representation and latent space
Salad
Lettuce
Tomatoes
Ice cream
Corn Salad = (0.3, 0.2, 0.45, -0.1, -0.3)
Lettuce = (0.2, 0.1, 0.38, -0.5, -0.4)
…
Ice cream = (-0.9, -0.3, -0.5, 0.8, 0.7)
9
10. (Large) Language Models
Transformer networks (2017) A encoder-decoder neural network w/:
- About 65M parameters
- Successive feed-forward blocks
- Paralel heads
… That estimates contextual representations of items
with self-attention
Distinguishing Washington/city from Washington/man
(Vaswini et al 2017)
10
12. Large Language Models: interesting properties
➜ Prompting
➜ Prompt :
Instruction explicitly expressing
what is expected
➜ Challenge:
Writing the good prompt
(task, context, expected output …)
➜ Implication:
Everything is generation
From Thomas Gerald - 2023
Translate this sentence in
French: « the sun shines »
Output:
Le soleil brille
12
13. Large Language Models: interesting properties
➜ In-context learning
• Learning from examples mentioned in the prompt
• Without fine-tuning of the model
Multimodal few-shot learning with frozen language models, Tsimpoukelli et al. 2021
13
14. Large Language Models: interesting properties
1. Language model: general knowledge
2. Adaptation to a new task with fine-tuning
cat dog
Encoder
Pretraining
text
Decoder
words & text
representations
Word prediction; sentence completion; ...
Pretrained Language Model Finetuned Model
Language Model
your
(small)
data
expected
target
+
Adapted Language
Model
Massive corpus
= 3%
of the corpus
It's raining MASK and PRED
14
16. Use case on conversational search
Introduction
→ Replacing or augmenting IR systems to perform search session in natural language
Objectives [Radlinsky and Craswell 2017, Culpepper et al 2018]
6
16
17. Use case on conversational search
17
→ Understanding users’ information need
→ Retrieving documents according to the conversation context
→ Generating a response according to the retrieved documents
Initial definition of the research project
What current LLMs do
What we need
- Capturing the semantics of words
- Leveraging the conversation context
- Word representations
- Prompting*
What current LLMs do
What we need
- Matching contextual information
needs with documents
- Leveraging users’ feedback
- Word representations
- Neural ranking models
What current LLMs do
What we need
- Synthesizing document content into a
structured response
- Text generation
- Prompting*
2017
Pierre
Erbacher’s
thesis
18. Use case on conversational search
Proactive information systems
with clarifying questions
18
First strategy: Thinking to the next step 2018-2019
→ Multi-turn clarification framework and analyzing its impact on the retrieval effectiveness
[Erbacher et al., SIGIR 2021]
Contributions
à What existed:
- Small human-annotated datasets
- Single-turn interaction datasets
Except that….
19. Use case on conversational search
19
How to react? Which strategy?
- Stop your thesis? Change thesis subject?
- Change task?
- Since GPT3 and ChatGPT are not open-sourced, designing an open-source model
- … What else?
20. Use case on conversational search
20
Second strategy: Leveraging existing models 2023
→ Generating new conversational search sessions using IR datasets
LLM with the following prompt:
« Query: q Facet: f »
fine-tune to generate clarifying questions
LLM with the following prompt:
« Query: q Intent: i Question: cq »
fine-tune to a yes/no user’s answer
21. Use case on conversational search
21
Second strategy: Leveraging existing models 2023
→ Beyond Toolformer: learning LLM when to search
Toolformer: Language Models Can Teach Themselves to Use Tools, Schick et al, 2023
Our approach
(Erbacher et al – under submission)
Toolformer
23. Conclusion - Discussion
What it has changed in a thesis?
à Huge competition
à Big actors, huge number of (un)submitted papers
à Big GPU clusters (but we have Jean Zay!!!!)
à Collaborative projects between Ph.D. students (and advisors)
à Faster reactivity against the literature review
à More experiments
à Not a 3-year project anymore
à Adapt the research project to on-going innovations
23
24. Conclusion - Discussion
24
à Don’t be afraid!
à You are not the only one facing the tornado
à No pression: you don't have to create
version 10 of the transformer
à It is always possible to find a good idea
à You are learning valuable knowledge and skills
à Might be difficult to design effective models
à You are learning a methodology
à You are accumulating knowledge on the best
LLMs
à Be passionate!
Wrap up for future and current Ph.D. students
25. 25
Thank you for your attention
@LaureSoulier
laure-soulier-18829948
https://pages.isir.upmc.fr/soulier/