SlideShare a Scribd company logo
1 of 45
Intro to LLMs
Loic Merckel
Septembre 2023
linkedin.com/in/merckel
1966: ELIZA
Image source: en.wikipedia.org/wiki/ELIZA#/media/File:ELIZA_conversation.png
“While ELIZA was capable of
engaging in discourse, it
could not converse with true
understanding. However,
many early users were
convinced of ELIZA's
intelligence and
understanding, despite
Weizenbaum's insistence to
the contrary.”
Source: en.wikipedia.org/wiki/ELIZA (and
references therein).
2005: SCIgen - An Automatic CS Paper Generator
nature.com/articles/d41586-021-01436-7
news.mit.edu/2015/how-three-mit-students-fooled-scientific-journals-0414
A project using a rather rudimentary technology that aimed to "maximize amusement, rather than coherence" is
still the cause of troubles today...
pdos.csail.mit.edu/archive/scigen
2017: Google Revolutionized Text Generation
■ Vaswani (2017), Attention Is All You Need (doi.org/10.48550/arXiv.1706.03762)
■ openai.com/research/better-language-models
Image generated with DALL.E: “A small robot standing on the
shoulder of a giant robot” (and slightly modified with The Gimp)
OpenAI’s Generative Pre-trained
Transformer (DALL.E, 2021; ChatGPT,
2022), as the name suggests, reposes on
Transformers.
Google introduced the Transformer,
which rapidly became the state-of-the-art
approach to solve most NLP problems.
● Kiela et al. (2021), Dynabench: Rethinking Benchmarking in NLP: arxiv.org/abs/2104.14337
● Roser (2022), The brief history of artificial intelligence: The world has changed fast – what might be next?: ourworldindata.org/brief-history-of-ai
Transformers
2017
Text and shapes in blue have been added to the original work from Max Roser.
What Are Transformers?
Source: Vaswani (2017), Attention Is All You Need
(doi.org/10.48550/arXiv.1706.03762)
Generative (deep learning) models for understanding and generating text,
images and many other types of data.
Transformers analyze chunks of data, called "tokens" and learn to predict
the next token in a sequence, based on previous and, if available, following
tokens.
The auto-regressive concept means that the output of the model, such as
the prediction of a word in a sentence, is influenced by the previous words it
has generated.
Music—MusicLM (Google) and Jukebox (OpenAI) generate music from text.
Image—Imagen (Google) and DALL.E (OpenAI) generate novel images from text.
Texte—OpenAI’s GPT has become widely known, but other players have similar technology
(including Google, Meta, Anthropic and others).
Others—Recommender (movies, books, flight destinations), drug discovery…
Models that learn from a given dataset how to
generate new data instances.
2022: ChatGPT
“ChatGPT, the popular chatbot
from OpenAI, is estimated to have
reached 100 million monthly
active users in January, just two
months after launch, making it the
fastest-growing consumer
application in history”
statista.com/chart/29174/time-to-one-million-users
Reuters, Feb 1, 2023
https://reut.rs/3yQNlGo
The Mushrooming of Transformer-Based LLMs
PaML (540b), LaMDA
(137b) and others (Bard
relies on LaMDA)
OPT-IML (175b), Galactica
(120b), BlenderBot3
(175b), Llama 2 (70b)
ERNIE 3.0 Titan (260b)
GPT-3 (175b), GPT-3.5 (?b),
GPT-4 (?b)
BLOOM (176b)
PanGu-𝛼 (200b)
Jurassic-1 (178b), Jurassic-2 (?b)
Exaone (300b)
Megatron-Turing NLG (530b)
(It appears that all those models rely only on
transformer-based decoders)
Source:
github.com/Mooler0410/LLMsPracti
calGuide
Now What?
cv
In Finance…
bloomberg.com/news/articles/2023-03-07/griffin-says-trying-to-negotiate-enterprise-wide-chatgpt-license bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance
AI Mentions Boost Stock Prices
● AI-mentioning companies:
+4.6% avg. stock price
increase (nearly double of the
non-mentioning).
● In general, 67% of companies
that mentioned AI observed an
increase in their stock prices
→ +8.5% on average.
● Tech companies:
71% → +11.9% on avg.
● Non-tech companies:
65% → +6.8% on avg.
- Mentions of "AI" and related terms (machine learning, automation, robots, etc.).
- S&P 500 companies in 2023.
- 3-day change from the date the earnings call transcript was published. Source: wallstreetzen.com/blog/ai-mention-moves-stock-prices-2023
GPUs Demand Skyrockets
Before LLMs, GPUs were primarily needed for training, and
CPUs were used for inference. However, with the emergence
of LLMs, GPUs have become almost essential for both tasks.
Paraphrasing Brannin McBee, co-founder of CoreWeave, in
Bloomberg Podcast*:
While you may train the model using 10,000 GPUs, the real
challenge arises when you need 1 million GPUs to meet the
entire inference demand. This surge in demand is expected
during the initial one to two years after the launch, and it's likely
to keep growing thereafter.
* How to Build the Ultimate GPU Cloud to Power AI | Odd Lots (youtube.com/watch?v=9OOn6u6GIqk&t=1308s)
Enhancing Productivity With Generative AI?
nature.com/articles/d41586-023-02270-9
science.org/doi/10.1126/science.adh2586
Limitations
Beware of “Hallucinations” Which Do Remain Very Real
“Hallucinations” are “confident
statements that are not true”1
.
For the moment, this
phenomenon inexorably
affects all known LLMs.
1: fr.wikipedia.org/wiki/Hallucination_(intelligence_artificielle)
Yves Montand in “Le Cercle Rouge” during an attack of delirium tremens
This thing probably doesn't exist.
Concrete
Hallucinations (GPT-4)
We asked ChatGPT the first part of the third
question of the British Mathematical Olympiad
1977: bmos.ukmt.org.uk/home/bmo-1977.pdf
Is that so? Although not an obvious
hallucination, it may remind us of Fermat’s
lack of space in the margin to give the proof
of his last theorem… Perhaps here there is a
lack of tokens?
Here a total hallucination, this statement is
evidently false.
Perhaps it meant “the
product of two negative
numbers”
Here a total hallucination, this statement is
evidently false. (Although in this case the
inequality is indeed clearly true.)
The Saga of the Lawyer Who Used ChatGPT
nytimes.com/2023/06/08/nyregion/law
yer-chatgpt-sanctions.html
nytimes.com/2023/05/27/nyregion/avia
nca-airline-lawsuit-chatgpt.html
nytimes.com/2023/06/22/nyregion/la
wyers-chatgpt-schwartz-loduca.html
ChatGPT: Achieving Human-Level Performance in
Professional and Academic Benchmarks
● GPT-4's performance in recent tests is
undeniably impressive.
● Study conducted by OpenAI
(openai.com/papers/gpt-4.pdf).
● Most of those tests mainly focus on high
school-level content.
● Many are prepared through test prep
courses and resources.
● By contrast, university exams typically
require a deeper understanding of course
material and critical thinking skills.
● Uniform Bar Exam: Worth noting, but
potential overestimation concerns (see
dx.doi.org/10.2139/ssrn.4441311).
Exploring the MIT Mathematics and EECS Curriculum Using
Large Language Models
Published on Jun 15, 2023
Authors: Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith
Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando
Solar-Lezama, Iddo Drori
Abstract
We curate a comprehensive dataset of 4,550 questions and solutions from problem sets,
midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and
Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of
large language models to fulfill the graduation requirements for any MIT major in Mathematics
and EECS. Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT
curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set
excluding questions based on images. We fine-tune an open-source large language model on
this dataset. We employ GPT-4 to automatically grade model responses, providing a detailed
performance breakdown by course, question, and answer type. By embedding questions in a
low-dimensional space, we explore the relationships between questions, topics, and classes and
discover which questions and classes are required for solving other questions and classes
through few-shot learning. Our analysis offers valuable insights into course prerequisites and
curriculum design, highlighting language models' potential for learning and improving
Mathematics and EECS education.
Source: arxiv.org/abs/2306.08997
i.e., GPT-4
scored 100% on
MIT EECS
Curriculum
(Electrical
Engineering and
Computer
Science)
“No, GPT4 can’t ace MIT”
Three MIT undergrads have debunked the myth.
- 4% of the questions were unsolvable. (How did GPT-4 achieve 100%?)
- Information leak in some few-shot prompts: for those, the answer was
quasi-given in the question.
- The automatic grading using GPT-4 itself has some severe issues: prompt
cascade that reprompted (many times) when the given answer was deemed
incorrect. 16% of the questions were multi-choices questions, hence a
quasi-guaranteed correct response.
- Bugs found in the research script that raise serious questions regarding the
soundness of the study.
Source: flower-nutria-41d.notion.site/No-GPT4-can-t-ace-MIT-b27e6796ab5a48368127a98216c76864
Note: The paper has since been withdrawn (see official statement at people.csail.mit.edu/asolar/CoursesPaperStatement.pdf)
Chemistry May Not Be ChatGPT Cup of Tea
A study conducted by three researchers of the University of
Hertfordshire (UK) showed that ChatGPT is not a fan of
chemistry.
Real exams were used, and the authors note that “[a] well-written
question item aims to create intellectual challenge and to require
interpretation and inquiry. Questions that cannot be easily
‘Googled’ or easily answered through a single click in an
internet search engine is a focus.”
“The overall grade on the year 1 paper calculated from the top
four graded answers would be 34.1%, which does not meet the
pass criteria. The overall grade on the year 2 paper would be
18.3%, which does not meet the pass criteria.”
Source: Fergus et al., 2023, Evaluating Academic Answers Generated Using ChatGPT (pubs.acs.org/doi/10.1021/acs.jchemed.3c00087)
The “Drift” Phenomenon
Sources:
- wsj.com/articles/chatgpt-openai-math-artificial-intelligence-8aba83f0
- Chaîne et al., 2023, arxiv.org/abs/2307.09009
● New research from Stanford and UC Berkeley
highlights a fundamental challenge in AI
development: "drift."
● Drift occurs when improving one aspect of
complex AI models leads to a decline in
performance in other areas.
● ChatGPT has shown deterioration in basic math
operations despite advancements in other tasks.
● GPT-4 exhibits reduced responsiveness to
chain-of-thought prompting (may be intended to
mitigate potential misuse with malicious
prompts).
The “behavior of the ‘same’ LLM service can
change substantially in a relatively short amount of
time, highlighting the need for continuous monitoring
of LLMs” (Chain et al., 2023).
Techniques for Tailoring LLMs to
Specific Problems
Prompts Engineering
Fine-Tuning
Reinforcement Learning From Human Feedback (RLHF)
First We Must Have a Problem to Solve…
Source: DeepLearning.AI, licensed under CC BY-SA 2.0
Then We Need a Model
Commercial APIs
- Google, OpenAI, Anthropic, Microsoft...
- Privacy concerns may arise.
- No specific hardware requirement.
- Prompt engineering (OpenAI offers prompt fine-tuning).
Use a foundation model (many open sources models are available)
- As it is (prompt engineering),
- or fine-tuned (either full or parameter efficient fine-tuning).
- May required specific hardware/infrastructure for hosting, fine-tuning and
inferences.
Train a model from the scratch
- Requires huge resources (both data and computing power).
- (e.g., BloombergGPT, arxiv.org/abs/2303.17564.)
A Plethora of Open
Source Pre-Trained
Models
huggingface.co/models
Models should be selected
depending on:
● The problem at hand.
● The strength of the model.
● The operating costs (larger
models require more
resources).
● Other considerations (e.g.,
license).
Prompt Engineering: “Query Crafting”
Improving the output with actions like phrasing
queries, specifying styles, providing context, or
assigning roles (e.g., 'Act as a mathematics
teacher') (Wikipedia, 2023).
Some hints can be found in OpenAI’s “GPT best
practices” (OpenAi, 2023).
Chain-of-thought: popular technique consisting
in “guiding [LLMs] to produce a sequence of
intermediate steps before giving the final answer”
(Wei et al., 2022).
Sources:
- Wei, J.et al., 2022. Emergent abilities of large language models, arxiv.org/abs/2206.07682
- OpenAI, 2023, platform.openai.com/docs/guides/gpt-best-practices/six-strategies-for-getting-better-results
- Wikipedia, 2023, , Prompt Engineering, en.wikipedia.org/wiki/Prompt_engineering
(graph from Wei et al., 2022)
About GSM8K benchmark: arxiv.org/abs/2110.14168
Prompt Engineering: In-Context Learning (ICL)
In-Context Learning (ICL) consists in “a few input-output
examples in the model’s context (input) as a preamble
before asking the model to perform the task for an unseen
inference-time example” (Wei et al., 2022).
It is a kind of “ephemeral supervised learning.”
- Zero-shot prompting or Zero-shot learning: no example
given (for largest LLMs, smaller ones may struggle).
- One-shot prompting: one example provided.
- Few-shot prompting: a few examples (typically 3~6).
⚠ Context window limits (e.g., 4096 tokens).
Tweet: @lufthansa Please find our
missing luggage!!
Sentiment: negative
Tweet: Will be on LH to FRA very soon.
Cheers!
Sentiment: positive
Tweet: Refused to compensate me for 2
days cancelled flights . Joke of a airline
Sentiment:
LLM
negative
Example of an input and
output for two-shot prompting
Source: Wei, J.et al., 2022. Emergent abilities of large language models, arxiv.org/abs/2206.07682
Fine-Tuning: Introduction
Few shot learning:
- May not be sufficient for smaller models.
- Consumes tokens from the context window.
Fine-tuning is a supervised learning process
that leads to a new model (in contrast with
in-context learning that is “ephemeral”).
Task specific prompt-completion pairs data are
required.
Base LLM
Fine-tuned
LLM
(Prompt_1, completion_1)
(Prompt_2, completion_2)
…
(Prompt_n, completion_n)
Task specific prompt-completion
pairs data
Full Fine-Tuning: Updating All Parameters
Fine-tuning very often means “instruction fine-tuning.”
Instruction fine-tuning: each prompt-completion pair includes a specific
instruction (summarize this, translate that, classify this tweet, …).
● Fine-tuning on a single task (e.g, summarization) may lead to a phenomenon
referred to as “catastrophic forgetting” (arxiv.org/pdf/1911.00202), where the
model loses its abilities on other tasks (may not be a business issue, though).
● Fine-tuning on multi tasks (e.g., summarization, translation, classification, …).
This requires a lot more training data. (E.g., see FLAN in Wei et al., 2022.)
Full fine-tuning is extremely resources demanding, even more so for large models.
Source: Wei et al., 2022, Finetuned Language Models Are Zero-Shot Learners. arxiv.org/abs/2109.01652
Parameter Efficient Fine-Tuning (PEFT)
Unlike full fine-tuning, PEFT preserves the vast majority of the weights of the original
model.
● Less prone to “catastrophic forgetting” on single task.
● Often a single GPU is enough.
Three methods:
● Selective—subset of initial params to fine-tune.
● Reparameterization—reparameterize model weights using a low-rank
representation, e.g., LoRA (Hu et al., 2021).
● Additive—add trainable layers or parameters to model, two approaches:
- Adapters: add new trainable layers to the architecture of the model.
- Soft prompts: focus on manipulating the input (this is not prompt engineering).
Source:
- coursera.org/learn/generative-ai-with-llms/lecture/rCE9r/parameter-efficient-fine-tuning-peft
- Hu et al., 2021, LoRA: Low-Rank Adaptation of Large Language Models. arxiv.org/abs/2106.09685
OpenAI API offers
prompt tuning for
gpt-3.5-turbo, but not
“yet” for GPT-4.
platform.openai.com/docs/guides/fine-tuning
Fine-Tuning With
OpenAI GPT
(PEFT)
Reinforcement Learning From Human Feedback
LLMs are trained on the web data with a lot of irrelevant matters (unhelpful), or worse,
where false (dishonest) and/or harmful information are abundant, e.g.,
● Potentially dangerous false medical advices.
● Valid techniques for illegal activities (hacking, deceiving, building weapons, …).
HHH (Helpful, Honest & Harmless) alignment (Askell et al., 2021): ensuring that the
model's behavior and outputs are consistent with human values, intentions, and ethical
standards.
Reinforcement Learning from Human Feedback, or RLHF (Casper et al., 2023)
● “is a technique for training AI systems to align with human goals.”
● “[It] has emerged as the central method used to finetune state-of-the-art [LLMs].”
● It reposes on human judgment and consensus.
Source:
- Casper et al., 2023, Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. arxiv.org/abs/2307.15217
- Ziegler et al., 2022, Fine-Tuning Language Models from Human Preferences. arxiv.org/abs/1909.08593
- Askell et al., 2021, A General Language Assistant as a Laboratory for Alignment. arxiv.org/abs/2112.00861
What Is RLHF by Sam Altman
5:59
What is RLHF? Reinforcement Learning with Human Feedback, …
6:07
… So, we trained these models on a lot of text data and, in that process, they
learned the underlying, …. And they can do amazing things.
6:26
But when you first play with that base model, that we call it, after you finish
training, … it can do a lot of, you know, there's knowledge in there. But it's not
very useful or, at least, it's not easy to use, let's say. And RLHF is how we
take some human feedback,
6:45
the simplest version of this is show two outputs, ask which one is better
than the other,
6:50
which one the human raters prefer, and then feed that back into the model
with reinforcement learning.
6:56
And that process works remarkably well with, in my opinion, remarkably little
data to make the model more useful. So, RLHF is how we align the model to
what humans want it to do.
Sam Altman: OpenAI CEO on
GPT-4, ChatGPT, and the Future of
AI | Lex Fridman Podcast #367
(youtu.be/L_Guz73e6fw?si=vfkdtN
CyrQa1RzZR&t=359)
Source: Liu et al., 2022, Aligning Generative Language Models with Human Values. aclanthology.org/2022.findings-naacl.18
RLHF: Example of Alignment Tasks
Performance Evaluation
Assessing and Comparing LLMs
Metrics while training the model—ROUGE (summary) or BLEU (translation).
Benchmarks—A non-exhaustive list:
- ARC (Abstraction and Reasoning Corpus, arxiv.org/pdf/2305.18354),
- HellaSwag (arxiv.org/abs/1905.07830),
- TruthfulQA (arxiv.org/abs/2109.07958),
- GLUE & SuperGLUE (General Language Understanding Evaluation, gluebenchmark.com),
- HELM (Holistic Evaluation of Language Models, crfm.stanford.edu/helm),
- MMLU (Massive Multitask Language Understanding, arxiv.org/abs/2009.03300),
- BIG-bench (arxiv.org/pdf/2206.04615).
Others—“Auto-Eval of Question-Answering Tasks”
(blog.langchain.dev/auto-eval-of-question-answering-tasks).
Source: Wu et al., 2023,
BloombergGPT: A Large Language
Model for Finance.
arxiv.org/abs/2303.17564 (Table 13:
“BIG-bench hard results using
standard 3-shot prompting”)
Source: Touvron et al., 2023, Llama 2: Open Foundation and Fine-Tuned Chat Models,
scontent-fra3-1.xx.fbcdn.net/v/t39.2365-6/10000000_662098952474184_2584067087619170692_n.pdf
Application Example:
Conversing With Annual Reports
Question ChatGPT About the Latest Financial
Reports?
—blog.langchain.dev/tutorial-
chatgpt-over-your-data
“[ChatGPT] doesn’t know about
your private data, it doesn’t know
about recent sources of data.
Wouldn’t it be useful if it did?”
Workflow Overview
Question
Answer
« Quels vont être les dividendes payés
par action par le Groupe Crit ? »
« Le Groupe CRIT proposera lors de sa prochaine Assemblée Générale, le 9
juin 2023, le versement d'un dividende exceptionnel de 3,5 € par action. »
The example (the question and associated
answer) is a real example (the LLM was
“gpt-3.5-turbo” from OpenAI)
Technique described in: Lewis et al., 2020.
Retrieval-augmented generation for knowledge-intensive
nlp tasks. (doi.org/10.48550/arXiv.2005.11401)
Extracting
relevant
information
(“context”)
Generate a prompt
accordingly
(“question +
context”)
LLM
Vector store
Split into chunks
1
2 3
Compute
embeddings
Preliminary Prototype
Financial reports retrieved directly from the French AMF (“Autorité
des marchés financiers”) via their API (info-financiere.fr).
xhtml document in
French language.
Question and answer
are in English (they
would be in French
should the question be
asked in French).
Except where otherwise noted, this work is licensed under
https://creativecommons.org/licenses/by/4.0/
619.io

More Related Content

What's hot

The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesDianaGray10
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scaleMaxim Salnikov
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdfChaoYang81
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023HyunJoon Jung
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!taozen
 
The current state of generative AI
The current state of generative AIThe current state of generative AI
The current state of generative AIBenjaminlapid1
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023CoriFaklaris1
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGISynaptonIncorporated
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
 
Generative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second SessionGenerative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second SessionGene Leybzon
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AICori Faklaris
 

What's hot (20)

The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
 
The current state of generative AI
The current state of generative AIThe current state of generative AI
The current state of generative AI
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
Generative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second SessionGenerative AI Use cases for Enterprise - Second Session
Generative AI Use cases for Enterprise - Second Session
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 

Similar to Intro to LLMs

Case study on machine learning
Case study on machine learningCase study on machine learning
Case study on machine learningHarshitBarde
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxISSIP
 
Deep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningDeep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningJustin Beirold
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodologyacijjournal
 
History of AI - Presentation by Sanjay Kumar
History of AI - Presentation by Sanjay KumarHistory of AI - Presentation by Sanjay Kumar
History of AI - Presentation by Sanjay KumarSanjay Kumar
 
NITLE IT Leaders 2009: Emerging Technologies in a Submerging Economy
NITLE IT Leaders 2009: Emerging Technologies in a Submerging EconomyNITLE IT Leaders 2009: Emerging Technologies in a Submerging Economy
NITLE IT Leaders 2009: Emerging Technologies in a Submerging EconomyBryan Alexander
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep LearningIRJET Journal
 
Denmark future of ai 20180927 v8
Denmark future of ai 20180927 v8Denmark future of ai 20180927 v8
Denmark future of ai 20180927 v8ISSIP
 
Spohrer SIRs 20230511 v16.pptx
Spohrer SIRs 20230511 v16.pptxSpohrer SIRs 20230511 v16.pptx
Spohrer SIRs 20230511 v16.pptxISSIP
 
UpdatedSociety5, 2Oct23
UpdatedSociety5, 2Oct23UpdatedSociety5, 2Oct23
UpdatedSociety5, 2Oct23HeilaPienaar
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
20210128 jim spohrer ai house_fund v4
20210128 jim spohrer ai house_fund v420210128 jim spohrer ai house_fund v4
20210128 jim spohrer ai house_fund v4ISSIP
 
Hicss52 20190108 v3
Hicss52 20190108 v3Hicss52 20190108 v3
Hicss52 20190108 v3ISSIP
 
Worker Productivity 20230628 v1.pptx
Worker Productivity 20230628 v1.pptxWorker Productivity 20230628 v1.pptx
Worker Productivity 20230628 v1.pptxISSIP
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsJie Bao
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxGreg Makowski
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25thIBM
 

Similar to Intro to LLMs (20)

Case study on machine learning
Case study on machine learningCase study on machine learning
Case study on machine learning
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptx
 
Deep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningDeep Neural Networks for Machine Learning
Deep Neural Networks for Machine Learning
 
NHH 20231105 v6.pptx
NHH 20231105 v6.pptxNHH 20231105 v6.pptx
NHH 20231105 v6.pptx
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
 
History of AI - Presentation by Sanjay Kumar
History of AI - Presentation by Sanjay KumarHistory of AI - Presentation by Sanjay Kumar
History of AI - Presentation by Sanjay Kumar
 
History of AI
History of AIHistory of AI
History of AI
 
NITLE IT Leaders 2009: Emerging Technologies in a Submerging Economy
NITLE IT Leaders 2009: Emerging Technologies in a Submerging EconomyNITLE IT Leaders 2009: Emerging Technologies in a Submerging Economy
NITLE IT Leaders 2009: Emerging Technologies in a Submerging Economy
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep Learning
 
Denmark future of ai 20180927 v8
Denmark future of ai 20180927 v8Denmark future of ai 20180927 v8
Denmark future of ai 20180927 v8
 
Spohrer SIRs 20230511 v16.pptx
Spohrer SIRs 20230511 v16.pptxSpohrer SIRs 20230511 v16.pptx
Spohrer SIRs 20230511 v16.pptx
 
UpdatedSociety5, 2Oct23
UpdatedSociety5, 2Oct23UpdatedSociety5, 2Oct23
UpdatedSociety5, 2Oct23
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
20210128 jim spohrer ai house_fund v4
20210128 jim spohrer ai house_fund v420210128 jim spohrer ai house_fund v4
20210128 jim spohrer ai house_fund v4
 
Hicss52 20190108 v3
Hicss52 20190108 v3Hicss52 20190108 v3
Hicss52 20190108 v3
 
Worker Productivity 20230628 v1.pptx
Worker Productivity 20230628 v1.pptxWorker Productivity 20230628 v1.pptx
Worker Productivity 20230628 v1.pptx
 
Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptx
 
Null
NullNull
Null
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25th
 

Recently uploaded

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 

Recently uploaded (20)

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 

Intro to LLMs

  • 1. Intro to LLMs Loic Merckel Septembre 2023 linkedin.com/in/merckel
  • 2. 1966: ELIZA Image source: en.wikipedia.org/wiki/ELIZA#/media/File:ELIZA_conversation.png “While ELIZA was capable of engaging in discourse, it could not converse with true understanding. However, many early users were convinced of ELIZA's intelligence and understanding, despite Weizenbaum's insistence to the contrary.” Source: en.wikipedia.org/wiki/ELIZA (and references therein).
  • 3. 2005: SCIgen - An Automatic CS Paper Generator nature.com/articles/d41586-021-01436-7 news.mit.edu/2015/how-three-mit-students-fooled-scientific-journals-0414 A project using a rather rudimentary technology that aimed to "maximize amusement, rather than coherence" is still the cause of troubles today... pdos.csail.mit.edu/archive/scigen
  • 4. 2017: Google Revolutionized Text Generation ■ Vaswani (2017), Attention Is All You Need (doi.org/10.48550/arXiv.1706.03762) ■ openai.com/research/better-language-models Image generated with DALL.E: “A small robot standing on the shoulder of a giant robot” (and slightly modified with The Gimp) OpenAI’s Generative Pre-trained Transformer (DALL.E, 2021; ChatGPT, 2022), as the name suggests, reposes on Transformers. Google introduced the Transformer, which rapidly became the state-of-the-art approach to solve most NLP problems.
  • 5. ● Kiela et al. (2021), Dynabench: Rethinking Benchmarking in NLP: arxiv.org/abs/2104.14337 ● Roser (2022), The brief history of artificial intelligence: The world has changed fast – what might be next?: ourworldindata.org/brief-history-of-ai Transformers 2017 Text and shapes in blue have been added to the original work from Max Roser.
  • 6. What Are Transformers? Source: Vaswani (2017), Attention Is All You Need (doi.org/10.48550/arXiv.1706.03762) Generative (deep learning) models for understanding and generating text, images and many other types of data. Transformers analyze chunks of data, called "tokens" and learn to predict the next token in a sequence, based on previous and, if available, following tokens. The auto-regressive concept means that the output of the model, such as the prediction of a word in a sentence, is influenced by the previous words it has generated. Music—MusicLM (Google) and Jukebox (OpenAI) generate music from text. Image—Imagen (Google) and DALL.E (OpenAI) generate novel images from text. Texte—OpenAI’s GPT has become widely known, but other players have similar technology (including Google, Meta, Anthropic and others). Others—Recommender (movies, books, flight destinations), drug discovery… Models that learn from a given dataset how to generate new data instances.
  • 7. 2022: ChatGPT “ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after launch, making it the fastest-growing consumer application in history” statista.com/chart/29174/time-to-one-million-users Reuters, Feb 1, 2023 https://reut.rs/3yQNlGo
  • 8. The Mushrooming of Transformer-Based LLMs PaML (540b), LaMDA (137b) and others (Bard relies on LaMDA) OPT-IML (175b), Galactica (120b), BlenderBot3 (175b), Llama 2 (70b) ERNIE 3.0 Titan (260b) GPT-3 (175b), GPT-3.5 (?b), GPT-4 (?b) BLOOM (176b) PanGu-𝛼 (200b) Jurassic-1 (178b), Jurassic-2 (?b) Exaone (300b) Megatron-Turing NLG (530b) (It appears that all those models rely only on transformer-based decoders)
  • 12. AI Mentions Boost Stock Prices ● AI-mentioning companies: +4.6% avg. stock price increase (nearly double of the non-mentioning). ● In general, 67% of companies that mentioned AI observed an increase in their stock prices → +8.5% on average. ● Tech companies: 71% → +11.9% on avg. ● Non-tech companies: 65% → +6.8% on avg. - Mentions of "AI" and related terms (machine learning, automation, robots, etc.). - S&P 500 companies in 2023. - 3-day change from the date the earnings call transcript was published. Source: wallstreetzen.com/blog/ai-mention-moves-stock-prices-2023
  • 13. GPUs Demand Skyrockets Before LLMs, GPUs were primarily needed for training, and CPUs were used for inference. However, with the emergence of LLMs, GPUs have become almost essential for both tasks. Paraphrasing Brannin McBee, co-founder of CoreWeave, in Bloomberg Podcast*: While you may train the model using 10,000 GPUs, the real challenge arises when you need 1 million GPUs to meet the entire inference demand. This surge in demand is expected during the initial one to two years after the launch, and it's likely to keep growing thereafter. * How to Build the Ultimate GPU Cloud to Power AI | Odd Lots (youtube.com/watch?v=9OOn6u6GIqk&t=1308s)
  • 14. Enhancing Productivity With Generative AI? nature.com/articles/d41586-023-02270-9 science.org/doi/10.1126/science.adh2586
  • 16. Beware of “Hallucinations” Which Do Remain Very Real “Hallucinations” are “confident statements that are not true”1 . For the moment, this phenomenon inexorably affects all known LLMs. 1: fr.wikipedia.org/wiki/Hallucination_(intelligence_artificielle) Yves Montand in “Le Cercle Rouge” during an attack of delirium tremens This thing probably doesn't exist.
  • 17. Concrete Hallucinations (GPT-4) We asked ChatGPT the first part of the third question of the British Mathematical Olympiad 1977: bmos.ukmt.org.uk/home/bmo-1977.pdf Is that so? Although not an obvious hallucination, it may remind us of Fermat’s lack of space in the margin to give the proof of his last theorem… Perhaps here there is a lack of tokens? Here a total hallucination, this statement is evidently false. Perhaps it meant “the product of two negative numbers” Here a total hallucination, this statement is evidently false. (Although in this case the inequality is indeed clearly true.)
  • 18. The Saga of the Lawyer Who Used ChatGPT nytimes.com/2023/06/08/nyregion/law yer-chatgpt-sanctions.html nytimes.com/2023/05/27/nyregion/avia nca-airline-lawsuit-chatgpt.html nytimes.com/2023/06/22/nyregion/la wyers-chatgpt-schwartz-loduca.html
  • 19. ChatGPT: Achieving Human-Level Performance in Professional and Academic Benchmarks ● GPT-4's performance in recent tests is undeniably impressive. ● Study conducted by OpenAI (openai.com/papers/gpt-4.pdf). ● Most of those tests mainly focus on high school-level content. ● Many are prepared through test prep courses and resources. ● By contrast, university exams typically require a deeper understanding of course material and critical thinking skills. ● Uniform Bar Exam: Worth noting, but potential overestimation concerns (see dx.doi.org/10.2139/ssrn.4441311).
  • 20. Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models Published on Jun 15, 2023 Authors: Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori Abstract We curate a comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of large language models to fulfill the graduation requirements for any MIT major in Mathematics and EECS. Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images. We fine-tune an open-source large language model on this dataset. We employ GPT-4 to automatically grade model responses, providing a detailed performance breakdown by course, question, and answer type. By embedding questions in a low-dimensional space, we explore the relationships between questions, topics, and classes and discover which questions and classes are required for solving other questions and classes through few-shot learning. Our analysis offers valuable insights into course prerequisites and curriculum design, highlighting language models' potential for learning and improving Mathematics and EECS education. Source: arxiv.org/abs/2306.08997 i.e., GPT-4 scored 100% on MIT EECS Curriculum (Electrical Engineering and Computer Science)
  • 21. “No, GPT4 can’t ace MIT” Three MIT undergrads have debunked the myth. - 4% of the questions were unsolvable. (How did GPT-4 achieve 100%?) - Information leak in some few-shot prompts: for those, the answer was quasi-given in the question. - The automatic grading using GPT-4 itself has some severe issues: prompt cascade that reprompted (many times) when the given answer was deemed incorrect. 16% of the questions were multi-choices questions, hence a quasi-guaranteed correct response. - Bugs found in the research script that raise serious questions regarding the soundness of the study. Source: flower-nutria-41d.notion.site/No-GPT4-can-t-ace-MIT-b27e6796ab5a48368127a98216c76864 Note: The paper has since been withdrawn (see official statement at people.csail.mit.edu/asolar/CoursesPaperStatement.pdf)
  • 22. Chemistry May Not Be ChatGPT Cup of Tea A study conducted by three researchers of the University of Hertfordshire (UK) showed that ChatGPT is not a fan of chemistry. Real exams were used, and the authors note that “[a] well-written question item aims to create intellectual challenge and to require interpretation and inquiry. Questions that cannot be easily ‘Googled’ or easily answered through a single click in an internet search engine is a focus.” “The overall grade on the year 1 paper calculated from the top four graded answers would be 34.1%, which does not meet the pass criteria. The overall grade on the year 2 paper would be 18.3%, which does not meet the pass criteria.” Source: Fergus et al., 2023, Evaluating Academic Answers Generated Using ChatGPT (pubs.acs.org/doi/10.1021/acs.jchemed.3c00087)
  • 23. The “Drift” Phenomenon Sources: - wsj.com/articles/chatgpt-openai-math-artificial-intelligence-8aba83f0 - Chaîne et al., 2023, arxiv.org/abs/2307.09009 ● New research from Stanford and UC Berkeley highlights a fundamental challenge in AI development: "drift." ● Drift occurs when improving one aspect of complex AI models leads to a decline in performance in other areas. ● ChatGPT has shown deterioration in basic math operations despite advancements in other tasks. ● GPT-4 exhibits reduced responsiveness to chain-of-thought prompting (may be intended to mitigate potential misuse with malicious prompts). The “behavior of the ‘same’ LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLMs” (Chain et al., 2023).
  • 24. Techniques for Tailoring LLMs to Specific Problems Prompts Engineering Fine-Tuning Reinforcement Learning From Human Feedback (RLHF)
  • 25. First We Must Have a Problem to Solve… Source: DeepLearning.AI, licensed under CC BY-SA 2.0
  • 26. Then We Need a Model Commercial APIs - Google, OpenAI, Anthropic, Microsoft... - Privacy concerns may arise. - No specific hardware requirement. - Prompt engineering (OpenAI offers prompt fine-tuning). Use a foundation model (many open sources models are available) - As it is (prompt engineering), - or fine-tuned (either full or parameter efficient fine-tuning). - May required specific hardware/infrastructure for hosting, fine-tuning and inferences. Train a model from the scratch - Requires huge resources (both data and computing power). - (e.g., BloombergGPT, arxiv.org/abs/2303.17564.)
  • 27. A Plethora of Open Source Pre-Trained Models huggingface.co/models Models should be selected depending on: ● The problem at hand. ● The strength of the model. ● The operating costs (larger models require more resources). ● Other considerations (e.g., license).
  • 28. Prompt Engineering: “Query Crafting” Improving the output with actions like phrasing queries, specifying styles, providing context, or assigning roles (e.g., 'Act as a mathematics teacher') (Wikipedia, 2023). Some hints can be found in OpenAI’s “GPT best practices” (OpenAi, 2023). Chain-of-thought: popular technique consisting in “guiding [LLMs] to produce a sequence of intermediate steps before giving the final answer” (Wei et al., 2022). Sources: - Wei, J.et al., 2022. Emergent abilities of large language models, arxiv.org/abs/2206.07682 - OpenAI, 2023, platform.openai.com/docs/guides/gpt-best-practices/six-strategies-for-getting-better-results - Wikipedia, 2023, , Prompt Engineering, en.wikipedia.org/wiki/Prompt_engineering (graph from Wei et al., 2022) About GSM8K benchmark: arxiv.org/abs/2110.14168
  • 29. Prompt Engineering: In-Context Learning (ICL) In-Context Learning (ICL) consists in “a few input-output examples in the model’s context (input) as a preamble before asking the model to perform the task for an unseen inference-time example” (Wei et al., 2022). It is a kind of “ephemeral supervised learning.” - Zero-shot prompting or Zero-shot learning: no example given (for largest LLMs, smaller ones may struggle). - One-shot prompting: one example provided. - Few-shot prompting: a few examples (typically 3~6). ⚠ Context window limits (e.g., 4096 tokens). Tweet: @lufthansa Please find our missing luggage!! Sentiment: negative Tweet: Will be on LH to FRA very soon. Cheers! Sentiment: positive Tweet: Refused to compensate me for 2 days cancelled flights . Joke of a airline Sentiment: LLM negative Example of an input and output for two-shot prompting Source: Wei, J.et al., 2022. Emergent abilities of large language models, arxiv.org/abs/2206.07682
  • 30. Fine-Tuning: Introduction Few shot learning: - May not be sufficient for smaller models. - Consumes tokens from the context window. Fine-tuning is a supervised learning process that leads to a new model (in contrast with in-context learning that is “ephemeral”). Task specific prompt-completion pairs data are required. Base LLM Fine-tuned LLM (Prompt_1, completion_1) (Prompt_2, completion_2) … (Prompt_n, completion_n) Task specific prompt-completion pairs data
  • 31. Full Fine-Tuning: Updating All Parameters Fine-tuning very often means “instruction fine-tuning.” Instruction fine-tuning: each prompt-completion pair includes a specific instruction (summarize this, translate that, classify this tweet, …). ● Fine-tuning on a single task (e.g, summarization) may lead to a phenomenon referred to as “catastrophic forgetting” (arxiv.org/pdf/1911.00202), where the model loses its abilities on other tasks (may not be a business issue, though). ● Fine-tuning on multi tasks (e.g., summarization, translation, classification, …). This requires a lot more training data. (E.g., see FLAN in Wei et al., 2022.) Full fine-tuning is extremely resources demanding, even more so for large models. Source: Wei et al., 2022, Finetuned Language Models Are Zero-Shot Learners. arxiv.org/abs/2109.01652
  • 32. Parameter Efficient Fine-Tuning (PEFT) Unlike full fine-tuning, PEFT preserves the vast majority of the weights of the original model. ● Less prone to “catastrophic forgetting” on single task. ● Often a single GPU is enough. Three methods: ● Selective—subset of initial params to fine-tune. ● Reparameterization—reparameterize model weights using a low-rank representation, e.g., LoRA (Hu et al., 2021). ● Additive—add trainable layers or parameters to model, two approaches: - Adapters: add new trainable layers to the architecture of the model. - Soft prompts: focus on manipulating the input (this is not prompt engineering). Source: - coursera.org/learn/generative-ai-with-llms/lecture/rCE9r/parameter-efficient-fine-tuning-peft - Hu et al., 2021, LoRA: Low-Rank Adaptation of Large Language Models. arxiv.org/abs/2106.09685
  • 33. OpenAI API offers prompt tuning for gpt-3.5-turbo, but not “yet” for GPT-4. platform.openai.com/docs/guides/fine-tuning Fine-Tuning With OpenAI GPT (PEFT)
  • 34. Reinforcement Learning From Human Feedback LLMs are trained on the web data with a lot of irrelevant matters (unhelpful), or worse, where false (dishonest) and/or harmful information are abundant, e.g., ● Potentially dangerous false medical advices. ● Valid techniques for illegal activities (hacking, deceiving, building weapons, …). HHH (Helpful, Honest & Harmless) alignment (Askell et al., 2021): ensuring that the model's behavior and outputs are consistent with human values, intentions, and ethical standards. Reinforcement Learning from Human Feedback, or RLHF (Casper et al., 2023) ● “is a technique for training AI systems to align with human goals.” ● “[It] has emerged as the central method used to finetune state-of-the-art [LLMs].” ● It reposes on human judgment and consensus. Source: - Casper et al., 2023, Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. arxiv.org/abs/2307.15217 - Ziegler et al., 2022, Fine-Tuning Language Models from Human Preferences. arxiv.org/abs/1909.08593 - Askell et al., 2021, A General Language Assistant as a Laboratory for Alignment. arxiv.org/abs/2112.00861
  • 35. What Is RLHF by Sam Altman 5:59 What is RLHF? Reinforcement Learning with Human Feedback, … 6:07 … So, we trained these models on a lot of text data and, in that process, they learned the underlying, …. And they can do amazing things. 6:26 But when you first play with that base model, that we call it, after you finish training, … it can do a lot of, you know, there's knowledge in there. But it's not very useful or, at least, it's not easy to use, let's say. And RLHF is how we take some human feedback, 6:45 the simplest version of this is show two outputs, ask which one is better than the other, 6:50 which one the human raters prefer, and then feed that back into the model with reinforcement learning. 6:56 And that process works remarkably well with, in my opinion, remarkably little data to make the model more useful. So, RLHF is how we align the model to what humans want it to do. Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI | Lex Fridman Podcast #367 (youtu.be/L_Guz73e6fw?si=vfkdtN CyrQa1RzZR&t=359)
  • 36. Source: Liu et al., 2022, Aligning Generative Language Models with Human Values. aclanthology.org/2022.findings-naacl.18 RLHF: Example of Alignment Tasks
  • 38. Assessing and Comparing LLMs Metrics while training the model—ROUGE (summary) or BLEU (translation). Benchmarks—A non-exhaustive list: - ARC (Abstraction and Reasoning Corpus, arxiv.org/pdf/2305.18354), - HellaSwag (arxiv.org/abs/1905.07830), - TruthfulQA (arxiv.org/abs/2109.07958), - GLUE & SuperGLUE (General Language Understanding Evaluation, gluebenchmark.com), - HELM (Holistic Evaluation of Language Models, crfm.stanford.edu/helm), - MMLU (Massive Multitask Language Understanding, arxiv.org/abs/2009.03300), - BIG-bench (arxiv.org/pdf/2206.04615). Others—“Auto-Eval of Question-Answering Tasks” (blog.langchain.dev/auto-eval-of-question-answering-tasks).
  • 39. Source: Wu et al., 2023, BloombergGPT: A Large Language Model for Finance. arxiv.org/abs/2303.17564 (Table 13: “BIG-bench hard results using standard 3-shot prompting”)
  • 40. Source: Touvron et al., 2023, Llama 2: Open Foundation and Fine-Tuned Chat Models, scontent-fra3-1.xx.fbcdn.net/v/t39.2365-6/10000000_662098952474184_2584067087619170692_n.pdf
  • 42. Question ChatGPT About the Latest Financial Reports? —blog.langchain.dev/tutorial- chatgpt-over-your-data “[ChatGPT] doesn’t know about your private data, it doesn’t know about recent sources of data. Wouldn’t it be useful if it did?”
  • 43. Workflow Overview Question Answer « Quels vont être les dividendes payés par action par le Groupe Crit ? » « Le Groupe CRIT proposera lors de sa prochaine Assemblée Générale, le 9 juin 2023, le versement d'un dividende exceptionnel de 3,5 € par action. » The example (the question and associated answer) is a real example (the LLM was “gpt-3.5-turbo” from OpenAI) Technique described in: Lewis et al., 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. (doi.org/10.48550/arXiv.2005.11401) Extracting relevant information (“context”) Generate a prompt accordingly (“question + context”) LLM Vector store Split into chunks 1 2 3 Compute embeddings
  • 44. Preliminary Prototype Financial reports retrieved directly from the French AMF (“Autorité des marchés financiers”) via their API (info-financiere.fr). xhtml document in French language. Question and answer are in English (they would be in French should the question be asked in French).
  • 45. Except where otherwise noted, this work is licensed under https://creativecommons.org/licenses/by/4.0/ 619.io