SlideShare a Scribd company logo
Implications of GPT-3
Raven Jiang
raven@cs.stanford.edu
Overview
This document covers:
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
Details and charts from the OpenAI paper and a talk given on 7/24 by Ben Mann, the second author.
Please send feedback to Raven Jiang (raven@cs.stanford.edu)
Disclaimer: I am not affiliated with OpenAI, nor an expert in deep learning. I possess practical
knowledge of its implementation
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
What is GPT-3?
• Text generator deep learning model trained by OpenAI
• Transformer architecture pioneered by Google
• GPT-2, BERT, XLNet, and RoBERTa
• Task agnostic
• Unsupervised learning
• 100x larger (more parameters) than its predecessor GPT-2 (2018)
• Estimated to have cost $12 million of computation cost to train
• Trained on text data from books and websites
What does it do?
• Seemingly many things
• Translation
• Write new poetry
• Generate stories
• Have a conversation
• Answer questions
• Generate working React code
• Generate Figma designs
• Magical VLOOKUP backed by the
Internet
• Maybe creativity is not hard for AI
GPT-3 training data
Common Crawl
59%
WebText2
22%
Books1
8%
Books2
8%
Wikipedia
3%• Common Crawl is scraped web
data manually filtered for some
quality issues
• Books1 and Books2 are mostly
fiction
• Books2 includes non-English
content
Transformer architecture
Transformer-based models Older NLP neural network models
Examples Google’s BERT, OpenAI’s GPT-3, Microsoft’s
Turing-NLG
Google’s GNMT
Task Task-agnostic
The same model is successful at many
different language tasks without additional
training
Trained for a specific task
Models are usually trained for a certain task
and fine-tuned for a related task with
additional training data (e.g. Transfer
Learning)
Training Unsupervised training
Model is trained with large collections of text
without special annotations
Supervised training
Trained with large quantities of input
annotated with expected output that are
usually human-generated
Transformers are the state of the art for NLP neural networks
Example translation workflow
Transformer
1. Train on unrelated English and French text
2. Query describes desired pattern:
"""Translate these sentences:
Hello => Bonjour
That is a cat => C'est un chat
You pass butter =>"""
3. Result:
"""Translate these sentences:
Hello => Bonjour
That is a cat => C'est un chat
You pass butter => Tu passes du beurre"""
Pre-Transformer Language Models
1. Create dataset of English-French text
examples
2. Train on dataset
3. Query:
"You pass butter"
4. Result:
"Tu passes du beurre"
Goal 1: Find the French translation for “You pass butter.”
Generalizability of Transformer models
Transformer
1. Use the same model as the previous task
2. Query describes new pattern:
"""Here are some great dad jokes:
Q: How do you make a lemon drop? A: Let it fall.
Q: What has ears but cannot hear? A: A cornfield.
Q:"""
3. Result:
""”Here are some great dad jokes:
Q: How do you make a lemon drop? A: Let it fall.
Q: What has ears but cannot hear? A: A cornfield.
Q: How does a vampire start a letter? A. Dear blood."""
Pre-Transformer Language Models
1. Create/source a new annotated dataset
suited for the new task
2. Retrain the model either with Transfer
Learning or from scratch
3. Query
Goal 2: Tell some dad jokes
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
How is GPT-3 different?
• It is huge.
• 175 billion parameters
• Its predecessor GPT-2 has 1.5
billion parameters
• The previous record holder,
Microsoft’s Turing-NLG, has 17
billion parameters
• Innovation of scale not technique
0 50 100 150 200
GPT-2 - 2018/06
Turing-NLG - 2020/02
GPT-3 - 2020/07
Parameters (Billions)
Parameters (Billions)
Power of scale
• Scale made a dramatic
difference in performance
• Accuracy increased from
25% to 65% for a specific
benchmarking task going
from 13B parameters to
175B parameters
Uncanny Valley
• Participants asked to spot
fake news generated by
GPT-3
• More parameters = harder
to spot
• Very close to 50-50
accuracy at GPT-3 scale
Returns to scale
• Task performance appears
to continue improving
with scale
• How will GPT-4 perform?
Consequences of scale
• Querying is extremely powerful
• Unexpectedly good performance on a large variety of tasks
• Compared to older task-specific models, API-only access is useful for
broader range of applications
• Caveat: performance probably still inferior to task-specific models
• Caveat2: performance may continue to improve with scale
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
OpenAI’s API strategy
• Gated API access to selected partners
• No access to underlying GPT-3 model and its trained weights
• Turns NLP from annotation/training problem into meta-programming
• Designing queries to yield useful results on a range of language problems
• Much friendlier paradigm for small teams and product-driven startups
• MaaS (Model as a Service) is viable business(?)
• Extremely large NLP models as OpEx instead of CapEx
• No need to fine-tune models for problems with more training data
• Concerns over AGI risk
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
Potential commercial implications
• Access to GPT-3 (or future GPT-4) API accelerates go-to-market speed
of a startup doing applied NLP
• Build MVP using GPT-3 without investing in any training data or infrastructure
• Switch to better performing fine-tuned models over time
• Companies like Grammarly may face low-cost competitors
• Building NLP-powered product features may be as simple as
programming GPT-3 to answer the right questions
• Caveat: Only if GPT-3 (or GPT-4) turns out to be Good Enough for
these applications. Unclear without wider access to the API
Apps using OpenAI API
Conclusion
• Task-agnostic NLP models that deliver acceptable performance may
soon be available as a service
• GPT-3 may be that model
• Potential explosion of startups building MVPs on such an API
• Investor warning: startups dependent on the API may lack expertise
and tools to iterate off MVP
• Happy to chat more: raven@cs.stanford.edu

More Related Content

What's hot

OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
Nawroz University
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
SylvainGugger
 
ChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdfChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdf
Addepto
 
CHATGPT.pptx
CHATGPT.pptxCHATGPT.pptx
CHATGPT.pptx
SajedRahman2
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
Fiza987241
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
David Rostcheck
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
Leon Dohmen
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
rahul_net
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?
Windzoon Technologies
 
ChatGPT Use- Cases
ChatGPT Use- Cases ChatGPT Use- Cases
ChatGPT Use- Cases
Bluechip Technologies
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 
ChatGPT ppt.pptx
ChatGPT  ppt.pptxChatGPT  ppt.pptx
ChatGPT ppt.pptx
YuvrajS9
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
Hady Elsahar
 
Generative AI
Generative AIGenerative AI
Generative AI
All Things Open
 
Uses of AI text bot.pdf
Uses of AI text bot.pdfUses of AI text bot.pdf
Uses of AI text bot.pdf
SreeNivas983124
 

What's hot (20)

OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
ChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdfChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdf
 
CHATGPT.pptx
CHATGPT.pptxCHATGPT.pptx
CHATGPT.pptx
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?
 
ChatGPT Use- Cases
ChatGPT Use- Cases ChatGPT Use- Cases
ChatGPT Use- Cases
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
ChatGPT ppt.pptx
ChatGPT  ppt.pptxChatGPT  ppt.pptx
ChatGPT ppt.pptx
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Uses of AI text bot.pdf
Uses of AI text bot.pdfUses of AI text bot.pdf
Uses of AI text bot.pdf
 

Similar to Implications of GPT-3

SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
DevOpsDays Tel Aviv
 
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsUnleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Safe Software
 
ChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdfChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdf
Sonal Tiwari
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
VictorSzoltysek
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Alok Singh
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
M Waleed Kadous
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
Awantik Das
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
BIHI Oussama
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
Ivo Andreev
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
Marcin Gębala
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine Learning
Ogilvy Consulting
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
GDG Bujumbura
 
Explore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlowExplore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlow
MahaKhalidALhobishi
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
Generative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptxGenerative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptx
Alon Fliess
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
EDB
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Ramiro Aduviri Velasco
 

Similar to Implications of GPT-3 (20)

SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
 
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsUnleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
 
ChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdfChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdf
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine Learning
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
 
Explore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlowExplore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlow
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Generative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptxGenerative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptx
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Recently uploaded

Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

Implications of GPT-3

  • 1. Implications of GPT-3 Raven Jiang raven@cs.stanford.edu
  • 2. Overview This document covers: 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications Details and charts from the OpenAI paper and a talk given on 7/24 by Ben Mann, the second author. Please send feedback to Raven Jiang (raven@cs.stanford.edu) Disclaimer: I am not affiliated with OpenAI, nor an expert in deep learning. I possess practical knowledge of its implementation
  • 3. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 4. What is GPT-3? • Text generator deep learning model trained by OpenAI • Transformer architecture pioneered by Google • GPT-2, BERT, XLNet, and RoBERTa • Task agnostic • Unsupervised learning • 100x larger (more parameters) than its predecessor GPT-2 (2018) • Estimated to have cost $12 million of computation cost to train • Trained on text data from books and websites
  • 5. What does it do? • Seemingly many things • Translation • Write new poetry • Generate stories • Have a conversation • Answer questions • Generate working React code • Generate Figma designs • Magical VLOOKUP backed by the Internet • Maybe creativity is not hard for AI
  • 6. GPT-3 training data Common Crawl 59% WebText2 22% Books1 8% Books2 8% Wikipedia 3%• Common Crawl is scraped web data manually filtered for some quality issues • Books1 and Books2 are mostly fiction • Books2 includes non-English content
  • 7. Transformer architecture Transformer-based models Older NLP neural network models Examples Google’s BERT, OpenAI’s GPT-3, Microsoft’s Turing-NLG Google’s GNMT Task Task-agnostic The same model is successful at many different language tasks without additional training Trained for a specific task Models are usually trained for a certain task and fine-tuned for a related task with additional training data (e.g. Transfer Learning) Training Unsupervised training Model is trained with large collections of text without special annotations Supervised training Trained with large quantities of input annotated with expected output that are usually human-generated Transformers are the state of the art for NLP neural networks
  • 8. Example translation workflow Transformer 1. Train on unrelated English and French text 2. Query describes desired pattern: """Translate these sentences: Hello => Bonjour That is a cat => C'est un chat You pass butter =>""" 3. Result: """Translate these sentences: Hello => Bonjour That is a cat => C'est un chat You pass butter => Tu passes du beurre""" Pre-Transformer Language Models 1. Create dataset of English-French text examples 2. Train on dataset 3. Query: "You pass butter" 4. Result: "Tu passes du beurre" Goal 1: Find the French translation for “You pass butter.”
  • 9. Generalizability of Transformer models Transformer 1. Use the same model as the previous task 2. Query describes new pattern: """Here are some great dad jokes: Q: How do you make a lemon drop? A: Let it fall. Q: What has ears but cannot hear? A: A cornfield. Q:""" 3. Result: ""”Here are some great dad jokes: Q: How do you make a lemon drop? A: Let it fall. Q: What has ears but cannot hear? A: A cornfield. Q: How does a vampire start a letter? A. Dear blood.""" Pre-Transformer Language Models 1. Create/source a new annotated dataset suited for the new task 2. Retrain the model either with Transfer Learning or from scratch 3. Query Goal 2: Tell some dad jokes
  • 10. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 11. How is GPT-3 different? • It is huge. • 175 billion parameters • Its predecessor GPT-2 has 1.5 billion parameters • The previous record holder, Microsoft’s Turing-NLG, has 17 billion parameters • Innovation of scale not technique 0 50 100 150 200 GPT-2 - 2018/06 Turing-NLG - 2020/02 GPT-3 - 2020/07 Parameters (Billions) Parameters (Billions)
  • 12. Power of scale • Scale made a dramatic difference in performance • Accuracy increased from 25% to 65% for a specific benchmarking task going from 13B parameters to 175B parameters
  • 13. Uncanny Valley • Participants asked to spot fake news generated by GPT-3 • More parameters = harder to spot • Very close to 50-50 accuracy at GPT-3 scale
  • 14. Returns to scale • Task performance appears to continue improving with scale • How will GPT-4 perform?
  • 15. Consequences of scale • Querying is extremely powerful • Unexpectedly good performance on a large variety of tasks • Compared to older task-specific models, API-only access is useful for broader range of applications • Caveat: performance probably still inferior to task-specific models • Caveat2: performance may continue to improve with scale
  • 16. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 17. OpenAI’s API strategy • Gated API access to selected partners • No access to underlying GPT-3 model and its trained weights • Turns NLP from annotation/training problem into meta-programming • Designing queries to yield useful results on a range of language problems • Much friendlier paradigm for small teams and product-driven startups • MaaS (Model as a Service) is viable business(?) • Extremely large NLP models as OpEx instead of CapEx • No need to fine-tune models for problems with more training data • Concerns over AGI risk
  • 18. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 19. Potential commercial implications • Access to GPT-3 (or future GPT-4) API accelerates go-to-market speed of a startup doing applied NLP • Build MVP using GPT-3 without investing in any training data or infrastructure • Switch to better performing fine-tuned models over time • Companies like Grammarly may face low-cost competitors • Building NLP-powered product features may be as simple as programming GPT-3 to answer the right questions • Caveat: Only if GPT-3 (or GPT-4) turns out to be Good Enough for these applications. Unclear without wider access to the API
  • 21. Conclusion • Task-agnostic NLP models that deliver acceptable performance may soon be available as a service • GPT-3 may be that model • Potential explosion of startups building MVPs on such an API • Investor warning: startups dependent on the API may lack expertise and tools to iterate off MVP • Happy to chat more: raven@cs.stanford.edu