State Of GPT

•

3 likes•589 views

Andre Carpathy, a founding member of OpenAI, explains in "State of GPT" the process of training GPT, an emerging ecosystem of large language models. It starts with pre-training with large datasets that generate the base model through tokenization and translation. Andre also explains that the power of Llama, a smaller model, is more powerful than GPT3 despite containing fewer parameters. The speaker discusses the training of Transformer models for language modeling, followed by the evolution of base models that have arisen since GPT-2. The training process consists of pre-training, supervised fine-tuning, reward modeling, and reinforcement learning. The speaker also talks about improving the performance of Transformers by prompting them, using self-consistency, and prompt engineering. Finally, the speaker addresses the limitations of LLMs, including biases and reasoning errors, and suggests using them in low-stakes applications with human oversight.

Technology

What's hot

The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!taozen

gpt3_presentation.pdfGiacomo Frisoni

Large Language Models BootcampData Science Dojo

Transformers, LLMs, and the Possibility of AGISynaptonIncorporated

Everything to know about ChatGPTKnoldus Inc.

How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University

LLMs BootcampFiza987241

LLMs_talk_March23.pdfChaoYang81

Webinar on ChatGPT.pptxAbhilash Majumder

Prompt EngineeringManjunatha Sai

Generative AI at the edge.pdfQualcomm Research

And then there were ... Large Language ModelsLeon Dohmen

Large Language Models - Chat AI.pdfDavid Rostcheck

AI Introduction for high school studentsMireaCartabbia

[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1

Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation

An Introduction to Generative AICori Faklaris

Introduction to ChatGPTannusharma26

A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti

Generative Models and ChatGPTLoic Merckel

What's hot (20)

The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!

gpt3_presentation.pdf

Large Language Models Bootcamp

Transformers, LLMs, and the Possibility of AGI

Everything to know about ChatGPT

How ChatGPT and AI-assisted coding changes software engineering profoundly

LLMs Bootcamp

LLMs_talk_March23.pdf

Webinar on ChatGPT.pptx

Prompt Engineering

Generative AI at the edge.pdf

And then there were ... Large Language Models

Large Language Models - Chat AI.pdf

AI Introduction for high school students

[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation

Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap

An Introduction to Generative AI

Introduction to ChatGPT

A Comprehensive Review of Large Language Models for.pptx

Generative Models and ChatGPT

Recently uploaded

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Gen AI in Business - Global Trends Report 2024.pdfAddepto

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

CloudStudio User manual (basic edition):comworks

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

costume and set research powerpoint presentationphoebematthew05

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

AI as an Interface for Commercial BuildingsMemoori

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Recently uploaded (20)

Unraveling Multimodality with Large Language Models.pdf

Gen AI in Business - Global Trends Report 2024.pdf

"Debugging python applications inside k8s environment", Andrii Soldatenko

CloudStudio User manual (basic edition):

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Ensuring Technical Readiness For Copilot in Microsoft 365

costume and set research powerpoint presentation

Unleash Your Potential - Namagunga Girls Coding Club

AI as an Interface for Commercial Buildings

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Nell’iperspazio con Rocket: il Framework Web di Rust!

Vertex AI Gemini Prompt Engineering Tips

Advanced Test Driven-Development @ php[tek] 2024

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

SIP trunking in Janus @ Kamailio World 2024

DevEX - reference for building teams, processes, and platforms

Powerpoint exploring the locations used in television show Time Clash

Pigging Solutions in Pet Food Manufacturing

Designing IA for AI - Information Architecture Conference 2024