SlideShare a Scribd company logo
1 of 17
Download to read offline
LLMs for the “GPU-Poor”
Franck Nijimbere
Founder @GoAgentic
Intro to Large Language Models (LLMs)
An LLM is a type of neural network that
specializes in processing, understanding,
and generating human language.
Eg. llama-2-70 b
Intro to Large Language Models (LLMs)
There are two main components of an LLM:
1. the parameters: model weights
2. the code to run the parameters
Given a sequence of words, the LLM predicts the next
word.
● This requires parameters/ the network to learn a lot
about the world
● All of this info is compressed into the parameters
LLM Training: Step1: Model Pre-training
This involves:
1. Taking chunks of the internet collected
through crawling (~10TB of text)
2. Using a GPU cluster to compute
parameters (6,000 GPUs for 12 days,
~1e24 FLOPS)
3. Compressing these large amounts of
text with lossy compression (~140 GB
file)
** values for llama-2-70b, it’s important to
note that these values are ~10x worse than
current leading models*
To start, we create a base model by learning from a labeled dataset. This process is extremely expensive and usually
only done ~1/year.
LLM Training: Step2: Model Fine-tuning
This involves:
1. Writing labeling instructions
1. note: Labeling used to be done by
humans. Now, it is increasingly
completed by LLMs.
2. Creating a dataset of ~100k high quality
specific texts (ex: use ScaleAI)
3. Fine-tuning the base model on this dataset
(~1 day)
4. Evaluating and monitoring misbehaviors
E.g Assistant model like ChatGPT
- Fine-tune the model on Q/A <user: query> + <assistant:
answer> pair formatted data
We take our general base model and tune it to a specific domain and purpose. Unlike pre-training which requires billions of examples,
fine-tuning only requires a few hundred domain specific examples.
Model Training vs Model Fine-tuning
Priority Data Format
Stage 1 Prioritizes data quantity
(requires ~10,000 x more
than Stage 2)
general internet documents
Stage 2 Prioritizes data quality and
specificity
more specific text format and
contents
LLM Training: Step3: Model Inference
How does this work exactly?
The network “dreams” documents. Every-time it generates a new word, this word is fed back
in to generate the next word. This allows the network to iteratively build words, and then
sentences, and then paragraphs.
However, the model isn’t always correct. There is always a risk of hallucinations.
Hallucinations are generated text or responses that are incorrect, fictional, or misleading.
This is currently a big problem with LLMs and can be caused by:
1. The LLM was trained on outdated or incorrect information.
2. The LLM was trained on biased or discriminatory training data.
3. The LLM doesn’t have access to real-time, real-world information.
LLM scaling laws
We can significantly, and predictably improve LLM performance just by increasing
the # of parameters and the amount of training data.
This is very important because:
● It’s a very easy and clear path for improvement
● It’s why people are rushing to obtain more data and GPUs
GPU cost
NVIDIA A100 ~ $27K
GPU-Poor
GPU-Poor
On-device inference
The main goal of llama.cpp is to run the
LLaMA model using 4-bit integer
quantization on a MacBook
LLM Optimization: Model Pruning
For example, SparseGPT claims their algorithm can prune models by 50% without retraining.
LLM Optimization: Model Quantization
Quantization (post-training) — Normalize and round
the weights. No retraining is needed.
Mixed precision — Using a combination of lower
(e.g., float16) and higher (e.g., float32) precision
arithmetic to balance performance and accuracy.
LLM Optimization: Low-rank factorization
LoRa (Low-Rank Adaptation of Large Language Models) — A method to
reduce the model size and computational requirements by
approximating large matrices using low-rank decomposition.
Fine-tune LLMs for a few dollars
Final words
Lots of opportunities for the GPU-Poor:
- Restrict the domain (fine-tune models): model for kwatura, generating
Burundian proverbs, generating ibisokozo
- Dispatch multiple smaller models, each specialised in a sub-task, to improve
overall performance: GoAgentic.
- Improve optimization techniques for the GPU-poor.

More Related Content

Similar to LLMs for the “GPU-Poor” - Franck Nijimbere.pdf

Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3Raven Jiang
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoMLArpitha Gurumurthy
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...IEEEFINALYEARSTUDENTPROJECT
 
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...IEEEMEMTECHSTUDENTSPROJECTS
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindsporeijdms
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine LearningSudarsun Santhiappan
 
mapReduce for machine learning
mapReduce for machine learning mapReduce for machine learning
mapReduce for machine learning Pranya Prabhakar
 
BloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for FinanceBloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for Finance957671457
 
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...IEEEGLOBALSOFTTECHNOLOGIES
 
A probabilistic approach to string transformation
A probabilistic approach to string transformationA probabilistic approach to string transformation
A probabilistic approach to string transformationIEEEFINALYEARPROJECTS
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesArvind Rapaka
 
Why is dev ops for machine learning so different
Why is dev ops for machine learning so differentWhy is dev ops for machine learning so different
Why is dev ops for machine learning so differentRyan Dawson
 
Automated Speech Recognition
Automated Speech Recognition Automated Speech Recognition
Automated Speech Recognition Pruthvij Thakar
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSHCL Technologies
 

Similar to LLMs for the “GPU-Poor” - Franck Nijimbere.pdf (20)

Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoML
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
 
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
mapReduce for machine learning
mapReduce for machine learning mapReduce for machine learning
mapReduce for machine learning
 
BloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for FinanceBloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for Finance
 
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
 
A probabilistic approach to string transformation
A probabilistic approach to string transformationA probabilistic approach to string transformation
A probabilistic approach to string transformation
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
 
Deep learning-practical
Deep learning-practicalDeep learning-practical
Deep learning-practical
 
Why is dev ops for machine learning so different
Why is dev ops for machine learning so differentWhy is dev ops for machine learning so different
Why is dev ops for machine learning so different
 
Automated Speech Recognition
Automated Speech Recognition Automated Speech Recognition
Automated Speech Recognition
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 

More from GDG Bujumbura

Web au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfWeb au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfGDG Bujumbura
 
Unleashing the power of Unit Testing - Franck Ninsabira.pdf
Unleashing the power of Unit Testing - Franck Ninsabira.pdfUnleashing the power of Unit Testing - Franck Ninsabira.pdf
Unleashing the power of Unit Testing - Franck Ninsabira.pdfGDG Bujumbura
 
Transaction SQL - Jean Thierry.pptx
Transaction SQL - Jean Thierry.pptxTransaction SQL - Jean Thierry.pptx
Transaction SQL - Jean Thierry.pptxGDG Bujumbura
 
Science-Fiction - The forgotten art of designing better technologies - Josue....
Science-Fiction - The forgotten art of designing better technologies - Josue....Science-Fiction - The forgotten art of designing better technologies - Josue....
Science-Fiction - The forgotten art of designing better technologies - Josue....GDG Bujumbura
 
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...GDG Bujumbura
 
Les outils et compétences nécessaires pour le développement en remote - Ce...
Les outils et compétences nécessaires pour le développement en remote - Ce...Les outils et compétences nécessaires pour le développement en remote - Ce...
Les outils et compétences nécessaires pour le développement en remote - Ce...GDG Bujumbura
 
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...GDG Bujumbura
 
Google Authentication in Python - Destin.pdf
Google Authentication in Python - Destin.pdfGoogle Authentication in Python - Destin.pdf
Google Authentication in Python - Destin.pdfGDG Bujumbura
 
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdf
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdfComment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdf
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdfGDG Bujumbura
 
Web au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfWeb au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfGDG Bujumbura
 
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdf
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdfSenior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdf
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdfGDG Bujumbura
 
Road map to DevOps engineering - Elie Sirius
Road map to DevOps engineering -  Elie SiriusRoad map to DevOps engineering -  Elie Sirius
Road map to DevOps engineering - Elie SiriusGDG Bujumbura
 
How to be a self-taught programmer best practices - Edgar Eldy
How to be a self-taught programmer  best practices - Edgar EldyHow to be a self-taught programmer  best practices - Edgar Eldy
How to be a self-taught programmer best practices - Edgar EldyGDG Bujumbura
 
Women in Tech : The Community - Seilla Nkurunziza
Women in Tech : The Community - Seilla NkurunzizaWomen in Tech : The Community - Seilla Nkurunziza
Women in Tech : The Community - Seilla NkurunzizaGDG Bujumbura
 
Android et Minimalisme - Thomas Ezan
Android et Minimalisme - Thomas EzanAndroid et Minimalisme - Thomas Ezan
Android et Minimalisme - Thomas EzanGDG Bujumbura
 

More from GDG Bujumbura (15)

Web au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfWeb au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdf
 
Unleashing the power of Unit Testing - Franck Ninsabira.pdf
Unleashing the power of Unit Testing - Franck Ninsabira.pdfUnleashing the power of Unit Testing - Franck Ninsabira.pdf
Unleashing the power of Unit Testing - Franck Ninsabira.pdf
 
Transaction SQL - Jean Thierry.pptx
Transaction SQL - Jean Thierry.pptxTransaction SQL - Jean Thierry.pptx
Transaction SQL - Jean Thierry.pptx
 
Science-Fiction - The forgotten art of designing better technologies - Josue....
Science-Fiction - The forgotten art of designing better technologies - Josue....Science-Fiction - The forgotten art of designing better technologies - Josue....
Science-Fiction - The forgotten art of designing better technologies - Josue....
 
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...
Remote Sensing for Land Cover Mapping in Google Earth Engine - HAMENYIMANA Is...
 
Les outils et compétences nécessaires pour le développement en remote - Ce...
Les outils et compétences nécessaires pour le développement en remote - Ce...Les outils et compétences nécessaires pour le développement en remote - Ce...
Les outils et compétences nécessaires pour le développement en remote - Ce...
 
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...
La diversité et la véracité de l'IA dans la vie de tous les jours avec un ...
 
Google Authentication in Python - Destin.pdf
Google Authentication in Python - Destin.pdfGoogle Authentication in Python - Destin.pdf
Google Authentication in Python - Destin.pdf
 
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdf
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdfComment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdf
Comment creer de Applicartions Desktop avec Javascript - Bejamin Kinyamba.pdf
 
Web au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdfWeb au logiciel desktop avec Tauri - Don Nermed.pdf
Web au logiciel desktop avec Tauri - Don Nermed.pdf
 
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdf
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdfSenior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdf
Senior Sebarundi @flutterfoward 2023 - Flutter Favorites.pdf
 
Road map to DevOps engineering - Elie Sirius
Road map to DevOps engineering -  Elie SiriusRoad map to DevOps engineering -  Elie Sirius
Road map to DevOps engineering - Elie Sirius
 
How to be a self-taught programmer best practices - Edgar Eldy
How to be a self-taught programmer  best practices - Edgar EldyHow to be a self-taught programmer  best practices - Edgar Eldy
How to be a self-taught programmer best practices - Edgar Eldy
 
Women in Tech : The Community - Seilla Nkurunziza
Women in Tech : The Community - Seilla NkurunzizaWomen in Tech : The Community - Seilla Nkurunziza
Women in Tech : The Community - Seilla Nkurunziza
 
Android et Minimalisme - Thomas Ezan
Android et Minimalisme - Thomas EzanAndroid et Minimalisme - Thomas Ezan
Android et Minimalisme - Thomas Ezan
 

Recently uploaded

Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 

Recently uploaded (20)

Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 

LLMs for the “GPU-Poor” - Franck Nijimbere.pdf

  • 1. LLMs for the “GPU-Poor” Franck Nijimbere Founder @GoAgentic
  • 2. Intro to Large Language Models (LLMs) An LLM is a type of neural network that specializes in processing, understanding, and generating human language. Eg. llama-2-70 b
  • 3. Intro to Large Language Models (LLMs) There are two main components of an LLM: 1. the parameters: model weights 2. the code to run the parameters Given a sequence of words, the LLM predicts the next word. ● This requires parameters/ the network to learn a lot about the world ● All of this info is compressed into the parameters
  • 4. LLM Training: Step1: Model Pre-training This involves: 1. Taking chunks of the internet collected through crawling (~10TB of text) 2. Using a GPU cluster to compute parameters (6,000 GPUs for 12 days, ~1e24 FLOPS) 3. Compressing these large amounts of text with lossy compression (~140 GB file) ** values for llama-2-70b, it’s important to note that these values are ~10x worse than current leading models* To start, we create a base model by learning from a labeled dataset. This process is extremely expensive and usually only done ~1/year.
  • 5. LLM Training: Step2: Model Fine-tuning This involves: 1. Writing labeling instructions 1. note: Labeling used to be done by humans. Now, it is increasingly completed by LLMs. 2. Creating a dataset of ~100k high quality specific texts (ex: use ScaleAI) 3. Fine-tuning the base model on this dataset (~1 day) 4. Evaluating and monitoring misbehaviors E.g Assistant model like ChatGPT - Fine-tune the model on Q/A <user: query> + <assistant: answer> pair formatted data We take our general base model and tune it to a specific domain and purpose. Unlike pre-training which requires billions of examples, fine-tuning only requires a few hundred domain specific examples.
  • 6. Model Training vs Model Fine-tuning Priority Data Format Stage 1 Prioritizes data quantity (requires ~10,000 x more than Stage 2) general internet documents Stage 2 Prioritizes data quality and specificity more specific text format and contents
  • 7. LLM Training: Step3: Model Inference How does this work exactly? The network “dreams” documents. Every-time it generates a new word, this word is fed back in to generate the next word. This allows the network to iteratively build words, and then sentences, and then paragraphs. However, the model isn’t always correct. There is always a risk of hallucinations. Hallucinations are generated text or responses that are incorrect, fictional, or misleading. This is currently a big problem with LLMs and can be caused by: 1. The LLM was trained on outdated or incorrect information. 2. The LLM was trained on biased or discriminatory training data. 3. The LLM doesn’t have access to real-time, real-world information.
  • 8. LLM scaling laws We can significantly, and predictably improve LLM performance just by increasing the # of parameters and the amount of training data. This is very important because: ● It’s a very easy and clear path for improvement ● It’s why people are rushing to obtain more data and GPUs
  • 12. On-device inference The main goal of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook
  • 13. LLM Optimization: Model Pruning For example, SparseGPT claims their algorithm can prune models by 50% without retraining.
  • 14. LLM Optimization: Model Quantization Quantization (post-training) — Normalize and round the weights. No retraining is needed. Mixed precision — Using a combination of lower (e.g., float16) and higher (e.g., float32) precision arithmetic to balance performance and accuracy.
  • 15. LLM Optimization: Low-rank factorization LoRa (Low-Rank Adaptation of Large Language Models) — A method to reduce the model size and computational requirements by approximating large matrices using low-rank decomposition.
  • 16. Fine-tune LLMs for a few dollars
  • 17. Final words Lots of opportunities for the GPU-Poor: - Restrict the domain (fine-tune models): model for kwatura, generating Burundian proverbs, generating ibisokozo - Dispatch multiple smaller models, each specialised in a sub-task, to improve overall performance: GoAgentic. - Improve optimization techniques for the GPU-poor.