LORA: Low-Rank Adaptation of Large Language Models

•

0 likes•479 views

YongSang Yoo

23년 3월 9일 NLP 티타임 스터디 발표 자료

Education

LORA: LOW-RANK ADAPTATION OF LARGE
LANGUAGE MODELS
2023.3.9
유용상
NLP 티타임

Introduction
• We take inspiration from Li et al. (2018a); Aghajanyan et al. (2020) which
show that the learned over-parametrized models in fact reside on a low
intrinsic dimension.
• We hypothesize that the change in weights during model adaptation also
has a low “intrinsic rank”, leading to our proposed Low-Rank Adaptation
(LoRA) approach.

Introduction : Adventages
• A pre-trained model can be shared and used to build many small LoRA modules for
different tasks. We can freeze the shared model and efficiently switch tasks by replacing
the matrices in reducing the storage requirement and task-switching overhead
significantly.
• LoRA makes training more efficient and lowers the hardware barrier to entry by up to 3
times when using adaptive optimizers. optimize the injected, much smaller low-rank
matrices.
• Our simple linear design allows us to merge the trainable matrices with the frozen
weights when deployed, introducing no inference latency compared to a fully fine-tuned
model, by construction.
• LoRA is orthogonal to many prior methods and can be combined with many of them,
such as prefix-tuning.

Problem Statement
175B
82B
178B
530B
280B

Low-Rank Parametrized Update Matrices
• Forward Pass
• Update
W0 + ∆W = W0 + BA
• Hypothesis :
가중치에 대한 update도 adaptation 중 intrinsic rank가 낮을
것이다.

Conclusion
• Large-scale model을 효율적으로 튜닝하는 LoRA 제안
• Adapter류의 기법과 다르게 inference latency가 발생하지 않음
• Prefix-tuning과 다르게 usable sequence length를 줄일 필요가 없음
• 가중치 업데이트 행렬이 low intrinsic rank를 가진다고 가정
• 논문에선 LM에 초점을 맞췄지만 이론적으로 모든 dense layer에 적용
가능

PEFT
Blog : https://huggingface.co/blog/peft
https://4n3mone.tistory.com/7
Code : https://github.com/huggingface/peft

What's hot

gpt3_presentation.pdfGiacomo Frisoni

Few shot learning/ one shot learning/ machine learningﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ

Word embedding ShivaniChoudhary74

BERT introductionHanwha System / ICT

NLP State of the Art | BERTshaurya uppal

ViT (Vision Transformer) Review [CDM]Dongmin Choi

Introduction to Few shot learningRidge-i, Inc.

Benchmark comparison of Large Language ModelsMatej Varga

Methods of Optimization in Machine LearningKnoldus Inc.

Intro to LLMsLoic Merckel

Deep Generative Models Chia-Wen Cheng

Generative Adversarial Network (GAN)Prakhar Rastogi

Machine Learning Pipelinesjeykottalam

Underfitting and Overfitting in Machine LearningAbdullah al Mamun

Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science

Ensemble methodsChristopher Marker

Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby

Large Language Models Are Reasoning TeachersNamgyu Ho

Gpt modelsDanbi Cho

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen

What's hot (20)

gpt3_presentation.pdf

Few shot learning/ one shot learning/ machine learning

Word embedding

BERT introduction

NLP State of the Art | BERT

ViT (Vision Transformer) Review [CDM]

Introduction to Few shot learning

Benchmark comparison of Large Language Models

Methods of Optimization in Machine Learning

Intro to LLMs

Deep Generative Models

Generative Adversarial Network (GAN)

Machine Learning Pipelines

Underfitting and Overfitting in Machine Learning

Introduction to Generative Adversarial Networks (GANs)

Ensemble methods

Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...

Large Language Models Are Reasoning Teachers

Gpt models

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf

Similar to LORA: Low-Rank Adaptation of Large Language Models

Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішенняLviv Startup Club

IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS Scalable analytics for iaa s cloud ...IEEEMEMTECHSTUDENTPROJECTS

2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...IEEEFINALSEMSTUDENTPROJECTS

2014 IEEE JAVA CLOUD COMPUTING PROJECT Scalable analytics for iaas cloud avai...IEEEFINALSEMSTUDENTPROJECTS

IEEE 2014 JAVA CLOUD COMPUTING PROJECTS Scalable analytics for iaa s cloud av...IEEEGLOBALSOFTSTUDENTPROJECTS

Cla06.pptYann-Gaël Guéhéneuc

Parallel & Distributed Deep Learning - Dataworks SummitRafael Arana

[DSC Europe 23] Alexander Kovalchuk - Finetuning Stable Diffusion with low-ra...DataScienceConferenc1

Integrating lock free and combining techniques for a practical and scalable f...jpstudcorner

Parallel/Distributed Deep Learning and CDSWDataWorks Summit

Data warehouse 26 exploiting parallel technologiesVaibhav Khanna

Ratpack and Grails 3 (and Spring Boot) SpringOne 2GX 2014Lari Hotari

Pretzel: optimized Machine Learning framework for low-latency and high throu...NECST Lab @ Politecnico di Milano

Algorithm Solved IEEE Projects 2012 2013 Java @ SeabirdssolutionsSBGC

Ieee projects 2012 for cseSBGC

Jooq java object oriented queryingeSAT Publishing House

Electi Deep Learning OptimizationNikolas Markou

Scala a case4lee.gilbert

Scalable analytics for iaa s cloud availabilityNexgen Technology

Similar to LORA: Low-Rank Adaptation of Large Language Models (20)

Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішення

IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS Scalable analytics for iaa s cloud ...

2014 IEEE DOTNET CLOUD COMPUTING PROJECT Scalable analytics for iaa s cloud a...

2014 IEEE JAVA CLOUD COMPUTING PROJECT Scalable analytics for iaas cloud avai...

IEEE 2014 JAVA CLOUD COMPUTING PROJECTS Scalable analytics for iaa s cloud av...

Cla06.ppt

Parallel & Distributed Deep Learning - Dataworks Summit

[DSC Europe 23] Alexander Kovalchuk - Finetuning Stable Diffusion with low-ra...

Integrating lock free and combining techniques for a practical and scalable f...

Parallel/Distributed Deep Learning and CDSW

Data warehouse 26 exploiting parallel technologies

Ratpack and Grails 3 (and Spring Boot) SpringOne 2GX 2014

Pretzel: optimized Machine Learning framework for low-latency and high throu...

Algorithm Solved IEEE Projects 2012 2013 Java @ Seabirdssolutions

Ieee projects 2012 for cse

Jooq java object oriented querying

Electi Deep Learning Optimization

Scala a case4

Scalable analytics for iaa s cloud availability

Recently uploaded

Introduction to AI in Higher Education_draft.pptxpboyjonauth

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe

Presiding Officer Training module 2024 lok sabha electionsanshu789521

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood

Earth Day Presentation wow hello nice greatYousafMalik24

Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a

How to Configure Email Server in Odoo 17Celine George

EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3

Crayon Activity Handout For the Crayon AUnboundStockton

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

Full Stack Web Development Course for BeginnersSabitha Banu

Proudly South Africa powerpoint Thorisha.pptxthorishapillay1

Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU

MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

Software Engineering Methodologies (overview)eniolaolutunde

भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke

History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf

Presiding Officer Training module 2024 lok sabha elections

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

Earth Day Presentation wow hello nice great

Painted Grey Ware.pptx, PGW Culture of India

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf

How to Configure Email Server in Odoo 17

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx

Crayon Activity Handout For the Crayon A

Introduction to ArtificiaI Intelligence in Higher Education

Full Stack Web Development Course for Beginners

Proudly South Africa powerpoint Thorisha.pptx

Capitol Tech U Doctoral Presentation - April 2024.pptx

MARGINALIZATION (Different learners in Marginalized Group

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

Software Engineering Methodologies (overview)

भारत-रोम व्यापार.pptx, Indo-Roman Trade,

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx

LORA: Low-Rank Adaptation of Large Language Models

1. LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS 2023.3.9 유용상 NLP 티타임

2. Introduction • We take inspiration from Li et al. (2018a); Aghajanyan et al. (2020) which show that the learned over-parametrized models in fact reside on a low intrinsic dimension. • We hypothesize that the change in weights during model adaptation also has a low “intrinsic rank”, leading to our proposed Low-Rank Adaptation (LoRA) approach.

3. Introduction : Adventages • A pre-trained model can be shared and used to build many small LoRA modules for different tasks. We can freeze the shared model and efficiently switch tasks by replacing the matrices in reducing the storage requirement and task-switching overhead significantly. • LoRA makes training more efficient and lowers the hardware barrier to entry by up to 3 times when using adaptive optimizers. optimize the injected, much smaller low-rank matrices. • Our simple linear design allows us to merge the trainable matrices with the frozen weights when deployed, introducing no inference latency compared to a fully fine-tuned model, by construction. • LoRA is orthogonal to many prior methods and can be combined with many of them, such as prefix-tuning.

4. Problem Statement 175B 82B 178B 530B 280B

5. Problem Statement

6. Aren’t Existing Solutions Good Enough?

7. Low-Rank Parametrized Update Matrices • Forward Pass • Update W0 + ∆W = W0 + BA • Hypothesis : 가중치에 대한 update도 adaptation 중 intrinsic rank가 낮을 것이다.

8. Low-Rank Parametrized Update Matrices

9. Low-Rank Parametrized Update Matrices

10. Low-Rank Parametrized Update Matrices

11. Empirical Experiments

12. Empirical Experiments

13. Conclusion • Large-scale model을 효율적으로 튜닝하는 LoRA 제안 • Adapter류의 기법과 다르게 inference latency가 발생하지 않음 • Prefix-tuning과 다르게 usable sequence length를 줄일 필요가 없음 • 가중치 업데이트 행렬이 low intrinsic rank를 가진다고 가정 • 논문에선 LM에 초점을 맞췄지만 이론적으로 모든 dense layer에 적용 가능

14. PEFT Blog : https://huggingface.co/blog/peft https://4n3mone.tistory.com/7 Code : https://github.com/huggingface/peft

LORA: Low-Rank Adaptation of Large Language Models

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to LORA: Low-Rank Adaptation of Large Language Models

Similar to LORA: Low-Rank Adaptation of Large Language Models (20)

More from YongSang Yoo

More from YongSang Yoo (10)

Recently uploaded

Recently uploaded (20)

LORA: Low-Rank Adaptation of Large Language Models