This paper proposes LLaMA-Adapter, a lightweight method to efficiently fine-tune the LLaMA language model into an instruction-following model. It uses learnable adaption prompts prepended to word tokens in higher transformer layers. Additionally, it introduces zero-initialized attention with a gating mechanism that incorporates instructional signals while preserving pre-trained knowledge. Experiments show LLaMA-Adapter can generate high-quality responses comparable to fully fine-tuned models, and it can be extended to multi-modal reasoning tasks.