What is QLoRA?
LLM Engineer Interview Questions: Fine-Tuning, LoRA, QLoRA, PEFT, and Instruction Tuning
Audio flashcard · 0:20Nortren·
What is QLoRA?
0:20
QLoRA combines quantization with LoRA. The base model is loaded in 4-bit precision instead of 16-bit, dramatically reducing memory, while LoRA adapters are trained in higher precision. QLoRA enables fine-tuning models with tens of billions of parameters on a single consumer GPU. It introduced techniques like NF4 quantization and double quantization to maintain quality.
arxiv.org