kaggle.com

QLoRA: Fine-Tuning a 7B Model on a 16GB GPU (It Shrank to 5.4GB in Front of Me) (opens in new tab)

Discussed on DEV

Series — Fine-Tuning, Smallest to Largest: QLoRA (7B) ← you are here In , LoRA let me fine-tune a 1.5B model by freezing it and training tiny adapters. But the frozen base still sat in memory in 16-bit (~3GB). Now I wanted to go to Qwen2.5-7B — and hit a wall that LoRA alone doesn't solve. The problem A 7B model is ~15GB in 16-bit precision. A free-tier T4 GPU has 16GB. It would barely load, with no room left to actually train. The QLoRA insight QLoRA asks the question that naturally follows ...

Read the original article
Sign in to keep reading the full article.

Keyboard Shortcuts

Navigation

Next / previous post
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Discover
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help