v2ex.com

Gemma4 12B 如何跑在 16G 显存上? (opens in new tab)

Google 发布了 Gemma 4 的一个新模型,12B 参数,看介绍不是 MoE 。 看 HF 和 Kaggle 上都是 BF16 数据类型,权重文件大小 23\.9GB 左右。 Google 在博客里专门强调了 Laptop ready: Small enough to run locally with just 16GB of VRAM or unified memory\. 这是怎么做到能在 16G 显存上跑的? 还是说 BF16 的不能跑,要 FP8 量化的才行?但这种量化之后能在 16G 卡上跑的模型很多了,还有很多参数量更大的模型。

Read the original article
Sign in to keep reading the full article.

Keyboard Shortcuts

Navigation

Next / previous post
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Discover
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help