Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Back to article
llama + spec: MTP Support by am17an · Pull Request #22673
(opens in new tab)
10
articles covering this post
github.com
·
5w
5 weeks ago
·
Hacker News
,
r/LocalLLaMA
·
Open original
(opens in new tab)
Save
Love
Like
Dislike
|
Add interest
Feeds
Share
|
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block
Add interest
Show Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Covered in 10 articles
Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)
dev.to
·
3d
3 days ago
·
DEV
Actions for Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)
Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive DFlash MTP for Qwen3.6-27B
dev.to
·
3w
3 weeks ago
·
DEV
Actions for Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive DFlash MTP for Qwen3.6-27B
froggeric/Qwen3.6-27B-MTP-GGUF
huggingface.co
·
3w
3 weeks ago
·
DEV
Actions for froggeric/Qwen3.6-27B-MTP-GGUF
Used over a million tokens in three separate sessions to test Qwen 3.6 35b (new Multi-token Prediction version)
huggingface.co
·
4w
4 weeks ago
·
r/LocalLLaMA
Actions for Used over a million tokens in three separate sessions to test Qwen 3.6 35b (new Multi-token Prediction version)
Benchmarking llama.cpp's brand-new MTP support on Strix Halo
calebcoffie.com
·
3w
3 weeks ago
·
Hacker News
Actions for Benchmarking llama.cpp's brand-new MTP support on Strix Halo
Multi Token Prediction in llama.cpp
am17an.bearblog.dev
·
2w
2 weeks ago
Actions for Multi Token Prediction in llama.cpp
This Month in Agentic Coding – May 2026
agenticcodingweekly.com
·
1w
1 week ago
·
Hacker News
,
Hacker News
Actions for This Month in Agentic Coding – May 2026
AI's Plummeting Prices Are a Software Story, Not a Hardware One
weightythoughts.com
·
3w
3 weeks ago
·
Hacker News
Actions for AI's Plummeting Prices Are a Software Story, Not a Hardware One
MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent
bric.pe.kr
·
4d
4 days ago
·
DEV
Actions for MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent
In other languages
Qwen3.6 MTP весит на 0.3 Гб больше, а даёт ускорение в ~2 раза. С 60 t/s до 130 t/s для Qwen3.6 27B без искажений
habr.com
·
3w
3 weeks ago
Actions for Qwen3.6 MTP весит на 0.3 Гб больше, а даёт ускорение в ~2 раза. С 60 t/s до 130 t/s для Qwen3.6 27B без искажений
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help