Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Speculative Decoding
⚡ Speculative Decoding
Specific
LLM Inference, Token Generation, Draft Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
44
posts in
5.2
ms
AdaPLD: Adaptive Retrieval and Reuse for Efficient
Model-Free
Speculative
Decoding
⚡
Quantization
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for AdaPLD: Adaptive Retrieval and Reuse for Efficient Model-Free Speculative Decoding
Less-relevant results
The economics of
speculative
decoding
📈
Algorithmic Trading
Content type:
Blog
fergusfinn.com
·
2d
2 days ago
·
Hacker News
Actions for The economics of speculative decoding
A system programmer’s guide to
LLM
inference
🔤
Tokenization
Content type:
Blog
blog.xiangpeng.systems
·
2d
2 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🤖
AI
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
MiMo-v2.5-Pro-UltraSpeed: 1T
model
with 1000 TPS
⚡
Quantization
Content type:
Blog
mimo.xiaomi.com
·
2d
2 days ago
·
Hacker News
,
r/LocalLLaMA
Actions for MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 TPS
Speculators
v0.5.0: DFlash support and online training
💬
LLMs
developers.redhat.com
·
6d
6 days ago
Actions for Speculators v0.5.0: DFlash support and online training
GoCritic! Review: Eeny, Meeny, Miny, Moe! - GoCritic! - Anifilm Liberec 2026
🎮
Godot
cineuropa.org
·
23h
23 hours ago
Actions for GoCritic! Review: Eeny, Meeny, Miny, Moe! - GoCritic! - Anifilm Liberec 2026
China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
💬
LLMs
Content type:
News
decrypt.co
·
2d
2 days ago
Actions for China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
BeeLlama.cpp DFlash on Strix Halo: 2.7x Gemma 31B, But MTP Is Still Faster
🎮
Game Engines
sleepingrobots.com
·
3d
3 days ago
Actions for BeeLlama.cpp DFlash on Strix Halo: 2.7x Gemma 31B, But MTP Is Still Faster
Making LLMs faster and more efficient across multiple
languages
💬
LLMs
techxplore.com
·
6d
6 days ago
Actions for Making LLMs faster and more efficient across multiple languages
Here's a llama.cpp CLI Command builder.
💬
LLMs
llamabuilding.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Here's a llama.cpp CLI Command builder.
Nutrient control enables metabolic reconstruction of L. rhamnosus GG and analysis of secretions
📡
Science Communication
Content type:
Academic
biorxiv.org
·
3d
3 days ago
Actions for Nutrient control enables metabolic reconstruction of L. rhamnosus GG and analysis of secretions
Xiaomi MiMo-V2.5-Pro Just Hit 1,000
Tokens
Per Second!
💬
Natural Language Processing
gizchina.com
·
1d
1 day ago
Actions for Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!
ViaTunisia subsea segment reaches ready-for-service status
🎮
Game Design
Content type:
News
computerweekly.com
·
5d
5 days ago
Actions for ViaTunisia subsea segment reaches ready-for-service status
K-Forcing: Joint
Next-K-Token
Decoding
via Push-Forward
Language
Modeling
💬
LLMs
Content type:
Academic
arxiv.org
·
19h
19 hours ago
Actions for K-Forcing: Joint Next-K-Token Decoding via Push-Forward Language Modeling
Photo Friday: Our valiant steeds
👁️
Computer Vision
ggwash.org
·
5d
5 days ago
Actions for Photo Friday: Our valiant steeds
Qwen 3.6 27B AutoRound GGUF, need your feedback
⚡
Quantization
huggingface.co
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Qwen 3.6 27B AutoRound GGUF, need your feedback
a local Windows app for interview prep and mock practice
📈
Optimization
ofarwise.com
·
12h
12 hours ago
·
Hacker News
Actions for a local Windows app for interview prep and mock practice
defai-digital/ax-engine: Apple Silicon
LLM
runtime supporting Gemma 4 and Qwen 3.6 MTP
modes
🤖
AI
Content type:
Code
github.com
·
22h
22 hours ago
·
Hacker News
Actions for defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes
Jason McDonald
✍️
Prompt Engineering
theamericanscholar.org
·
2d
2 days ago
Actions for Jason McDonald
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help