Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Speculative Decoding
⚡ Speculative Decoding
Specific
LLM Inference, Token Generation, Draft Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
45
posts in
13.5
ms
Jason McDonald
✍️
Prompt Engineering
theamericanscholar.org
·
2d
2 days ago
Actions for Jason McDonald
B & S About Movies podcast Episode 140: The Sons of Hercules
📚
Speculative Fiction
bandsaboutmovies.com
·
5d
5 days ago
Actions for B & S About Movies podcast Episode 140: The Sons of Hercules
Amy Adams Brings Dario Vitale’s Versace Style to ‘The Tonight Show’
📚
Factor
Content type:
News
wwd.com
·
1d
1 day ago
Actions for Amy Adams Brings Dario Vitale’s Versace Style to ‘The Tonight Show’
the sissy boy
🛸
Science Fiction
Content type:
Blog
blog.hyeonje.website
·
3h
3 hours ago
Actions for the sissy boy
Barbara Gladstone Living Room
✨
Computer Graphics
greg.org
·
6d
6 days ago
Actions for Barbara Gladstone Living Room
New rumour claims with '100%' confidence that AMD's
next-gen
Zen 6 desktop CPU will run at over 6.5 GHz
🎮
Handheld Gaming
Content type:
News
pcgamer.com
·
1d
1 day ago
Actions for New rumour claims with '100%' confidence that AMD's next-gen Zen 6 desktop CPU will run at over 6.5 GHz
Nvidia Nemotron 3 Ultra
🎛️
Fine-tuning
research.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for Nvidia Nemotron 3 Ultra
What Arm-based innovations happened in May 2026?
🤖
AI
Content type:
Blog
newsroom.arm.com
·
5d
5 days ago
Actions for What Arm-based innovations happened in May 2026?
Review: The Boy with the Light-Blue Eyes - SXSW London 2026
🛸
Science Fiction
cineuropa.org
·
1d
1 day ago
Actions for Review: The Boy with the Light-Blue Eyes - SXSW London 2026
bigattichouse/packed-twin-inference
: PTI achieves ~2× throughput using a single quantized
model
(Q5_K_M or better) by running 4
generation
streams in one batched decode call. The GPU loads
model
weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft
model
. No quality loss
💬
LLMs
Content type:
Code
github.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
Everyone’s a girl’s girl on TV. Until they’
re
not.
🕸️
Network Effects
Content type:
News
vox.com
·
4h
4 hours ago
Actions for Everyone’s a girl’s girl on TV. Until they’re not.
Making Local
LLM
Go Brrr
✍️
Prompt Engineering
seanpedersen.github.io
·
6d
6 days ago
Actions for Making Local LLM Go Brrr
OpenAI S-1 🇺🇸, Siri AI 📱, Xiaomi Ultraspeed ⚡
🤖
AI
tldr.tech
·
1d
1 day ago
Actions for OpenAI S-1 🇺🇸, Siri AI 📱, Xiaomi Ultraspeed ⚡
If Vampire Survivors and Spelunky had a baby, it'd be Messhof's Blood Dungeon
🎮
Game Design
Content type:
News
engadget.com
·
4d
4 days ago
Actions for If Vampire Survivors and Spelunky had a baby, it'd be Messhof's Blood Dungeon
3x Faster Search:
Parallel
Test-Time Scaling with Instructed-Retriever-1
✍️
Prompt Engineering
Content type:
Blog
databricks.com
·
6d
6 days ago
Actions for 3x Faster Search: Parallel Test-Time Scaling with Instructed-Retriever-1
MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
⚡
Quantization
Content type:
News
Content type:
Blog
kaitchup.substack.com
·
5d
5 days ago
·
r/LocalLLaMA
Actions for MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
Keats's Melancholy Ode
📖
Book recommendations
Content type:
News
Content type:
Blog
profadamroberts.substack.com
·
2d
2 days ago
·
Substack
Actions for Keats's Melancholy Ode
STYLING HACK: A Sculptural Vase Can Change Your Space…Let Us Prove It
🎮
Handheld Gaming
stylebyemilyhenderson.com
·
1d
1 day ago
Actions for STYLING HACK: A Sculptural Vase Can Change Your Space…Let Us Prove It
Hackers Exploit Critical Everest Forms Pro WordPress Plugin Flaw to Take Over Sites
🔐
Cryptography
thehackernews.com
·
5d
5 days ago
Actions for Hackers Exploit Critical Everest Forms Pro WordPress Plugin Flaw to Take Over Sites
Imbuing Large
Language
Models
with Bidirectional Logic for Robust Chain Repair
🤖
AI
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help