Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
💾 Prompt Caching
Context Reuse, KV Cache, Inference Optimization, Token Efficiency
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
26570
posts in
779.0
ms
Cache-aware
disaggregated
inference for up to 40% faster long-context LLM
serving
together.ai
·
1d
·
Discuss:
Hacker News
,
r/LocalLLaMA
🔮
Prefetching
Garnix
Blog:
Forwardly-evaluated
build systems
garnix.io
·
20h
·
Discuss:
Lobsters
🏗️
Build Systems
discord/twitch/kick/snapchat
age
verifier
age-verifier.kibty.town
·
4h
🌐
ActivityPub Protocol
hit-box/hitbox
: Highly
customizable
async caching framework for Rust - from in-memory to distributed solutions, designed for high-performance applications
github.com
·
2d
·
Discuss:
r/rust
🌐
Axum
A chatbot's worst
enemy
is page
refresh
zknill.io
·
16h
·
Discuss:
Hacker News
🤖
Web Crawling Politeness
ESTAR
:
Early-Stopping
Token-Aware Reasoning For Efficient Inference
arxiv.org
·
1d
🧠
Inference Serving
Moltis
: a personal AI
assistant
built in Rust
pen.so
·
43m
🦋
Tauri
The LLM
Context
Tax: Best Tips for Tax
Avoidance
nicolasbustamante.com
·
13h
·
Discuss:
Hacker News
💰
Tokenomics
cysqlite
—a new
sqlite
driver
simonwillison.net
·
14h
💾
SQLite
GRAIL
Text
Recognizer
jackschaedler.github.io
·
6h
🔤
Font Rendering
Bitsum
. Real-time
CPU
Optimization and Automation
bitsum.com
·
10h
🔮
Prefetching
AI giants are
racing
to secure
shrinking
memory. It’s creating opportunities for startups
sifted.eu
·
2d
🖥
GPUs
Proof-oriented
Programming in F*
fstar-lang.org
·
3h
·
Discuss:
Lobsters
💻
Programming languages
Running
Mistral-7B
on Intel
NPU
— 12.6 tokens/s, zero CPU/GPU usage
github.com
·
2h
·
Discuss:
r/LocalLLaMA
🖥
GPUs
Concurrent
vs.
Parallel
Execution in LLM API Calls: From an AI Engineer’s Perspective
pub.towardsai.net
·
3d
🚀
Async Optimization
AFMTJ
Model For In-Memory Computing (University of
Arizona
)
semiengineering.com
·
1d
📦
In-process Databases
Model
Context
Protocol
developers.openai.com
·
13h
📋
MCP
[
AINews
] Qwen Image 2 and
Seedance
2
latent.space
·
1d
🏗️
LLM Infrastructure
ManifoldKV
: Training-Free KV Cache Compression via
Euclidean
Outlier Detection
arxiv.org
·
2d
🔬
RaBitQ
Go -
Unit
&
Integration
Testing
linkedin.com
·
11h
·
Discuss:
r/programming
⚡
Comptime Programming
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help