Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🚀 Model Serving
TorchServe, TensorFlow Serving, Inference Optimization, Batching
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
1632
posts in
37.4
ms
shreyansh26/Speculative-Decoding
: Speculative Decoding Implementations: EAGLE-3, Medusa-1,
PARD
, Draft Models, N-gram and Suffix Decoding from scratch
🔨
LLVM
github.com
·
3d
·
r/LLM
,
r/LocalLLaMA
My
Calculator
Is a
Transformer
🐍
Programming
sinclairs.gitlab.io
·
3h
·
Hacker News
,
r/LocalLLaMA
Prefetching
Weights
in llama.cpp
🔨
LLVM
am17an.bearblog.dev
·
2d
inclusionAI/Ling-2.6-1T
🐹
Go
huggingface.co
·
23h
·
r/LocalLLaMA
Vibe
Training - Auto Train a Small Language Model for Your Use Case
🤖
Transformers
diamantai.substack.com
·
2d
·
Substack
,
r/LocalLLaMA
I Built a
WebAssembly
Runtime
in 5 Days
🐹
Go
tingouw.com
·
3h
·
Hacker News
Maybe
I was too
harsh
on deep learning theory (three days ago)
🤖
Machine Learning
lesswrong.com
·
10h
Lambda
Calculus
Benchmark for AI
🔄
Concurrency
victortaelin.github.io
·
5d
·
Hacker News
LingBot-Map
: Streaming 3D reconstruction with
geometric
context transformer
📓
Jupyter Notebooks
technology.robbyant.com
·
2d
·
Hacker News
Lessons from Building an
OTel
Normalizer
for GenAI (Part 1)
🛠️
Feature Engineering
groundcover.com
·
12h
·
Hacker News
Scaling Pain of Coding Agent Serving: Lessons from
Debugging
GLM-5
at Scale
🐍
Programming
z.ai
·
15h
·
Lobsters
,
Hacker News
Qwen 3.6-35B-A3B KV cache bench:
f16
vs q8_0 vs
turbo3
vs turbo4 from 0 to 1M context on M5 Max
🔨
LLVM
llmkube.com
·
2d
·
r/LocalLLaMA
Vibin
’ With
Erlang
🐹
Go
blog.whenhen.com
·
6d
·
Lobsters
Granite
4.1: IBM's
8B
Model Is Competing With Models Four Times Its Size
🛠️
Feature Engineering
firethering.com
·
6h
·
Hacker News
Changes, New Features, and
Fixes
🔨
LLVM
gcc.gnu.org
·
5h
·
Hacker News
,
r/cpp
How we built ten custom
subagents
to
tame
a 500K-line Clojure codebase
🛠️
Feature Engineering
metabase.com
·
2d
·
Hacker News
,
r/programming
Clojure
us the future of AI coding, but you won't use it
🛠️
Feature Engineering
latypoff.com
·
21h
·
Hacker News
vLLM-Lens
: Fast Interpretability
Tooling
That Scales to Trillion-Parameter Models
🔨
LLVM
lesswrong.com
·
6d
Sequoia
Ascent
2026 summary
🛠️
Feature Engineering
karpathy.bearblog.dev
·
1h
Letting
AI play my game – building an agentic test
harness
to help play-testing
🤖
Transformers
blog.jeffschomay.com
·
1d
·
Hacker News
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help