Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🚀 Performance
Broad
Benchmarking, Profiling, Optimization, Bottlenecks
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
21239
posts in
26.3
ms
Benchmarking Inference
Engines
on Agentic
Workloads
🦙
Ollama
appliedcompute.com
·
6d
·
Hacker News
Qwen 3.6-35B-A3B KV cache bench:
f16
vs q8_0 vs
turbo3
vs turbo4 from 0 to 1M context on M5 Max
💫
slick production values
llmkube.com
·
1d
·
r/LocalLLaMA
vnmoorthy/pavo-bench
: A 50K-turn voice pipeline benchmark and an 85K-param meta-controller that cuts P95 latency 10.3% and energy 71% vs fixed cloud.
TMLR
2026.
🦙
Ollama
github.com
·
15h
·
Hacker News
PEAKS No 42: The Open-Weight Uprising: GPT-5.5, Qwen Beats a
397B
Giant, and Your
Jira
Data Is Now AI Training Fuel
🦙
Ollama
bogdandeac.com
·
16h
Show HN:
Utilyze
, an open source GPU monitoring tool more accurate than
nvtop
⚙
Laptop optimization
systalyze.com
·
1d
·
Hacker News
DeepSeek-V4 on Day 0: From Fast Inference to Verified
RL
with
SGLang
and Miles
🧮
Vector Databases
lmsys.org
·
3d
·
Hacker News
The World's Most Advanced Team
Scheduling
Software
💾
Local-First Software
teamcal.ai
·
22h
·
Hacker News
Vibing
, Harness and
OODA
loop
🦙
Ollama
architecture-weekly.com
·
2d
GPT-5.5:
Capabilities
and
Reactions
🦙
Ollama
thezvi.wordpress.com
·
16h
Reimagining Kernel Generation at the
PTX
Layer: An LLM System Learning from
DSLs
to Outperform Them
🦙
Ollama
standardkernel.com
·
1d
·
Hacker News
I
Forked
4 cli coding agents to Run the Same Model. The
scaffolding
explained a 2x gap.
🔓
Open source software
charlesazam.com
·
6d
·
Hacker News
local-first MCP code intelligence (and the
runs
we
lose
)
🦙
Ollama
sverklo.com
·
1d
·
Hacker News
The Coding Assistant
Breakdown
: More
Tokens
Please
🦙
Ollama
newsletter.semianalysis.com
·
4d
·
Hacker News
State of my
AI-assisted
development
workflows
🦙
Ollama
handmadeoasis.com
·
2d
·
Hacker News
Guess-Verify-Refine
: Data-Aware Top-K for Sparse-Attention Decoding on Blackwell via Temporal
Correlation
🧮
Vector Databases
arxiv.org
·
2d
·
Hacker News
Mojo
language, any hardware. Systems-level performance.
Pythonic
syntax
🕹️
PICO-8
modular.com
·
5d
·
Hacker News
My local
setup
Worklog
🦙
Ollama
allanatrix.bearblog.dev
·
1d
Where
Optimizations
Come From
🧮
Vector Databases
NULL BITMAP by Justin Jaffray via buttondown.com
·
1d
·
Hacker News
Fast
Attention
for Short
Sequences
🧮
Vector Databases
blog.qwertyforce.dev
·
3d
·
Hacker News
EsoBench
: Learning a Novel
Esolang
via Iterative Execution Feedback
🦙
Ollama
caseys-evals.com
·
2d
·
Hacker News
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help