Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🚀 Performance
Broad
Benchmarking, Profiling, Optimization, Bottlenecks
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
15807
posts in
16.1
ms
Scaling a
Monolith
to 1M
LOC
: 113 Pragmatic Lessons from Tech Lead to CTO
📋
due diligence
semicolonandsons.com
·
5d
·
Lobsters
,
Hacker News
·
…
Inside the
Together
AI
kernels
team
🦙
Ollama
together.ai
·
1d
·
…
Review:
Measuring
AI
Ability
to Complete Long Software Tasks
🦙
Ollama
emptysqua.re
·
17h
·
Lobsters
,
Hacker News
·
…
Pure C implementation of the
TurboQuant
paper (
ICLR
2026) for KV cache compression in LLM inference.
🦙
Ollama
github.com
·
21h
·
r/LocalLLaMA
·
…
You guys seen this? 1-bit model with an
MMLU-R
of 65.7, 8B
params
🕹️
PICO-8
huggingface.co
·
1d
·
r/LocalLLaMA
·
…
The
insert
benchmark on a small server :
Postgres
12.22 through 18.3
💫
slick production values
smalldatum.blogspot.com
·
3d
·
smalldatum.blogspot.com
·
…
Pyre
: New
JIT
Python interpreter written in Rust
🕹️
PICO-8
pyre-lang.org
·
20h
·
Lobsters
,
Hacker News
·
…
Ring Attention &
Sequence
Sharding with
shard
_map
🧮
Vector Databases
rishirajacharya.com
·
2d
·
…
Why Your Engineering Team Is
Slow
(It's the
Codebase
, Not the People)
🎯
strategies, tools, and mindset needed
piechowski.io
·
2d
·
Lobsters
,
Hacker News
·
…
Presentations
: How I Made
Immer
Twice as Fast: Performance Optimization in Practice
🎓
Masterclass
blog.isquaredsoftware.com
·
6d
·
…
Announcing 1-bit
Bonsai
: The First
Commercially
Viable 1-bit LLMs
🦙
Ollama
prismml.com
·
1d
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
·
…
1 Million Tokens Per Second: Qwen 3.5
27B
on GKE with
B200
GPUs
💫
slick production values
medium.com
·
6d
·
Hacker News
,
r/LocalLLaMA
·
…
Chinese Named
Entity
Recognition Model Selection for Small Pure CPU
Environments
🦙
Ollama
lawtee.com
·
1d
·
…
Discussion - The
9950X3D2
performance speculation thread - The most
divisive
CPU ever?
⬛
Ditherpunk
forums.anandtech.com
·
2d
·
…
I spent 96 hours setting up dual
DGX
Sparks and a Mac Studio M3 Ultra for the same
397B
model. Neither won.
💫
slick production values
alooftwaffle.substack.com
·
5d
·
r/LocalLLaMA
·
…
Softwares
& Services
🔍
Alternative Search
var.run
·
2d
·
…
Strand
PWA
Runtime (Part 2.5)
🌐
Web Standards
kver.ca
·
4d
·
…
The Future of Developer
Compute
🔓
Open source software
austen.info
·
5d
·
…
🎲
OpenRISC
Multicore
⬛
Ditherpunk
stffrdhrn.github.io
·
3d
·
…
Analyzing
round trip query
latency
🧮
Vector Databases
datadoghq.com
·
6d
·
Lobsters
,
Hacker News
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help