Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Performance Engineering
⚡ Performance Engineering
Optimization, Profiling, Benchmarking, Tuning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
177
posts in
8.2
ms
Rustinel: Open-source Endpoint Detection for Windows and Linux
🏗️
software engineering
Content type:
Blog
linuxtoday.com
·
2d
2 days ago
Actions for Rustinel: Open-source Endpoint Detection for Windows and Linux
Google's latest DiffusionGemma open AI model comes with a 4x speed boost
🤖
AI
Content type:
News
arstechnica.com
·
5h
5 hours ago
Actions for Google's latest DiffusionGemma open AI model comes with a 4x speed boost
Google's new open model DiffusionGemma generates text from noise instead of word by word
🤖
AI
the-decoder.com
·
5h
5 hours ago
Actions for Google's new open model DiffusionGemma generates text from noise instead of word by word
[eCHO News] Episode #105: Cilium on VMware.
eBPF
NFS
Flamegraphs
🏗️
software engineering
isovalent-9197153.hs-sites.com
·
5d
5 days ago
Actions for [eCHO News] Episode #105: Cilium on VMware. eBPF NFS Flamegraphs
HFT
Latency
Monitoring with Probabilistic Calling Context
🏗️
software engineering
hftuniversity.com
·
1d
1 day ago
·
Hacker News
Actions for HFT Latency Monitoring with Probabilistic Calling Context
Supermicro and Arm advance compute for the agentic AI era
🏗️
software engineering
Content type:
Blog
newsroom.arm.com
·
8h
8 hours ago
Actions for Supermicro and Arm advance compute for the agentic AI era
Building &
Benchmarking
: LLMs on a 16GB Jetson Orin NX for Hermes Agent
⚙️
platform engineering
Content type:
Blog
dnhkng.github.io
·
2d
2 days ago
Actions for Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent
Stop
hand-tuning
kernels: How Neuron Agentic Development accelerates AWS Trainium
optimizations
🤖
AI
Content type:
Blog
aws.amazon.com
·
9h
9 hours ago
Actions for Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations
The Edge LLM Offload Story
🤖
LLMs
semiengineering.com
·
6d
6 days ago
Actions for The Edge LLM Offload Story
Location: Edmonton, Canada Remote: Yes Willing to relocate: Yes, within Canada T...
🌐
web development
Content type:
Discussion
news.ycombinator.com
·
3h
3 hours ago
·
Hacker News
Actions for Location: Edmonton, Canada Remote: Yes Willing to relocate: Yes, within Canada T...
ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities
🤖
AI
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities
Simplifying Weak Reference Processing in ZGC
🧠
computer science
inside.java
·
2h
2 hours ago
Actions for Simplifying Weak Reference Processing in ZGC
On-device AI is a margin decision
🤖
AI
Content type:
Blog
ziraph.com
·
6h
6 hours ago
·
Hacker News
Actions for On-device AI is a margin decision
The hidden
bottleneck
in LLM inference and the impact on MLPerf
benchmarking
🤖
LLM
edn.com
·
6d
6 days ago
Actions for The hidden bottleneck in LLM inference and the impact on MLPerf benchmarking
Can Java Microservices Be As Fast As Go? A 2026
Benchmark
Update
🔵
golang
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Can Java Microservices Be As Fast As Go? A 2026 Benchmark Update
🇳🇱 Go/Golang job: Senior Backend
Engineer
(Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
🏗️
software engineering
golangprojects.com
·
9h
9 hours ago
Actions for 🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
bigattichouse/packed-twin-inference: PTI achieves ~2×
throughput
using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once
per
step and produces 4 predictions simultaneously. KV
cache
overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
🖥️
operating systems
Content type:
Code
github.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
[eCHO News] Episode #104: mTLS for Cilium. Lisp for
eBPF
🖥️
operating systems
isovalent-9197153.hs-sites.com
·
5d
5 days ago
Actions for [eCHO News] Episode #104: mTLS for Cilium. Lisp for eBPF
Intel is turning the wrong clock: The Core Ultra 7 265K shows why Arrow Lake loses more at NGU than D2D can recover
🖥️
operating systems
igorslab.de
·
1d
1 day ago
Actions for Intel is turning the wrong clock: The Core Ultra 7 265K shows why Arrow Lake loses more at NGU than D2D can recover
NVIDIA Accelerates Google DeepMind’s DiffusionGemma for
Local
AI
🤖
AI
Content type:
Blog
blogs.nvidia.com
·
8h
8 hours ago
Actions for NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help