Performance

Feeds to Scour
SubscribedAll
Scoured 182 posts in 6.6 ms

G.Skill explains how AMD EXPO ULL unlocks additional performance — expanded profiles allow memory makers to include subtiming tweaks for the first time

 🧠Memory Allocators  Content type: News
tomshardware.com
·

Benchmarking OpenZFS vs EXT4 for my NAS | Heitor's log

 🏠Self-Hosting
heitorpb.github.io·

Records in Production: Where They Shine and Where They Silently Fail

 🧠Memory Management
javacodegeeks.com·

Apple WWDC On-Device AI Deep Dive - Google Docs

 🤖AI Agents
gist.is··Hacker News

Intel is turning the wrong clock: The Core Ultra 7 265K shows why Arrow Lake loses more at NGU than D2D can recover

 🧠CPU Architecture
igorslab.de·

ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities

 ⏱️Tokio  Content type: Academic
arxiv.org·

Elasticsearch simdvec deep-dive: Walking the memory tightrope to 2x better vector throughput

 🧠CPU Architecture  Content type: Blog
elastic.co·

Why AI code optimization needs production-grounded benchmarks

 🖥️Systems Programming  Content type: Blog
datadoghq.com··Hacker News

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

 🤖AI Agents  Content type: Blog
blogs.nvidia.com·

Now available: Amazon EC2 M9g and M9gd instances powered by new AWS Graviton5 processors

 🤖AI Agents  Content type: Blog
aws.amazon.com··Hacker News

MLPerf and the rise of latency-aware LLM benchmarking

 🧠AI Research
edn.com·

HFT Latency Monitoring with Probabilistic Calling Context

 ⚙️Compilers

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

 🤖AI Agents  Content type: Blog
tilert.ai··Hacker News

The Inference Alpha: Maximizing Frontier Models on AMD

 📱Edge Computing  Content type: Blog
digitalocean.com·

Why your database benchmarking data is probably wrong (and how I fixed mine)

 ⚙️Database Internals
developers.redhat.com·

SanDisk's massive 8TB SD cards are finally close to launch

 🔐Hardware Security  Content type: News
techspot.com·

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

 📱Edge AI  Content type: Blog
dnhkng.github.io·

Tried to benchmark Google's new on-device dictation model and basically couldn't

 📱Edge AI
getonit.ai··Hacker News

Massive AI Storage Demand Creates a New Memory Wall

 📱Edge AI  Content type: News
eetimes.com·

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

 🧠Memory Allocators  Content type: Code
github.com··r/LocalLLaMA

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help