Cache Optimization

Feeds to Scour
SubscribedAll
Scoured 32 posts in 7.5 ms

Toward Intelligent Prefetching: A Survey on Complex Memory Access Prediction Techniques

 🏗️Hardware Architecture  Content type: Academic
arxiv.org·

I wrote a simple multithreaded code in Rust, but the performance didn’t increase much with an…

 🦀rust  Content type: Blog
medium.com
·

Zero-Click HFP/A2DP Takeover via L2CAP Session Preemption

 🧠Memory Management

Massive AI Storage Demand Creates a New Memory Wall

 🔧Data Engineering  Content type: News
eetimes.com·

The Return of Rigorous Full-System Timing Simulation

 🏗️Hardware Architecture
sigarch.org··Hacker News

Beyond the Memory Wall: The CPU Was Helping You All Along

 🏗️Hardware Architecture  Content type: Blog
prawns.dev··Hacker News

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

 🏗️Hardware Architecture  Content type: Blog
tilert.ai··Hacker News

Simplifying Weak Reference Processing in ZGC

 🧮Algorithms
inside.java·

Release 0.17.6: Merge pull request #3782 from tigerbeetle/release-2026-06-05 · tigerbeetle/tigerbeetle

 🚀DevOps  Content type: Code
github.com·

Lexar's "AI Storage Stick" Concept Calls for Treating M.2 NVMe SSDs Like Memory-expansion Cartridges

 🏗️Hardware Architecture
techpowerup.com·

The Inference Alpha: Maximizing Frontier Models on AMD

 🏗️Hardware Architecture  Content type: Blog
digitalocean.com·

[Dev Weekly #114] Google’s Gemma 4 Changes the Game | Ruby Performance Secrets Exposed | Trust Over Velocity - The Miners

 🤖Copilot  Content type: Blog
blog.codeminer42.com·

How Will the AI IC Market Evolve Amid Rising Artificial Intelligence Adoption Through 2034?

 🤖AI  Content type: Blog

ScaleDisturb: Exploiting Temporal Asymmetry to Amplify Read Disturbance in Modern DRAM Chips

 🔍Memory Profilers  Content type: Academic
arxiv.org·
Less-relevant results

Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations

 ☁️AWS Infrastructure  Content type: Blog
aws.amazon.com·

dod hash-trie - gardenweb

 🧮Algorithms

New comment by Nya-kundi in "Ask HN: Who wants to be hired? (June 2026)"

 🦊GitLab

A Database You Can See

 ⚛️Atomic Databases  Content type: Blog
nockawa.github.io·

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

 🏗️Hardware Architecture  Content type: Code
github.com··r/LocalLLaMA

The hidden bottleneck in LLM inference and the impact on MLPerf benchmarking

 📈Time Series
edn.com·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help