Cache Optimization, Memory Access Patterns, Hardware Prefetcher, Performance

Feeds to Scour
SubscribedAll
Scoured 2396 posts in 40.1 ms
Accelerating Long-Context Inference with Skip Softmax in NVIDIA TensorRT-LLM
developer.nvidia.com·6h
🧠LLM Inference
Preview
Report Post
The Attention Hybrid MoE Architecture is the Future. Now, AI Labs Should Dedicate Resources to Improve Long Context Recall Capabilities.
reddit.com·17h·
Discuss: r/LocalLLaMA
🏗️LLM Infrastructure
Preview
Report Post
Use GWP-ASan to detect exploits in production environments
blog.trailofbits.com·15h
🧠Memory Allocators
Preview
Report Post
Intel's Cache Aware Scheduling Presentation At LPC 2025
phoronix.com·7h
⚙️Mechanical Sympathy
Preview
Report Post
ByteDance Had Billion-Scale Vector Search Problem. It Solved with Hybrid Search
velodb.io·29m·
Discuss: Hacker News
🗂️Vector Indexes
Preview
Report Post
Windows Exploitation Techniques: Winning Race Conditions with Path Lookups
projectzero.google·19h
🕳LLM Vulnerabilities
Preview
Report Post
Emulating avx-512 intrinsics in Miri
tweedegolf.nl·17h
🔄SIMD Programming
Preview
Report Post
The Big LLM Architecture Comparison
magazine.sebastianraschka.com·20h
🧠LLM Inference
Preview
Report Post
Research Bits: Dec. 16
semiengineering.com·19h
💻Chips
Preview
Report Post
Experiments with Memory Integrity Enforcement
octet-stream.net·22h·
Discuss: Hacker News
🧠Memory Allocators
Preview
Report Post
How Uber Indexes Streaming Data with Pull-Based Ingestion in OpenSearch™
uber.com·13h
📥Feed Aggregation
Preview
Report Post
How brain-inspired algorithms could drive down AI energy costs
techxplore.com·12h
📱Edge AI Optimization
Preview
Report Post
Everybody Codes 2025 week 4
blog.firedrake.org·18h
🚀Async Optimization
Preview
Report Post
koopman-checksum: a Rust implementation of Koopman checksums which provide longer Hamming-Distance 3 protection than Adler or Fletcher
reddit.com·23h·
Discuss: r/rust
🔒Borrow Checker
Preview
Report Post
A proof of concept of a semistable C++ vector container
github.com·14h·
Discuss: Hacker News, r/cpp
🔓Lock-Free Structures
Preview
Report Post
How WebSockets cost us $1M on our AWS bill
recall.ai·15h
📡Network Latency
Preview
Report Post
Boost GPU Memory Performance with No Code Changes Using NVIDIA CUDA MPS
developer.nvidia.com·10h
🖥GPUs
Preview
Report Post
Nemotron 3 Nano 30B is Amazing! (TLDR)
reddit.com·6h·
Discuss: r/LocalLLaMA
🔐Hardware Security
Preview
Report Post
Lessons from building a content scanner for multiple social platforms
keywordspal.com·7h·
Discuss: Hacker News
🔎Meilisearch
Preview
Report Post
Optimizing Semiconductor Defect Classification with Generative AI and Vision Foundation Models
developer.nvidia.com·1h
🔬Chip Fabrication
Preview
Report Post