🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🧠 unified memory

gpu localized ai

NVFP4 Trains with Precision of 16-Bit and Speed and Efficiency of 4-Bit
developer.nvidia.com·6d·
Discuss: Hacker News, Hacker News
💼ai-run businesses
Why I Ditched Malloc for AI Inference
gilli.dev·2d·
Discuss: Hacker News
🦀Rust
Huawei preps AI SSD to ease GPU memory bottlenecks
blocksandfiles.com·4d
🏢oxide computer
Dynamic KV Cache Scheduling in Heterogeneous Memory Systems for LLM Inference (Rensselaer Polytechnic Institute, IBM)
semiengineering.com·2d
🦀Rust
When it comes to running Ollama on your PC for local AI, one thing matters more than most — here's why
windowscentral.com·5d
🏢oxide computer
Fast CUDA DFloat11 decoding kernel
reddit.com·6d·
Discuss: r/LocalLLaMA
🦀Rust
MSNav: Zero-Shot Vision-and-Language Navigation with Dynamic Memory and LLM Spatial Reasoning
arxiv.org·5d
🤖agentic coding
AMD's Next-Gen UDNA: Four Die Sizes, One Potential 96-CU Flagship
techpowerup.com·3d·
Discuss: r/LocalLLaMA
🦀Rust
How AI Is Reshaping the Value of SSDs and DDR
dev.to·3d·
Discuss: DEV
💼ai-run businesses
Matrix Multiplication on Nvidia's Blackwell: Part 1 – Introduction
modular.com·1d·
Discuss: Hacker News
🦀Rust
LLM VRAM Usage Cut by 45x? What Jet-Nemotron Means for Local Users
hardware-corner.net·4d·
Discuss: Hacker News
🦀Rust
NVIDIA details Blackwell Ultra GB300: dual-die design, 208B transistors, up to 288GB HBM3E
tweaktown.com·4d
🏢oxide computer
Artificial neuron merges DRAM with MoS₂ circuits to better emulate brain-like adaptability
techxplore.com·19h
🏢oxide computer
Show HN: Paragon: A Go-native AI framework with WebGPU/Vulkan (no CUDA lock-in)
openfluke.com·4d·
Discuss: Hacker News
🦀Rust
Check this one out ! Built my own AI second brain using Claude as the final boss dev (8 months journey)
oneeko.ai·4d·
Discuss: r/ClaudeAI
💼ai-run businesses
vLLM Performance Tuning: The Ultimate Guide to xPU Inference Configuration
cloud.google.com·5d
🏢oxide computer
Long Shot: augmenting COCONUT with a working memory
github.com·5d·
Discuss: r/LocalLLaMA
💼ai-run businesses
Designing AI factories: Purpose-built, on-prem GPU data centers
datasciencecentral.com·4d
💼ai-run businesses
Why are CUDA kernels hard to optimize?
johndcook.com·3d·
Discuss: Hacker News
🦀Rust
Everyone talks about AI “memory,” but nobody defines it.
threadreaderapp.com·3d
🤖agentic coding
Loading...Loading more...
AboutBlogChangelogRoadmap