Trying Out C++26 Executors
🌊Glommio
Flag this post
Strix Halo, Debian 13@6.16.12&6.17.8, Qwen3Coder-Q8 CTX<=131k, llama.cpp@Vulkan&ROCm, Power & Efficiency
📊Performance Tools
Flag this post
The Engineering Guide to Efficient LLM Inference: Metrics, Memory, and Mathematics
pub.towardsai.net·2d
📊Profile-Guided Optimization
Flag this post
EP190: Cloudflare vs. AWS vs. Azure
blog.bytebytego.com·16h
🔭Tracing
Flag this post
OSS Friday Update
🌊Glommio
Flag this post
How Google Does It: Building the largest known Kubernetes cluster, with 130,000 nodes
☸️Kubernetes
Flag this post
Show HN: Lite³ – A JSON-Compatible Zero-Copy Serialization Format in 9.3 KB of C
📄FlatBuffers
Flag this post
September 2024 Progress in Guaranteed Safe AI
lesswrong.com·2d
🧮SMT Solvers
Flag this post
20x Faster TRL Fine-tuning with RapidFire AI
huggingface.co·2d
📊Performance Tools
Flag this post
LLM APIs are a Synchronization Problem
🦙Ollama
Flag this post
I got frustrated with existing web UIs for local LLMs, so I built something different
🦙Ollama
Flag this post
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization
engineering.fb.com·1d
🚀Performance
Flag this post
Loading...Loading more...