Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.aiยท2hยท
Discuss: Hacker News
๐Ÿ”„Concurrency
Flag this post
Why is AI Generated Rust slow when compared with Go/C#/Node/JavaScript
srid68.github.ioยท4hยท
Discuss: Hacker News
๐ŸŒWebAssembly
Flag this post
Don't let these 3 CPU specs trick you into paying more
xda-developers.comยท1d
๐Ÿ”„Concurrency
Flag this post
How to Build an Enterprise AI Benchmarking Framework?
dev.toยท22hยท
Discuss: DEV
๐ŸŒWebAssembly
Flag this post
On Designing Low-Latency Systems for High-Traffic Environments
hackernoon.comยท1d
๐Ÿ”„Concurrency
Flag this post
Free Functions Don't Change Performance (Much)
16bpp.netยท1dยท
Discuss: Hacker News, r/cpp
๐Ÿ”„Concurrency
Flag this post
Running MiniMax-M2 locally - Existing Hardware Advice
reddit.comยท3hยท
Discuss: r/LocalLLaMA
๐ŸŒWebAssembly
Flag this post
Inline vs. Pipeline Ray Tracing
evolvebenchmark.comยท6hยท
Discuss: Hacker News
๐Ÿ”„Concurrency
Flag this post
Lazy loading isn't the magic pill to fix AI Inference
tensorfuse-docs.mintlify.devยท5hยท
Discuss: Hacker News
๐ŸŒWebAssembly
Flag this post
Inside Pinecone: Slab Architecture
pinecone.ioยท3hยท
Discuss: Hacker News
๐Ÿ—„๏ธDatabase Design
Flag this post
Benchmarking the cost of Java's EnumSet - A Second Look
kinnen.deยท38mยท
Discuss: r/programming
๐Ÿ”„Concurrency
Flag this post
Essential Things to Know Before Upgrading Your Computer Memory
buysellram.comยท1dยท
Discuss: Hacker News
๐ŸŒWebAssembly
Flag this post
Disciplined Biconvex Programming
arxiv.orgยท15h
๐Ÿ”„Concurrency
Flag this post
Parallel achieves 70% accuracy on SEAL, benchmark for hard web research
parallel.aiยท51mยท
Discuss: Hacker News
๐Ÿ”„Concurrency
Flag this post
Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.netยท15hยท
Discuss: DEV
๐Ÿ”„Concurrency
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.ioยท1dยท
Discuss: Hacker News
๐Ÿ”„Concurrency
Flag this post
Algorithmic Complexity Reduction via Quantized State Space Search
dev.toยท2hยท
Discuss: DEV
๐Ÿ”„Concurrency
Flag this post
Balancing Cost, Power, and AI Performance
oreilly.comยท1h
๐Ÿ”ŒAPI Development
Flag this post
eBPF Tutorial by Example: Monitoring GPU Driver Activity with Kernel Tracepoints
dev.toยท12hยท
Discuss: DEV
๐Ÿ”„Concurrency
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.comยท2d
๐Ÿ”„Concurrency
Flag this post