Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·2h·
Discuss: Hacker News
🤖AI
Flag this post
Dual 5090 work station for SDXL
reddit.com·8h·
Discuss: r/LocalLLaMA
🤖AI
Flag this post
Spiking Neural Networks: The Future of Brain-Inspired Computing
arxiv.org·1d
🤖AI
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·1d·
Discuss: Hacker News
🤖AI
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·3h·
Discuss: Hacker News
🏛️Software Architecture Patterns
Flag this post
Inline vs. Pipeline Ray Tracing
evolvebenchmark.com·6h·
Discuss: Hacker News
🤖AI
Flag this post
Lazy loading isn't the magic pill to fix AI Inference
tensorfuse-docs.mintlify.dev·5h·
Discuss: Hacker News
🏛️Software Architecture Patterns
Flag this post
Parallel achieves 70% accuracy on SEAL, benchmark for hard web research
parallel.ai·53m·
Discuss: Hacker News
🤖AI
Flag this post
NVIDIA Sends a Powerful GPU to Space
spectrum.ieee.org·1d·
🏛️Software Architecture Patterns
Flag this post
Dive into Systems
diveintosystems.org·1d·
Discuss: Hacker News
⚙️DevOps Practices
Flag this post
Geonum – geometric number library for unlimited dimensions with O(1) complexity
github.com·1d·
Discuss: Hacker News
💻Programming
Flag this post
r/mathematics
reddit.com·6h·
Discuss: r/mathematics
💻Programming
Flag this post
Scaling up Prime Video monitoring service reduced costs 90% (archive) (2023)
web.archive.org·21h·
Discuss: Hacker News
🏛️Software Architecture Patterns
Flag this post
Exploring a space-based, scalable AI infrastructure system design
research.google·3h·
Discuss: Hacker News
🏛️Software Architecture Patterns
Flag this post
Why is AI Generated Rust slow when compared with Go/C#/Node/JavaScript
srid68.github.io·4h·
Discuss: Hacker News
💻Programming
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.me·5d·
Discuss: Hacker News
🏛️Software Architecture Patterns
Flag this post
Building Yantra: A Visual Workflow Automation Engine
patali.dev·1d·
Discuss: Hacker News
⚙️DevOps Practices
Flag this post
Small Vs. Large Language Models
semiengineering.com·1d·
Discuss: Hacker News, r/LLM
🤖AI
Flag this post
ParallelMind Engine: First AI System with Parallel Logical Reasoning (202+ problems/sec)
github.com·2d·
Discuss: r/programming
🤖AI
Flag this post
KTransformers Open Source New Era: Local Fine-tuning of Kimi K2 and DeepSeek V3
reddit.com·7h·
Discuss: r/LocalLLaMA
🤖AI
Flag this post