Why vLLM is the best choice for AI inference today
developers.redhat.comยท3d
๐Ÿ”„ONNX
Flag this post
Scaling Embeddings with Feast and KubeRay
feast.devยท3dยท
Discuss: Hacker News
๐ŸŒDistributed Computing
Flag this post
Keeping Linux Responsive - Taming the OOM Killer with EarlyOOM
dev.toยท1dยท
Discuss: DEV
๐Ÿ“ŠProfiling Tools
Flag this post
Inference Acceleration from the Ground Up
semiwiki.comยท4d
๐ŸŽฏTensor Cores
Flag this post
ClipTagger-12B VLM: Frame Captioning Tutorial
dev.toยท7hยท
Discuss: DEV
๐Ÿ”„ONNX
Flag this post
FlashWorld: High-quality 3D Scene Generation within Seconds
paperium.netยท11hยท
Discuss: DEV
โšกFlash Attention
Flag this post
Ubuntu Blog: Why we brought hardware-optimized GenAI inference to Ubuntu
ubuntu.comยท3d
โšกONNX Runtime
Flag this post
4x RTX 3090 Setup for Wan2.2-TI2V-5B (FP16)
textimage2video.pyยท4dยท
Discuss: r/LocalLLaMA
๐Ÿ“ˆGPU Occupancy
Flag this post
Prediction: AMD Will Be Worth More Than Broadcom by 2030
fool.comยท6h
๐Ÿ”Nsight
Flag this post
NVIDIA and Samsung working even closer together, new semiconductor AI factory has 50,000+ GPUs
tweaktown.comยท21h
๐Ÿ”Nsight
Flag this post
A Practitioner's Guide to Kolmogorov-Arnold Networks
arxiviq.substack.comยท5hยท
Discuss: Substack
๐Ÿ“‰Model Quantization
Flag this post
Cycle-accurate 6502 emulator as coroutine in Rust
github.comยท1dยท
๐Ÿ“ŠProfiling Tools
Flag this post
Platform generated AI slop at scale
markjgsmith.comยท1h
๐Ÿค–AI Coding Tools
Flag this post
OpenAI and NVIDIA Team Up for Massive AI Infrastructure Deployment
dev.toยท7hยท
Discuss: DEV
๐Ÿ”—NCCL
Flag this post
Finetuning Open-source models with Opus, Sonnet 4.5 and Haiku 4.5
reddit.comยท12hยท
Discuss: r/ClaudeAI
๐Ÿ“‰Model Quantization
Flag this post
Performance evaluation of image convolution with gradient filters in OpenCL
milania.deยท4dยท
Discuss: Hacker News
๐Ÿ”ขcuBLAS
Flag this post
GPU for Prodesk 400 G7 SFF
reddit.comยท2hยท
Discuss: r/sffpc
๐Ÿ”งPTX
Flag this post