✨ Computer Graphics - jhcha.oyo · Scour

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

⚡Flash Attention Code

github.com··Hacker News

AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference

🤖AI Academic

RenderLab – Prototype rendering techniques and renderers in the browser

🎮Game Development

pub.prklinteractive.com··Hacker News

1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM

smolhub.com··r/LocalLLaMA

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

💬LLMs News

newsletter.semianalysis.com

··Hacker News

Open source building blocks for computational design. Est. 2006

💻Programming Languages

thi.ng··Hacker News

An introduction to the Linux graphics stack

📚Speculative Fiction

Unsloth Gemma 4 QAT

⚡Quantization

Path-Traced Inverse Rendering with Global Illumination in 3D Gaussian Fields

🎮Game Development Academic

nex-agi/Nex-N2-mini • Huggingface

huggingface.co··r/LocalLLaMA

I stopped using most of Rust’s advanced features for my ML library

🤖AI Code

github.com··r/rust

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

🖥️GPU Programming Academic

sgl-project/sglang-omni: SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models

⚡Hardware Acceleration Code

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

🤖AI Academic

arxiv.org··Hacker News

ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling

🖥️GPU Programming Academic

Density Field State Space Models: 1-Bit Distillation, Efficient Inference, and Knowledge Organization in Mamba-2

🖥️GPU Programming Academic

Parallel Causal Associative Fields: Gated Sparse Memory for Long-Context Language Modeling

⚡Transformers Academic

Does anyone know what PCIe mode was used for these benchmarks?

💬LLMs Code

github.com··r/LocalLLaMA

Communication Strategy Selection for Multi-GPU 3D FDTD with Convolutional Perfectly Matched Boundary Layers

🖥️GPU Programming Academic

LLM-Based Porting of Optimized C++ to CUDA Through Deoptimization and Reoptimization

🖥️GPU Programming Academic

Log in to enable infinite scrolling