⚡ CUDA - moyutianzun · Scour

SK Hynix and Nvidia co-developing memory and memory manufacturing tech

🔺Triton News

blocksandfiles.com·

NVIDIA’s New RTX Spark Superchip Changes Everything for On-the-Go 12K Video Editing and 3D Rendering

🤖agentic system

canonrumors.com·

On-device AI is a margin decision

⚡Inference Optimization Blog

ziraph.com··Hacker News

WSL 3 will finally let Linux apps use your GPU and NPU without the performance tax

xda-developers.com·

Apple WWDC Announces Privacy Leap Off a Cliff

⚡FlashAttention

flyingpenguin.com·

Google Joins Anthropic in Leasing Compute from SpaceX/xAI

🔺Triton Blog

512pixels.net·

Apple extends Private Cloud Compute to third-party data centers

🔲TPU Architecture

helpnetsecurity.com·

Location: Edmonton, Canada Remote: Yes Willing to relocate: Yes, within Canada T...

📊LLM Evaluation Discussion

news.ycombinator.com··Hacker News

Firefox 153 is finally ditching the Nvidia driver workaround Linux users have hated

🔺Triton News

xda-developers.com·

440 Power! Rare 1970 Plymouth Fury Gran Coupe

📊LLM Evaluation

barnfinds.com·

Data center infrastructure startup TensorWave raises $350M to help break Nvidia’s AI chip monopoly

siliconangle.com·

Unreleased RTX 3050 Ti engineering sample appears in photos and benchmarks — the RTX 3060 alternative that never happened

🔺Triton News

tomshardware.com

·

AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference

🔺Triton Academic

Veo with Anders Hellerup Madsen and Gorm Casper

corrode.dev··r/rust

AMD Radeon RX 9070 GRE vs. Nvidia GeForce RTX 5070

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

⚡FlashAttention Code

github.com··Hacker News

Introducing Piper: A Programmable Distributed Training System

🎭Mixture of Experts Academic Blog

syfi.cs.washington.edu··Hacker News

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

⚡Inference Optimization Blog

bric.pe.kr··DEV

DiffusionGemma: The Developer Guide

💾KV Cache Blog

developers.googleblog.com·

BeeLlama.cpp DFlash on Strix Halo: 2.7x Gemma 31B, But MTP Is Still Faster

⚡Inference Optimization

sleepingrobots.com·

Sign up or log in to see more results

Log in to enable infinite scrolling