⚡ Hardware Acceleration - nmarshall · Scour

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

🔥PyTorch Code

github.com··Hacker News

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

aarushgupta.io··Lobsters, Hacker News

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

🏗️AI Infrastructure Academic

arxiv.org··Hacker News

PCIe Benefits From AI, Despite Scaling Protocols

semiengineering.com·

Nvidia's RTX Spark is a developer's dream, but AMD's Ryzen AI Max+ is what most people actually need for local AI

xda-developers.com·

Full Context on a Vulkan-Only Strix Halo: The Decode-Drop Reproduces, but the Sweet Spot Moves

🎨Shader Programming

thefrontierlab.ai··Hacker News

Release TorchCodec 0.14: HDR Video Decoding for CPU & CUDA, and Fast Wav Decoder · meta-pytorch/torchcodec

🔥PyTorch Code

github.com··Hacker News

Google reportedly orders at least three million chips from Intel to arrive in 2028, as TSMC struggles to keep up with the AI boom

🖥computers News

··Hacker News

Jensen Huang Just Called This the Next Trillion-Dollar AI Chip Stock

🏗️AI Infrastructure

finance.yahoo.com·

geohot/fromthetransistor: From the Transistor to the Web Browser, a rough outline for a 12 week course

🔌FPGA Code

github.com··Hacker News

Mid-range GPUs have largely dodged the memory crisis, but not for much longer

🌟Ray Tracing

xda-developers.com·

I stopped using most of Rust’s advanced features for my ML library

🔥PyTorch Code

github.com··r/rust

🫧 AI Companies' Shared Destiny Recalls Dot-Com Bubble Memories

🏗️AI Infrastructure Discussion

bullbear.ninja··Hacker News

Unpacking AI: The Hardware Behind AI

🧠AI News

pathtostaff.com··Hacker News

EP217: Latency vs Throughput vs Bandwidth

🏗️System Design News Blog

blog.bytebytego.com·

AMD shipped Nvidia's new AI laptop over a year ago, and the software is finally catching up

🌟Ray Tracing

xda-developers.com·

LLM-Based Porting of Optimized C++ to CUDA Through Deoptimization and Reoptimization

🏗️AI Infrastructure Academic

The Edge LLM Offload Story

semiengineering.com·

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

💻Local LLMs Code

github.com··Hacker News

CodegenBench: Can LLMs Write Efficient Code Across Architectures?

🔥PyTorch Academic

arxiv.org··Hacker News

No more posts from nmarshall's subscribed feeds.

Scour all 25255 feeds Learn more about Feeds

Log in to enable infinite scrolling