🎯 Tensor Cores - miterion · Scour

Deep Integration and the Convergence of Model Architecture and Hardware in AI

dev.to·1d·

Discuss: DEV

Flag this post

Can-t stop till you get enough

cant.bearblog.dev·1d·

Discuss: Hacker News

📜TorchScript

Flag this post

Beyond ImageNet: Understanding Cross-Dataset Robustness of Lightweight Vision Models

arxiv.org·3h

Flag this post

Attention Is All You Need for KV Cache in Diffusion LLMs

paperium.net·4h·

Discuss: DEV

🔲Loop Tiling

Flag this post

A Soft‑Fork Proposal for Blockchain‑Based Distributed AI Computation

hackernoon.com·21h

Flag this post

My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X

gau-nernst.github.io·1d·

Discuss: Hacker News

🎯GPU Kernels

Flag this post

Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)

sebastianraschka.com·1d·

Discuss: r/LLM

👁️Attention Optimization

Flag this post

Hybrid-Attention models are the future for SLMs

inference.net·6h·

Discuss: Hacker News

⚡Flash Attention

Flag this post

Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)

semiengineering.com·11h

🌊CUDA Streams

Flag this post

Don't let these 3 CPU specs trick you into paying more

xda-developers.com·12h

⚡Flash Attention

Flag this post

I made a tensor runtime & inference framework in C (good for learning how inference works)

github.com·1d·

Discuss: r/C_Programming

📜TorchScript

Flag this post

On the Structure of Floating-Point Noise in Batch-Invariant GPU Matrix Multiplication

arxiv.org·3h

Flag this post

AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs

arxiv.org·3h

📊Gradient Accumulation

Flag this post

Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors

arxiv.org·3h

⚡Flash Attention

Flag this post

Kimi Linear: An Expressive, Efficient Attention Architecture

arxiviq.substack.com·2d·

Discuss: Substack

🧩Attention Kernels

Flag this post

The Evolution of GPUs: How Floating-Point Changed Computing

dell.com·1d·

Discuss: Hacker News

Flag this post

Real-time stock volatility prediction with deep learning on a time-series DB

medium.com·34m·

Discuss: Hacker News

⚡ONNX Runtime

Flag this post

MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter

towardsdatascience.com·1d

📉Model Quantization

Flag this post

Neuromorphic Computing: Building Brain-Inspired Processors to Revolutionize Technology

thetasvibe.blogspot.com·1d

⚡Flash Attention

Flag this post

Design of quasi phase matching crystal based on differential gray wolf algorithm

arxiv.org·3h

🌐Distributed Computing

Flag this post

Loading more...