Mixed Precision, FP16, WMMA, Matrix Multiplication, Deep Learning Acceleration

Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·11h·
Discuss: Substack
🧩Attention Kernels
Flag this post
Radxa Launches AICore DX-M1 Edge AI Accelerator with DeepX DX-M1 NPU
linuxgizmos.com·10h
🔧PTX
Flag this post
Quantum-Resistant Federated Learning with Homomorphic Encryption for Medical Imaging Diagnostics
dev.to·34m·
Discuss: DEV
🎓Model Distillation
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.me·2d·
Discuss: Hacker News
🎯GPU Kernels
Flag this post
AndesVL Technical Report: An Efficient Mobile-side Multimodal Large LanguageModel
paperium.net·9h·
Discuss: DEV
🏎️TensorRT
Flag this post
zFLoRA: Zero-Latency Fused Low-Rank Adapters
arxiv.org·2d
ONNX Runtime
Flag this post
TinyML is the most impressive piece of software you can run on any ESP32
xda-developers.com·1d
ONNX Runtime
Flag this post
Yes, you should understand backprop (2016)
karpathy.medium.com·4h·
Discuss: Hacker News
📊Gradient Accumulation
Flag this post
An intro to the Tensor Economics blog
lesswrong.com·3d
🏎️TensorRT
Flag this post
The Role of GPUs in Accelerating Deep Learning Training
acecloud.ai·2d·
Discuss: DEV
🔗NCCL
Flag this post
Inference Acceleration from the Ground Up
semiwiki.com·3d
🧠CPU Architecture
Flag this post
Squeezing AI into Tiny Spaces: The Integer Revolution
dev.to·2d·
Discuss: DEV
📉Model Quantization
Flag this post
The next RISC-V processor frontier: AI
edn.com·1d
🧠CPU Architecture
Flag this post
Duality-Based Fixed Point Iteration Algorithm for Beamforming Design in ISAC Systems
arxiv.org·2d
🔗Kernel Fusion
Flag this post
Best Digital Marketing Institute in Allahabad – Ndmit Prayagraj
ndmit.com·3h·
Discuss: Hacker News
🎓Model Distillation
Flag this post
Show HN: Fast-posit, sw implementation of posit arithmetic in Rust
github.com·2d·
Discuss: Hacker News
🔍Type Checkers
Flag this post
Review of Intel-based UP AI development kits – Part 1: Unboxing and first boot to Ubuntu Pro 24.04
cnx-software.com·1d
🔍Nsight
Flag this post
Physics informed machine learning based predictive control for intelligent operation of edge datacenters
sciencedirect.com·13h
🎓Model Distillation
Flag this post
Sparse Adaptive Attention “MoE”: How I Solved OpenAI’s $650B Problem With a £700 GPU
medium.com·4d·
Flash Attention
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
paperium.net·8h·
Discuss: DEV
🏎️TensorRT
Flag this post