Mixed Precision, FP16, WMMA, Matrix Multiplication, Deep Learning Acceleration

Radxa Launches AICore DX-M1 Edge AI Accelerator with DeepX DX-M1 NPU
linuxgizmos.com·2d
🔧PTX
Flag this post
Geonum – geometric number library for unlimited dimensions with O(1) complexity
github.com·12h·
Discuss: Hacker News
✂️CUTLASS
Flag this post
Understanding the Design of Optimizers with me
dev.to·22h·
Discuss: DEV
📊Gradient Accumulation
Flag this post
Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·2d·
Discuss: Substack
🧩Attention Kernels
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.com·1d
🧠CPU Architecture
Flag this post
T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging Analysis
arxiv.org·21h
🏎️TensorRT
Flag this post
We found embedding indexing bottleneck in the least expected place: JSON parsing
nixiesearch.substack.com·9h·
Discuss: Substack
🐕Ruff
Flag this post
A Thesis and Playbook for Edge AI
ondeviceguy.substack.com·15h·
Discuss: Substack
ONNX Runtime
Flag this post
Connectivity Structure and Dynamics of Nonlinear Recurrent Neural Networks
journals.aps.org·2h
📉Model Quantization
Flag this post
Generation at the Speed of Thought: Speculative Decoding
bittere.substack.com·1d·
Discuss: Substack
Flash Attention
Flag this post
New AI models Cursor and Cognition (Windsurf) built on Chinese base models
linkedin.com·1d·
Discuss: r/China
🤖AI Coding Tools
Flag this post
The best AI inference for your project. Blazing fast responses.
dev.to·2h·
Discuss: DEV
Flash Attention
Flag this post
I repurposed my old GPU for self-hosted AI and it changed my life
xda-developers.com·10h
🤖AI Coding Tools
Flag this post
LinEAS: End-to-end Learning of Activation Steering with a Distributional Loss
machinelearning.apple.com·1d
Flash Attention
Flag this post
InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames
arxiv.org·21h
🏎️TensorRT
Flag this post
How We Built a Custom Vision LLM to Improve Document Processing at Grab
engineering.grab.com·2h
🏎️TensorRT
Flag this post
CHERIoT 1.0 Released
cheriot.org·9h·
🔧PTX
Flag this post
The True Cost of AI Integrations: Comparing Performance and Pricing Models for C# Libraries
dev.to·1h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
Unlocking AI Potential: Squeezing Giant Models into Tiny Spaces
dev.to·1d·
Discuss: DEV
📉Model Quantization
Flag this post