Mixed Precision, FP16, WMMA, Matrix Multiplication, Deep Learning Acceleration

GEN-0: SoTA 10B+ Foundation Model for Robotics with Harmonic Reasoning
generalistai.com·6h·
Discuss: Hacker News
📊Gradient Accumulation
Flag this post
How We Built a Custom Vision LLM to Improve Document Processing at Grab
engineering.grab.com·1d·
Discuss: Hacker News
🏎️TensorRT
Flag this post
Inference Acceleration from the Ground Up
semiwiki.com·6d
🧠CPU Architecture
Flag this post
Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play
arxiv.org·21h
🔄ONNX
Flag this post
Spiking Neural Networks: The Next Leap in AI Power Efficiency by Arvind Sundararajan
dev.to·1h·
Discuss: DEV
ONNX Runtime
Flag this post
Real-time Semantic Segmentation for AR Glasses: Dynamic Occlusion Handling via Bayesian Fusion
dev.to·17h·
Discuss: DEV
🏎️TensorRT
Flag this post
Show HN: ReadMyMRI DICOM native preprocessor with multi model consensus/ML pipes
github.com·3h·
Discuss: Hacker News
🏎️TensorRT
Flag this post
flowengineR: A Modular and Extensible Framework for Fair and Reproducible Workflow Design in R
arxiv.org·21h
🔄ONNX
Flag this post
World Simulation with Video Foundation Models for Physical AI
arxiv.org·21h
🏎️TensorRT
Flag this post
Contrastive Knowledge Transfer and Robust Optimization for Secure Alignment of Large Language Models
arxiv.org·1d
🎓Model Distillation
Flag this post
Probabilistic Robustness for Free? Revisiting Training via a Benchmark
arxiv.org·21h
📊Gradient Accumulation
Flag this post
A Comparative Analysis of LLM Adaptation: SFT, LoRA, and ICL in Data-Scarce Scenarios
arxiv.org·21h
🎓Model Distillation
Flag this post
Scalable In-Memory Associative Processing for Graph Neural Network Inference
dev.to·2d·
Discuss: DEV
Flash Attention
Flag this post
Building WriteRight: My Journey Creating an AI Writing Assistant with Mastra
dev.to·1d·
Discuss: DEV
🤖AI Coding Tools
Flag this post
T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging Analysis
arxiv.org·1d
🏎️TensorRT
Flag this post
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
arxiv.org·21h
📊Gradient Accumulation
Flag this post
Beyond Bandwidth: AI's Quantum Leap in Image Transmission
dev.to·15h·
Discuss: DEV
Flash Attention
Flag this post
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models
arxiv.org·1d
🛠Ml-eng
Flag this post
VISAT: Benchmarking Adversarial and Distribution Shift Robustness in Traffic Sign Recognition with Visual Attributes
arxiv.org·1d
🧮cuDNN
Flag this post