📉 Model Quantization - miterion · Scour

Quantization-Aware Distillation

ternarysearch.blogspot.com·2d·

Discuss: Hacker News

🎓Model Distillation

Regularized Calibration with Successive Rounding for Post-Training Quantization

arxiv.org·4d

🏎️TensorRT

Image Classification with Convolutional Neural Networks

dev.to·12h·

Discuss: DEV

Main Content || Math ∩ Programming

jeremykun.com·1d

🔗Kernel Fusion

Tutorial – What is a variational autoencoder?

jaan.io·14h·

Discuss: Hacker News

🏎️TensorRT

Autoregressive Image Generation with Masked Bit Modeling

arxiv.org·3h

⚡Flash Attention

Quantized Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with Reynolds-Independent Bond Dimension

zenodo.org·19h·

Discuss: Hacker News

🏎️TensorRT

Faster AI Training Unlocked With New System For Massive Language Models

quantumzeitgeist.com·18h

🎯Tensor Cores

Writing a ONNX Neural Network Inference Engine from Scratch in C to run image classification with MobileNetV2

flexw.github.io·1d·

Discuss: r/C_Programming

⚡ONNX Runtime

Guide: Getting started with choosing a Machine Learning CLIP Model for Smart Search · immich-app/immich

github.com·8h

👁️Attention Optimization

Expectation and Copysets

buttondown.com·13h·

Discuss: Hacker News

SAE Feature Matchmaking (Layer-to-Layer)

lesswrong.com·3h

A Note on Flat Abstract Syntax Trees

gist.github.com·13h·

Discuss: Hacker News

🔬Static Analysis

Manufacturing QMS Software

samrian.com·16h·

Discuss: Hacker News

⏱️Benchmarking

Sense8 WorldToolKit Demo v1.01 : Sense8 : Free Download, Borrow, and Streaming

archive.org·9h

🏎️TensorRT

the mathematics of compression in database systems

bitsxpages.com·12h

📈Occupancy Optimization

Gated Attention & DeltaNets: The Missing Link for Long-Context AI

pub.towardsai.net

·2h

👁️Attention Optimization

Scale LLM fine-tuning with Hugging Face and Amazon SageMaker AI

aws.amazon.com·15h

🎓Model Distillation

Geometrically Allocated Ads in AI Conversations

june.kim·5h·

Discuss: Hacker News

🧩Attention Kernels

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy

developer.nvidia.com·13h

🏎️TensorRT

Loading more...