📉 Model Quantization - minezone · Scour

Quantization-Aware Distillation

ternarysearch.blogspot.com·2d·

Discuss: Hacker News

Image Classification with Convolutional Neural Networks

dev.to·20h·

Discuss: DEV

Large Language Models for Mortals book released

crimede-coder.com·1h·

Discuss: Hacker News

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy

developer.nvidia.com·21h

🧩LLM Integration

the mathematics of compression in database systems

bitsxpages.com·20h

Main Content || Math ∩ Programming

jeremykun.com·1d

Quantized Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with Reynolds-Independent Bond Dimension

zenodo.org·1d·

Discuss: Hacker News

🚀Performance

Tutorial – What is a variational autoencoder?

jaan.io·23h·

Discuss: Hacker News

Your Agent Is Slow Because of Inference

futureagi.com·4d·

Discuss: DEV

🧩LLM Integration

What I've Learned From Digitizing 20 Million Historical Documents

noahdasanaike.github.io·1d·

Discuss: r/LocalLLaMA

🧲Vector Search & Embeddings

Fastfood: Approximate Kernel Expansions in Loglinear Time

paperium.net·2d·

Discuss: DEV

🗂️Vector Databases

Understanding LLM Inference Engines: Inside Nano-vLLM (Part 2)

neutree.ai·4d·

Discuss: Hacker News

🧩LLM Integration

From Pixels to Precision

dev.to·4h·

Discuss: DEV

The State of Agentic Graph RAG

localoptimumai.substack.com·4h·

Discuss: Substack

Geometrically Allocated Ads in AI Conversations

june.kim·13h·

Discuss: Hacker News

💸Affordable LLMs

Building a Production-Ready Claude Streaming API with Next.js Edge Runtime

bydaewon.gumroad.com·1d·

Discuss: DEV

MiRAGE: Open-source framework for multimodal RAG evaluation

news.ycombinator.com·53m·

Discuss: Hacker News

Expectation and Copysets

buttondown.com·21h·

Discuss: Hacker News

How I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

mohammedeabdelaziz.github.io·3d·

Discuss: Hacker News

Tutorial on Agentic Engine

pori.vanangamudi.org·1d·

Discuss: r/LocalLLaMA

Loading more...