⚙️ Finetuning LLMs faster with less memory - autocole · Scour

DeepSeek-V4-Flash makes LLM steering interesting again 🔵LLM frameworks and AI libraries for TypeScript

seangoedecke.com·4d·Lobsters, Hacker News

Pen and Paper LM 160: a language model small enough to run by hand 🦙Simple finetuning LLMs

maciej.bearblog.dev·6d·Hacker News

NGM: A Plug-and-Play Training-Free Memory Module for LLMs 🔵LLM frameworks and AI libraries for TypeScript

Benchmarking Subquadratic's latest model and SSA Kernel 🔥Svelte

appen.com·5d·Hacker News

MiniCPM-V 4.6: The 1.3B Model Running on Your Phone That Challenges Much Larger Rivals 🦙Simple finetuning LLMs

firethering.com·6d·Hacker News

How do I get the superfast DFlash / MTP tokens per second that I'm seeing on here? Dual 3090s 🔥Svelte

github.com·2d·r/LocalLLaMA

Flash PD-SSM: Memory-Optimized Structured Sparse State-Space Models 🔵LLM frameworks and AI libraries for TypeScript

LLM Memory Calculator: Online Estimators Miss 40% Usage 🦙Simple finetuning LLMs

tildalice.io·6d

(VBS-NN) ML – 512k context length pre-training on a 12GB GPU 📊Vector Databases

github.com·2d·Hacker News

Stage-adaptive Token Selection for Efficient Omni-modal LLMs 🦙Simple finetuning LLMs

SpecSA: Bridging Speculative Decoding and Sparse Attention for Efficient LLM Inference 🔵LLM frameworks and AI libraries for TypeScript

TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload 🔵LLM frameworks and AI libraries for TypeScript

PerfCodeBench: Benchmarking LLMs for System-Level High-Performance Code Optimization 🔥Svelte

Prompt Optimization for LLM Code Generation via Reinforcement Learning 🔄AI Pipeline design and techniques

Lossless Anti-Distillation Sampling 📊Vector Databases

The Silent Hyperparameter: Quantifying the Impact of Inference Backends on LLM Reproducibility 🔵LLM frameworks and AI libraries for TypeScript

A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM 🔵LLM frameworks and AI libraries for TypeScript

Formal Skill: Programmable Runtime Skills for Efficient and Accurate LLM Agents 🔵LLM frameworks and AI libraries for TypeScript

PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents 🔄AI Pipeline design and techniques

2.3x KV Cache Compression at 32k Context 🦙Simple finetuning LLMs

github.com·5d·Hacker News

Sign up or log in to see more results

Log in to enable infinite scrolling