⚙️ Finetuning LLMs faster with less memory - autocole · Scour

How LLM Inference Works 🦙Simple finetuning LLMs

arpitbhayani.me·6d·Hacker News

Well! I Am Finally Optimistic About the SubTuringBradBot Project!: Monday MAMLMs 🦙Simple finetuning LLMs

braddelong.substack.com·2d·Substack

8 Architectural Pillars to Boost GenAI LLM Accuracy and Performance in Low Cost 🔵LLM frameworks and AI libraries for TypeScript

techcommunity.microsoft.com·4d

MoE LLMs Confront Real-World Hardware Noise 🦙Simple finetuning LLMs

startuphub.ai·6d

DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training 🦙Simple finetuning LLMs

HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top) 🦙Simple finetuning LLMs

hwebench.com·4d·Hacker News

Faster, cheaper, just as smart: Improving the economics of LLM inference with speculative decoding 🔵LLM frameworks and AI libraries for TypeScript

DFlash: The Trick That Makes LLMs Stop Crawling One Token at a Time 🦙Simple finetuning LLMs

abvcreative.medium.com·4d

Show HN: GPT-2 inference in pure C#, 0 bytes allocated per token 🦀Rust language vector embeddings

github.com·2d·Hacker News

#1 on the leading AI memory benchmark using a smaller, cheaper model 🔄AI Pipeline design and techniques

exabase.io·4d·Hacker News

Fast B-Trees 📊Vector Databases

news.ycombinator.com·5d·Hacker News

OpenSQL: Data-Efficient Text-to-SQL for Open-Source LLMs via Synthesized Intermediate Supervision 🔵LLM frameworks and AI libraries for TypeScript

Ollama vs vLLM vs llama.cpp: Which Wins for Your Use Case 🦙Simple finetuning LLMs

tildalice.io·4d

Understanding Inference Scaling for LLMs: Bottlenecks, Trade-offs, and Performance Principles 🔵LLM frameworks and AI libraries for TypeScript

Faster Tokens Please 🔥Svelte

newsletter.semianalysis.com

·6d·Hacker News

Slow Memory, Slow Conflict 🔥Svelte

Chasing Bit-Equality Through a Wasm Actor Runtime 🧩WASI

abacusnoir.com·3d

How do I get the superfast DFlash / MTP tokens per second that I'm seeing on here? Dual 3090s 🔥Svelte

github.com·2d·r/LocalLLaMA

MinIO Introduces MemKV for Petabyte-Scale AI Inference Memory 📊Vector Databases

storagereview.com·6d

DeepSeek-V4-Flash makes LLM steering interesting again 🔵LLM frameworks and AI libraries for TypeScript

seangoedecke.com·4d·Lobsters, Hacker News

Log in to enable infinite scrolling