LocalLlama · Scour

Open-Source "GreenBoost" Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs

phoronix.com·7w·Hacker News, r/LocalLLaMA

I Ran Kotlin HumanEval on 11 Local LLMs. An 8GB Model Beat Several 30B Models

medium.com·7w·r/LocalLLaMA

ForwardCompatible/GestaltSyntax: An AI-native semantic compression syntax for natural language and code. Specification, examples, compression tests, and a reproducible debugging experiment with unexpected, suggestive results.

github.com·7w·r/LocalLLaMA

llama.cpp build b8338 adds OpenVINO backend + NPU support for prefill + kvcache

github.com·7w·r/LocalLLaMA

Cicikus v3 Prometheus 4.4B - An Experimental Franken-Merge for Edge Reasoning

huggingface.co·7w·Hacker News, r/LocalLLaMA

Cross-Lingual Acoustic Feature Database for Tabular ML and Emotion Recognition

huggingface.co·7w·r/LocalLLaMA

vLLM on Jetson Orin — pre-built wheel with Marlin GPTQ support (3.8x prefill speedup)

github.com·7w·r/LocalLLaMA

StepFun releases SFT dataset used to train Step 3.5 Flash

huggingface.co·7w·r/LocalLLaMA

Nvidia’s Nemotron 3 Super is a Bigger Deal Than You Think

signalbloom.ai·7w·Hacker News, r/LocalLLaMA

(Very) High-Quality Attention Coder-Next GGUFs

huggingface.co·7w·r/LocalLLaMA

How to host and run DeepSeek 671B in your house for under $2,000

thomasunise.com·7w·r/LocalLLaMA

jaden688/JL_Engine-local: JL Engine Local is a local-first runtime and UI stack for the JL Engine.

github.com·7w·r/LocalLLaMA, r/artificial

gkjuwon-ui/agisti: Recursive self-improvement framework for LLMs. No teacher, no RLHF — the model performs surgery on its own weights. Built by a 13-year-old.

github.com·7w·r/LocalLLaMA

https://github.com/mayocream/koharu I plan to add segment and inpaint features to Koharu... I learn Rust for 3 months, and it's my first Rust-written applic...

github.com·804w·Hacker News, r/LocalLLaMA, r/rust

The bias is not in what they say - it's in what they assume about you.

aibyshinde.substack.com·7w·r/LocalLLaMA

dealignai/Nemotron-3-Super-120B-A12B-4bit-MLX-CRACK-Uncensored

huggingface.co·7w·r/LocalLLaMA

50x Faster Post-Training

workshoplabs.ai·7w·Hacker News, r/LocalLLaMA

Fine-tuned Qwen 3.5 2B to beat same-quant 4B, 9B, 27B, and 35B on a real dictation cleanup task, full pipeline, code, and eval (RTX 4080 Super, under £1 compute...

github.com·7w·r/LocalLLaMA

Open source LLM compiler for models on Huggingface. 152 tok/s. 11.3W. 5.3B CPU instructions. mlx-lm: 113 tok/s. 14.1W. 31.4B CPU instructions on macbook M1 Pro.

github.com·7w·r/LocalLLM, r/LocalLLaMA

catcam/hads: Human-AI Document Standard — lightweight convention for AI-optimized technical documentation

github.com·7w·Hacker News, r/GithubCopilot, r/LocalLLaMA

Log in to enable infinite scrolling