LocalLlama · Scour

Intel Arc Pro B70 Preliminary testing results(includes some gaming)

forum.level1techs.com·6w·r/LocalLLaMA

Intel Targets AI Workstations With Memory-Stuffed Arc Pro B70 and B65 GPUs

pcmag.com·6w·r/LocalLLaMA

GolfStudent v2 14L: d=352, Value Residuals, GPTQ-lite, Schedule-Free, Muon+EMA by whitestone1121-web · Pull Request #604

github.com·6w·Hacker News, r/LocalLLaMA

Tesslate/OmniCoder-2-9B-GGUF

huggingface.co·6w·r/LocalLLaMA

TurboQuant: Redefining AI efficiency with extreme compression

research.google·6w·Lobsters, Hacker News, Hacker News, r/LocalLLaMA, r/artificial, r/programming

brjen/pytorch-memory-fix: Two environment variables that fix PyTorch/glibc memory creep on Linux. Zero code changes. Zero performance cost.

github.com·6w·Hacker News, r/LocalLLaMA

[Security]: CRITICAL: Malicious litellm_init.pth in litellm 1.82.8 — credential stealer via PyPI supply chain · Issue #24512

github.com·6w·Hacker News, Hacker News, r/LocalLLaMA, r/Python, r/devops, r/selfhosted

meganoob1337/NoobScribe: NoobScribe is a Whisper-compatible transcription API and Web UI for recording meetings, transcribing audio, and remembering speakers across sessions with diarization and speaker memory.

github.com·6w·r/LocalLLaMA

Litellm 1.82.7 and 1.82.8 on PyPI are compromised, do not update!

futuresearch.ai·6w·Lobsters, Hacker News, Hacker News, r/LocalLLaMA, r/Python, r/programming

Devstral-Small-2-24B fine-tuned on Claude 4.6 Opus reasoning traces [GGUF Q4+Q5]

huggingface.co·6w·r/LocalLLaMA

darkc0de/Mistral-Small-4-119B-2603-heretic

huggingface.co·6w·r/LocalLLaMA

Request: Training a pretrained, MoE version of Mistral Nemo

huggingface.co·6w·r/LocalLLaMA

cenconq25/delta-compress-llm: Exploiting temporal coherence in LLM inference-- delta encoding for KV cache compression and weight-skip prediction. Achieves F16-quality KV cache at Q4_0 compression ratios with zero perplexity loss on llama.cpp.

github.com·6w·Hacker News, r/LocalLLaMA

n57d30top/graph-assist-npu-array-v1-direct-add-commit-add-hi-tap: Curated open-source export for the graph-assist NPU Array V1 direct-add-commit-add-hi-tap branch

github.com·6w·r/LocalLLaMA

SirhanMacx/mcp-registry: Community registry for Model Context Protocol (MCP) servers — verified install commands, tool listings, structured metadata

github.com·6w·Hacker News, r/LocalLLaMA, r/mcp

Qwen3.5-9B finetune/export with Opus 4.6 reasoning distillation + mixed extras

huggingface.co·6w·r/LocalLLaMA

Do not use mixed KV cache quantization

blog.foodnik.app·6w·r/LocalLLaMA

GitHub here . You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inferen...

github.com·162w·r/LocalLLaMA, r/LocalLLaMA

Green0-0/llm_datasets: A collection of high quality huggingface datasets.

github.com·6w·r/LocalLLaMA

has anyone tried this? Flash-MoE: Running a 397B Parameter Model on a Laptop

github.com·6w·r/LocalLLaMA

Log in to enable infinite scrolling