Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LocalLlama
reddit.com
Intel Arc Pro
B70
Preliminary
testing results(includes some gaming)
forum.level1techs.com
·
6w
·
r/LocalLLaMA
Intel Targets AI Workstations With Memory-Stuffed Arc Pro
B70
and
B65
GPUs
pcmag.com
·
6w
·
r/LocalLLaMA
GolfStudent
v2 14L: d=352, Value Residuals, GPTQ-lite, Schedule-Free, Muon+EMA by
whitestone1121-web
· Pull Request #604
github.com
·
6w
·
Hacker News
,
r/LocalLLaMA
Tesslate/OmniCoder-2-9B-GGUF
huggingface.co
·
6w
·
r/LocalLLaMA
TurboQuant
:
Redefining
AI efficiency with extreme compression
research.google
·
6w
·
Lobsters
,
Hacker News
,
Hacker News
,
r/LocalLLaMA
,
r/artificial
,
r/programming
brjen/pytorch-memory-fix
: Two environment variables that fix
PyTorch/glibc
memory creep on Linux. Zero code changes. Zero performance cost.
github.com
·
6w
·
Hacker News
,
r/LocalLLaMA
[Security]: CRITICAL: Malicious litellm_
init.pth
in litellm 1.82.8 — credential
stealer
via PyPI supply chain · Issue #24512
github.com
·
6w
·
Hacker News
,
Hacker News
,
r/LocalLLaMA
,
r/Python
,
r/devops
,
r/selfhosted
meganoob1337/NoobScribe
:
NoobScribe
is a Whisper-compatible transcription API and Web UI for recording meetings, transcribing audio, and remembering speakers across sessions with diarization and speaker memory.
github.com
·
6w
·
r/LocalLLaMA
Litellm
1.82.7 and 1.82.8 on
PyPI
are compromised, do not update!
futuresearch.ai
·
6w
·
Lobsters
,
Hacker News
,
Hacker News
,
r/LocalLLaMA
,
r/Python
,
r/programming
Devstral-Small-2-24B
fine-tuned on Claude 4.6 Opus reasoning traces [GGUF Q4+Q5]
huggingface.co
·
6w
·
r/LocalLLaMA
darkc0de/Mistral-Small-4-119B-2603-heretic
huggingface.co
·
6w
·
r/LocalLLaMA
Request: Training a
pretrained
, MoE version of Mistral
Nemo
huggingface.co
·
6w
·
r/LocalLLaMA
cenconq25/delta-compress-llm
: Exploiting temporal coherence in LLM inference-- delta encoding for KV cache compression and weight-skip prediction. Achieves
F16-quality
KV cache at Q4_0 compression ratios with zero perplexity loss on llama.cpp.
github.com
·
6w
·
Hacker News
,
r/LocalLLaMA
n57d30top/graph-assist-npu-array-v1-direct-add-commit-add-hi-tap
: Curated open-source export for the graph-assist
NPU
Array V1 direct-add-commit-add-hi-tap branch
github.com
·
6w
·
r/LocalLLaMA
SirhanMacx/mcp-registry
: Community
registry
for Model Context Protocol (MCP) servers — verified install commands, tool listings, structured metadata
github.com
·
6w
·
Hacker News
,
r/LocalLLaMA
,
r/mcp
Qwen3.5-9B
finetune/export
with Opus 4.6 reasoning distillation + mixed
extras
huggingface.co
·
6w
·
r/LocalLLaMA
Do not use mixed
KV
cache
quantization
blog.foodnik.app
·
6w
·
r/LocalLLaMA
GitHub here . You can follow the build instructions below as well. Change -
DGGML
_CUDA=ON to -
DGGML
_CUDA=OFF if you don't have a GPU or just want CPU
inferen
...
github.com
·
162w
·
r/LocalLLaMA
,
r/LocalLLaMA
Green0-0/llm
_datasets: A collection of high quality
huggingface
datasets.
github.com
·
6w
·
r/LocalLLaMA
has anyone tried this?
Flash-MoE
: Running a
397B
Parameter Model on a Laptop
github.com
·
6w
·
r/LocalLLaMA
« Page 16
·
Page 18 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help