Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Local LLMs
🧠 Local LLMs
Specific
local AI, self-hosted LLM, ollama, on-device inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
411
posts in
8.2
ms
Trainable Smooth-Rotation Transforms with Learned Channel Scales for
LLM
Quantization
🤖
LLMs
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization
Local
AI
agents on Arduino UNO Q
🤖
Agents
Content type:
Blog
blog.arduino.cc
·
1d
1 day ago
Actions for Local AI agents on Arduino UNO Q
Running
LLM
Inference
on Kubernetes: What It Actually Takes
📝
NLP
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
andreyvgavrilov/food_database:
AI
agent to evaluate recipe nutrition
🤖
Agents
Content type:
Code
github.com
·
2d
2 days ago
·
r/mcp
Actions for andreyvgavrilov/food_database: AI agent to evaluate recipe nutrition
LM
Link launches on iPhone, bringing
local
AI
model access to iOS devices
🤖
LLMs
alternativeto.net
·
5d
5 days ago
Actions for LM Link launches on iPhone, bringing local AI model access to iOS devices
Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!
📝
NLP
gizchina.com
·
1d
1 day ago
Actions for Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!
Less-relevant results
Re-quantizing
a
local
LLM
14x faster by skipping the tensors that didn't change
🔥
PyTorch
Content type:
News
Content type:
Blog
andreaborio.substack.com
·
11h
11 hours ago
·
Substack
Actions for Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change
Apples to Apples: MLX vs.
Llama.cpp
for Gemma 4 12B on an M1 16GB
🤖
Qwen
Content type:
Blog
ziraph.com
·
5d
5 days ago
·
Hacker News
Actions for Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB
LC-QAT: Data-Efficient 2-Bit QAT for
LLMs
via Linear-Constrained Vector
Quantization
🤖
LLMs
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization
China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
🟣
Claude
Content type:
News
decrypt.co
·
2d
2 days ago
Actions for China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
How to Run Gemma 4 12B
Locally
- The Best
AI
For Consumer Laptops
🧠
OpenAI
Content type:
Video
youtube.com
·
6d
6 days ago
Actions for How to Run Gemma 4 12B Locally - The Best AI For Consumer Laptops
TFLite Edge Model
Quantizer
Snippet
🔷
TensorFlow
itsevilduck.gumroad.com
·
2d
2 days ago
·
DEV
Actions for TFLite Edge Model Quantizer Snippet
fix(memory-core): filter stale recall entries in REM harness preview · openclaw/openclaw@92418fc
👨💻
AI Coding
Content type:
Code
github.com
·
18h
18 hours ago
Actions for fix(memory-core): filter stale recall entries in REM harness preview · openclaw/openclaw@92418fc
A system programmer’s guide to
LLM
inference
📝
NLP
Content type:
Blog
blog.xiangpeng.systems
·
3d
3 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
WWDC 2026: Foundation Models (& Anarlog)
♊
Gemini
skushagra.com
·
2d
2 days ago
Actions for WWDC 2026: Foundation Models (& Anarlog)
LM
Studio
now lets you use your iPhone to talk to
local
models on your Mac
🧠
OpenAI
9to5mac.com
·
6d
6 days ago
·
r/apple
Actions for LM Studio now lets you use your iPhone to talk to local models on your Mac
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
🔬
Deep Learning
aarushgupta.io
·
1d
1 day ago
·
Lobsters
,
Hacker News
Actions for Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
📝
NLP
local-llm.utop.workers.dev
·
3d
3 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Information Bottleneck Meets
Quantization
: Finite Rate Analysis and Optimal Designs
🤖
LLMs
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Information Bottleneck Meets Quantization: Finite Rate Analysis and Optimal Designs
"
AI
" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY
🧠
OpenAI
Content type:
News
Content type:
Blog
braddelong.substack.com
·
2d
2 days ago
·
Substack
Actions for "AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help