Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Local LLMs
🤖 Local LLMs
Specific
Ollama, LLaMA, Mistral, On-device AI
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
600
posts in
8.4
ms
Unsloth
Gemma
4 QAT
🧠
Deep Learning
unsloth.ai
·
4d
4 days ago
Actions for Unsloth Gemma 4 QAT
defai-digital/ax-engine: Apple Silicon
LLM
runtime supporting
Gemma
4 and
Qwen
3.6 MTP modes
🧠
Deep Learning
Content type:
Code
github.com
·
15h
15 hours ago
·
Hacker News
Actions for defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes
Ollama
0.30 delivers faster NVIDIA GPU performance and wider hardware support
🗂️
Obsidian
alternativeto.net
·
2d
2 days ago
Actions for Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support
Qwen
3.6 27B AutoRound
GGUF
, need your feedback
🚌
GTFS
huggingface.co
·
21h
21 hours ago
·
r/LocalLLaMA
Actions for Qwen 3.6 27B AutoRound GGUF, need your feedback
Google Shrank
Gemma
4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good
🤖
AI Ethics
Content type:
Blog
towardsai.net
·
2d
2 days ago
Actions for Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good
Improved performance and model support with
GGUF
🧠
Deep Learning
Content type:
Blog
ollama.com
·
5d
5 days ago
Actions for Improved performance and model support with GGUF
lightmetal: GPU
LLM
Inference
From a Single Java 25 JAR
🧠
Deep Learning
Content type:
Blog
adambien.blog
·
1d
1 day ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
Gemma
4 QAT models: Optimizing model compression for mobile and laptop efficiency
🧠
Deep Learning
Content type:
News
Content type:
Blog
blog.google
·
5d
5 days ago
·
Hacker News
Actions for Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
Google fills out the middle with the
Gemma
4 12B
🧠
LLM Reasoning
jonpeddie.com
·
1d
1 day ago
Actions for Google fills out the middle with the Gemma 4 12B
The latest
Gemma
4 models use a training trick to slash their
on-device
memory footprint
🧠
Deep Learning
androidauthority.com
·
4d
4 days ago
Actions for The latest Gemma 4 models use a training trick to slash their on-device memory footprint
Integrate
on-device
AI
models into your app using Core
AI
- WWDC26 - Videos
👁️
Computer Vision
developer.apple.com
·
2d
2 days ago
·
Hacker News
Actions for Integrate on-device AI models into your app using Core AI - WWDC26 - Videos
local
llm
on laptop 780M GPU using
llama
+ gemma 4 qat
💾
Local-first Software
Content type:
Blog
alper.bearblog.dev
·
4d
4 days ago
Actions for local llm on laptop 780M GPU using llama + gemma 4 qat
Apples to Apples: MLX vs.
Llama.cpp
for
Gemma
4 12B on an M1 16GB
🥶
Cold Start Problem
Content type:
Blog
ziraph.com
·
4d
4 days ago
·
Hacker News
Actions for Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB
Show HN: Run
Llama.cpp
In-Process from Java with Project Panama FFM
🧠
Deep Learning
deemwar-products.github.io
·
5d
5 days ago
·
Hacker News
Actions for Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM
Google’s new Mac app keeps your
AI
chats off the internet
🗂️
Obsidian
cultofmac.com
·
6d
6 days ago
Actions for Google’s new Mac app keeps your AI chats off the internet
Google
Gemma4
12B released
🧠
Deep Learning
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Google Gemma4 12B released
Google makes
Gemma
4 12B a
local
AI
bet for startups
🧠
LLM Reasoning
startupfortune.com
·
6d
6 days ago
Actions for Google makes Gemma 4 12B a local AI bet for startups
Gemma
4 12B: A unified, encoder-free multimodal model
🥶
Cold Start Problem
Content type:
Discussion
news.ycombinator.com
·
3d
3 days ago
·
Hacker News
Actions for Gemma 4 12B: A unified, encoder-free multimodal model
Google DeepMind releases
Gemma
4 QAT, but Unsloth developer Daniel Han warns naive
llama.cpp
conversions suffer accuracy loss
🥶
Cold Start Problem
Content type:
News
digg.com
·
4d
4 days ago
Actions for Google DeepMind releases Gemma 4 QAT, but Unsloth developer Daniel Han warns naive llama.cpp conversions suffer accuracy loss
Using
Scikit-LLM
with Open-Source LLMs
📈
Data Science
machinelearningmastery.com
·
6d
6 days ago
Actions for Using Scikit-LLM with Open-Source LLMs
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help