Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Open Source AI
🔓 Open Source AI
open source models, Hermes, Mistral, local LLM, Ollama
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
728
posts in
10.2
ms
The latest Gemma 4
models
use a training trick to slash their on-device memory footprint
🔔
Plan 9
androidauthority.com
·
5d
5 days ago
Actions for The latest Gemma 4 models use a training trick to slash their on-device memory footprint
A new chapter of efficient foundation
models
for medical imaging
🖥️
Retro Computing
techcommunity.microsoft.com
·
11h
11 hours ago
Actions for A new chapter of efficient foundation models for medical imaging
Researchers Build Self-Replicating
AI
Worm That
Operates
Entirely on
Local
, Open-Weight Models
🔔
Plan 9
thehackernews.com
·
1d
1 day ago
Actions for Researchers Build Self-Replicating AI Worm That Operates Entirely on Local, Open-Weight Models
Re-quantizing
a
local
LLM
14x faster by skipping the tensors that didn't change
🖥️
Retro Computing
Content type:
News
Content type:
Blog
andreaborio.substack.com
·
11h
11 hours ago
·
Substack
Actions for Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change
Local
LLMs, Buy a GPU, and the Case for Cognitive Security
🖥️
Retro Computing
briefing.forwardfuture.ai
·
6d
6 days ago
Actions for Local LLMs, Buy a GPU, and the Case for Cognitive Security
Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good
🖥️
Retro Computing
Content type:
Blog
towardsai.net
·
2d
2 days ago
Actions for Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good
Google releases Gemma 4 12B with encoder-free multimodal architecture
🔔
Plan 9
4sysops.com
·
1d
1 day ago
Actions for Google releases Gemma 4 12B with encoder-free multimodal architecture
A system programmer’s guide to
LLM
inference
🦬
Emacs
Content type:
Blog
blog.xiangpeng.systems
·
3d
3 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
magenta/magenta-realtime: Magenta RealTime 2: An
Open-Weights
Live Music
Model
♊
Gemini Protocol
Content type:
Code
github.com
·
17h
17 hours ago
Actions for magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music Model
HNSW vs LSH: How Elasticsearch hits 0.99 recall@10 at 15,000 QPS — and what it costs
🖧
BSD
Content type:
Blog
elastic.co
·
2d
2 days ago
Actions for HNSW vs LSH: How Elasticsearch hits 0.99 recall@10 at 15,000 QPS — and what it costs
local
llm
on laptop 780M GPU using
llama
+ gemma 4 qat
🦬
Emacs
Content type:
Blog
alper.bearblog.dev
·
4d
4 days ago
Actions for local llm on laptop 780M GPU using llama + gemma 4 qat
Optimal Post-Training
Quantization
Scales and Where to
Find
Them
🔌
Single-Board Computers
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Optimal Post-Training Quantization Scales and Where to Find Them
China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
♊
Gemini Protocol
Content type:
News
decrypt.co
·
2d
2 days ago
Actions for China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)
Build a Medical Report Analyzer on Dedicated
Inference
with Python
♊
Gemini Protocol
digitalocean.com
·
6d
6 days ago
Actions for Build a Medical Report Analyzer on Dedicated Inference with Python
"North Mini Code";
open
weights
, 30B param, Canadian coding
model
🔌
Single-Board Computers
Content type:
Blog
cohere.com
·
2d
2 days ago
·
Hacker News
Actions for "North Mini Code"; open weights, 30B param, Canadian coding model
AI
Serving Platform That Adapts to Your
Model
🏠
Self-Hosting
Content type:
Blog
databricks.com
·
8h
8 hours ago
Actions for AI Serving Platform That Adapts to Your Model
DiffusionGemma
🖥️
Retro Computing
simonwillison.net
·
4h
4 hours ago
Actions for DiffusionGemma
2x GH200 for
LLM
inference
, Part 2:
vLLM
, DeepSeek V4 Flash, and MTP
🖥️
Retro Computing
Content type:
Blog
dnhkng.github.io
·
3d
3 days ago
Actions for 2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP
Researchers trained an
open
source
AI
search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
🖥️
Retro Computing
venturebeat.com
·
2d
2 days ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
WWDC 2026: Foundation
Models
(& Anarlog)
🔔
Plan 9
skushagra.com
·
2d
2 days ago
Actions for WWDC 2026: Foundation Models (& Anarlog)
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help