Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI
🤖 AI
Broad
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
56
posts in
28.7
ms
SLUUG Talk: Demystifying Large Language
Models
on Linux
🤖
LLM
Content type:
Code
github.com
·
5d
5 days ago
·
DEV
·
Cited by 1 article
Actions for SLUUG Talk: Demystifying Large Language Models on Linux
LLM
KV Cache Optimization, Open
Model
Evaluation, & Agent Engineering Skills for Local Deployment
🤖
LLM
Content type:
Blog
dev.to
·
5h
5 hours ago
·
DEV
Actions for LLM KV Cache Optimization, Open Model Evaluation, & Agent Engineering Skills for Local Deployment
DiffusionGemma: 4x Faster Text
Generation
🤖
LLM
Content type:
News
Content type:
Blog
21
articles covering this post
blog.google
·
2d
2 days ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
·
Cited by 21 articles
Actions for DiffusionGemma: 4x Faster Text Generation
PyTorch
from Scratch — Part 1: Tensors, Gradients & Activations
⚙
Rust
x.com
·
6d
6 days ago
·
DEV
Actions for PyTorch from Scratch — Part 1: Tensors, Gradients & Activations
Framework Desktop AMD 395+ (rdna 3.5) cannot run confyui err Fix 2026
⚡
Vite
Content type:
Blog
runaihome.com
·
5d
5 days ago
·
DEV
Actions for Framework Desktop AMD 395+ (rdna 3.5) cannot run confyui err Fix 2026
Teaching a Reranker the Language of Security Tickets (+41% MRR@10)
🤖
LLM
linkedin.com
·
6d
6 days ago
·
DEV
·
Cited by 1 article
Actions for Teaching a Reranker the Language of Security Tickets (+41% MRR@10)
Three sleep intervals for three APIs: Steam 250ms, GitHub 100ms,
HuggingFace
none
🔄
TanStack Query
Content type:
Reference
docs.github.com
·
6d
6 days ago
·
DEV
·
Cited by 1 article
Actions for Three sleep intervals for three APIs: Steam 250ms, GitHub 100ms, HuggingFace none
FlashAttention Explained: The Optimization That Made
Modern
LLMs Practical
🤖
LLM
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for FlashAttention Explained: The Optimization That Made Modern LLMs Practical
Stop Downloading 8GB
Models
on Every Pod Restart - Use OCI Object Storage as a Model Cache
🚀
DevOps
Content type:
Blog
dev.to
·
12h
12 hours ago
·
DEV
Actions for Stop Downloading 8GB Models on Every Pod Restart - Use OCI Object Storage as a Model Cache
Flowork: Self-Hosted
AI
Stack with Sovereign Agent OS and
LLM
Gateway
🚀
DevOps
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for Flowork: Self-Hosted AI Stack with Sovereign Agent OS and LLM Gateway
Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than
PyTorch
🔶
Svelte
Content type:
Code
github.com
·
6d
6 days ago
·
DEV
Actions for Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than PyTorch
8GB to 70B: A Real Hardware Guide for Local LLMs
🤖
LLM
Content type:
Blog
dev.to
·
20h
20 hours ago
·
DEV
Actions for 8GB to 70B: A Real Hardware Guide for Local LLMs
Run Codex CLI with Local
LLM
- Gemma4 with llama.cpp on WSL2
🤖
LLM
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for Run Codex CLI with Local LLM - Gemma4 with llama.cpp on WSL2
Token Cost Optimization: How to Cut
LLM
Inference Spend Without Cutting Quality
🤖
LLM
Content type:
Blog
dev.to
·
9h
9 hours ago
·
DEV
Actions for Token Cost Optimization: How to Cut LLM Inference Spend Without Cutting Quality
I Made Two
AI
Models
Fight Each Other. They Agreed Way Too Much.
🤖
LLM
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for I Made Two AI Models Fight Each Other. They Agreed Way Too Much.
Local
Ai
Deployment Cost Analysis 2024
🤖
LLM
Content type:
Blog
dev.to
·
10h
10 hours ago
·
DEV
Actions for Local Ai Deployment Cost Analysis 2024
RFC: pluggable publisher verification as a trust tier for community skills · Issue #40555 · NousResearch/hermes-agent
⚡
Vite
Content type:
Discussion
Content type:
Code
github.com
·
6d
6 days ago
·
DEV
Actions for RFC: pluggable publisher verification as a trust tier for community skills · Issue #40555 · NousResearch/hermes-agent
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
🤖
LLM
Content type:
Blog
dev.to
·
2d
2 days ago
·
DEV
Actions for Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
Mixture of Experts (MoE): what it actually does under the hood, and when it pays off
🤖
LLM
Content type:
Blog
dev.to
·
1h
1 hour ago
·
DEV
Actions for Mixture of Experts (MoE): what it actually does under the hood, and when it pays off
I Built a Python Agent That Uses a Vector DB as Memory, Not Retrieval
🤖
LLM
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for I Built a Python Agent That Uses a Vector DB as Memory, Not Retrieval
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help