Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
🤖 LLMs
Specific
large language models, GPT, Claude, AI models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
309
posts in
5.9
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🤖
AI
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
,
r/LLM
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
Comprehensive evaluation of
LLM
capabilities for interpretation and analysis of genome-scale metabolic
models
in metabolic engineering
🤖
AI
Content type:
Academic
biorxiv.org
·
2d
2 days ago
Actions for Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering
Fine-tuning Multi-modal
LLMs
with ART: Art-based Reinforcement Training
🤖
AI
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training
6. Air-Gapped
Claude
Code - The
Claude
Code SRE Handbook
⚙️
DevOps
har-ki.github.io
·
2h
2 hours ago
·
Hacker News
Actions for 6. Air-Gapped Claude Code - The Claude Code SRE Handbook
Intelligent inference scheduling with
llm-d
on Red Hat
AI
🤖
AI
developers.redhat.com
·
19h
19 hours ago
Actions for Intelligent inference scheduling with llm-d on Red Hat AI
Why
LLMs
(still) lack taste
⚙️
DevOps
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
Claude
vs
GPT-4
: Which
AI
API Is Better for Developers? (2026)
🐍
Python
kalyna.pro
·
6d
6 days ago
·
DEV
Actions for Claude vs GPT-4: Which AI API Is Better for Developers? (2026)
Ollama
0.30 GPU Boost: Faster local Qwen inference on NVIDIA
🔌
APIs
everylocalai.com
·
22h
22 hours ago
·
DEV
Actions for Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA
Timing Trick Cuts Energy Used in
LLM
Training by Up to 14 Percent
🏃
Running
Content type:
News
spectrum.ieee.org
·
1d
1 day ago
·
Hacker News
Actions for Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent
Why Your
LLM
Gets Dumber With More Context
🤖
AI
siliconopera.com
·
4h
4 hours ago
Actions for Why Your LLM Gets Dumber With More Context
What
Ollama
Reveals About Local
AI
, Agents, and Open
Models
🤖
AI
Content type:
Blog
odsc.medium.com
·
20h
20 hours ago
Actions for What Ollama Reveals About Local AI, Agents, and Open Models
The smartest
ChatGPT
users are putting local
AI
in front of it — here's why
🤖
AI
tomsguide.com
·
5d
5 days ago
Actions for The smartest ChatGPT users are putting local AI in front of it — here's why
Fixing a stuck
Ollama
runner and building a GPU watchdog
⚙
System programming
patrickmccanna.net
·
2d
2 days ago
·
Hacker News
Actions for Fixing a stuck Ollama runner and building a GPU watchdog
CommBench: Can
LLMs
Write Correct and Efficient GPU Communication Code?
🤖
AI
uccl-project.github.io
·
12h
12 hours ago
·
Hacker News
Actions for CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?
Built and launched a research-reading and highlighting tool with
Claude
over a few months. Here are the things
AI
was surprisingly good (and bad) at.
🤖
AI
highlyt.app
·
2d
2 days ago
·
r/ClaudeAI
Actions for Built and launched a research-reading and highlighting tool with Claude over a few months. Here are the things AI was surprisingly good (and bad) at.
Improved performance and
model
support with GGUF
🤖
AI
Content type:
Blog
ollama.com
·
6d
6 days ago
Actions for Improved performance and model support with GGUF
MCP Architecture Explained for Beginners: Why
AI
Needs a Structured Communication System
🤖
AI
Content type:
Blog
medium.com
·
7h
7 hours ago
Actions for MCP Architecture Explained for Beginners: Why AI Needs a Structured Communication System
MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent
🇬🇧
London Tech
Content type:
Blog
bric.pe.kr
·
2d
2 days ago
·
DEV
Actions for MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent
Large
companies can add a local
LLM
filter layer to considerably reducing their
AI
costs
🤖
AI
umrashrf.github.io
·
5d
5 days ago
·
Hacker News
Actions for Large companies can add a local LLM filter layer to considerably reducing their AI costs
AMD's Lemonade SDK For Local
AI
Adds NVIDIA CUDA Support
🤖
AI
phoronix.com
·
1d
1 day ago
·
r/artificial
Actions for AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help