Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
🧠 LLMs
Specific
large language models, GPT, Claude, inference, fine-tuning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
169
posts in
5.7
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
💾
Caching
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
Why
LLMs
(still) lack taste
🤖
AI Agents
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
Inferoa
AI harness claimed 90% cache savings. We ran it and measured 97.8%
💾
Caching
zozo123.github.io
·
18h
18 hours ago
·
Hacker News
Actions for Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%
NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
⚡
Performance
Content type:
Blog
blogs.nvidia.com
·
13h
13 hours ago
Actions for NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
⚡
Performance
Content type:
News
newsletter.semianalysis.com
·
1d
1 day ago
·
Hacker News
Actions for DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
Build a Medical Report Analyzer on Dedicated
Inference
with Python
🐍
Python
digitalocean.com
·
6d
6 days ago
Actions for Build a Medical Report Analyzer on Dedicated Inference with Python
Get officially certified in
Claude
AI for just $19.99
✨
OpenAI
pcworld.com
·
21h
21 hours ago
Actions for Get officially certified in Claude AI for just $19.99
A free diagnostic for the
Claude
Certified Architect exam
🤖
AI Agents
Content type:
Discussion
Content type:
Tutorial
claudecertifiedarchitects.com
·
1d
1 day ago
·
Hacker News
Actions for A free diagnostic for the Claude Certified Architect exam
KJLdefeated/RL.cu
: RLVR training for
LLM
in CUDA/C++
✨
OpenAI
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
DiffusionGemma: 4x Faster Text Generation
⚡
Performance
Content type:
News
Content type:
Blog
blog.google
·
13h
13 hours ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
Actions for DiffusionGemma: 4x Faster Text Generation
google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
🐍
Python
huggingface.co
·
2d
2 days ago
·
r/LocalLLaMA
Actions for google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation
AI 101: From
Prompt
Engineering
to Skill
Engineering
🤖
AI Agents
turingpost.com
·
8h
8 hours ago
Actions for AI 101: From Prompt Engineering to Skill Engineering
Token4Token — pay-per-token
inference
on Gnosis + Swarm
✨
OpenAI
t4t.eth.link
·
1d
1 day ago
·
Hacker News
Actions for Token4Token — pay-per-token inference on Gnosis + Swarm
SLUUG Talk: Demystifying
Large
Language
Models
on Linux
🤖
AI Agents
Content type:
Code
github.com
·
4d
4 days ago
·
DEV
Actions for SLUUG Talk: Demystifying Large Language Models on Linux
DiffusionGemma: The Developer Guide- Google Developers Blog
⚡
Performance
Content type:
Blog
developers.googleblog.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for DiffusionGemma: The Developer Guide- Google Developers Blog
The Anthropic leader who built
Claude
Code says he ditched
prompting
— now he just writes loops.
✨
OpenAI
thenewstack.io
·
12h
12 hours ago
Actions for The Anthropic leader who built Claude Code says he ditched prompting — now he just writes loops.
Running
LLM
Inference
on Kubernetes: What It Actually Takes
⚡
Performance
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
LLM
Observability: What To Instrument and How To Act on It
⚡
Performance
Content type:
Blog
blog.n8n.io
·
2d
2 days ago
Actions for LLM Observability: What To Instrument and How To Act on It
How we fight GPU scarcity without compromise
🤖
AI Agents
Content type:
Blog
equixly.com
·
5d
5 days ago
·
Hacker News
Actions for How we fight GPU scarcity without compromise
Less-relevant results
our workplace
LLM
mass delusion
🏢
Engineering Blogs
Content type:
Blog
blog.avas.space
·
18h
18 hours ago
·
Hacker News
Actions for our workplace LLM mass delusion
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help