Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Models
🤖 AI Models
model launch, foundation model, AI release, LLM
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
384
posts in
6.9
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🖥️
GPU
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
Comprehensive evaluation of
LLM
capabilities for interpretation and analysis of genome-scale metabolic
models
in metabolic engineering
⚗️
Metabolic Health
Content type:
Academic
biorxiv.org
·
1d
1 day ago
Actions for Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering
LLM-as-a-Discriminator
: When Synthetic Tables Still Look Real
📱
Consumer Hardware
Content type:
Academic
arxiv.org
·
17h
17 hours ago
Actions for LLM-as-a-Discriminator: When Synthetic Tables Still Look Real
LLM
Routing: From Strategy Selection to Production Architecture
⚡
AI Productivity
Content type:
Blog
blog.n8n.io
·
6h
6 hours ago
Actions for LLM Routing: From Strategy Selection to Production Architecture
lightmetal: GPU
LLM
Inference
From a Single Java 25 JAR
🖥️
GPU
Content type:
Blog
adambien.blog
·
1d
1 day ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
Claude
vs
GPT-4
: Which
AI
API Is Better for Developers? (2026)
⚡
AI Productivity
kalyna.pro
·
5d
5 days ago
·
DEV
Actions for Claude vs GPT-4: Which AI API Is Better for Developers? (2026)
Initial impressions of
Claude
Fable 5
⚡
AI Productivity
simonwillison.net
·
21h
21 hours ago
·
Hacker News
Actions for Initial impressions of Claude Fable 5
Report: GKE
Inference
Gateway delivers up to 92% faster
AI
responses
🟢
Nvidia
Content type:
Blog
cloud.google.com
·
1d
1 day ago
·
Hacker News
Actions for Report: GKE Inference Gateway delivers up to 92% faster AI responses
The biggest local
LLM
on your machine is useless if it can't call a single tool, no matter how many
parameters
it has
🖥️
GPU
xda-developers.com
·
4h
4 hours ago
Actions for The biggest local LLM on your machine is useless if it can't call a single tool, no matter how many parameters it has
Slack bot for the whole team, not per-seat
⚡
AI Productivity
Content type:
Discussion
plugand.ai
·
12h
12 hours ago
·
Hacker News
Actions for Slack bot for the whole team, not per-seat
Using
Scikit-LLM
with Open-Source LLMs
📊
Quant Trading
machinelearningmastery.com
·
6d
6 days ago
Actions for Using Scikit-LLM with Open-Source LLMs
Claude
Fable 5 is Mythos for the masses
⚡
AI Productivity
Content type:
Blog
techzine.eu
·
1d
1 day ago
Actions for Claude Fable 5 is Mythos for the masses
Google’s DiffusionGemma is 4x faster than its other Gemma
models
🟢
Nvidia
thenewstack.io
·
4h
4 hours ago
Actions for Google’s DiffusionGemma is 4x faster than its other Gemma models
Timing Trick Cuts Energy Used in
LLM
Training by Up to 14 Percent
🖥️
GPU
Content type:
News
spectrum.ieee.org
·
10h
10 hours ago
·
Hacker News
Actions for Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent
know the mother tongue of your LLMs
📱
Consumer Hardware
mothertoken.inigoimaz.com
·
1d
1 day ago
·
Hacker News
Actions for know the mother tongue of your LLMs
MLPerf and the rise of latency-aware
LLM
benchmarking
⚡
AI Productivity
edn.com
·
5d
5 days ago
Actions for MLPerf and the rise of latency-aware LLM benchmarking
A Plea to the Labs: Let the
Models
Diagnose.
⚡
AI Productivity
Content type:
Blog
tangent.bearblog.dev
·
6h
6 hours ago
·
Hacker News
Actions for A Plea to the Labs: Let the Models Diagnose.
You don't need Copilot for code completion, try this instead
⚡
AI Productivity
mistral.ai
·
2d
2 days ago
·
r/GithubCopilot
Actions for You don't need Copilot for code completion, try this instead
Google's new open
model
DiffusionGemma generates text from noise instead of word by word
🟢
Nvidia
the-decoder.com
·
2h
2 hours ago
Actions for Google's new open model DiffusionGemma generates text from noise instead of word by word
DiffusionGemma: 4x Faster Text Generation
🟢
Nvidia
Content type:
News
Content type:
Blog
blog.google
·
5h
5 hours ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
Actions for DiffusionGemma: 4x Faster Text Generation
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help