Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
🧠 LLMs
Specific
large language models, GPT, Claude, inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
944
posts in
12.9
ms
Why Your
LLM
Gets Dumber With More
Context
🤖
AI Engineering
siliconopera.com
·
1d
1 day ago
Actions for Why Your LLM Gets Dumber With More Context
Report: GKE
Inference
Gateway delivers up to 92% faster AI responses
🖥️
Backend Development
Content type:
Blog
cloud.google.com
·
3d
3 days ago
·
Hacker News
·
Cited by 1 article
Actions for Report: GKE Inference Gateway delivers up to 92% faster AI responses
MTG Bench: Testing how well
LLMs
can play Magic
🤝
AI Agents
mtgautodeck.com
·
1d
1 day ago
·
Hacker News
Actions for MTG Bench: Testing how well LLMs can play Magic
Orchestrate your
LLM
pipeline. Locally
🤖
AI Engineering
llmforge.app
·
1d
1 day ago
·
Hacker News
Actions for Orchestrate your LLM pipeline. Locally
Show HN:
Ext-Infer
🔍
RAG
infer.displace.tech
·
5d
5 days ago
·
Hacker News
·
Cited by 2 articles
Actions for Show HN: Ext-Infer
A Complete Beginner's Guide to Local
LLM
Inference
🔍
RAG
Content type:
Blog
khnsakhnm.medium.com
·
1d
1 day ago
Actions for A Complete Beginner's Guide to Local LLM Inference
Comprehensive evaluation of
LLM
capabilities for interpretation and analysis of genome-scale metabolic
models
in metabolic engineering
🤖
AI Engineering
Content type:
Academic
biorxiv.org
·
3d
3 days ago
Actions for Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering
CommBench: Can
LLMs
Write Correct and Efficient GPU Communication Code?
🤖
AI Engineering
uccl-project.github.io
·
1d
1 day ago
·
Hacker News
Actions for CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?
LangChain Explained: Understanding
Models
, Prompts, Chains, Memory, Indexes, and Agents
🤖
AI Engineering
Content type:
Blog
towardsai.net
·
4d
4 days ago
Actions for LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents
Timing Trick Cuts Energy Used in
LLM
Training by Up to 14 Percent
📐
System Design
Content type:
News
spectrum.ieee.org
·
2d
2 days ago
·
Hacker News
Actions for Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent
lightmetal: GPU
LLM
Inference
From a Single Java 25 JAR
🔍
RAG
Content type:
Blog
adambien.blog
·
3d
3 days ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
Show HN: In-browser real
LLM
token
counter and cost estimation
🖥️
Backend Development
holaclaw.ai
·
1d
1 day ago
·
Hacker News
Actions for Show HN: In-browser real LLM token counter and cost estimation
A reporting checklist for
large
language
models
in behavioural science
🤝
AI Agents
Content type:
Academic
nature.com
·
3d
3 days ago
Actions for A reporting checklist for large language models in behavioural science
Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon
🛡️
AI Safety
xda-developers.com
·
1d
1 day ago
Actions for Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon
harmansingh4163-ai/ESP-32-s3-Story-maker-LLM
: 15M/42M-param
Llama
split across two ESP32-S3s over 3 wires — too big for either chip alone. INT4, flash mmap, bit-exact verified.
📐
System Design
Content type:
Code
github.com
·
14h
14 hours ago
·
Hacker News
Actions for harmansingh4163-ai/ESP-32-s3-Story-maker-LLM: 15M/42M-param Llama split across two ESP32-S3s over 3 wires — too big for either chip alone. INT4, flash mmap, bit-exact verified.
Prompt Caching Explained: The AI Concept That Can Save Millions of
Tokens
🔌
API Design
Content type:
Blog
sweta-nit.medium.com
·
1d
1 day ago
Actions for Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens
Research Proposal: Decoupled
RISC-LLM
Architectures via Circadian Synaptic Consolidation
📐
System Design
aermia.com
·
5d
5 days ago
·
Hacker News
Actions for Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation
A Plea to the Labs: Let the
Models
Diagnose.
🛡️
AI Safety
Content type:
Blog
tangent.bearblog.dev
·
2d
2 days ago
·
Hacker News
Actions for A Plea to the Labs: Let the Models Diagnose.
Google's new open-weights
model
brings image-generation tricks to AI text generation
🤖
AI Engineering
Content type:
News
theregister.com
·
1d
1 day ago
·
Hacker News
Actions for Google's new open-weights model brings image-generation tricks to AI text generation
Why
LLMs
(still) lack taste
📐
System Design
beyondtheprior.com
·
3d
3 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help