Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
📊 LLM Evaluation
Specific
Benchmarks, Model Testing, Performance Metrics, HELM
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
98
posts in
11.5
ms
Prompt
Compression in Diffusion Large
Language
Models
: Evaluating LLMLingua-2 on LLaDA
✍️
Prompt Engineering
arxiv.org
·
2d
Market Trend Analysis: The Impact of Recent Advances on the Large
Language
Model
Evaluation
As A Service Market
⚙️
MLOps
openpr.com
·
5d
Artificial Analysis
🤖
Agentic AI
dsebastien.net
·
20h
May 20, 2026 (#4672)
🔄
DevOps
alvinashcraft.com
·
17h
LLM
Evaluation
and AI Observability for Agent Monitoring
⚙️
MLOps
blog.jetbrains.com
·
1d
Corbell-AI/evalmonkey: CLI for coding agents to
benchmark
& chaos
test
your AI Agents
⚙️
MLOps
github.com
·
5d
·
Hacker News
Why does
off-model
SFT degrade
capabilities
?
⚙️
MLOps
lesswrong.com
·
3h
Benchmarking
LLMs for malware triage and static unpacking with Malcat
🦠
Malware Analysis
malcat.fr
·
2d
·
r/Malware
Who
Wins
the Future: Chips vs Frontier LLMs
📱
Edge AI
medium.com
·
23h
·
DEV
tokenspeed — feel
LLM
tokens-per-second
✍️
Prompt Engineering
mikeveerman.github.io
·
57m
DreamFast/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark
🐛
Fuzzing
huggingface.co
·
3d
EvalHub: Because "looks good to me" isn't a
benchmark
🚀
Performance Engineering
developers.redhat.com
·
2d
Four-Tier Memory Hierarchy for
LLM
Reasoning
(USC, UW)
✍️
Prompt Engineering
semiengineering.com
·
10h
Mastering Agentic Techniques: AI Agent
Evaluation
🤝
AI Agents
developer.nvidia.com
·
1d
LLM
Targeted Underperformance Disproportionately Impacts Vulnerable Users
🐛
Fuzzing
lemmy.ml
·
6d
HRM-Text
✍️
Prompt Engineering
sapient.inc
·
1d
·
Hacker News
Context
pruning: cut
LLM
tokens without losing quality (9 minute
read
)
📱
Edge AI
redis.io
·
3d
Gemini's busy agentic day at Google I/O
📱
Edge AI
therundown.ai
·
19h
Supersymmetric Digital Assets & AI Emergence
📱
Edge AI
qbc.network
·
3d
·
Hacker News
Import AI 457: AI stuxnet; cursed Muon optimizer; and positive alignment
⚠️
AI Safety
jack-clark.net
·
2d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help