Self-Improving LLM Agents at Test-Time
arxiv.orgยท18h
๐Ÿ†•New AI
2025-10-10 # LLMs Are Transpilers
alloc.devยท22hยท
Discuss: Hacker News
๐Ÿช„Prompt Engineering
Benchmarking LLM Inference on RTX 4090 / RTX 5090 / RTX PRO 6000 #2
reddit.comยท4hยท
Discuss: r/LocalLLaMA
๐Ÿ—๏ธLLM Infrastructure
Assuring Agent Safety Evaluations By Analysing Transcripts
lesswrong.comยท12h
๐Ÿ›ก๏ธAI Safety
LLMs and reinforcement learning
sicpers.infoยท12h
๐Ÿช„Prompt Engineering
InferenceMAX: Open-Source Inference Benchmarking
newsletter.semianalysis.comยท23hยท
Discuss: Hacker News
๐Ÿ—๏ธLLM Infrastructure
Supercharge your Enterprise BI: How to approach your migration to AI/BI
databricks.comยท1h
๐Ÿ—๏ธInfrastructure Economics
OpenAI's inflated valuation, as I understand it
taloranderson.comยท6hยท
Discuss: Hacker News
๐Ÿ“ŠModel Serving Economics
LLM-Based AI Agent That Automates The Transistor Sizing Process (Univ. of Edinburgh)
semiengineering.comยท1h
๐Ÿ†•New AI
How leaderboards lost their spot as the best way to judge AI
platformer.newsยท21h
๐Ÿ†•New AI
MECE โ€” The AI Principle Youโ€™ll Never Stop Using After Reading This
pub.towardsai.netยท11h
๐Ÿ”AI Interpretability
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.aiยท8hยท
Discuss: Hacker News
๐ŸงฎSMT Solvers
Show HN: Comparegpt.io โ€“ Trustworthy Mode to reduce LLM hallucinations
news.ycombinator.comยท21hยท
Discuss: Hacker News
๐Ÿ—๏ธLLM Infrastructure
How To Measure AIโ€™s Organizational Impact
thenewstack.ioยท3h
๐Ÿ†•New AI
Chinese fintech giant Ant releases powerful AI model to rival DeepSeek, OpenAI
scmp.comยท23hยท
๐Ÿ†•New AI
How different AI engines generate and cite answers
searchengineland.comยท10h
๐Ÿ“ŠFeed Optimization
LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?
arxiv.orgยท18h
๐Ÿง LLM Inference
Measuring What Matters: The AI Pluralism Index
arxiv.orgยท18h
๐Ÿ”AI Interpretability