original ↗
allendowney.com·1d
Query Optimization
Flag this post
Logic Theorist: The program that rewrote the foundations of mathematics
bigthink.com·12h
🤖AI
Flag this post
Deep Reinforcement Learning Book
deepreinforcementlearningbook.org·5d·
Discuss: Hacker News
🔀Transformers
Flag this post
How to evaluate and benchmark Large Language Models (LLMs)
together.ai·1d
🤖AI
Flag this post
GDM: Consistency Training Helps Limit Sycophancy and Jailbreaks in Gemini 2.5 Flash
lesswrong.com·11h
🔧Feature Engineering
Flag this post
Information Gain-based Policy Optimization: A Simple and Effective Approach forMulti-Turn LLM Agents
dev.to·1d·
Discuss: DEV
🤖AI
Flag this post
Disciplined Biconvex Programming
arxiv.org·23h
🤖AI
Flag this post
Algorithmic Assistance with Recommendation-Dependent Preferences
arxiv.org·23h
🤖AI
Flag this post
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
paperium.net·4d·
Discuss: DEV
🔀Transformers
Flag this post
AI and Behavioral Economics: Decoding Decision-Making in the Digital Age
dev.to·1d·
Discuss: DEV
🤖AI
Flag this post
Scaling Coding-Agent RL to 32x H100s. 160% Improvement on Stanford's TBench
github.com·1d·
🤖AI
Flag this post
Trust in the Machine: Building Reputable Service Networks for AI Agents
dev.to·1d·
Discuss: DEV
🌐Distributed Systems
Flag this post
Process Bottleneck Breakthrough: AI-Powered Outcome Prediction
dev.to·1h·
Discuss: DEV
🔧Feature Engineering
Flag this post
Artificial intelligence: Nirvana or apocalypse?
mathscholar.org·6h
🤖AI
Flag this post
Unlock the Power of GANs: Train with Tiny Datasets!
dev.to·9h·
Discuss: DEV
🔀Transformers
Flag this post
The Evolution from RAG to Agentic RAG to Agent Memory
leoniemonigatti.com·15h·
Discuss: Hacker News
🤖AI
Flag this post
It Doesn’t Need to Be a Chatbot
towardsdatascience.com·1d
🔀Transformers
Flag this post
Hydra: Dual Exponentiated Memory for Multivariate Time Series Analysis
arxiv.org·23h
📈Time Series
Flag this post
WTF is Inverse Reinforcement Learning?
dev.to·2d·
Discuss: DEV
🤖AI
Flag this post