ddboline's Top FindsLoading...
Power Constrained Nonstationary Bandits with Habituation and Recovery Dynamics
arxiv.org·10h
🤖reinforcement learning
Flag this post
American Wind Farms
tech.marksblogg.com·8h·
Discuss: Hacker News
📊linear programming
Flag this post
The Orchestrator Pattern: Routing Conversations to Specialized AI Agents
dev.to·18h·
Discuss: DEV
🤖reinforcement learning
Flag this post
The Complexity Cliff: Why Reasoning Models Work Right Up Until They Don't
rewire.it·15h·
Discuss: Hacker News
🤖reinforcement learning
Flag this post
Rodrigo Girão Serrão: A generator, duck typing, and a branchless conditional walk into a bar
mathspp.com·1d
🦀Rust
Flag this post
Algorithmic Alchemy: Transmuting Dynamic Programming with Gradients by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🤖reinforcement learning
Flag this post
Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments
arxiv.org·10h
🤖reinforcement learning
Flag this post
A Guide to My Organizational Workflow
cachestocaches.com·15h·
Discuss: Hacker News
🧩operations research
Flag this post
Harness the Power of Atlas Search and Vector Search with $RankFusion
mongodb.com·29m·
Discuss: Hacker News
🧩operations research
Flag this post
Explaining Human Choice Probabilities with Simple Vector Representations
arxiv.org·10h
🤖reinforcement learning
Flag this post
Sign up or login to customize your feed and get personalized topic recommendations
Reinforcement Learning: How Machines Learn to Make Smart Choices Like You Do
dev.to·1d·
Discuss: DEV
🤖reinforcement learning
Flag this post
Periodic Skill Discovery
arxiv.org·10h
🤖reinforcement learning
Flag this post
Dynamic Freight Route Optimization via Multi-Agent Reinforcement Learning with Adaptive Risk Aversion
dev.to·9h·
Discuss: DEV
🤖reinforcement learning
Flag this post
Mathematical exploration and discovery at scale
terrytao.wordpress.com·12h·
Discuss: Hacker News
🤖reinforcement learning
Flag this post
LazyLLM, Easiest and laziest way for building multi-agent LLMs applications
github.com·15h·
Discuss: Hacker News
🤖reinforcement learning
Flag this post
Optimal Boundary Control of Diffusion on Graphs via Linear Programming
arxiv.org·10h
📊linear programming
Flag this post
[Deep Dive] How We Solved Poker: From Academic Bots to Superhuman AI (1998-2025)
gist.github.com·13h·
Discuss: r/programming
🤖reinforcement learning
Flag this post
Stop vibe coding your unit tests
andy-gallagher.com·22h·
Discuss: Hacker News
🦀Rust
Flag this post