🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🤖 reinforcement learning
Learning Real-World Acrobatic Flight from Human Preferences
arxiv.org·11h
🧩operations research
I reverse-engineered a bug in my PPO agent that gave it a 9x performance boost
theprincipledagent.com·23h·
Discuss: Hacker News
🦀Rust
On Zero-Shot Reinforcement Learning
arxiv.org·2d
🧩operations research
Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning
arxiv.org·1d
📊linear programming
LLM-Driven Intrinsic Motivation for Sparse Reward Reinforcement Learning
arxiv.org·11h
📊linear programming
StepWiser: Stepwise Generative Judges for Wiser Reasoning
arxiv.org·11h
🧩operations research
[P] AI Learns to play Sonic 2 Emerald Hill (Deep Reinforcement...
youtube.com·2d
🧩operations research
HAEPO: History-Aggregated Exploratory Policy Optimization
arxiv.org·11h
🧩operations research
History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
arxiv.org·11h
🧩operations research
Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
arxiv.org·2d
🧩operations research
MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
arxiv.org·11h
🧩operations research
RLMR: Reinforcement Learning with Mixed Rewards for Creative Writing
arxiv.org·11h
📊linear programming
Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
arxiv.org·11h
📊linear programming
Introduction to Artificial Neural Networks – Part 1 (2013)
theprojectspot.com·11h·
Discuss: Hacker News
📊linear programming
Learning Interior Point Method for AC and DC Optimal Power Flow
arxiv.org·11h
📊linear programming
Deep learning reveals antibiotics in the archaeal proteome
nature.com·51m·
Discuss: Hacker News
🦀Rust
Collaborative-Online-Learning-Enabled Distributionally Robust Motion Control for Multi-Robot Systems
arxiv.org·1d
🧩operations research
The Lazy Genius Inside Your Chatbot: Meet MoD, the Art of Thinking Less but Smarter
dev.to·20h·
Discuss: DEV
🧩operations research
The Science of Intelligent Exploration: Why We Need Exploration in AI
richardcsuwandi.github.io·3d·
Discuss: Hacker News
🧩operations research
MAB Optimizer for Estimating Math Question Difficulty via Inverse CV without NLP
arxiv.org·11h
📊linear programming
Loading...Loading more...
AboutBlogChangelogRoadmap