🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🤖 reinforcement learning
Reinforcement Learning for Target Zone Blood Glucose Control
arxiv.org·4h
🧩operations research
Actual LLM agents are coming
pleias.fr·3h·
Discuss: Hacker News
🧩operations research
**Automated Generative Design Optimization via Hyperdimensional Feature Mapping and Bayesian Reinforcement Learning**
dev.to·2h·
Discuss: DEV
🧩operations research
Backpropagating through a maze with candle and WASM
yberreby.com·1d·
Discuss: Hacker News
🧩operations research
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
arxiv.org·1d
🧩operations research
Augmented Reinforcement Learning Framework For Enhancing Decision-Making In Machine Learning Models Using External Agents
arxiv.org·2d
🧩operations research
Sotopia-RL: Reward Design for Social Intelligence
arxiv.org·4h
🧩operations research
MARS: A Meta-Adaptive Reinforcement Learning Framework for Risk-Aware Multi-Agent Portfolio Management
arxiv.org·2d
🧩operations research
AI Plays Risk – Lessons from a silly benchmark
andreasthinks.me·18h·
Discuss: Hacker News
🧩operations research
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success
arxiv.org·4h
📊linear programming
From "Aha Moments" to Controllable Thinking: Toward Meta-Cognitive Reasoning in Large Reasoning Models via Decoupled Reasoning and Control
arxiv.org·4h
🧩operations research
Hyperproperty-Constrained Secure Reinforcement Learning
arxiv.org·3d
📊linear programming
GTPO: Trajectory-Based Policy Optimization in Large Language Models
arxiv.org·4h
📊linear programming
Adaptive Contact Force Control of Multi-Body Systems via Hybrid Model Predictive Control and Reinforcement Learning
dev.to·2d·
Discuss: DEV
🧩operations research
One Model, Any Scenario: End-to-End Locomotion from Vision
skild.ai·9h·
Discuss: Hacker News
🏃‍♀️running
Car Reinforcement Learning Training
github.com·14h·
Discuss: Hacker News
🧩operations research
Metaverse-Native Skill Validation & Dynamic Workforce Allocation via Federated Learning
dev.to·2h·
Discuss: DEV
🧩operations research
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
arxiv.org·1d
📊linear programming
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
arxiv.org·4h
📊linear programming
Predictive Fleet Optimization with Dynamic Resource Allocation via Bayesian Reinforcement Learning
dev.to·3h·
Discuss: DEV
🧩operations research
Loading...Loading more...
AboutBlogChangelogRoadmap