Enhancing Diffusion-based Restoration Models via Difficulty-Adaptive Reinforcement Learning with IQA Reward
arxiv.org·2h
🤖AI
Flag this post
Information Gain-based Policy Optimization: A Simple and Effective Approach forMulti-Turn LLM Agents
💬Prompt Engineering
Flag this post
Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems
arxiv.org·2h
💬Prompt Engineering
Flag this post
Accelerated Dielectric Barrier Coating Optimization via Multi-Modal Data Fusion & Bayesian Hyperparameter Tuning
🧠Machine Learning
Flag this post
Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
arxiv.org·2h
🤖AI
Flag this post
Augmenting learning in neuro-embodied systems through neurobiological first principles
arxiv.org·2h
🧠Cognitive Science
Flag this post
Dynamic Model Selection for Trajectory Prediction via Pairwise Ranking and Meta-Features
arxiv.org·2h
🧠Machine Learning
Flag this post
Post-training methods for language models
developers.redhat.com·28m
🗣️LLMs
Flag this post
Adaptive Human-Computer Interaction Strategies Through Reinforcement Learning in Complex
arxiv.org·1d
💬Prompt Engineering
Flag this post
Robust Single-Agent Reinforcement Learning for Regional Traffic Signal Control Under Demand Fluctuations
arxiv.org·2h
💬Prompt Engineering
Flag this post
Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models
arxiv.org·2h
🧠Machine Learning
Flag this post
Token-Regulated Group Relative Policy Optimization for Stable Reinforcement Learning in Large Language Models
arxiv.org·2h
🗣️LLMs
Flag this post
Reinforcement Learning for Resource Allocation in Vehicular Multi-Fog Computing
arxiv.org·2h
💬Prompt Engineering
Flag this post
Improving the Robustness of Control of Chaotic Convective Flows with Domain-Informed Reinforcement Learning
arxiv.org·2h
🤖AI
Flag this post
Aligning LLM agents with human learning and adjustment behavior: a dual agent approach
arxiv.org·2h
💬Prompt Engineering
Flag this post
Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning
arxiv.org·2h
💬Prompt Engineering
Flag this post
Prefrontal inhibitory mechanisms associated with Putamen activity during valence learning revealed by multimodal fMRI-fMRS
nature.com·1d
🧠Cognitive Science
Flag this post
On the Fundamental Limitations of Decentralized Learnable Reward Shaping in Cooperative Multi-Agent Reinforcement Learning
arxiv.org·2h
💬Prompt Engineering
Flag this post
Uncertain node-state PI-DBN: A novel framework for predictive modeling of real-time blowout risk in deepwater drilling
sciencedirect.com·16h
🧠Machine Learning
Flag this post
Loading...Loading more...