🎮 Reinforcement Learning - saeedesmaili · Scour

TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution

📈Optimization Academic

Less-relevant results

Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing

🚀Bootstrapping

jack-clark.net·

I got so mad at poke(rogue)like that I trained a RL agent to beat it for me

thiagolira.blot.im··Hacker News

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

🚀Bootstrapping Academic

China women’s volleyball team finish Nations League leg on a high after opening defeat

📈Economics News

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

🔥PyTorch Academic

Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations

🔢Embeddings Academic

Geometrically Averaged Hard Target Updates for Linear Q-Learning

📈Optimization Academic

OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training agents.

🤖Machine Learning Blog

huggingface.co··Hacker News, r/LocalLLaMA

Comp.compilers: Paper: MileStone: A Multi-Objective Compiler Phase Ordering Framework for Graph-based IR-Level Optimization

🕸️Knowledge Graphs

compilers.iecc.com·

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

🤖LLM Academic

Combermere and Harrison College reach Under-15 basketball final

🔤Tokenization

CCKS: Consensus-based Communication and Knowledge Sharing

🧠Knowledge Management Academic

Central College News

📈Economics Academic

news.central.edu·

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

🔥PyTorch Academic

Risk Has an Owner, and It's Not the AI

🤖Automation Blog

aaddrick.com··Hacker News

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

📈Optimization Academic

What is MBPO? A Beginner’s Guide to Efficient Reinforcement Learning

🧠LLM Inference Blog

ujangriswanto08.medium.com·

Geometry-Aware Reinforcement Learning for 2D Irregular Nesting

🔥PyTorch Academic

Sign up or log in to see more results

Log in to enable infinite scrolling