🎯 Reinforcement Learning - tomas.burkert · Scour

check out this article on Reinforcement Learning with R: Origins, Real-Life Applications, and Practical Implementation

dev.to·2d·

Discuss: DEV

💬Prompt Engineering

Optimistic Training and Convergence of Q-Learning -- Extended Version

arxiv.org·3d

💬Prompt Engineering

On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling

arxiv.org·1d

💬Prompt Engineering

Optimizing post-disaster road restoration with reinforcement learning: A traveler-behavior-aware approach

sciencedirect.com·10h

💬Prompt Engineering

Show HN: Fighting the War Against Expensive Reinforcement Learning

cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app·19h·

Discuss: Hacker News

💬Prompt Engineering

BetaZero V2: A Diffusion Model for Setting Boulder Problems

evmojo37.substack.com·3h·

Discuss: Substack

A Conceptual Framework for Exploration Hacking

lesswrong.com·10h

💬Prompt Engineering

Feedback Control for Computer Systems

janert.org·19h

🐚Shell Scripting

How to Leverage Explainable AI for Better Business Decisions

towardsdatascience.com·12h

💬Prompt Engineering

A training principle for drifting models

breno.bearblog.dev·15h

🧠Machine Learning

A multi-agent reinforcement learning approach to autonomous aircraft taxiing with taxiing time, fuel consumption, and emission optimization

sciencedirect.com·1d

💬Prompt Engineering

Show HN: A minimal online decision maker

decisionmaker.online·1d·

Discuss: Hacker News

🧠Cognitive Science

Optimal timing for superintelligence

feeds.feedblitz.com·2h

💬Prompt Engineering

Researchers propose a self-distillation fix for ‘catastrophic forgetting’ in LLMs

infoworld.com·16h

💬Prompt Engineering

The 4 Mixture of Experts Architectures: How to Train 100B Models at 10B Cost

pub.towardsai.net

·14h

💬Prompt Engineering

v6 (Code 2 here) — Most complete architecture. This version is faster than my old v5, statistically correct, has all the advanced psychology/network features, and produces stunning visualizations

gist.github.com·8h·

Discuss: r/C_Programming

🧠Cognitive Science

Gibbs Measures from Deep Shaped Multilayer Perceptrons

link.aps.org·14h

Hybrid neural–cognitive models reveal how memory shapes human reward learning

nature.com·5d

🧠Cognitive Science

In defense of wasting time

fastcompany.com·7h

📵Digital Minimalism

A “Toolbox” Pipeline for Robots That See, Read, and Act

hackernoon.com·2h

💬Prompt Engineering

Loading more...