🎯 Reinforcement Learning - tomas.burkert · Scour

Optimistic Training and Convergence of Q-Learning -- Extended Version

arxiv.org·4d

💬Prompt Engineering

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

arxiv.org·21h

💬Prompt Engineering

Architectural and Mathematical Foundations of Machine Learning: A Rigorous Synthesis of Theory, Geometry, and Implementation

chizkidd.github.io·2d·

Discuss: Hacker News

🧠Machine Learning

Brain game may reduce risk of Alzheimer’s and other dementias

krdo.com·10h

🧠Cognitive Science

The 4 Mixture of Experts Architectures: How to Train 100B Models at 10B Cost

pub.towardsai.net

·1d

💬Prompt Engineering

MiniMaxAI/MiniMax-M2.5

huggingface.co·12h·

Discuss: Hacker News, r/LocalLLaMA

💬Prompt Engineering

A “Toolbox” Pipeline for Robots That See, Read, and Act

hackernoon.com·1d

💬Prompt Engineering

Multi objective optimization of a discrete fracture geothermal reservoir using Bi-LSTM network

sciencedirect.com·8h

💬Prompt Engineering

Scaling LLM Post-Training at Netflix

netflixtechblog.com·18h

💬Prompt Engineering

Shel-y/q-drift: Quantum-inspired CLI to analyze structural fragility and decision drift in distributed systems using Shannon Entropy and Signal Decay models.

github.com·18h·

Discuss: DEV

💬Prompt Engineering

Olmix: A framework for data mixing throughout LM development

allenai.org·10h

💬Prompt Engineering

A training principle for drifting models

breno.bearblog.dev·1d

🧠Machine Learning

The democratization of AI data poisoning and how to protect your organization

csoonline.com·15h

🧠Machine Learning

Generalized Lanczos method for systematic optimization of neural-network quantum states

link.aps.org·1d

💬Prompt Engineering

Memory and Learning layer be built in-house or bought externally?

medium.com·3d·

Discuss: Hacker News

💬Prompt Engineering

I gave my OpenClaw GTM assistant a brain. Here's what happened

shawnharris.com·5h·

Discuss: Hacker News

💬Prompt Engineering

Product Forecasting through Time Series Analysis (Modelling)

pub.towardsai.net·1d

📊Data Science

Hybrid neural–cognitive models reveal how memory shapes human reward learning

nature.com·6d

🧠Cognitive Science

Recursive self-improvement from AI models

marginalrevolution.com·3d·

Discuss: Hacker News

💬Prompt Engineering

GLM-5: Targeting complex systems engineering and long-horizon agentic tasks

news.ycombinator.com·7h·

Discuss: Hacker News

💬Prompt Engineering

Sign up or log in to see more results