Research Areas in Evaluation and Guarantees in Reinforcement Learning (The Alignment Project by UK AISI)
lesswrong.com·22h
Research Areas in Information Theory and Cryptography (The Alignment Project by UK AISI)
lesswrong.com·22h
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
arxiv.org·2d
Research Areas in Benchmark Design and Evaluation (The Alignment Project by UK AISI)
lesswrong.com·22h
Investigating the Invertibility of Multimodal Latent Spaces: Limitations of Optimization-Based Methods
arxiv.org·1d
I am worried about near-term non-LLM AI developments
lesswrong.com·1d
FovEx: Human-Inspired Explanations for Vision Transformers and Convolutional Neural Networks
arxiv.org·1d
DICOM De-Identification via Hybrid AI and Rule-Based Framework for Scalable, Uncertainty-Aware Redaction
arxiv.org·1d
Learning from Limited and Imperfect Data
arxiv.org·3d
SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches
arxiv.org·1d
Loading...Loading more...