Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge
arxiv.org·3d
MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams
arxiv.org·3d
An Iterative Reconstruction Method for Dental Cone-Beam Computed Tomography with a Truncated Field of View
arxiv.org·3d
Thoughts on extrapolating time horizons
lesswrong.com·3d
AI Safety at the Frontier: Paper Highlights, July '25
lesswrong.com·5d
Unequal Uncertainty: Rethinking Algorithmic Interventions for Mitigating Discrimination from AI
arxiv.org·3d
ARENA 5.0 Impact Report
lesswrong.com·4d
Loading...Loading more...