Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.com·1d
Flag this post

Published on October 31, 2025 1:28 AM GMT

by Raghu Arghal, Fade Chen, Niall Dalton, Mario Giulianelli, Evgenii Kortukov, Calum McNamara, Angelos Nalmpantis, Moksh Nirvaan, and Gabriele Sarti
 

TL;DR

This is the first post in an upcoming series of blog posts outlining Project Telos. This project is being carried out as part of the Supervised Program for Alignment Research (SPAR). Our aim is to develop a methodological framework to detect and measure goals in AI systems.

In this initial post, we give some background on the project, discuss the results of our first round of experiments, and then give some pointers about avenues we’re hoping to explore in the coming months.

Understanding AI Goals

As AI systems become mo…

Similar Posts

Loading similar posts...