I propose an approach for highly personalized LLMs, for near-future productivity gains and personal info/cybersecurity against increasingly powerful LLMs: they should, in the spirit of uploading, try to emulate the user’s values and preferences in order to amplify the principal—not replace them. I discuss a package of techniques and proposals to accomplish such ‘guardian angels’; dynamic evaluation of LLMs combined with active learning and elicitation and heavy inner-monologue search/data-aug...

Sign in to keep reading the full article.

Covered in 3 articles

Surplus: for massive public good

lesswrong.com·

Personal-Values Alignment Tech: Some Initial Motivations

blog.danielsosebee.com··Hacker News

In other languages

비용 상승에 따라 로컬 에이전트를 추진하는 AI 개발자들

kite.kagi.com·