🎯 AI Alignment - faruk · Scour

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

🧠LLMs Academic

Mechanistic Interpretability: The Key to Trusting Agentic AI

🧠LLMs Discussion

bradenkelley.com·

The Ghost of Alignment — Why AI Should Never Fully Obey Humanity

📊AI Monitoring Blog

·

[Recorded talk] "AI Alignment Versus AI Ethical Treatment: 10 Challenges"

🧩Epistemics Blog

meditationsondigitalminds.substack.com··Substack

Sequent: scale and automation for higher confidence in alignment

lesswrong.com·

From oversight to coercion: How authoritarian governments are twisting AI safety to get tech companies to fall in line

theconversation.com·

Criti-hyping is the best thing that happened to Big Tech

📝Long-form Essays

reveriesofahuman.com·

Solsong Chord Updates

Controversial smut as an AI alignment issue

🧩Epistemics News Blog

thingofthings.substack.com··Substack

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

The crucial human component in computing and AI

🧩Epistemics Academic

Less-relevant results

Designer babies. Self-improving AI. Are we ready for either?

🧩Epistemics News

·

Is the Space Pope Reptilian?

🧩Epistemics News

tearsinrain.ai··Hacker News

Op Ed: Consultant Tony O’Connor On The Agentic Trojan Horse

📊AI Monitoring

thecompanydime.com·

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

turingpost.com·

Existential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)

⚙️AI Infrastructure Academic

scMTG reconstructs single-cell temporal dynamics with Markov transition generators

🧠LLMs Academic

The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably

lesswrong.com·

Stack Overflow didn't just help AI learn to code

zozo123.github.io··Hacker News

Complete Drosophila Nervous System Mapped

⚙️AI Infrastructure

neurosciencenews.com·

Log in to enable infinite scrolling