AI Alignment

Feeds to Scour
SubscribedAll
Scoured 466 posts in 8.4 ms

How authoritarian governments twist AI safety to coerce tech companies to comply

 ⚖️AI Governance
fastcompany.com·

Agentic AI Governance: Designing for Accountability and Control | The JetBrains AI Blog

 ⚖️AI Governance  Content type: Blog
blog.jetbrains.com·

Models May Behave Worse When Eval Aware

 🛡LLM safety
lesswrong.com·

Existential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)

 🛡LLM safety  Content type: Academic
arxiv.org·

CISO Forum Webinar Today: 2026 Mid-Year Review

 ⚖️AI Governance
securityweek.com·

AI Safety — Genuine or Performative?

 ⚖️AI Governance  Content type: Blog
medium.com
·

The Best Politician In A Generation

 ⚖️AI Governance  Content type: News  Content type: Blog

What Will Canada’s AI Strategy Mean for Jobs and Safety?

 ⚖️AI Governance  Content type: News
thetyee.ca
·

The crucial human component in computing and AI

 🛡LLM safety  Content type: Academic
news.mit.edu·

Industry Reactions Highlight the Growing Importance of AI Cybersecurity Governance

 ⚖️AI Governance  Content type: Blog
medium.com·

Hollow-house-institute/HHI_Runtime_Core: Single deployable execution-time governance runtime core for telemetry, replay, assurance, audit, authority, workflow, and observability.

 ⚖️AI Governance  Content type: Code
github.com··DEV

Anatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning Signal

 🧬Embeddings  Content type: Academic
arxiv.org·

ML4Good Summer 2026 Bootcamps - Applications Open!

 ✍️Prompt Engineering
lesswrong.com·

CryoDiff: An uncertainty-aware diffusion model for Cryo-EM map enhancement

 🧬Embeddings  Content type: Academic
biorxiv.org·

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

 🤖AI  Content type: Academic
arxiv.org·

The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably

 🛡LLM safety
lesswrong.com·

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

 📊Model Evaluation  Content type: Academic
arxiv.org·

REACH: Interpretability-Driven Feature Identification and Architecture Compression for Multi-Channel Vehicular Channel Estimation

 🧬Embeddings  Content type: Academic
arxiv.org·

I Started an AI Safety Research Org and Think These 7 Things Matter

 ✍️Prompt Engineering
lesswrong.com·

Learnings from starting an AI safety research team

 🛡️Red Teaming
lesswrong.com·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help