🎯 AI Alignment - flicksinfants1y

Existential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)

🛡LLM safety Academic

arxiv.org·

CISO Forum Webinar Today: 2026 Mid-Year Review

⚖️AI Governance

securityweek.com·

AI Safety — Genuine or Performative?

⚖️AI Governance Blog

medium.com

The Best Politician In A Generation

⚖️AI Governance News Blog

benthams.substack.com··Substack

What Will Canada’s AI Strategy Mean for Jobs and Safety?

⚖️AI Governance News

thetyee.ca

The crucial human component in computing and AI

🛡LLM safety Academic

news.mit.edu·

Industry Reactions Highlight the Growing Importance of AI Cybersecurity Governance

⚖️AI Governance Blog

medium.com·

Hollow-house-institute/HHI_Runtime_Core: Single deployable execution-time governance runtime core for telemetry, replay, assurance, audit, authority, workflow, and observability.

⚖️AI Governance Code

github.com··DEV

Anatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning Signal

🧬Embeddings Academic

arxiv.org·

ML4Good Summer 2026 Bootcamps - Applications Open!

✍️Prompt Engineering

lesswrong.com·

CryoDiff: An uncertainty-aware diffusion model for Cryo-EM map enhancement

🧬Embeddings Academic

biorxiv.org·

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

🤖AI Academic

arxiv.org·

The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably

🛡LLM safety

lesswrong.com·

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

📊Model Evaluation Academic

arxiv.org·

REACH: Interpretability-Driven Feature Identification and Architecture Compression for Multi-Channel Vehicular Channel Estimation

🧬Embeddings Academic

arxiv.org·

I Started an AI Safety Research Org and Think These 7 Things Matter

✍️Prompt Engineering

lesswrong.com·

Learnings from starting an AI safety research team

🛡️Red Teaming

lesswrong.com·

How authoritarian governments twist AI safety to coerce tech companies to comply

Agentic AI Governance: Designing for Accountability and Control | The JetBrains AI Blog

Models May Behave Worse When Eval Aware

Existential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)

CISO Forum Webinar Today: 2026 Mid-Year Review

AI Safety — Genuine or Performative?

The Best Politician In A Generation

What Will Canada’s AI Strategy Mean for Jobs and Safety?

The crucial human component in computing and AI

Industry Reactions Highlight the Growing Importance of AI Cybersecurity Governance

Hollow-house-institute/HHI_Runtime_Core: Single deployable execution-time governance runtime core for telemetry, replay, assurance, audit, authority, workflow, and observability.

Anatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning Signal

ML4Good Summer 2026 Bootcamps - Applications Open!

CryoDiff: An uncertainty-aware diffusion model for Cryo-EM map enhancement

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

REACH: Interpretability-Driven Feature Identification and Architecture Compression for Multi-Channel Vehicular Channel Estimation

I Started an AI Safety Research Org and Think These 7 Things Matter

Learnings from starting an AI safety research team