Alignment Research, Model Robustness, Adversarial Examples, Risk Assessment

Neural Logic Gates
blog.typeobject.com·19h·
Discuss: Hacker News