Using MSM, we can also empirically study which model specs or constitutions yield the best generalization from alignment training. (opens in new tab)

Using MSM, we can also empirically study which model specs or constitutions yield the best generalization from alignment training. Specifying rules works to some extent, but explaining the values underlying those rules (or adding more detailed subrules) is even better.