Suggestions for improving debate protocols in AI safety (opens in new tab)
OverviewWhile many leading AI Safety researchers share an intuition that debate can be a powerful element of AI Safety measures, the nuaunces of debate protocols seem to be a less explored facet of the research. Competitive human debate offers a wealth of existing formats with distinct rules, which could inform future AI Safety implementations. The rules of competitive debate present ready-made alternative protocols to counteract observed model gaming behaviours and may present options to sub...
Read the original article