Finding duplicated code with tools from your CS course
purplesyringa.moe·6d·
Flag this post

November 17, 2025

Suppose that you’re writing a static analyzer and you want to write a diagnostic for match arms with equal bodies:

match number {
1 => { // <--
let x = 1;
f(x)
}
2 => f(g(h())),
3 => "",
4 => { // <--
let x = 1;
f(x)
}
_ => unreachable!(),
}

Well, that looks simple enough: serialize each arm into a string and throw the strings into a hash map. Then someone renames a variable:

match number {
1 => { // <--
let x = 1;
f(x)
}
2 => f(g(h())),
3 => "",
4 => { // <--
let y = 1;
f(y)
}
_ => unreachable!(),
}

Now the strings no longer match, but the arms are still clearly equivalent. Scary! It’s not immediately obvious how to handle this correctly, let alone efficiently.

It turns out that this problem has interesting connections to the theory of com…

Similar Posts

Loading similar posts...