arXiv:2601.15774v1 Announce Type: new Abstract: Monolithic Firmware is widespread. Unsurprisingly, fuzz testing firmware is an active research field with new advances addressing the unique challenges in the domain. However, understanding and evaluating improvements by deriving metrics such as code coverage and unique crashes are problematic, leading to a desire for a reliable bug-based benchmark. To address the need, we design and build FirmReBugger, a holistic framework for fairly assessing monolithic firmware fuzzers with a realistic, diverse, bug-based benchmark. FirmReBugger proposes using bug oracles–C syntax expressions of bug descriptors–with an interpreter to automate analysis and accurately report on bugs discovered, discriminating between states of detected, triggered, reached …
arXiv:2601.15774v1 Announce Type: new Abstract: Monolithic Firmware is widespread. Unsurprisingly, fuzz testing firmware is an active research field with new advances addressing the unique challenges in the domain. However, understanding and evaluating improvements by deriving metrics such as code coverage and unique crashes are problematic, leading to a desire for a reliable bug-based benchmark. To address the need, we design and build FirmReBugger, a holistic framework for fairly assessing monolithic firmware fuzzers with a realistic, diverse, bug-based benchmark. FirmReBugger proposes using bug oracles–C syntax expressions of bug descriptors–with an interpreter to automate analysis and accurately report on bugs discovered, discriminating between states of detected, triggered, reached and not reached. Importantly, our idea of benchmarking does not modify the target binary and simply replays fuzzing seeds to isolate the benchmark implementation from the fuzzer while providing a simple means to extend with new bug oracles. Further, analyzing fuzzing roadblocks, we created FirmBench, a set of diverse, real-world binary targets with 313 software bug oracles. Incorporating our analysis of roadblocks challenging monolithic firmware fuzzing, the bench provides for rapid evaluation of future advances. We implement FirmReBugger in a FuzzBench-for-Firmware type service and use FirmBench to evaluate 9 state-of-the art monolithic firmware fuzzers in the style of a reproducibility study, using a 10 CPU-year effort, to report our findings.