An anonymous reader shared this report from It’s FOSS: *Jenny Guanni Qu, a researcher at [VC fund] Pebblebed, analyzed 125,183 bugs from 20 years of Linux kernel development history (on Git). The findings show that the average bug takes 2.1 years to find. [Though the median is 0.7 years, with the average possibly skewed by "outliers" discovered after years of hiding.] The longest-lived bug, a buffer overflow in networking code, went unnoticed for 20.7 years! [But 86.5% of bugs are found within five years.]
The research was carried out by relying on the Fixes: tag that is used in kernel development. Basically, when a commit fixes a bug, it includes a tag po…
An anonymous reader shared this report from It’s FOSS: *Jenny Guanni Qu, a researcher at [VC fund] Pebblebed, analyzed 125,183 bugs from 20 years of Linux kernel development history (on Git). The findings show that the average bug takes 2.1 years to find. [Though the median is 0.7 years, with the average possibly skewed by "outliers" discovered after years of hiding.] The longest-lived bug, a buffer overflow in networking code, went unnoticed for 20.7 years! [But 86.5% of bugs are found within five years.]
The research was carried out by relying on the Fixes: tag that is used in kernel development. Basically, when a commit fixes a bug, it includes a tag pointing to the commit that introduced the bug. Jenny wrote a tool that extracted these tags from the kernel’s git history going back to 2005. The tool finds all fixing commits, extracts the referenced commit hash, pulls dates from both commits, and calculates the time frame. As for the dataset, it includes over 125k records from Linux 6.19-rc3, covering bugs from April 2005 to January 2026. Out of these, 119,449 were unique fixing commits from 9,159 different authors, and only 158 bugs had CVE IDs assigned.
It took six hours to assemble the dataset, according to the blog post, which concludes that the percentage of bugs found within one year has improved dramatically, from 0% in 2010 to 69% by 2022. The blog post says this can likely be attributed to:
-
The Syzkaller fuzzer (released in 2015)
-
Dynamic memory error detectors like KASAN, KMSAN, KCSAN sanitizers
-
Better static analysis
-
More contributors reviewing code
But "We’re simultaneously catching new bugs faster AND slowly working through ~5,400 ancient bugs that have been hiding for over 5 years."
They’ve also developed an AI model called VulnBERT that predicts whether a commit introduces a vulnerability, claiming that of all actual bug-introducing commits, it catches 92.2%. "The goal isn’t to replace human reviewers but to point them at the 10% of commits most likely to be problematic, so they can focus attention where it matters..."