arXiv:2601.19909v1 Announce Type: new Abstract: Prime generation is a fundamental task in cryptography, number theory, and randomized algorithms. While the classical Sieve of Eratosthenes is simple and efficient in theory, its practical performance on modern central processing units is often limited by memory access inefficiencies. This paper introduces a cache-aware hybrid sieve that integrates segmentation, bit-packing, and cache-line-aligned block processing to optimize memory bandwidth and level one and level two cache locality. The proposed approach reduces memory usage by storing only odd numbers and using one bit per value. The sieve range is divided into cache-sized blocks to minimize cache misses, while primes up to the square root of the limit are reused across blocks. Experiment…
arXiv:2601.19909v1 Announce Type: new Abstract: Prime generation is a fundamental task in cryptography, number theory, and randomized algorithms. While the classical Sieve of Eratosthenes is simple and efficient in theory, its practical performance on modern central processing units is often limited by memory access inefficiencies. This paper introduces a cache-aware hybrid sieve that integrates segmentation, bit-packing, and cache-line-aligned block processing to optimize memory bandwidth and level one and level two cache locality. The proposed approach reduces memory usage by storing only odd numbers and using one bit per value. The sieve range is divided into cache-sized blocks to minimize cache misses, while primes up to the square root of the limit are reused across blocks. Experimental results demonstrate up to an eight times reduction in memory usage and runtime improvements of up to two point four times compared to the classical sieve and one point seven times compared to the segmented sieve. Benchmarks up to ten to the power of nine illustrate that architecture-aware algorithm design can yield substantial practical performance gains.