Unpacking Parquet: Explicit SIMD, Scalar Baselines, and What HotSpot Makes of Them (opens in new tab)

Covers JEP 454: Foreign Function & Memory APIDiscussed on r/java

On the JVM, optimizing a hot kernel is not only about writing faster code: it is also about understanding how much the result depends on the machine code HotSpot derives from the scalar loop. Using Parquet bit-unpacking as a concrete case, the piece shows that a SIMD speedup depends on which scalar baseline C2 is handed, when explicit vectorization is actually justified, and why a more specialized scalar routine is not necessarily faster.

Read the original article