What We Built: Top ZFS Capabilities Delivered by Klara in 2025

ZFS keeps advancing - and much of the progress comes from what we see in real-world customer environments. At Klara Inc., our engineering team doesn’t just maintain ZFS; they actively evolve it.

The features we delivered this year were shaped by the patterns, edge cases, and performance behaviours that surface across high-scale deployments; free-space fragmentations that impact throughput, RAIDZ inefficiencies, flash hardware constraints, and the growing need for faster, more predictable recovery paths.

This overview covers the major ZFS capabilities our team built and upstreamed in 2025 - practical, production-ready improvements that boost performance, reliability, diagnostics, and flexibility across modern storage systems.

Fragmentation Profile Import/Export:

Test Using …

ZFS keeps advancing - and much of the progress comes from what we see in real-world customer environments. At Klara Inc., our engineering team doesn’t just maintain ZFS; they actively evolve it.

Fragmentation Profile Import/Export:

Test Using Real-world Fragmentation in Laboratory Conditions

Free-space fragmentation is a major factor in many ZFS performance issues, but recreating real fragmentation normally requires many hours of filling and rewriting large pools, and is never fully representative of the conditions created by real workloads. The new feature, Fragmentation Profile Import/Export, allows the fragmentation of a production pool to be mirrored to a test environment instantly. Applying real-world fragmentation to a test environment facilitates better testing and allows for long-term performance regression testing.

The feature adds two key tools:

zdb export: captures the pool’s actual allocation map.
zhack metaslab leak: applies that map to a test pool by creating matching raw allocations.

***Benefits: ***Faster and more accurate performance testing, lower hardware requirements for reproducing production issues, and more faithful evaluation of new allocator algorithms. Organizations gain more predictable performance outcomes and shorter time-to-resolution when diagnosing fragmentation-related slowdowns.

Dynamic Gang Headers:

Leveraging Larger Sectors for Improved Efficiency

Gang blocks are used when ZFS cannot store a large block of data in a single contiguous region. On heavily fragmented or RAIDZ systems, ZFS with “gang” together multiple smaller segments of space to create the required allocation, but if this required more than 3 gang members, an additional layer of indirection was required, resulting in very inefficient allocations. Klara’s updates improve both correctness and efficiency by fixing checksum edge cases and introducing flexible, dynamically sized gang headers that better match modern sector sizes. Larger allocations will require fewer levels of indirection, improving space efficiency.

The work includes two major improvements:

Fixes to gang-block checksum handling and RAIDZ behavior
Support for larger, variable-size gang headers to reduce space waste and create shallower gang trees

**Benefits: **Reduced overall fragmentation improves the performance of writes and all future reads of the same data. Variable ganging also reduces space waste when ganging occurs, and provides more predictable performance on large RAIDZ deployments. Organizations gain more stable behavior under heavy fragmentation and lower overhead during complex allocations.

Large Labels:

Supporting Next-Generation Hardware

Klara’s redesign of the ZFS on-disk label expands the label area from 256 KiB to 256 MiB, enabling support for large-sector flash devices (64 KiB and beyond), while also providing a deeper rewind history, safer recovery after miswrites or corruption, and future flexibility for additional on-disk features. The larger label stores more uberblocks, separates reserved and best-effort entries, and improves how ZFS can recover from catastrophic human errors. It also lets each disk store the full pool configuration, which speeds up pool imports and improves diagnostics when disks are missing.

Benefits: Support for the next generation of large-scale flash storage devices, further improved protection against data loss, faster disaster recovery, and future-proofing the on-disk format for upcoming features.

Forced Export:

Immediate Recovery From Stalled Pools

When devices fail or pools become suspended, ZFS will now be able to safely force-export instead of hanging the system. Operations continue without a reboot, and failover systems can respond instantly.

Benefits: Improved high availability, faster recovery, and dramatically fewer cluster-wide stalls — especially in multi-pool or HA environments.

Slow RAIDZ Sit-Out:

Stable Performance When Disks Slow Down

In RAIDZ configurations, a single slow or failing drive can bottleneck the entire vdev. Klara’s slow-RAIDZ sit-out logic detects these underperforming disks and temporarily removes them from read operations, preventing one outlier from dragging down overall system performance.

Benefits: More consistent performance during disk degradation, fewer latency spikes across production workloads, fewer temporary conditions causing outages, and less pressure to replace disks immediately — giving sysadmins smoother operations and organizations lower maintenance overhead.

Scrub Date Range:

Faster, Targeted Scrubs for Large Pools

Full ZFS scrubs process the entire pool, even when only a small portion of data has changed. Klara’s scrub date range enhancement adds the ability to scrub only the blocks modified within a defined time window, which significantly reduces scrub duration and system load.

Two new flags enable this behavior:

-S <timestamp>: start time for the scrub
-E <timestamp>: end time for the scrub

ZFS checks each block’s modification time and includes only those that fall inside the specified range.

*Benefits: *Shorter maintenance windows, lower I/O pressure during scrubs, and more flexible integrity checks after specific events. Organizations gain tighter data-integrity cycles without the operational cost of full-pool scrubs.

High-Scale Performance Improvements (OpenZFS 2.4)

Featuring Fast Dedup Log pacing, Parallel ARC eviction, smarter allocation under heavy fragmentation, and a long-awaited fix for the Encryption + Send issue.

***Benefits: ***More predictable performance under heavy load, smoother backups and replication, and reduced latency for large-memory and deduplicated environments.

Flush Error Propagation:

Reliable Write Guarantees for Critical Workloads

Historically, ZFS did not respond to errors raised when asking a disk to flush its cache, as there was no real recourse to this situation. However, if a write succeeded but the subsequent flush failed, the system still reported success. This created rare but serious cases where applications believed data was safely committed even when hardware issues prevented it from reaching stable storage. Klara’s flush error propagation fix ensures ZFS now detects and reports flush failures in the ZIL, so fsync() falls back to a full transaction-group sync rather than returning a false success.

The feature adds a flush-error flag inside ZIL operations, allowing ZFS to correctly treat a flush failure as a durability failure and react safely instead of silently continuing.

***Benefits: ***More reliable fsync behavior, stronger durability guarantees for databases and logging-heavy applications, and clearer visibility into underlying hardware faults. Organizations gain more predictable write integrity and reduced risk of silent data loss under degraded conditions.

AnyRAID:

Storage Flexibility Without Hardware Lock-In

Deploying mixed-size disks is now not only possible, but practical. AnyRAID lets organizations grow storage incrementally, reuse partially degraded drives, and adapt to changing hardware without full rebuilds or matched replacement sets. AnyRAID provides the building blocks to be able to support “failed elements” within HDD and NVMe drives, where only part of a device is failed, and the remainder is still usable.

***Benefits: ***Lower hardware costs, easier incremental upgrades, fewer device replacements, and higher uptime across production environments — from media pipelines to large enterprise clusters.

Enhanced Failure Simulation:

More Accurate Device-Level Testing

ZFS’s no-op fault injections previously bypassed key I/O stages, which made them unsuitable for simulating failures that occur between a device’s write cache and permanent storage. This failure simulation update moves the injection point deeper into the pipeline, after device-health checks and queue processing, so simulated failures behave like real hardware faults.

***Benefits: ***More realistic failure testing, better insight into how pools respond to mid-path device errors, and improved validation of redundancy, I/O paths, and recovery logic. Organizations gain more reliable pre-deployment testing and higher confidence in failure-mode handling.

Building the Next Wave of ZFS Improvements

Across everything we delivered this year, our focus remained simple: make ZFS stronger for our customers and for the community that relies on it. The updates improve performance under fragmentation, improve RAIDZ efficiency, shorten recovery times, give organizations clearer insights, and more flexibility in how they manage their storage. Work at scale shows us where ZFS can improve, and we’ll keep strengthening it. We’re looking ahead to 2026 with more features already in progress and a clear goal: to upstream even more improvements that move ZFS forward.

If you’re looking to dive further into ZFS, ZFS Basecamp is a great start. And if your team needs help with ZFS development, tuning, or support, you can find us here.

Fragmentation Profile Import/Export:

Test Using …

Fragmentation Profile Import/Export:

Test Using Real-world Fragmentation in Laboratory Conditions

Dynamic Gang Headers:

Leveraging Larger Sectors for Improved Efficiency

Large Labels:

Supporting Next-Generation Hardware

Forced Export:

Immediate Recovery From Stalled Pools

Slow RAIDZ Sit-Out:

Stable Performance When Disks Slow Down

Scrub Date Range:

Faster, Targeted Scrubs for Large Pools

High-Scale Performance Improvements (OpenZFS 2.4)

Flush Error Propagation:

Reliable Write Guarantees for Critical Workloads

AnyRAID:

Storage Flexibility Without Hardware Lock-In

Enhanced Failure Simulation:

More Accurate Device-Level Testing

Building the Next Wave of ZFS Improvements

Similar Posts