Cloudflare Raves About Performance Gains After Rust Rewrite (cloudflare.com)
Posted by EditorDavid on Sunday November 02, 2025 @01:34AM from the blatant-latencies dept.
“We’ve spent the last year rebuilding major components of our system,” Cloudflare announced this week, “and we’ve just slashed the latency of traffic passing through our network for millions of our customers,” (There’s a 10ms cut in the median time to respond, plus a 25% performance boost as measured by CDN performance tests.) They replaced a 15-year-old system n…
Cloudflare Raves About Performance Gains After Rust Rewrite (cloudflare.com)
Posted by EditorDavid on Sunday November 02, 2025 @01:34AM from the blatant-latencies dept.
“We’ve spent the last year rebuilding major components of our system,” Cloudflare announced this week, “and we’ve just slashed the latency of traffic passing through our network for millions of our customers,” (There’s a 10ms cut in the median time to respond, plus a 25% performance boost as measured by CDN performance tests.) They replaced a 15-year-old system named FL (where they run security and performance features), and “At the same time, we’ve made our system more secure, and we’ve reduced the time it takes for us to build and release new products.”
And yes, Rust was involved: *We write a lot of Rust, and we’ve gotten pretty good at it... We built FL2 in Rust, on Oxy [Cloudflare’s Rust-based next generation proxy framework], and built a strict module framework to structure all the logic in FL2... Built in Rust, [Oxy] eliminates entire classes of bugs that plagued our Nginx/LuaJIT-based FL1, like memory safety issues and data races, while delivering C-level performance. At Cloudflare’s scale, those guarantees aren’t nice-to-haves, they’re essential. Every microsecond saved per request translates into tangible improvements in user experience, and every crash or edge case avoided keeps the Internet running smoothly. Rust’s strict compile-time guarantees also pair perfectly with FL2’s modular architecture, where we enforce clear contracts between product modules and their inputs and outputs...
It’s a big enough distraction from shipping products to customers to rebuild product logic in Rust. Asking all our teams to maintain two versions of their product logic, and reimplement every change a second time until we finished our migration was too much. So, we implemented a layer in our old NGINX and OpenResty based FL which allowed the new modules to be run. Instead of maintaining a parallel implementation, teams could implement their logic in Rust, and replace their old Lua logic with that, without waiting for the full replacement of the old system.
Over 100 engineers worked on FL2 — and there was extensive testing, plus a fallback-to-FL1 procedure. But “We started running customer traffic through FL2 early in 2025, and have been progressively increasing the amount of traffic served throughout the year....” *As we described at the start of this post, FL2 is substantially faster than FL1. The biggest reason for this is simply that FL2 performs less work [thanks to filters controlling whether modules need to run]... Another huge reason for better performance is that FL2 is a single codebase, implemented in a performance focussed language. In comparison, FL1 was based on NGINX (which is written in C), combined with LuaJIT (Lua, and C interface layers), and also contained plenty of Rust modules. In FL1, we spent a lot of time and memory converting data from the representation needed by one language, to the representation needed by another. As a result, our internal measures show that FL2 uses less than half the CPU of FL1, and much less than half the memory. That’s a huge bonus — we can spend the CPU on delivering more and more features for our customers!
Using our own tools and independent benchmarks like CDNPerf, we measured the impact of FL2 as we rolled it out across the network. The results are clear: websites are responding 10 ms faster at the median, a 25% performance boost. FL2 is also more secure by design than FL1. No software system is perfect, but the Rust language brings us huge benefits over LuaJIT. Rust has strong compile-time memory checks and a type system that avoids large classes of errors. Combine that with our rigid module system, and we can make most changes with high confidence...
We have long followed a policy that any unexplained crash of our systems needs to be investigated as a high priority. We won’t be relaxing that policy, though the main cause of novel crashes in FL2 so far has been due to hardware failure. The massively reduced rates of such crashes will give us time to do a good job of such investigations. We’re spending the rest of 2025 completing the migration from FL1 to FL2, and will turn off FL1 in early 2026. We’re already seeing the benefits in terms of customer performance and speed of development, and we’re looking forward to giving these to all our customers.
After that, when everything is modular, in Rust and tested and scaled, we can really start to optimize...!
Thanks to long-time Slashdot reader Beeftopia for sharing the article.