When Ryan Williams plugged a novel technique developed to test the bounds of a key part of complexity theory into his own framework, he found the result hard to believe. The Massachusetts Institute of Technology (MIT) computer scientist had expected to reduce the memory space needed by a vast range of algorithms. But not to the degree it did. One longstanding assumption about the complexity of computer programs with predictable runtimes is how much memory they need relative to the number of steps or time they take to complete. Two complexity classes that theoreticians believe capture this relationship are P and PSPACE. The former is the collection of programs that complete in a polynomial number of steps, based on the number of data elements the algorithm needs to process. Complexity theor…
When Ryan Williams plugged a novel technique developed to test the bounds of a key part of complexity theory into his own framework, he found the result hard to believe. The Massachusetts Institute of Technology (MIT) computer scientist had expected to reduce the memory space needed by a vast range of algorithms. But not to the degree it did. One longstanding assumption about the complexity of computer programs with predictable runtimes is how much memory they need relative to the number of steps or time they take to complete. Two complexity classes that theoreticians believe capture this relationship are P and PSPACE. The former is the collection of programs that complete in a polynomial number of steps, based on the number of data elements the algorithm needs to process. Complexity theorists do not see the two sets as being identical. But it was up to this point also a reasonable assumption they might not be all that different. “Imagine a computation that produces a new bit of information in every step, based on the bits that it has computed so far. Over t steps of time, it may generate up to t new bits of information in order to reach the final result,” Williams explained. Take a large logic circuit where bits can flow arbitrarily from the inputs to a final set of output gates. It seems reasonable to assume you would need to collect most, if not all, the intermediate states to be sure of computing the correct result. Other memory-efficient programs may need far more time to complete than will fit into P. What Williams found shocking about his result, which was the first to move the apparent bounds of PSPACE to a significant degree in decades, is how big the difference seems to be. Rather than paring the total storage down to something like t/log t , which seemed likely, he wound up with a result for memory usage that was a little more than the square root of t . “It seemed beyond belief that every complex computation could somehow be reimplemented to use a much smaller amount of space than time,” Willams said. The result may have indicated the way to new approaches to determining what separates P and PSPACE. To make the novel proof work, the MIT researcher began with logic-type circuits, processing them on a multitape Turing machine. The idea was to build an approach in which the algorithm will resort to recomputing intermediate results, rather than just storing every partial result for possible later use. He then conceptually divided the tapes into blocks of execution, each block corresponding to a small tree of a larger branching program. The sub-trees connect when a result in one block affects another to build a chain of execution that passes the minimum amount of data needed between blocks. The multitape Turing machine made this an effective strategy because it is possible to organize the processing in blocks without demanding the program explicitly store its progress. For that reason, only some types of problem with highly predictable execution paths are likely to see the space savings of this technique if implemented on more conventional computers that offer random memory access. Into this framework Williams plugged a breakthrough result for evaluating tree-like circuits published last year a by independent researcher James Cook and postdoctoral researcher Ian Mertz at Charles University in the Czech Republic. The algorithm they worked on originally was devised as a strategy for separating P from the smaller LOGSPACE memory-space complexity class. That is a class of algorithms with data requirements that grow only logarithmically with the number of data inputs. Ironically, the pair’s work in effect pushed them closer together, in contrast to Williams’ results with P versus PSPACE. Tree-evaluation starts from its leaves to work out the value at each node progressively until it reaches the root. It seemed reasonable when Cook’s father and colleagues devised tree evaluation that the memory needed for it comprising at least as many nodes as the tree’s depth, plus an additional set of slots to hold temporary values from the evaluation of each sub-branch. If that had been the case, it would not fit into LOGSPACE. Cook and Mertz did not reduce the space usage that far. But further improvement would need to remove just one small, logarithmic factor to get it past the threshold. “At a very high level, we know that LOGSPACE and PSPACE are different classes and that P is sandwiched in the middle,” said Ramprasad Saptharishi, faculty member of the Tata Institute of Fundamental Research in India. “In a sense, anything that seems to push P closer to the LOGSPACE in our mental picture also pushes P and PSPACE further apart.” Cook and Mertz built their novel approach using ideas from the relatively unexplored world of catalytic computing, an approach developed a little over a decade ago by a team led by Harry Buhrman, group leader of the algorithms and complexity group at the Center for Mathematics and Informatics in the Netherlands before joining quantum-computing startup Quantinuum as chief scientist in 2023. Catalytic computing lets programs employ memory already used for another purpose, as long as they make no permanent changes to the values in the borrowed memory. Using this catalytic tape leads to dramatic reductions in memory usage. A key element of catalytic computing, known as a register program, harks back to a classic computing optimization for swapping two registers with no need for a third using a sequence of exclusive-OR operations. Using reversible operations for values not needed long-term and tactically recalculating them instead when needed delivered the groundbreaking result. The work underlines one of the social aspects of mathematics and theoretical computer science: ordering can be important. Williams argues that without the Cook and Mertz work, his own advance might not have had nearly the same startling outcome, perhaps even suggesting limited value in trying to show how far the upper bounds of PSPACE might go. Instead, it seems there might be scope in going even further, though this may be well short of the dream of a proof that definitively separates P from PSPACE. “I would not be surprised if someone gets an even better result than square-root t ,” said Cook. Mertz added, “I expect to see progress with both catalytic and non-catalytic techniques.” Applications are perhaps more remote. Memory limits do not affect most programmers the way they did in the early days of computing, when they used all manner of tricks to make an algorithm fit into just a few kilobytes of RAM. However, Williams said the techniques might prove to be a way for programmers to add a little more life to ancient machines. “This result shows that we could teach old hardware new tricks. Every time-efficient computation can be reimplemented to run on computers with significantly less memory, even if the re-implementation takes longer to complete,” Williams said. Another memory-restricted environment is quantum computing. There are obvious commonalities with quantum computing, as the domain uses reversible operations extensively and qubits are difficult to implement at scale. In more recent research, Mertz has investigated the prospects for similar catalytic techniques in that space with Buhrman and several other researchers. But the effort so far has found it encounters more limitations than in the classical world, not least because of the problem of maintaining the catalytic tape’s integrity where error correction is a major challenge. “All we could do is say how not-powerful it is,” Mertz said of the findings so far. It may even be the case, he added, that catalytic computing turns out to be strong enough in the classical domain to simulate operations in the quantum world efficiently. The growing energy demand of computing may provide a different incentive for borrowing register programs from catalytic computing. Main memory accesses are expensive not just in terms of latency, but also power consumption. Restricting simple algorithms to use small areas of local memory allocated for other temporarily inactive tasks may deliver some power savings compared to letting temporary results spill out of DRAM with power-hungry transfers. However, one major problem lies in the technique’s “disregard for memory safety,” Saptharishi pointed out. Catalytic computing’s future could lie in another property that is helping to shed light on other problems in theoretical computer science: how to take algorithms that rely on randomness and build more deterministic versions. In more recent work, Mertz, together with Aryan Agarwala of the Max-Planck Institute for Informatics in Germany, began a derandomization of bipartite matching, a key algorithm used for scheduling tasks that today rely on non-deterministic approaches. The catalytic-computing approach relies on an approach that harnesses the randomness buried in any prefilled memory. This works by compressing any data that is amenable to it until the contents have no discernible pattern and are effectively random. The work revealed an algorithm that packs the memory needed into a far smaller number of bits than expected. It points to the possibility that, like tree evaluation, a fully derandomized algorithm would only need a logarithmic number of bits for a problem that, like tree evaluation, was once thought to need far more memory. Saptharishi says this type of work could mark the first step in taking bipartite matching and other algorithms towards full derandomization. A curious aspect of catalytic computing lies in the apparent dissimilarity of its two key techniques. “In terms of interplay, they seem misaligned with each other,” Mertz noted. “One is oblivious to the contents. And the other depends on the contents of the catalytic tape.” However, Cook added, most work so far that uses the embedded randomness also relies on register-program techniques. “We’ve been trying to figure out whether these are the only two things we can do. So far, we don’t know any algorithms that look substantively different from either of them,” Mertz said. For the moment, catalytic computing is a relatively small space in terms of theoretical computer science research. Yet recent results, not least the two breakthrough papers, have brought more collaboration to the field. “Interest has exploded in the past two years,” said Mertz. “But we need to find practical uses at some point.” At the same time, the interest in this work may help accelerate progress in separating P from PSPACE, as well as shedding light on other complexity classes, as computer scientists gain a better understanding of how much they can cut memory overhead, if not time. Further Reading Buhrman, H., Cleve, R., Koucky, M., Loff, B. and Speelman, F. Computing with a full memory: catalytic space. Proceedings of the 46th ACM Symposium on Theory of Computing (STOC 2014) , pp 857–86 Cook, J., Li, J., Mertz, I., and Pyne, E. The structure of catalytic space: capturing randomness and time via compression. Electronic Colloquium on Computational Complexity , Report No. 106 (2024) Cook, J. and Mertz, I. Tree Evaluation Is in Space O (log n · log log n ). Proceedings of the 56th ACM Symposium on Theory of Computing (STOC 2024) , pp 1268-1278 Williams, R. Simulating time with square-root space. Proceedings of the 57th ACM Symposium on Theory of Computing (STOC 2025)