Introduction
Two main processing architectures currently dominate commercial computing: von Neumann architectures, comprising the majority of computing devices and now spread like dust in every corner of the planet; and tensor processors such as GPUs and TPUs, which have seen a rapid rise in use for computation since 2010.
Inspiration for computing systems came early from human and animal nervous systems. McCullough-Pitts1 proposed simplified logical units that communicate with binary values, inspired directly by activity in the nervous system. Subsequent to his state-machine model of com…
Introduction
Two main processing architectures currently dominate commercial computing: von Neumann architectures, comprising the majority of computing devices and now spread like dust in every corner of the planet; and tensor processors such as GPUs and TPUs, which have seen a rapid rise in use for computation since 2010.
Inspiration for computing systems came early from human and animal nervous systems. McCullough-Pitts1 proposed simplified logical units that communicate with binary values, inspired directly by activity in the nervous system. Subsequent to his state-machine model of computation which underlies modern procedural programming, Turing proposed self-modifying neurally-inspired computing systems2. Von Neumann himself was fascinated by information processing in the brain3, and embarked on a research program to define principles for self-constructing computational machines (automata)4, predating the concept of DNA as a computational substrate.
Von Neumann highlighted the need to program a device in terms of the fundamental computational functions it can perform3. In the case of tensor processors these are matrix multiplications; in the case of CPUs these are primarily basic arithmetic and branching operations. In the case of Neuromorphic (NM) technologies, the basic computational elements are directly inspired by biological nervous systems—temporal integration and dynamics; binary or low-bit-depth multiplication; and thresholding and binary communication between elements.
These technologies comprise a third computational architecture, in addition to CPUs and tensor processors, and several ongoing commercial and research projects are taking steps to bring NM technologies to widespread use. How can we learn from the wild success stories of tensor processors (particularly NVIDIA’s GPUs) and older CPUs, as they rose to dominate the computing landscape? And how can we adapt their success to NM processors?
While the workings of our own brains are fundamentally parallel, convoluted, and remain obscure, we often use simple linear sequences of instructions when explaining a task to others. Consequently, Turing/Jacquard inspired procedural programming models came to dominate digital computing, bringing with them von Neumann fetch-and-execute processing architectures (Fig. 1a). These systems provide a straightforward recipe for general-purpose computing—if you can describe step-by-step what you want a computer to do, then you can code and deploy it. This success came at the expense of analog computers, which had been highly important for scientific and war-time computations until around the middle of the 20th Century. Alternative massively-parallel multi-processor architectures fell out of favour in the 1990s, partially due to the difficulty in developing applications for these systems5,6.
Fig. 1: Programming models for von Neumann, tensor processor, and Neuromorphic hardware architectures.
a von Neumann processors such as CPUs and MCUs use a number of programming models, all of which compile to a predominately serial, fetch-and-execute approach (small-scale multi-processing and optimised pre-fetch notwithstanding). Programming these systems requires decomposing a desired task into a set of explicit procedures for a CPU to execute. b Tensor processors such as GPUs, TPUs and NPUs have achieved great success for general-purpose applications, by adopting an example-driven, supervised machine-learning (ML) programming model based on gradient-based parameter optimisation (green box). This is supported by APIs (yellow box) which efficiently implement the ML programming model on tensor processors. c Neuromorphic architectures are increasingly adopting a similar programming model, based on methods developed for deep learning (green arrow). At the same time, new software APIs for Neuromorphic application development are making efficient use of the SW tools for deep learning (yellow arrow), which permit rapid parallelised training of NM architectures on commodity tensor processors (dashed arrow).
In the late 1990s and early 2000s the consumer graphics hardware market emerged primarily to satisfy the computational needs of home gaming. But computational scientists soon realised that the parallel vector processing cores of GPUs could be applied efficiently to data-parallel problems. Neural Network (NN) researchers found that GPUs could dramatically accelerate NN computations, permitting larger and more complex architectures to be trained faster than ever previously. Since 2010 the rise of deep neural networks has driven and been driven by a concomitant rise in tensor processor architectures, with enormous commercial success in particular for the largest GPU manufacturer NVIDIA. This shift back to parallel processing was directly enabled by the success of deep learning in providing a programming model for tensor computations — the ability to map almost any arbitrary use case to tensor computing, via a data-driven machine-learning approach7 (Fig. 1b). Similarly, the commercial success of NVIDIA was driven by their foresight in providing a software Application Programming Interface (API; i.e. CUDA) for their GPUs8, which allowed this new programming model to be mapped efficiently to their hardware. In a beneficial feedback process, availability of GPUs and increasing demand for neural network processing have driven development of both. Formerly designed to accelerate graphical loads, more recent high-end GPUs are now explicitly designed for tensor-based computational tasks. Coupled with the provision of simple, high-level APIs embodying the deep learning programming model (i.e. Tensorflow/Keras, Pytorch), tensor processors are accessible to developers in a way that earlier multi-processing architectures were not.
In this perspective we outline the early forays of Neuromorphic hardware towards commercial use; describe the new steps the field has made in the last few years; compare and contrast the current design alternatives in nascent commercial Neuromorphic hardware; and sketch a path that we believe will lead to widespread consumer adoption of Neuromorphic technology. We focus primarily on processors making use of sparse event-driven communication with limited precision, with or without analog computing elements, and incorporating temporal dynamics as an integral aspect of computation. We include some discussion of compute-in-memory architectures, where these overlap with our focus. This covers digital implementations of spiking neurons and some analog compute-in-memory arrays, along with traditional sub-threshold silicon Neuromorphic processors.
Early steps for Neuromorphic sensing and processing
The starting line for Neuromorphic engineering lies in the 1960s in California, when the neurobiophysicist Max Delbrück buttonholed the analog circuit designer Carver Mead9. Intrigued by similarities between the biophysics of synapses in the nervous system and the behaviour of silicon transistors, Mead designed circuits which accurately simulated aspects of neuronal dynamics using the subthreshold analog dynamics of transistors and the physics of the silicon substrate. In the 1980s this approach crystallised into a direct examination of computation in neural-like substrates, following intense discussions between Mead, John Hopfield and Richard Feynmann10. The researchers who arranged themselves around that line of enquiry used the industrial tools of VLSI and CMOS fabrication to develop analog circuits for neurons11, synapses12, and various biological sensory organs13,14. From the beginning, the commercial benefits to the NM approach were touted—to harness the extreme power efficiency of biological nervous systems, and usher in a new class of ultra-low-power computing and sensing systems[15](https://www.nature.com/articles/s41467-025-57352-1#ref-CR15 “Indiveri, G. & Horiuchi, T. K. Frontiers in neuromorphic engineering. Frontiers in Neuroscience. https://doi.org/10.3389/fnins.2011.00118
(2011).“).
Early commercial success for Neuromorphic technologies came on the sensory side. Inspired by the vertebrate retina, graded analog circuits for signal processing lead to early capacitative touchpads, and movement sensors for optical mice[16](https://www.nature.com/articles/s41467-025-57352-1#ref-CR16 “Lyon, R. F. et al. US4521772A: Cursor control device. https://patents.google.com/patent/US4521772A
(1983).“),[17](https://www.nature.com/articles/s41467-025-57352-1#ref-CR17 “Bidiville, M. et al. US5288993A: Cursor pointing device utilizing a photodetector array with target ball having randomly distributed speckles. https://patents.google.com/patent/US5288993A
(1994).“). High-efficiency vision sensors were developed and commercialised into non-frame-based, low-latency cameras18, with several commercial entities investing in this approach for industrial and consumer machine vision applications as of 2024.
Similar commercial success on the computing front has taken longer. Several large industrial concerns have produced neurally-inspired event-driven processors—notably Qualcomm’s Zeroth19; IBM’s TrueNorth20, and Intel’s Loihi21. Samsung invested in event-based Neuromorphic vision sensing through their collaboration with startup iniVation22. However, with the exception of Intel’s Loihi, and IBM’s work on memristive elements for in-memory computing, these companies have moved away from Neuromorphic computational architectures in the direction of more traditional CPUs and tensor processors.
A key challenge for research into Neuromorphic computation has been uncertainty about precisely which aspects of biological neural computation are crucial. Early contributions from industry imported this uncertainty by offering high flexibility—and therefore complexity—in chip architecture.
Likewise, methods for configuring spiking neural network chips have been driven by academic research, and absent a powerful, general-purpose programming model.
Neuromorphic programming models
Until very recently, deploying an application to a spiking Neuromorphic processor required approximately one or more PhDs worth of effort. Designing an SNN application is not as simple as writing some lines of procedural code, as there is no straightforward way to compile a procedural language to the spiking neurons that comprise the computational units of a spiking Neuromorphic processor. This lack of an accessible programming model has stood in the way of NM processors as general-purpose computing solutions.
Consequently, a number of methods for configuring spiking neural networks were proposed to fill this gap23.
Hand wiring and spike-based learning
For many applications, hand-wired networks have been used24. Taking inspiration from synaptic plasticity in biological neural tissue, spike-based learning rules have been built into to neuromorphic hardware25,26,27. However, STDP and other synapse- and neuron-local learning rules are not sufficient to produce arbitrary applications for Neuromorphic processors. Instead these learning rules are generally coupled with hand-designed networks, to solve particular restricted classes of problems.
Random architectures and dynamics
Several SNN configuration methods are based around random recurrent networks and random dynamics28,29. These networks project a low-dimensional input into a high-dimensional space, and over a rich random temporal basis formed by complex dynamics of spiking networks. In theory, training these networks then becomes a linear regression problem to tune a readout layer as desired, and these networks are universal function approximators28,30. In practice, guaranteeing an expressive temporal basis, and maintaining stable-but-not-too-stable recurrent network dynamics, are extremely challenging. Due to their highly redundant architecture, random networks represent an inefficient use of resources for a given application. Computational efficiency can be improved by tuning network dynamics to follow a teacher dynamical system31,32,[33](https://www.nature.com/articles/s41467-025-57352-1#ref-CR33 “Büchel, J., Zendrikov, D., Solinas, S., Indiveri, G. & Muir, D. R. Supervised training of spiking neural networks for robust deployment on mixed-signal neuromorphic processors. Sci. Rep. 11, https://doi.org/10.1038/s41598-021-02779-x
(2021).“). These methods are effective, but apply best to smooth regression tasks. At the same time, the existence of a teacher dynamical system is required—which may not be feasible for some applications. Recurrent networks with engineered temporal basis functions, such as Legendre Memory Units, may offer a workable and more efficient alternative to random architectures34,35,36.
Modular networks
Some researchers have proposed computational units, composed of spiking subnetworks, which can be combined to build larger computing systems. Such modules may be based on attractor dynamics[37](https://www.nature.com/articles/s41467-025-57352-1#ref-CR37 “Sandamirskaya, Y. et al. Dynamic neural fields as a step toward cognitive neuromorphic architectures. Frontiers in Neuroscience 7, https://doi.org/10.3389/fnins.2013.00276
(2014).“); tuning curve based dynamical encoding38; or on winner-take-all competitive networks39,40. When these modules are combined in defined ways, they can be used to implement particular classes of applications such as state machines or dynamical systems. These methods could be considered as pre-cursors to large-scale programmable functional spiking neural networks41,42.
Gradient-based machine learning optimisation
The methodological advance that is currently driving heightened interest in SNNs for computation is the realisation that data-centric, machine-learning (ML) approaches can be used to map tasks on to neuromorphic primitives. The modern approach to map a task onto a NM processor specifies an input-output mapping via a dataset or task simulation, combined with a value function that correlates with success in the desired task (Fig. 1c). Task definitions and datasets can be re-used from deep learning systems, as the steps are almost identical (Fig. 1b). Loss-gradient-based optimisation techniques can then be applied to deep SNNs, given some relatively simple adjustments. Since 2016, theoretical advances showed that surrogate gradients[43](#ref-CR43 “Lee, J. H., Delbruck, T. & Pfeiffer, M. Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10, https://doi.org/10.3389/fnins.2016.00508
(2016).“),44,45, back- and forward-propagation through time46,47 and spike-based direct gradient computation48,49 could open up SNNs to gradient-based methods. This new approach is highly similar to modern deep learning training methods, and permits SNNs to make direct use of software and hardware acceleration for training deep networks (Fig. 1).
The impact of gradient-based training for SNNs and Neuromorphic architectures has been enormous. It is now possible to train and deploy very large SNNs on tasks that have previously been in the realms of fantasy for spiking networks, for example large language models[50](https://www.nature.com/articles/s41467-025-57352-1#ref-CR50 “Zhu, R.-J., Zhao, Q. Li, G. & Eshraghian, J. K. SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks. Trans. ML Research https://openreview.net/forum?id=gcf1anBL9e
(2024).“) and autonomous driving systems[51](https://www.nature.com/articles/s41467-025-57352-1#ref-CR51 “Zhu, R.-J., Wang, Z., Gilpin, L. & Eshraghian, J. K. Autonomous driving with spiking neural networks. Trans. ML Research https://openreview.net/forum?id=95VyH4VxN9
(2024).“). Gradient-based approaches can also be used to mitigate the effects of mismatch in mixed-signal and analog compute architectures52,53. This application building approach is highly transferrable to even very complex HW architectures54,55.
For the first time, NM compute architectures have a general-purpose programming model, permitting developers to map arbitrary tasks on to NM processors. This new approach makes deploying applications to NM hardware more straightforward, faster and more convenient than at any point in the past.
How to: popularise a new computing architecture
Achieving commercial success for NM processors requires satisfying a small constellation of conditions. The remaining barriers to commercialisation are not technical, but relate to softer aspects of widespread uptake. Here the main competitors are not alternative Neuromorphic processor designs, but are instead commodity devices that currently occupy the application niches NM companies would like to access. In the low-resource TinyML market, these are low-power microcontrollers, low-power DSPs, FPGAs and other ASICs. In the high-resource datacenter ML market, these competitors are predominately tensor processors and other NN accelerators.
Market interest
For commercial uptake, market interests are primary—the direct benefits in terms of cost, power and capability that will justify a device manufacturer buying an NM processor over commodity devices. These business requirements should focus the attention of the Neuromorphic community towards applications where NM processors show a clear advantage.
Historically, some NM processor designs and systems have explicitly targeted large brain simulations, in the order of millions of neurons56,57,58. As a potential avenue for commercialising NM technology, this use case suffers from several drawbacks. Firstly the global market size for brain simulation machines is currently very small, which is underlined by the necessity for research funding for these projects. Secondly, the paucity of Neuroscientific knowledge about precisely how organic brains are connected at a single-neuron and -synapse level points to uncertainty about how to usefully configure such large simulations. Although the sales margin for large-scale Neuromorphic supercomputers is high, these use-cases will not lead to widespread commercial adoption of Neuromorphic technologies.
In contrast, ML inference tasks are increasingly moving to portable consumer devices such as smartwatches, phones and headsets, as well as in-home devices such as smart speakers and cameras. Independence from cloud processing brings consumer advantages for privacy, data security, and robustness to network interference. Moving inference to the edge introduces application constraints for low-power, low-resource, real-time operation—known in the market as TinyML. TinyML is receiving significant attention from processor manufacturers, who are competing to obtain low-power operation at reasonable inference speeds. These constraints are ideal for NM processor technology, especially devices for which low-power operation is a key design goal (see Box). Concretely, this points to a niche for NM processors in low-power sensory processing and HCI, such as audio- and visual-wake phrase; and voice and gesture interaction for smart home consumer appliances. NM technologies can also provide general situational awareness for consumer devices, for example ambient audio analysis, presence detection, and vision- or motion-based human behaviour understanding.
Ease of use and interfacing
For engineers and application developers, questions of adoption come down to flexibility and ease of use: How straightforward is integration between a NM processor and the rest of a system? How steep is the learning curve for a new application developer? What restrictions exist in terms of applications which can be deployed? How reliable and robust will be the applications when deployed at scale? This last question is particularly pointed for mixed-signal and analog compute-in-memory architectures.
The NM community must ensure that answers to these questions lead commercial design engineers to favour NM technologies, in spite of the mature market dominance of von Neumann and tensor processor devices.
Since the design of NM processors has been primarily driven by the research community, relatively little attention has been paid to system-level design considerations. For commercial adoption, interfacing a NM processor with commodity hardware must be frictionless. This implies the NM community will need to adopt standard digital interfaces for configuration, data input and output; standard voltage rails used in consumer devices; compact packaging and pinouts; etc.
Adoption of the deep-learning inspired programming model and associated APIs solves the questions of flexibility and generality, while at the same time easing the learning curve for current ML engineers to adopt NM processors. Familiarity is a valuable resource — when application design for an NM processor looks and feels very similar to that required for a tensor processor, it will be easier for engineers to advocate for NM processors in their systems. The NM community is already working to provide easy to use, easy to learn software toolchains, but a critical mass of NM application developers has yet to form. Community engagement in terms of tutorials, educational workshops and training will drive wider developer interest and adoption of NM programming.
Reliability and robustness
Neuromorphic hardware has had to work to overcome negative perceptions in terms of the expressive power of SNNs59. At the same time, the reliability and robustness of mixed-signal NM devices, either compute-in-memory or traditional neuromorphic, have been impinged by mismatch and analog device non-idealities. These issues give rise to an image problem for Neuromorphic computation, which the field needs to overturn. Consumer device companies are used to a model where a single application is deployed at scale, identically, to a large number of devices. For the current breed of analog / compute-in-memory devices, non-idealities and mismatch impose a degree of calibration and per-device application fine-tuning which complicate mass deployment. However, the new breed of simpler, inference-only NM processors, based on standard spiking neuron modules and fabricated in standard digital CMOS, completely solve these questions of reliability and robustness.
Standardisation and benchmarks
Standardisation has come slowly to neuromorphic engineering, but a crop of new open-source software frameworks are already coalescing around the deep-learning programming model[60](#ref-CR60 “Muir, D., Bauer, F. & Weidel, P. Rockpool Documentation, https://doi.org/10.5281/zenodo.3773845
(2019).“),[61](#ref-CR61 “Sheik, S., Lenz, G., Bauer, F. & Kuepelioglu, N. SINABS: a simple Pytorch based SNN library specialised for speck. https://github.com/synsense/sinabs
.“),[62](#ref-CR62 “Fang, W. et al. Spikingjelly. https://github.com/fangwei123456/spikingjelly
(2020).“),[63](#ref-CR63 “Pehle, C. & Pedersen, J. E. Norse — a deep learning library for spiking neural networks. https://doi.org/10.5281/zenodo.4422025
(2021).“),64. These tools thematically extend earlier pushes for NM APIs, such as PyNN[65](https://www.nature.com/articles/s41467-025-57352-1#ref-CR65 “Davison, A. P. et al. PyNN: a common interface for neuronal network simulators. Frontiers in Neuroinformatics 2. https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/neuro.11.011.2008
(2009).“), which provided portability between NM simulators and some processors, but which did not address the absence of a programming model. Several projects for interoperability and benchmarking with broad community support are also underway66,67,68, enabling application developers to easily pick and choose NM systems for deployment. Recent work has shown success in porting network architectures between seven software toolchains and four hardware platforms, requiring only few lines of code to deploy a single trained network to disparate NM processors66.
While the motivation for the neuromorphic approach has always highlighted energy efficiency, it is only recently that head-to-head benchmarks between NM and other processing architectures have been available. For performing SNN inference, NM processors perhaps unsurprisingly show advantages over other computational architectures69, with inference energy improvements of 280–21000 × vs GPU on an SNN simulation task; 3–50 × vs an edge ANN accelerator on the same task; and 75 × vs Raspberry Pi on a Sudoku benchmark.
But on general physical simulation with neural networks70, analog compute-in-memory (CIM) NM processors exhibit inference energy advantages over existing commercial devices, ranging 150–1300 × improvement vs GPU and 1500–3500 × vs CPU, on particle physics and condensed matter physics simulation taks.
On dedicated deep learning tasks, CIM processors show distinct improvements over state-of-the-art NN accelerators with traditional architectures, in terms of energy efficiency (10–1000 × improvement) and compute density (around 10 × depending on architecture), on ImageNet, CIFAR10 and LLM inference tasks71,[72](https://www.nature.com/articles/s41467-025-57352-1#ref-CR72 “Wolters, C., Yang, X., Schlichtmann, U. & Suzumura, T. Memory is all you need: an overview of compute-in-memory architectures for accelerating large language model inference. ArXiv https://doi.org/10.48550/arXiv.2406.08413
(2024).“).
When performing inference with neurally-inspired algorithms[73](https://www.nature.com/articles/s41467-025-57352-1#ref-CR73 “Parpart, G., Risbud, S. R., Kenyon, G. T. & Watkins, Y. Implementing and benchmarking the locally competitive algorithm on the loihi 2 neuromorphic processor. In: ICONS ’23: Proceedings of the 2023 International Conference on Neuromorphic Systems, vol. 15, 1–6 https://doi.org/10.1145/3589737.3605973
(2023).“), SNN processors show dynamic energy improvements of 4.2–225 × vs desktop GPU; 380 × vs an edge mobile GPU; and 12 × vs a desktop processor on an MNIST image reconstruction task.
On a range of sensory processing tasks (gesture recognition; audio keyword spotting; cue integration; and MNIST image recognition), NM processors show power efficiency benefits over other low-power inference ASIC architectures, with continuous power improvements of 4–1700 × , when normalised to processor node74,75.
It is in real-time temporal signal analysis tasks that NM processors truly show off their energy benefits. On a spoken keyword spotting benchmark, where each architecture performs inference on the same benchmark dataset but using a HW-appropriate network, the neuromorphic approach brings power advantages of several orders of magnitude over GPUs and CPUs, and even over mobile GPU architectures[76](#ref-CR76 “Blouw, P., Choo, X., Hunsberger, E. & Eliasmith, C. Benchmarking keyword spotting efficiency on neuromorphic hardware. http://arxiv.org/abs/1812.01739
(2018).“),77,[78](https://www.nature.com/articles/s41467-025-57352-1#ref-CR78 “Bos, H. & Muir, D. R. Micro-power spoken keyword spotting on xylo audio 2. Preprint at https://arxiv.org/abs/2406.15112
(2024).“).
Benchmarks that highlight the strengths of NM processors, on industry-standard tasks that can be compared against commodity devices, are particularly important to drive commercial uptake. We recommend that the NM community engages strongly with industry-standard benchmarks, ensuring that edge inference tasks which can show the benefits of the NM approach are included, and quantitatively illustrating those benefits. Similarly, establishing hardware standards, and adopting standard computational units such as the leaky-integrate-and-fire neuron (LIF), will signal to industry that the NM processor field is maturing commercially.
NVIDIA and CUDA
The crucial lesson for Neuromorphic hardware commercialisation is the conjunction of factors that led to the dramatic success of NVIDIA’s tensor processors. Deep ANNs and CNNs have existed in a consistent form for decades, but training very large networks had been computationally infeasible. Likewise, gradient-based training methods have existed for considerable time, but without robust software toolchain support or hardware acceleration. GPUs from competitors to NVIDIA existed contemporaneously, but all were difficult to configure for strictly computational operation. In 2004, NVIDIA (along with DARPA, ATI, IBM and SONY) supported a prototype C language extension, Brook, for stream-based processing, with compilation support for several GPU architectures79. In 2006 NVIDIA released the Tesla GPU architecture, explicitly tailored to support general-purpose computation in addition to graphical processing80. Brook evolved into CUDA, which was officially released alongside the compute-tailored Tesla cores, and was promoted by NVIDIA as a general-purpose parallel computing API8. Soon after, CUDA and NVIDIA GPUs were adopted by researchers working to scale up neural network applications81,[82](https://www.nature.com/articles/s41467-025-57352-1#ref-CR82 “Witt, S. How Jensen Huang’s Nvidia Is Powering the A.I. Revolution. The New Yorker. https://www.newyorker.com/magazine/2023/12/04/how-jensen-huangs-nvidia-is-powering-the-ai-revolution
(2023).“), paving the way for the CUDA-accelerated CNN architecture used by AlexNet in 201283, and the start of the Deep Learning boom.
With CUDA, NVIDIA provided an early, simple API for general-purpose computing on their tensor processors, which was taken up by deep learning toolchains and therefore by deep learning engineers. Faster training powered by HW acceleration, and the ability to build larger models in a reasonable time, have delivered huge leaps in ML application performance. Increased interest in ML/AI applications has driven enormous concomitant interest and investment in training ML engineers, as well as enormous demand for ML training hardware supplied predominately by NVIDIA.
In other words, it was not until the advent of simple programming models and tools, simultaneous with straightforward API access to hardware, that tensor processors reached widespread adoption.
The NM community can actively drive this convergence for Neuromorphic processors. We recommend that the NM community, including funding bodies and researchers, support and consolidate the current crop of open-source tools so they reach industry-standard quality and usability.
Commercial concepts for NM computing devices
NM computing hardware is undergoing a commercial renaissance, with numerous companies fabricating processors that can reasonably claim the title of Neuromorphic (see Fig. 2). These offerings vary in their approach to hardware design and application targeting84,85 (see Box).
Fig. 2: Timeline of recent commercial NM compute firms and new hardware.
Large industry players (top) have committed efforts to NM compute hardware for some time. From around 2015, increased interest in NM compute as a commercial prospect has seen a cluster of new commercial startups and spinoffs emerge. Shown here are commercially available announced hardware, focussing on NM compute. University research chips are not shown. Note that IBM’s NorthPole is a near-memory efficient sparse tensor processor, rather than an SNN accelerator or compute-in-memory (CIM) processor117.
There has been significant overlap between research into analog compute-in-memory (CIM) architectures and event-driven computation, as the analog computing approach of mixed-signal spiking neuromorphic processors complements the analog signals generated by CIM crossbars. Combining these technologies can avoid the need for analog-digital converters on the output of a crossbar.
Analog CIM crossbars have been proposed using a range of technologies which are compatible with CMOS logic fabrication, encompassing flash-based memory86, capacitor-based approaches71, floating-gate transistors87, ferro-electric RAM88, resistive RAM89,90,[91](#ref-CR91 “Xue, C.-X. et al. 15.4 A 22nm 2Mb ReRAM Compute-in-Memory Macro with 121-28TOPS/W for Multibit MAC Computing for Tiny AI Edge Devices. In 2020 IEEE International Solid-State Circuits Conference - (ISSCC), 244–246. https://ieeexplore.ieee.org/abstract/document/9063078
(2020).“),92,93, Phase-Change Memory52,94,95, as well as other approaches which are more difficult to integrate into the VLSI fabrication process.
CIM approaches show