Programmable Chips Evolve For Shifting Needs

ICs and SoCs are utilizing a range of processing elements that allow them to optimize current workloads while hedging their bets for the future.

What used to be a simple choice between an ASIC, FPGA, or DSP, has evolved into a mix of processor types and architectures, including varying levels of programmability and customization. Speed is essential, but technology is evolving so quickly that the optimal solution today may be obsolete by the time a chip reaches manufacturing. If a new AI model, memory standard, or other type of technology upgrade comes along, programmable components offer a simpler solution compared to a costly silicon re-spin. This could even include swapping in a programmable chiplet.

The ability to reprogram or reconfigure chips in the field enables designers to r…

ICs and SoCs are utilizing a range of processing elements that allow them to optimize current workloads while hedging their bets for the future.

The ability to reprogram or reconfigure chips in the field enables designers to re-allocate workloads and provide consumers with hardware upgrades, without requiring them to purchase an expensive new device. FPGAs (field-programmable gate arrays) and DSPs (digital signal processors) are two of the most common types of programmable components, but there are others, as well.

“The easy example is the graphics processing unit,” said Andy Nightingale, vice president of product management and marketing at Arteris. “For a long time, it’s been the massively parallel, programmable bunch of GPUs that you can do stuff with. It might not be the most optimal, but it’s the closest thing to an FPGA in terms of how you drive something with software versus hardware elements.”

While GPUs are highly programmable, they’re also extremely power-hungry, which is why designers try other solutions for embedded AI applications. One commonly used option is a relatively fixed-function neural processing unit (NPU) with a programmable DSP.

“Nvidia’s GPUs employ the CUDA C++ programming dialect and a thread-based, warp-based programming model that relies on hardware-heavy cache memory systems,” said Steve Roddy, chief marketing officer at Quadric. “That gives the programmer freedom to ignore how data is mapped to memory and just lets the hardware handle the details. DSPs are also C and C++ programmable, but much lower power than GPUs because DSPs typically employ local SRAM, not local cache memory, and DMA (direct memory access) movement of data instead of endless cache line fetches. DSPs are limited in AI performance because they don’t do matrix math efficiently and have limited data parallelism. NPUs are great at the matrix math at the heart of AI, but they lack programmability. Our GPNPU (general-purpose neural processing unit) blends the matrix efficiency of the NPU with the low-power programmability of the DSP to create an optimal embedded AI processor.”

Synaptics’ latest embedded AI processors feature Arm CPUs and MCUs with the Helium DSP extension, and Google’s RISC-V-based Coral NPU. And Blaize uses proprietary, programmable graph streaming processors (GSPs), that leverage Arteris’ network on chip IP for multimodal AI applications.

Data centers have several programmable options, as well. “Data processing units (DPUs) are smart network interfaces that you can use to route packets from different parts of the system,” said Nightingale. “There are P-4 programmable switches in the data center space. They’re network switches for programmable packet processing pipelines. Then there are reconfigurable arrays. Coarse-grained reconfigurable arrays (CGRAs) perform certain functions. They’re software-driven reconfiguration at a higher level of abstraction than FPGAs, so they give you the balance of flexibility with efficiency and AI inference in pipelines.”

CGRAs are an emerging technology that could sit somewhere between FPGAs and GPUs, allowing for a more blended or balanced approach. “This is probably the most interesting one that’s coming out of the mix,” said Nightingale. “They’re still doing the experimental phase, because there’s technology that’s reached an arbitrary baseline that you can rely on, and then there’s new technologies that have shown promise but haven’t been thoroughly tested yet. They could come in and be little game changers in their own right, in their own space. I’m still advocating for a blend of FPGA, GPU, and XPU for specific tasks. It’s probably the best mix for what we’re looking at.”

CGRAs and field programmable analog arrays (FPAAs) extend the flexibility of reconfigurable computing beyond traditional digital logic. “The market is in the early stages, and questions remain about the scale and maturity of market demand,” said Venkat Yadavalli, head of the Business Management Group at Altera. “This is true especially as it relates to ecosystem support, toolchain maturity, and integration with existing FPGA and ASIC design flows.”

Programmability vs. reconfigurability vs. customization Chips can be programmable or reconfigurable, or both, as is the case with FPGAs.

“There’s programmability in that the entire hardware itself is programmable, which means I can completely change the design on an FPGA,” said Nandan Nayampally, chief commercial officer at Baya Systems. “The next layer says, ‘I’ve got all these components. I’ve got all these fabric connections, but I can configure bandwidth that goes to this, latency that goes to this.’ I can prioritize. That’s also programmability, but it’s somewhat limited, because you’re not completely changing the functionality. You’re partitioning and provisioning differently.”

For example, some CPUs are programmable and configurable, but only to a certain extent. “In the general space of programmability, which you can see at a CPU level, there are programmable CPUs like the RISC-V architecture or ISA architecture, and architecturally, certain other things are coming in,” said Altera’s Yadavalli. “FPGAs give you the kind of extreme flexibility that you need, where you can bring in and implement different kinds of workloads. If you take a RISC-V, it allows you to do some kind of appliance-related configurability, where you are just making some basic decisions to help another chip to work, or it could be a RISC-V processor with some limited programmability.”

Extreme programmability is always with FPGAs. “Because you can change I/Os, you can change the fabric, you can change everything along the way to suit your needs,” said Yadavalli. “Other types of programmability are more explored in a very limited, targeted manner. If there is a standard product, is there anything I can provide to add additional flexibility? It’s a little more configurable, but not fully programmable.”

Chips can also be customized via power infrastructure. “There are a couple of ways of handling customization,” said Mo Faisal, CEO of Movellus. “One is that every chip is customized, and it’s got custom power grids and optimization. If you look at a million chips that are packaged with a million different packages, they’re all unique. However, you can actually make the power infrastructure a bit more programmable so that it matches the different packages, because every package will have a different resonance. If I have enough programmability, I can tune out some of the variation of the packages. That could be a significant gain because the package causes the droops, which sets your Vmin, which sets your power, which sets your cooling and everything.”

Impact of AI and rising analog on DSPs Modern SoCs are evolving rapidly, and one significant shift is the growing amount of analog content they need to handle, which places an extra burden on DSPs.

“These chips aren’t just digital anymore — they now include RF, mixed-signal, and sensor interfaces for things like 5G, automotive radar, and IoT devices,” said Amol Borkar, senior director of product management and marketing, head of computer vision/AI products at Cadence. “That sounds great for functionality, but it also means DSPs are dealing with signals that are far from perfect. Real-world analog signals come with noise, distortion, and variability, so DSPs must work harder to clean them up. This has led to a big push for smarter calibration and compensation algorithms.”

Because of this, the role of DSPs has expanded. “They’re not just crunching numbers. They’re doing analog-aware processing,” said Borkar. “Think adaptive filtering to reduce interference, linearization for RF power amplifiers, and correcting errors from ADCs and DACs. All of this adds complexity, so DSP architectures are becoming more parallel and often include specialized accelerators to keep up with performance demands.”

Digitally controlled analog gives DSPs more programmability. “These days, if you have a basic backbone of a data flow, the DSP is analog, but then you tap off some signals along the way and convert them to digital,” said Marc Swinnen, director of product marketing at Ansys, part of Synopsys. “Now you can do a full digital analysis of all sorts of mathematical algorithms and all the software coding, and do a lot of digital thinking about it. When you think about what the feedback signal should be, you translate that back into analog and feed back the analog signal. That’s called digitally-controlled analog, and that inserts programmability, software, and digital circuits in the feedback flow. It’s not as fast or as elegant, but it’s more programmable, and the software is more controllable.”

Looking ahead, some interesting trends are emerging. “AI is now starting to play a big role in solving the challenges that come with more analog content in SoCs,” said Borkar. “Traditionally, DSPs relied on fixed models to correct imperfections in analog signals, but those models often fall short when real-world conditions change. This is where AI shines. Machine learning can learn from actual device behavior and adjust calibration dynamically, predicting non-linearities in ADCs or RF paths and applying corrections on the fly.”

AI also makes DSPs more adaptive. “Instead of using static filters or equalizers, AI-driven algorithms can continuously optimize themselves as conditions shift — whether it’s temperature changes, aging components, or interference,” said Borkar. “That’s especially important for systems like 5G radios or automotive sensors, where the environment is never constant.”

Others agree that the future holds a mix of classic and AI approaches. “We had a discussion with some customers on the automotive side about what the DSP will continue to do and what AI will take over,” said Andy Heinig, head of the Department for Efficient Electronics at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “For example, if you have a radar, then you have three different FFTs (fast-Fourier transforms). There are some approaches to replace that with AI, but we are quite sure that a classical FFT algorithm is much better in terms of power efficiency, because you can highly optimize that, and you need very huge nets to have the same accuracy. Also, it’s much more deterministic, more explainable. We see the solution is to have FFTs, and then, for object recognition, for example, on top of that, an AI will work. But to replace everything that a classical DSP can do, we don’t see it.”

When doing signal conditioning, it makes sense to do the first steps on classical DSP algorithms with some FFTs. “We will see a short trend where everything is replaced by AI, but it will definitely come back to this hybrid approach,” said Heinig. “To find the right way — what is classic, what is AI — will take some iteration.”

DSP slices and AI engines in FPGAs The DSPs built into FPGAs are reconfigurable blocks that have evolved to be much more efficient, capable of handling both fixed- and floating-point operations, as well as AI and machine learning workloads, noted Altera’s Yadavalli.

In addition to DSPs, many modern FPGAs have AI engines, which are VLIW (very long instruction word), SIMD (single input, multiple data) processors. One advantage of this is to enable FPGAs to perform digital signal processing in line with the data. “As opposed to a discrete DSP and then an FPGA that’s doing data acquisition from ADCs or DACs, you now have DSP slices or AI engines,” said Yadavalli. “That’s something new that we’ve driven over the last few years, bringing in those vector compute engines onto the single device.”

A vector processing unit (VPU) is similar to a GPU, with cores that are doing the processing. “Or like an x86, you have your core architecture,” said Yadavalli. “This is a different architecture with a different instruction set, but that’s optimized for linear algebra and matrix math.”

The AI engines take on some traditional DSP workloads, but not all. “Multiply-accumulates are useful for a lot of different operations, so programmable logic still has DSP slices,” said Rob Bauer, senior manager of the product marketing, adaptive and embedded group at AMD. “But now we’ve added this AI engine array to the device to do the heavy lifting on these compute-intensive loads, such as channelizers, FFTs, FIR filters, and we have a handful of use cases. The aerospace/defense and test/measurement markets have taken that up in spades because of those advantages.”

Fig. 1: An adaptive SoC with AI engines that integrate DSP capabilities. Source: AMD

Integrating the ADCs and DACs into the same die as the FPGA is important from an RF test perspective. “It allows lower latency for testing these systems,” said Bauer. “You don’t have a discrete data converter or even a discrete chiplet, as you see with some other approaches. When you have a chiplet and then you have an FPGA, you’ve still got to move the data between the two. When you have the ADCs right there in the same die as the programmable logic, it brings some significant advantages.”

Chiplets or eFPGA for flexibility For new, unknown, and evolving use-cases, programmable chips allow engineers to update a deployed configuration. Chiplets also offer a solution.

“Chiplets are being built with technologies on them that would help the use case where things are changing a lot, where you could effectively swap in a chiplet that had a new protocol or standard in it,” said Arteris’ Nightingale. “That mitigates some of the benefits of FPGAs, because you can say, ‘On my next bin, I can build an SoC with multiple chiplets on, and we’ll replace one of the chiplets with a security chiplet that we’ve just upgraded, and everything else stays the same.’ However, there’s that trade-off between the power performance advantage in the prototyping aspect. Maybe prototype first in an FPGA and then swap it out. Chiplets are definitely going to be part of the equation. Chiplets give you more time or more variables. You could even think of having a chiplet with an FPGA on it that you are going to swap out for an optimized processing unit.”

This means a chiplet with an FPGA on it can be reprogrammed, but the rest of the SoC doesn’t have to be validated again because just one part is being changed.

Embedded FPGAs are another solution, but they come with an area tax because of the reconfigurability circuitry. “For someone who’s used to designing ASIC gates in the smallest amount of area, put that same design in an FPGA, and it becomes a lot bigger,” said Andy Jaros, vice president of IP sales at QuickLogic. “Designers need to think judiciously. ‘I’m only going to put it where it’s really important.’ It would also impact cost. We’re seeing things like I/O flexibility, because nobody wants to tape out another chip. Or if a new data center comes on and they change the backplane a little bit, they don’t want to tape out the ASIC, so they put an embedded FPGA in that space to be able to connect to different data centers or back planes.”

Alternatively, eFPGAs offer flexibility to the unknown. “It’s unknown now, but it will be known in the future,” said Hezi Saar, executive director, product management, mobile, automotive, and consumer IP at Synopsys. “Once it’s known, you need to be fast to market. And to be reliable, you need to be low power. I believe there’s some functionality that you can do with eFPGA, but it’s not a solution for all. The market is very agitated because of those unknowns. They have to make decisions, and they go with it, but they need to have a backup plan A, B, and C, where something changes, and how they respond to that. Based on what I’ve seen, they’re just going to do silicon quicker. It’s easier on the mobile side, because they can do it quickly. But in new markets like robotics, it’s a bit more challenging, especially if they say, ‘Okay, I need to move from LPDDR5 to LPDDR6. Then the foundry is changing the node from A to B. This one is no longer available to me. How fast can I move to the next one?’”

Memory architecture is a key differentiator between programmable and fixed logic. “ASICs can be architected with custom memory hierarchies to meet targeted AI workload demands, while FPGAs offer a flexibility that allows them to be applied to a range of use cases,” said Steve Woo, distinguished inventor and fellow at Rambus. “This tradeoff between generality and performance can impact efficiency, especially as models grow and memory bandwidth becomes a limiting factor.”

Conclusion In these fast-evolving times of AI spreading everywhere, the rise of robotics, and the future demands of 6G, programmability enables companies to keep up with technology trends and consumer demands, even if it sacrifices some of the efficiency of an ASIC.

“The way I like to describe it is that products are becoming software-defined, AI-powered, silicon-enabled,” said Michael Munsey, vice president of semiconductor industry for Siemens EDA. “If you have software, you need a semiconductor. Software has to run somewhere. But the things that have flipped over time were that typically when you did product design, the semiconductor would be developed first, then the software development would start later on, since so much of the functionality and features are being defined by software. Now, software is being developed a lot earlier, plus companies are looking to monetize after product release updates. Basically, they want to add functionality via software and update the product via software. All those decisions have to now be made architecturally.”

The hardware needs to be able to support the changes in the software, however. Take the iPhone, for example. “When iOS 16 came out, and you installed it on your phone, you ended up with a better phone because you had better noise cancellation in the microphone,” said Munsey. “You took better pictures, because those were all software and DSP algorithms that could be updated. The battery lasted longer because they tweaked the power profile of the chips, and they made the lifetime of the battery last longer by updating the battery management system. You could not do that unless you architected the whole product to be able to be software updatable. Now you have companies like Tesla, which are offering software upgrades to add more functionality to the cars, so you can see this could be a path forward. That’s why companies are investing more in compilation technologies, because that software has to now be developed even before the semiconductor is available to do the co-design.”

Related Reading FPGAs Find New Workloads In The High-Speed AI Era Growing use cases include life science AI, reducing memory and I/O bottlenecks, data prepping, wireless networking, and as insurance for evolving protocols.

Similar Posts