High-bandwidth memory stands at the forefront of multiple technology developments as a critical enabler of AI, but it is one of the most difficult modules to manufacture. Leading HBM device makers and foundries must simultaneously handle multi-layer chip stacking, die warpage, and shorter product lifecycles that are shrinking from two years down to just one.
But perhaps the most formidable challenge is introduced through the tightening size and pitch of through-silicon vias (TSVs) and microbumps, whose yield depends on the rapid resolution of defects for each generation of high bandwidth memory. Defect counts surge with the thousands of interconnects that must be flawlessly processed. These trends are pushing inspection tools to their absolute limit.
“Small bumps are the problem, …
High-bandwidth memory stands at the forefront of multiple technology developments as a critical enabler of AI, but it is one of the most difficult modules to manufacture. Leading HBM device makers and foundries must simultaneously handle multi-layer chip stacking, die warpage, and shorter product lifecycles that are shrinking from two years down to just one.
But perhaps the most formidable challenge is introduced through the tightening size and pitch of through-silicon vias (TSVs) and microbumps, whose yield depends on the rapid resolution of defects for each generation of high bandwidth memory. Defect counts surge with the thousands of interconnects that must be flawlessly processed. These trends are pushing inspection tools to their absolute limit.
“Small bumps are the problem, not the large bumps, said Alex Tokar, worldwide application and sales manager, X-ray division at Bruker. “X-ray imaging can inspect the bumps as well as the underbump metallization for defects and inconsistencies.”
HBM utilizes more data paths to achieve the high bandwidth needed, but with significantly smaller bump pitches than traditional ball grid arrays (BGAs) in flip chip packaging. For the current generation devices, HBM3E features 30- to 20-micron bumps, but HBM4 will likely scale down to the 10-micron level. “HBM is a key technology driving scaling. For HBM4, some customers are moving to a bump height of only 10 microns,” said Damon Tsai, head of product marketing, inspection products at Onto Innovation.
Challenges of scaling copper bumps In order to stack 16 chips under the height of a single wafer, each wafer backside must be thinned down dramatically, to as thin as 20 microns. Inspection technology of the backside is used in production to ensure flatness across the 300mm wafer. At the same time, the three HBM chipmakers — SK hynix, Samsung, and Micron — are evaluating the inevitable move to hybrid bonding.
“We see so-called hybrid-hybrid bonding as a possible way of transitioning from microbumps to hybrid bonding,” Tsai said. “Here, two wafers use hybrid bonding, accessing the benefits of much shorter interconnects and signal delay, while the next level uses microbumping.”
Tsai noted that warpage is becoming a bigger issue as wafers are thinned down further. “HBM companies are starting to think about wafer-to-wafer bonding, because after thinning, the wafer level is much easier to handle than the die level.”
Another factor that negatively affects yield, reliability, and performance across bumps is an inconsistent bump height (poor coplanarity), which can result from plating non-uniformity and process variability. Meanwhile, a lack of coplanarity affects surrounding areas, inducing mechanical stress, interconnect fatigue, or thermal cycling failures. Latent defects that escape detection during manufacturing can lead to inconsistent contact, degrading signal integrity, power delivery and reliability. The result of such misalignment causes opens and shorts during flip-chip bonding. [1]
Given the breadth of these challenges, IC manufacturers typically focus on identifying issues after the plating step and before the reflow step (see figure 1). After copper and tin cap electroplating, confocal laser inspection outperforms white light inspection because of reflections off a rough metal surface that cause measurement noise.
Fig. 1: Coherent laser measurements overcome the roughness of as-deposited copper pillars, while white light illumination suffers from scattering noise. After reflow, inline process control and AI analytics enable 3D anomaly detection across thousands of bumps. Source: Onto Innovation
The construction of 3D bump images benefits from using multiple cameras at different angles. “In the advanced packaging space, devices were laterally based for so long,” said John Hoffman, director of research and development at Nordson Test & Measurement. “Now the challenges are vertical, as well. So the relevant metrics relate to the quality and size of the vias and bumps and how the stacking is being performed.”
“When you go with stacking, coplanarity is a critical requirement, so you really want tight control of the flatness,” said Hoffman. “And customers want different capabilities when they stack one device on top of another. So as they’re developing the process they are evaluating ‘do I have to measure every single part and look at the warpage and map these things together or can I control things really well enough so that things are really flat?’”
Additionally, flexibility is necessary to accommodate different stages of development and to ramp to high-volume manufacturing. “A flexible tool can provide a profile of your part,” Hoffman said. “Some customers want to match the curvature of the different parts so that the die stacking is successful.”
Why are HBMs so tough to make? In recent months, SK hynix, Samsung, and Micron have seen soaring demand for HBM. These modules are positioned along the shoreline of AI processors (XPUs) or ASICs in data servers. Because the physical shoreline space is limited, HBM makers must stack multiple DRAMs — 16 layers with a logic base controller, which could grow to 20 layers in HBM4.
To facilitate manufacturing integration, JEDEC standards restrict the total height of the memory module to 775 microns in HBM4, roughly the thickness of a prime silicon wafer. TSVs and copper microbumps vertically interconnect the devices down to the purpose-built interposers, which enable bandwidths far exceeding those delivered by memories like DDR4 and GDDR5.
Microbumps play a critical role in enabling HBM structures by serving as interconnects between dies, and between dies and interposers or substrates. These bumps need to be uniform in height, properly aligned, and defect-free. Microbumps also serve to dissipate heat in multi-die stacks, so the ability to assemble denser, smaller bumps also improves heat dissipation from the HBM modules.
To reach acceptable yield in the competitive market for HBM, chipmakers are optimizing 3D inspection approaches that can illuminate critical bump defects such as voids, poorly aligned pads, and solder extrusion. Automated optical inspection (AOI) with laser triangulation methods can provide bump height and coplanarity measurements, while X-ray inspection tools are well-suited to measuring hidden bump characteristics. Similarly, acoustic inspection tools are actively being improved to illuminate any voids in metal interconnects, a growing problem in microbumps, fine redistribution layers, and other interconnects.
Such defects can occur anywhere among the thousands of copper microbumps connecting stacked chips, which become harder to inspect after the mass reflow or thermocompression bonding (TCB) step. Both Samsung and Micron use thermal compression with non-conductive film (TC-NCF) to bond their microbumps, while SK hynix employs a mass reflow molded underfill (MR-MUF) approach.
Mass reflow is the most mature and the least expensive solder flow option. In general, mass reflow is used whenever possible. Thermocompression and R-LAB (reverse laser-assisted bonding) are both process enhancements to conventional MR that better manage warpage from die-to-die and within the package. Because TCB uses high pressure and high temperature, TC-NCF may not scale as well as MR-MUF approaches.
For HBMs, characterizing and eliminating defects within an acceptable timeframe requires compilation abilities that are only possible using a combination of human expertise with AI data crunching. Add to that the critical choice between hybrid bonding and microbumps, and HBM is truly working at the edge of what is possible in fabricating high-yielding electrical connections.
Interconnect density intensifies with the adoption of hybrid bonding for bond pad interconnects, with less room for error across the wafer and the challenge of detecting particles or micro voids at the Cu-Cu bond pad interface. Undetected voids can lead to open electrical connections, resulting in yield loss, and in extreme cases, wafer breakage.
“Current acoustic technologies are limited to ≥10µm void size sensitivity and require water immersion, which adds the risk of bonded wafer contamination, delamination, and corrosion,” said Onto Innovation’s Tsai. The company is testing an immersion-free opto-acoustic solution at a customer site aimed at detecting ever smaller voids without the risks associated with water immersion.
The cost of bump technologies is lower than that of hybrid bonding, but only if yield levels can be maintained as bumps scale below 20 microns. Microbumps face pitch limitations, especially below 10µm, due to challenges in plating uniformity and solder reflow.
The transition from copper pillar bump manufacturing to hybrid bonding depends on the limitations of bumps with scaling, as well as the ease with which front-end wafer-to-wafer bonding can be implemented. “Because some memory makers also have front-end capability, it’s not as difficult to implement wafer-to-wafer bonding as it could be for other companies,” added Tsai.
AI: Making sense of inspection data Microbumps suffer from a variety of defect types, including solder necking, head-in-pillow, and partial cracks. A head-in-pillow defect is a latent defect where one ball appears to sink into the one below, resembling a head on a pillow. Once subjected to thermal and/or mechanical stress, a fault can occur. X-ray imaging can be instrumental in identifying marginal failures like head-in-pillow caused by bump misalignment from wafer tilt.
Fig. 2: Gradual transition from good bumps to non-wet defect caused by die tilt. Source: Bruker
It’s critical to catch these marginal defects using early inspection steps. The electrical connection means it will typically pass testing, but may show up as a high resistance connection. Later in the process, or when HBMs are in the field, the introduction of mechanical or thermal stress can lead to complete failure — an open, non-wet. Enabling real-time defect analysis and feedback to process equipment for statistical process control effectively reduces the cycle time of yield loss assessments.
Whether microbump defects are collected using optical (AOI), X-ray, or acoustic inspection, compiling the massive amounts of data from the 3D bump assembly can significantly benefit from AI processing, which can improve the speed and accuracy of automated 3D defect detection methods. Depending on the customer, metrology suppliers often partner with customers to aid in inspection/metrology data analysis.
“That becomes a big question,” said Nordson’s Hoffman. “‘How do we inject our data into their factory automation systems? Or does it work better as standalone capability?’ We support all kinds of different use cases.”
A big part of using AI for metrology data analytics involves developing a multi-view approach to increase the robustness of defect detection. For example, A*STAR researcher Richard Chang and colleagues recently demonstrated how 3D X-ray scanning data, together with convolutional neural networks and vision transformers, improved detection accuracy and enabled comprehensive reports on >400 bumps in only 5 minutes. [2]
The researchers applied both deep learning and large language models to enable better defect detection accuracy, along with the assignment of fewer labels. “We detect individual HBMs consisting of multiple components along with providing a full report for the scanned chip providing 3D metrology information, details on defects, and possible root cause analysis for the defects,” stated the authors. The approach focuses on four features, bond-line thickness, solder-to-void ratio, pad misalignment, and solder extrusion, while also measuring copper pillar width and height.
A*STAR’s segmentation method identifies defects from four regions of the bump – copper pillar, copper pad, void and solder based on transversal views, sagittal views and compiled 3D views of bump arrays. Generative AI reports then highlight potential failure patterns and suggest corrective actions. The approach also prioritizes critical defects based on historical and real-time data.
A key role for GenAI lies in quickly creating reports based on queries. This enables root cause analysis back to the tool or process that caused the fault. “We then propose a fully automated system that, with one click, is able to provide a comprehensive report of the entire 3D scan consisting of more than 400 bumps in under 5 minutes,” stated the authors.
At the individual tool level, another area where AI can help is in separating measurement noise from the desired image, a result of the various film interfaces in stacked chips. In X-ray inspection, adjustments in the imaging process reduce noise and optimize image clarity. This may involve adjusting energy settings, power levels, or exposure times.
Control of the X-ray dose is especially critical because X-rays potentially can damage sensitive devices, a top-line concern for memory chips because such damage can degrade device performance and reliability.
Conclusion With the transition to HBM4, chipmakers are facing multiple challenges associated with scaling copper microbumps to 10 microns in size, deciding when and how to migrate from microbumps to hybrid bonding, and choosing the best methods to analyze reams of data streaming from automated optical inspection, X-ray defect inspection, or acoustic inspection methods.
Microbumps suffer from a variety of defect types, including pad misalignment, solder necking, head-in-pillow, and partial cracks. The great challenge of inspecting and controlling thousands of bumps between die and from die to substrate lies in analyzing thousands of images in a reasonable timeframe. Defects such as solder extrusion and voids need to be traced back to their source for rapid prevention during manufacturing.
A*STAR researchers suggested the following remedies: [2]
Solder extrusion defects occur due to excessive solder paste application, improper reflow temperature profile or insufficient solder mask coverage. Manufacturers should optimize solder paste volume control, adjust reflow temperature profiles and ensure proper solder mask coverage.
Pad misalignment defects arise due to improper alignment during die placement, PCB warpage, or inaccuracies during the stencil design. Manufacturers should implement high-accuracy placement techniques, ensure PCB flatness during assembly, and use accurate stencil alignment for consistent solder deposition.
As HBM vendors approach the transition to HBM4, having an analytics framework and inspection methodology in place to detect the myriad bump defects and defect patterns from development to high-volume manufacturing will be key to producing modules with high yield.
References
- https://semiengineering.com/interconnect-innovations-in-high-bandwidth-memory-part-2/
- R. Chang, et. al, “Efficient Visual Inspection Framework of High-Bandwidth Memory Bumps with Generative and Deep Learning AI,” 2025 IEEE 75th Electronic Components and Technology Conference (ECTC), Dallas, TX, USA, 2025, pp. 920-926, doi: 10.1109/ECTC51687.2025.00161.
Related Reading HBM Options Increase As AI Demand Soars But manufacturing reliable 3D DRAM stacks with good yield is complex and costly. X-ray Inspection Becoming Essential In Advanced Packaging Improvements in speed and precision have transformed a research tool for studying defects in chips into an industry workhorse. Progress In Wafer And Package Level Defect Inspection Advances in imaging systems aim to improve throughput without sacrificing measurement accuracy.