A probabilistic histological atlas of the human brain for MRI segmentation

Main

MRI enables three-dimensional (3D) imaging of the human brain in vivo with millimetre resolution. Neuroimaging packages like FreeSurfer4, FSL5 and SPM6 enable large-scale studies with thousands of MRI scans. A core component of these packages is digital atlases: reference 3D brain images that comprise image intensities, neuroanatomical labels or both. (We note that the cerebral cortex is often modelled with specific atlases defined on surface coordinate systems rather than 3D images.) Atlases enable comparison of different subjects in a common coordinate frame (CCF). When they include neuroanatomical labels, atlases also provide previous spatial information for analyses such as automated image segmentation7.

Most volumetric atlases are built by averaging in vivo MRI scans from many subjects. However, their resolution (roughly 1 mm) is insufficient to study brain subregions with different function and connectivity8. Ex vivo MRI yields roughly 100-μm resolution9,10,11,12 but still fails to visualize cytoarchitecture. Histology is a microscopic modality that addresses this issue. Earlier versions of histological atlases were printed and comprised a small number of sections13. Subsequent efforts combined serial histology with image registration to produce 3D histological atlases14. These were mapped to in vivo scans of living subjects by means of intermediate 3D MRI templates (for example, the Montreal Neurological Institute (MNI) atlas15) or directly with Bayesian methods.

Earlier 3D histological atlases modelled only one brain region (for example, thalamus, basal ganglia16,17,18). More recent efforts targeted the whole brain. BigBrain1 comprises more than 7,000 histological sections of a single brain, but without labels. Its follow-up, Julich-Brain2, aggregates data from 23 individuals, with community-sourced labels for 248 cytoarchitectonic areas mapped to MNI space—albeit with limited accuracy and only partial subcortical labelling19. The Allen reference brain3 has comprehensive anatomical annotations but only on a sparse set of sections of a single specimen. The Allen MNI template is labelling of the MNI atlas with the Allen anatomical protocol, but with a fraction of the labels and less accurate delineations owing to limited resolution and contrast. The Ahead brains20 comprise quantitative MRI and registered 3D histology for two separate specimens, but labels are available for only a few dozen structures and are automated rather than manual. Further details on these atlases can be found in the ‘Extended Introduction’ in the Supplementary Information.

Although existing histological atlases provide exquisite 3D cytoarchitectural maps and some degree of MRI–histology integration, there are at present neither (1) datasets with densely labelled 3D histology of the whole brain nor (2) probabilistic atlases built from such datasets, which would enable analyses such as Bayesian segmentation or CCF mapping of the whole brain.

To address these issues, we present NextBrain, a densely labelled probabilistic atlas of the human brain built from histology images. We used custom artificial-intelligence-enabled registration and segmentation methods to assemble 3D reconstructions of multimodal serial histology of five human half brains, semi-automatically segment them into 333 ROIs and average the labels into the probabilistic atlas. NextBrain is open source and includes the atlas, a companion Bayesian segmentation method, the data (with an online visualization tool) and ground truth delineations for a 100-μm isotropic ex vivo scan12.

Densely labelled 3D histology of five human hemispheres

The NextBrain workflow is summarized in Fig. 1 and detailed in Methods. The first result of the pipeline (Fig. 1a–g) is a multimodal dataset with human hemispheres from five donors (three right, two left), including half cerebellum and brainstem. Each of the five cases comprises accurately aligned high-resolution ex vivo MRI, serial histology with hematoxylin and eosin (H&E) and Luxol fast blue (LFB) stains, and dense ground truth segmentations of 333 cortical and subcortical brain ROIs.

Fig. 1: NextBrain workflow.

a, Photograph of formalin-fixed hemisphere (lateral view). b, High-resolution (400 μm) ex vivo MRI scan, FreeSurfer segmentation and extracted pial surface (parcellated with FreeSurfer). Left, sagittal slice of MRI. Centre, corresponding FreeSurfer segmentation. Right, 3D rendering of reconstructed and parcellated pial surface. c, Tissue slabs and blocks, before and after paraffin embedding. Left, blocked coronal slice of the cerebrum. Right, blockface photo of a cerebral block. d, Histology: coronal section of cerebrum stained with LFB (left) and H&E (right). e, Artificial-intelligence-assisted labelling of 333 ROIs on LFB. Left, cerebrum; centre, brainstem; right, cerebellum28. f, Initialization of affine alignment of tissue blocks using a custom registration algorithm that minimizes overlap and gaps between blocks. g, Refinement of registration with histology and nonlinear transform24,25. Reconstructed coronal slice of LFB (left), H&E (middle) and labels (right), overlaid on MRI, after nonlinear registration with artificial intelligence and robust Bayesian refinement. h, Orthogonal slices of our 3D probabilistic atlas. Left, sagittal; middle, coronal; right, axial. Each voxel is painted with a linear combination the colours of each label, multiplied by their probabilities. i, Automated Bayesian segmentation of an in vivo scan into 333 ROIs using the atlas. The atlas can also be used for segmenting ex vivo MRI and as CCF for population analyses.

Aligning the histology of a case is analogous to solving a 2,000-piece jigsaw puzzle in 3D, with the ex vivo MRI as reference (similar to the image on the box cover), and with pieces that are deformed by sectioning and mounting on glass slides—with occasional tissue folding or tearing. This problem falls out of the scope of existing intermodality registration techniques21, including slice-to-volume22 and 3D histology reconstruction methods14, which do not have to address the joint constraints of thousands of sections acquired in non-parallel planes as part of different blocks.

Instead, we solve this challenging problem with a custom, state-of-the-art image registration framework (Fig. 2), which includes three components specifically developed for this project: (1) a differentiable regularizer that minimizes overlap of different blocks and gaps in between23, (2) an artificial intelligence registration method that uses contrastive learning to provide highly accurate alignment of corresponding brain tissue across MRI and histology24 and (3) a Bayesian refinement technique based on Lie algebra that guarantees the 3D smoothness of the reconstruction across modalities, even in the presence of outliers due to tissue folding and tearing25. We note that this is an evolution of our previously presented pipeline26, which incorporates the aforementioned contrastive artificial intelligence method and jointly optimizes the affine and nonlinear transforms to achieve a 32% reduction in registration error (details below).

Fig. 2: 3D histology reconstruction of Case 1.

a, Coronal slice of 3D reconstruction; boundaries between blocks are noticeable from uneven staining. Joint registration minimizes overlap and and gaps between blocks (this reconstructed slice comprises four different blocks). b, Accurate intermodality registration with artificial intelligence techniques. Registered MRI, LFB and H&E histology of a block, with tissue boundaries (traced on LFB) overlaid. c, Orthogonal view of reconstruction, which is smooth thanks to the Bayesian refinement and avoids gaps and overlaps thanks to the regularizer. d, Visualization of 3D landmark registration errors for Case 1. Left, visualization of landmarks. Right, histogram, mean and s.d. of error magnitude for this case, compared with our previous pipeline. Error (mean ± s.d.): 1.27 ± 0.59 mm. Error26: 1.42 ± 0.72 mm. See Table 1 and Extended Data Figs. 1, 2, 3 and 4 for results on the other cases.

Qualitatively, it is apparent from Fig. 2 that a very high level of accuracy is achieved for the spatial alignment, despite the non-parallel sections and distortions in the raw data. The regularizer effectively aligns the block boundaries in 3D without gaps or overlap (Fig. 2a–c), with minor discontinuities across blocks (for example, in the temporal lobe). When the segmentations of different blocks are combined (Fig. 2a, right), the result is a smooth mosaic of ROI labels.

The artificial-intelligence-enabled registration across MRI and histological stains is exemplified in Fig. 2b. Overlaying the main ROI contours on the different modalities shows the highly accurate alignment of the three modalities (MRI, H&E, LFB) even in convoluted regions of the cortex and the basal ganglia. The mosaic of modalities also highlights the accurate alignment at the substructural level: for example, subregions of the hippocampus.

Figure 2c shows the 3D reconstruction in orientations orthogonal to the main plane of sectioning (coronal). This illustrates not only the lack of gaps and overlaps between blocks but also the smoothness that is achieved within blocks. This is thanks to the Bayesian refinement algorithm, which combines the best features of methods that (1) align each section independently (high fidelity to the reference, but jagged reconstructions) and (2) those that align sections to their neighbours (smooth reconstructions, but with a ‘banana effect’: that is, straightening of curved structures).

To quantitatively evaluate the 3D reconstruction accuracy, we used 250 manually placed pairs of landmarks to compute registration errors (50 landmarks per case); landmarks are known to be a better proxy for registration error than similarity of label overlap metrics27. Table 1 displays means and standard deviations of the registration error for each of the five cases, comparing our method with our previous pipeline26. Histograms and 3D visualizations of the errors for individual cases can be found in Fig. 2d and in Extended Data Figs. 1d, 2d, 3d and 4d. Our method yields an average error of 0.99 mm (s.d., 0.51 mm; standard error, 0.03 mm), which is a considerable reduction with respect to ref. 26, which yielded 1.44 mm (s.d., 0.58 mm; standard error, 0.04 mm). The difference between the two methods is strongly significant: P values computed with a non-parametric paired Wilcoxon test were under 0.001 for all cases, and the P value for all 250 landmarks was P < 10−21; see details in Table 1. The spatial distribution of the error is further visualized with kernel regression in Extended Data Fig. 5, which shows that this distribution is fairly uniform: that is, there is no obvious consistent pattern across cases.

Our pipeline is widely applicable as it produces accurate 3D reconstructions from blocked tissue in standard-sized cassettes, sectioned with a standard microtome. The computer code and aligned dataset are freely available in our public repository. For educational and data inspection purposes, we have built an online visualization tool for the multimodality data, which is available at https://github-pages.ucl.ac.uk/BrainAtlas.

Supplementary Video 1 illustrates the aligned data, which include (1) MRI at 400-μm isotropic resolution, (2) aligned H&E and LFB histology digitized at 4-μm resolution (with 250-μm or 500-μm spacing, depending on the brain location) and (3) ROI segmentations, obtained with a semi-automated artificial intelligence method28. The ROIs comprise 34 cortical labels (following the Desikan–Killiany atlas29) and 299 subcortical labels (following different atlases for different brain regions; Methods and Supplementary Information). This public dataset enables researchers worldwide to conduct their own studies not only in 3D histology reconstruction but also other fields, such as high-resolution segmentation of MRI or histology30, MRI-to-histology and histological stain-to-stain image translation31, deriving MRI signal models from histology32 and many others.

A next-generation probabilistic atlas of the human brain

The labels from the five human hemispheres were coregistered and merged into a probabilistic atlas. This was achieved with a method that alternately registers the volumes to the estimate of the template and updates the template by means of averaging33. The registration method is diffeomorphic34 to ensure preservation of the neuroanatomic topology (for example, ROIs do not split or disappear in the deformation process). Crucially, we use an initialization based on the MNI template, which serves two important purposes: preventing biases towards any of the cases (which would happen if we initialized with one of them) and ‘centring’ our atlas on a well-established CCF computed from 305 subjects, which largely mitigates our relatively low number of cases. Because the MNI template is a greyscale volume, the first iteration of atlas building uses registrations computed with the ex vivo MRI scans. Subsequent iterations register labels directly with a metric based on the probability of the discrete labels according to the atlas33.

Figure 3 shows close-ups of orthogonal slices of the atlas, which model voxel-wide probabilities for the 333 ROIs on a 0.2-mm isotropic grid. The resolution and detail of the atlas represent a substantial advance with respect to the SAMSEG atlas35 now in FreeSurfer (Fig. 3a). SAMSEG models 13 brain ROIs at 1-mm resolution and is a highly detailed probabilistic atlas that covers all brain regions. The figure also shows roughly corresponding slices of the manual labelling of the MNI atlas with the simplified Allen protocol3. Compared with NextBrain, this labelling is not probabilistic and does not include many histological boundaries that are invisible on the MNI template (for example, hippocampal subregions, in violet). For this reason, it only has 138 ROIs—whereas NextBrain has 333.

Fig. 3: NextBrain probabilistic atlas.

a, Comparison with whole brain atlases. Portions of the NextBrain probabilistic atlas (which has 333 ROIs), the SAMSEG atlas in FreeSurfer35 (13 ROIs) and the manual labels of MNI based on the Allen atlas3 (138 ROIs). b, Close-up of three orthogonal slices of NextBrain. The colour coding follows the convention of the Allen atlas3, where the hue indicates the structure (for example, purple is thalamus, violet is hippocampus, green is amygdala) and the saturation is proportional to neuronal density. The colour of each voxel is a weighted sum of the colour corresponding to the ROIs, weighted by the corresponding probabilities at that voxel. The red lines separate ROIs on the basis of the most probable label at each voxel, thus highlighting boundaries between ROIs of similar colour; we note that the jagged boundaries are a common discretization artefact of probabilistic atlases in regions where two or more labels mix continuously: for example, the two layers of the cerebellar cortex.

A comparison between labelled sections of the printed atlas by ref. 13 and roughly equivalent sections of the Allen reference brain and NextBrain is included in the Supplementary Information. The agreement between the three atlases is generally good, especially for the outer boundaries of the whole structures: for example, the whole hippocampus, amygdala or thalamus. Mild differences can be found in the delineation of substructures, both cortical and subcortical (for example, subdivision of the accumbens), mainly due to (1) the forced choice of applying arbitrary anatomical criteria in both atlases because of lack of contrast in smaller regions, (2) different anatomical definitions and (3) the probabilistic nature of NextBrain. We emphasize that these differences are not exclusive to NextBrain, as they are also present between Mai–Paxinos and Allen.

Close-ups of NextBrain slices centred on representative brain regions are shown in Fig. 3b, with boundaries between the ROIs (computed from the maximum likelihood segmentation) overlaid in red. These highlight the anatomical granularity of the new atlas, with dozens of subregions for areas such as the thalamus, hippocampus, amygdala, midbrain and so on. An overview of the complete atlas is shown in Supplementary Video 2, which illustrates the atlas construction procedure and flies through all the slices in axial, coronal and sagittal view.

The probabilistic atlas is freely available as part of our segmentation module distributed with FreeSurfer. The maximum likelihood and colour-coded probabilistic maps (as in Fig. 3) can also be downloaded separately from our public repository for quick inspection and educational purposes. Developers of neuroimaging methods can freely capitalize on this resource, for example, by extending the atlas through combination with other atlases or manually tracing new labels; or by designing their own segmentation methods using the atlas. Neuroimaging researchers can use the atlas for fine-grained automated segmentation (as shown below) or as a highly detailed CCF for population analyses.

Segmentation of ultra-high-resolution ex vivo MRI

One of the new analyses that NextBrain enables is the automated fine-grained segmentation of ultra-high-resolution ex vivo MRI. Because motion is not a factor in ex vivo imaging, very long MRI scanning times can be used to acquire data at resolutions that are infeasible in vivo. One example is the publicly available 100-μm isotropic whole brain presented in ref. 12, which was acquired in a 100-hour session on a 7-T MRI scanner. Such datasets have huge potential in mesoscopic studies connecting microscopy with in vivo imaging36.

Volumetric segmentation of ultra-high-resolution ex vivo MRI can be highly advantageous in neuroimaging in two different manners: first, by supplementing such scans (like the 100-micron brain) with neuroanatomical information that augments their value as atlases (for example, as CCFs or for segmentation purposes37); and second, by enabling analyses of ex vivo MRI datasets at scale (for example, volumetry or shape analysis).

Dense manual segmentation of these datasets is practically infeasible, as it entails manually tracing ROIs on over 1,000 slices. Moreover, one typically seeks to label these images at a higher level of detail than in vivo (that is, more ROIs of smaller sizes), which exacerbates the problem. One may use semi-automated methods like the artificial-intelligence-assisted technique we used in to build NextBrain (see the previous section), which limits the manual segmentation to one every N slices28 (N = 4 in this work). However, such a strategy only ameliorates the problem to a certain degree, as tedious manual segmentation is still required for a significant fraction of slices.

A more appealing alternative is thus automated segmentation. However, existing approaches have limitations, as they either (1) were designed for 1-mm in vivo scans and do not capitalize on the increased resolution of ex vivo MRI18,35 or (2) use neural networks trained with ex vivo scans but with a limited number of ROIs because of the immense labelling effort that is required to generate the training data30.

This limitation is circumvented by NextBrain: as a probabilistic atlas of neuroanatomy, it can be combined with well-established Bayesian segmentation methods (which are adaptive to MRI contrast) to segment ultra-high-resolution ex vivo MRI scans into 333 ROIs. We have released in FreeSurfer an implementation that segments full brain scans in about 1 h, using a desktop equipped with a graphics processing unit.

To quantitatively evaluate the segmentation method, we have created a gold standard segmentation of the public 100-micron brain12, which we are publicly releasing as part of NextBrain. To make this burdensome task practical and feasible, we simplified it in five manners: (1) downsampling the data to 200-μm resolution, (2) labelling only one hemisphere, (3) using the same semi-automated artificial intelligence method as in NextBrain for faster segmentation, (4) using FreeSurfer to automatically subdivide the cerebral cortex and (5) labelling only a subset of 98 visible ROIs (Supplementary Videos 3 and 4). Even with these simplifications, labelling the scan took more than 100 h of manual tracing effort.

We compared the gold standard labels with the automated segmentations produced by NextBrain using Dice overlap scores. Because the gold standard has fewer ROIs (particularly in the brainstem), we (1) clustered the ROIs in the automated segmentation that correspond with the ROIs in the gold standard and (2) used a version of NextBrain in which the brainstem ROIs are simplified to better match those of the gold standard (with 264 labels instead of 333). The results are shown in Extended Data Table 1. As expected, there is a clear link between size and Dice. Larger ROIs like the cerebral white matter or cortex have Dice around 0.9. The smaller ROIs have lower Dice, but very few are below 0.4—which is enough to localize ROIs. We note that the median Dice (0.667) is comparable with that reported by other Bayesian segmentation methods for brain subregions38.

Sample slices and their corresponding automated and manual segmentations are shown in Fig. 4. The exquisite resolution and contrast of the dataset enables our atlas to accurately delineate a large number of ROIs with very different sizes, including small nuclei and subregions of the hippocampus, amygdala, thalamus, hypothalamus, midbrain and so on. Differences in label granularity aside, the consistency between the automated and gold standard segmentation is qualitatively very strong.

Fig. 4: NextBrain segmentation of ultra-high-resolution MRI.

Automated Bayesian segmentation of publicly available ultra-high-resolution ex vivo brain MRI12 using the simplified version of NextBrain, and comparison with the gold standard (only available for the right hemisphere). We show two coronal, sagittal and axial slices. The MRI was resampled to 200-μm isotropic resolution for processing. As in previous figures, the segmentation uses the Allen colour map3 with boundaries overlaid in red. We note that the manual segmentation uses a coarser labelling protocol.

This is a highly comprehensive dense segmentation of a human brain MRI scan. As ex vivo datasets with tens of scans become available30,39, https://dandiarchive.org/dandiset/000026, our tool has great potential in augmenting mesoscopic studies of the human brain. Moreover, the labelled MRI that we are releasing has great potential in other neuroimaging studies, for example, for training or evaluating segmentation algorithms; for ROI analysis in the high-resolution ex vivo space; or for volumetric analysis by means of registration-based segmentation.

Fine-grained analysis of in vivo MRI

NextBrain can also be used to automatically segment in vivo MRI scans at the resolution of the atlas (200-μm isotropic), yielding an extremely high level of detail. Scans used in research typically have isotropic resolution with voxel sizes ranging from 0.7 mm to 1.2 mm and therefore do not show all ROI boundaries with as much detail as ultra-high-resolution ex vivo MRI. However, many boundaries are still visible, including the external boundaries of brain structures (hippocampus, thalamus and so on) and some internal boundaries: for example, between the anteromedial and lateral posterior thalamus40. Bayesian segmentation capitalizes on these visible boundaries and combines them with the previous knowledge encoded in the atlas to produce the full subdivision—albeit with lower reliability for the indistinct boundaries10. A sample segmentation is shown in Fig. 1f.

Evaluation of segmentation accuracy

We first evaluated the in vivo segmentation quantitatively in two different experiments. First, we downsampled the ex vivo MRI scan from the previous section to 1-mm isotropic resolution (that is, the standard resolution of in vivo scans), segmented it at 200-μm resolution and computed Dice scores with the high-resolution reference. The results are displayed in Extended Data Table 1. The median Dice is 0.590, which is 0.077 lower than at 200 μm but still fair for such small ROIs38. Moreover, most Dice scores remain over 0.4, as for the ultra-high resolution, hinting that the priors can successfully provide a rough localization of internal boundaries, given the more visible external boundaries.

In a second experiment, we analysed the Dice scores produced by NextBrain in OpenBHB41, a public meta-dataset with roughly 1-mm isotropic T1-weighted scans of more than 3,000 healthy individuals acquired at more than 60 sites. Using FreeSurfer 7.0 as a silver standard, we computed Dice scores for our segmentations at the level of whole regions: that is, the level of granularity provided by FreeSurfer. Although these scores cannot assess segmentation accuracy at the subregion level, they do enable evaluation on a much larger multisite cohort, as well as comparison with the Allen MNI template—the only competing histological (or rather, histology-inspired) atlas that can segment the whole brain in vivo. The results (Extended Data Fig. 6) show that (1) NextBrain consistently outperform the Allen MNI template, as expected from the fact that one atlas is probabilistic whereas the other is not; (2) NextBrain yields Dice scores in the range expected from Bayesian segmentation methods35—despite using only five cases, thanks to the excellent generalization ability of generative models42; and (3) despite being built from a set of older subjects, our mitigation strategy (anchoring NextBrain on MNI and using highly generalizable Bayesian segmentation) enables NextBrain to produce segmentations that are consistently accurate throughout the lifespan, as opposed to the Allen MNI template, which has a strong negative correlation between age and performance: r = −0.274, P < 10−55, compared with NextBrain (r = 0.046, P = 0.009). Please see Extended Data Fig. 6b,c for further details.

Application to Alzheimer’s disease classification

To further compare NextBrain with the Allen MNI template, we used an Alzheimer’s disease classification task based on linear discriminant analysis (LDA) of ROI volumes (corrected by age and intracranial volume). Using a simple linear classifier on a task where strong differences are expected allows us to use classification accuracy as a proxy for the quality of the input features: that is, the ROI volumes derived from the automated segmentations. To enable direct comparison, we used a sample of 383 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset43 (168 Alzheimer’s disease, 215 controls) that we used in previous publications10,11,40.

Using the ROI volumes estimated by FreeSurfer 7.0 (which do not include subregions) yields an area under the receiver operating characteristic curve (AUROC) equal to 0.911, with classification accuracy of 85.4% at its elbow. The Allen MNI template exploits subregion information to achieve AUROC = 0.929 and 86.9% accuracy. The increased segmentation accuracy and granularity of NextBrain enables it to achieve AUROC = 0.953 and 90.3% accuracy—with a significant increase in AUROC with respect to the Allen MNI template (P = 0.01 for a DeLong test). This AUROC is also higher than those of specific ex vivo atlases we have presented in the previous work10,11,40—which range from 0.830 to 0.931.

Application to fine-grained signature of ageing

We performed Bayesian segmentation with NextBrain on 705 subjects (aged 36–90, mean 59.6 years) from the Ageing HCP dataset44, which comprises high-quality in vivo scans at 0.8-mm resolution. We computed the volumes of the ROIs for every subject, corrected them for total intracranial volume (by division) and sex (by regression) and computed their Spearman correlation with age. We used the Spearman rather than Pearson correlation because, being rank-based, it is a better model for ageing trajectories as they are known to be nonlinear for wide age ranges45,46.

The result of this analysis is a highly comprehensive map of regional ageing of the human brain (Fig. 5a and Extended Data Fig. 7a; see also full trajectories for select ROIs in Extended Data Fig. 8). Cortically, we found significant negative correlations with age in the prefrontal cortex (marked with ‘a’ in Fig. 5a) and insula (b), whereas the temporal (c) and parahippocampal cortices (d) did not yield significant correlation; this is consistent with findings from studies of cortical thickness47,[48](https://www.nature.com/articles/s41586-025-09708-2#ref-CR48 “Llamas-Rodríguez, J. et al. TDP-43 and tau concurrence in the entorhinal subfields in primary age-related tauopathy and preclinical Alzheimer’s diseas

Main

Main

Densely labelled 3D histology of five human hemispheres

A next-generation probabilistic atlas of the human brain

Segmentation of ultra-high-resolution ex vivo MRI

Fine-grained analysis of in vivo MRI

Evaluation of segmentation accuracy

Application to Alzheimer’s disease classification

Application to fine-grained signature of ageing

Similar Posts