Digital dementia and testing of cognitive intervention for degenerating neural networks

Introduction

Recent advances in artificial intelligence, particularly in deep learning, have provided unprecedented opportunities for studying brain function and neurodegenerative disease. Within this context, convolutional neural networks (CNNs) have emerged as especially valuable tools for exploring visual processing, as they share remarkable architectural and functional similarities with the biological visual system in humans1,2,3,4. Like the hierarchical organization of the ventral visual stream5, CNNs process visual information through successive layers of increasing abstraction, essentially transforming low-level visual features into complex object representations. These parallels extend from early visual processing, where CNN filters mirror the receptive fields found in V1, to higher-order visual areas, where both systems develop specialized feature detectors for complex objects like faces and places6,7.

The similarities between artificial and biological visual systems even go beyond architectural resemblance. Previous studies have, for example, shown that CNNs trained on natural images develop internal representations that correlate strongly with neural activity patterns throughout the primate visual hierarchy8,9,10. This brain-model correspondence has been demonstrated through various methodologies, including representational similarity analysis11 and neural predictivity studies, where CNN activations successfully predict neural responses in the V4 and IT areas12,13.

These parallels between CNNs and the biological visual system have supported the establishment of artificial neural networks as powerful in silico model systems, not only for studying normal visual processing but also its pathologies14,15. More precisely, by manipulating the network architecture, training regimes, and introducing controlled perturbations, it is now possible to probe fundamental questions about visual computation that would be difficult to address in real biological systems. Of particular interest within this context is the simulation of posterior cortical atrophy (PCA), a distinct form of neurodegeneration characterized by progressive deterioration of higher-order visual processing functions16. PCA, often referred to as the visual variant of Alzheimer’s disease, is a condition marked by atrophy of the posterior brain regions17. This pattern of neurodegeneration leads to a characteristic onset of visual processing deficits: patients initially experience difficulties with complex visual tasks, followed by progressive impairment of more basic visual functions18. The preservation of memory and insight during early stages of PCA, combined with progressive visual dysfunction, creates a unique opportunity for studying targeted cognitive interventions in an in silico PCA model.

Several recent works15,19,20 have successfully simulated PCA in CNNs, each modeling different key aspects of the disease’s progression. The success of these models in replicating disease presentation (e.g., synaptic atrophy and cognitive decline) suggests that CNNs can indeed serve as valuable computational models for studying both basic PCA disease mechanisms and potential interventions. In this work, we investigate whether computational models of progressive neurodegeneration can provide insights into the effectiveness of different cognitive intervention strategies. Through systematic evaluation of various retraining approaches in an in silico CNN-based model of PCA, we aim to understand fundamental principles that might inform the development of therapeutic interventions for preserving visual cognition in patients with PCA.

While existing literature provides some insights into cognitive intervention approaches for early-onset Alzheimer’s disease or mild cognitive impairment, there remains a significant gap in our understanding of their fundamental efficacy in delaying functional visual decline21,22,23,24. This is especially true for PCA, as the time to diagnosis is often delayed due to misdiagnoses of PCA as an ocular condition18. Although many studies have been conducted in humans, there is only minimal research that provides foundational promise in the potential of cognitive therapy techniques. Current cognitive interventions for patients with neurodegenerative diseases typically focus on compensatory strategies and environmental modifications rather than direct remediation of visual processing deficits25. While some literature provides insights into precision-based cognitive therapy26,27, we believe that our in silico model of disease progression may provide an avenue to both streamline and discover more strongly beneficial cognitive interventions, specifically for patients with PCA.

In this work, we used a compressed VGG19 deep learning model trained on CIFAR100 for object recognition as an in silico model of the intact visual cortex. The function of the intact network is to correctly identify what object is depicted in a 32 × 32 pixel image from a predefined list of 100 objects (CIFAR100). Next, we simulate atrophy in the model by decaying and freezing synaptic weights following the methodology described by Moore et al.28. The novelty of the present work comes from the investigation of three distinct retraining strategies designed to preserve visual cognition and function. To perform this analysis, an intervention step is interleaved with the model injury step, where the model is retrained on selected images according to one of three specific retraining strategies. These strategies are inspired by both clinical, cognitive intervention approaches (accuracy-based retraining) and machine learning principles (entropy-based retraining). The retraining strategies explored in this work are outlined as follows and visualized in Fig. 1:

The first strategy was implemented to provide a baseline for comparisons, where retraining data is randomly sampled from a dataset that was held out from initial model training. We test different combinations of parameters, adjusting the number of retraining images.

The second strategy adopts a more targeted approach, focusing on maintaining high model object classification accuracy across all object classes by preferentially sampling retraining data for object categories where performance has degraded due to injury. In this strategy, class accuracies are assessed after each progression of disease. Following this analysis, retraining data are sampled according to the inverse distribution of the class accuracy. In doing so, classes of objects that the model performs worse on are more prevalent in the retraining data.

Our final strategy draws from active learning principles and utilizes entropy-based uncertainty measures to identify and address areas of model confusion. In humans, a surrogate of this measure of “uncertainty” could, for example, be measured by the delay in rendering a classification or recognition task. This retraining strategy aims to maintain low entropy, or uncertainty, in the model. To do so, the entropy of the model, after each disease step, is analyzed by identifying which samples in the training set of data are associated with the highest levels of uncertainty. The retraining set of data is then curated with images similar to these uncertain samples.

Fig. 1: Pipeline of model degeneration and the three retraining techniques.

Beginning with an intact network, a group of synaptic weights is decayed and frozen. Then, the model is evaluated on both object recognition accuracy and the entropy of the softmax probabilities. Next, the remaining healthy synapses of the model are retrained according to retraining schema 1, 2, or 3. The retrained model is once again evaluated on the same metrics, and the process is repeated. For the first type of retraining strategy (randomly selected or the null hypothesis), the healthy synapses of the model get retrained on a randomly selected, balanced subset of held-out data (previously unseen during training). For the class-wise accuracy retraining (2), the model is retrained on a held-out set of data that is curated with the number of examples representing each class inversely proportional to the model’s class-wise accuracy. Finally, during the entropy-based retraining (3), the model is retrained on held-out data that is the closest (lowest mean squared error (MSE)) to the images that exhibit the highest uncertainty at inference time.

These strategies represent different theoretical approaches to cognitive interventions, each with distinct implications for clinical practice. The random sampling approach serves as a control condition or a null hypothesis, analogous to the general cognitive stimulation without specific targeting or an intervention focus. The accuracy-based strategy mirrors targeted cognitive training programs that focus on strengthening specific weakened abilities29. The entropy-based approach represents a potential program that focuses on diminishing confusion between objects25. The testing of these targeted strategies in silico could lead to the development of adaptive cognitive intervention protocols that respond to individual patterns of cognitive decline and enhance desired patient outcomes.

To evaluate these approaches, we employ both traditional metrics of machine learning model performance (object recognition accuracy) and more recently proposed geometric analyses of the network’s internal representations. For the geometric analyses, we use the object manifold framework previously established30,31 and examine how the geometry and dimensionality of these manifolds evolve under the combined influences of network damage and rehabilitation. We hypothesize that these geometric changes serve as meaningful indicators of cognitive changes induced by progressive disease load, potentially providing insights into the mechanistic basis of both deterioration and cognitive intervention in visual processing networks (Fig. 2).

Fig. 2: Schematic illustration of the geometrical changes to object manifolds within healthy, diseased, and rehabilitated neural networks.

A In the healthy model, object categories (e.g., flowers, cars, trees) maintain linear separability in the representational space, enabling accurate classification. B Following simulated neurodegeneration, object manifolds exhibit increased dimensionality and reduced linear separability. This inseparability of manifold structure is associated with impaired classification performance, as depicted by the model’s confusion between the flower and tree categories. C Strategic retraining interventions help to restore the geometric properties of object manifolds that lead to easier separability, partially recovering distinct categorical representations and reducing the dimensionality. This geometric restoration correlates with improved classification performance, suggesting a potential computational mechanism for cognitive rehabilitation in neurodegenerative conditions.

Results

Differences in accuracy with different retraining schemas

Following the methods described in Moore et al.28, the CNN model was evaluated over N = 10 iterations of synaptic decay and retraining. Quantitative results were analyzed at each iteration of decay and retraining. The results of the retrained model are reported at each 10% interval of synaptic decay, while the decayed model results are displayed as the intermediate points between each iteration (Fig. 3). The initial, intact model (a compressed VGG19 for image/object classification) was pretrained on ImageNet and then finetuned on CIFAR100, where it achieved an accuracy of 69.5% in object classification. The implementation of progressive synaptic decay in our CNN model revealed distinct patterns of accuracy degradation across different retraining strategies. Figure 3 presents the comparative model accuracy trajectories of three retraining/intervention approaches: randomly selected data (null hypothesis), accuracy-based retraining, and entropy-based retraining, evaluated across varying levels of synaptic deterioration and retraining dataset sizes (500, 1000, and 3000 images). For 500 retraining images (Fig. 3A), accuracy-based retraining demonstrated superior performance compared to the null- and entropy-based retraining regimes between 10 and 60% of the synapses being decayed. Statistical analysis revealed a significant benefit of accuracy-based retraining (p < 0.05) compared against the other two methods, particularly in the intermediate stages of decay (10–40%) (p = [4.29E−2, 2.61E−6, 1.09E−5, 1.22E−5]) between accuracy-based and null for each retrained point) although it statistically outperformed entropy-based retraining up to 60% decay (p = [1.30E−6, 9.52E−8, 2.67E−5, 1.87E−4, 4.53E−5, 9.32E−5] for each retrained point between 10 and 60% decay). After 70% decay, the model was generally unable to recover any meaningful function with retraining. In Fig. 3B, results from using 1000 retraining images for the three strategies are shown, which demonstrate a similar trend in that accuracy-based retraining generally outperformed the null- and entropy-based strategies. Additionally, we found that the null retraining strategy outperformed the entropy-based strategy by a statistically significant amount for multiple progressive iterations of synaptic decay, including the retrained model states at 20, 30, 50, and 60% (p = [2.89E−2, 1.27E−3, 3.82E−2, 1.70E−4]). This phenomenon could also be replicated for the 3000-image dataset. For nearly all retrained model states (10–70% decay), accuracy- and null-based retraining caused the model to retain significantly higher levels of accuracy compared to the entropy-based regime. The difference between all three strategies was diminished when the retraining subset contained a larger proportion of the total images available to sample from in the holdout set.

Fig. 3: Model accuracy as network synapses are progressively decayed and retrained.

Results are reported as the average over ten different seeds ±SD. Model performance is evaluated in a cyclic pattern: after each 10% increment of synaptic decay is applied, we first measure the model’s degraded performance (shown in between the 10% increments), then conduct retraining, and measure the retrained performance. Thus, the reported metrics show alternating patterns of decline (immediately after decay) and potential recovery (following retraining) throughout disease progression. The accuracies of the three different retraining strategies are shown for varying amounts of retraining data. A Accuracy-based retraining consistently outperforms the null hypothesis and entropy-based retraining when using a pool of 500 retraining images. B Accuracy-based retraining continues to retain the highest levels of accuracy under degeneration with 1000 retraining images. C With 3000 retraining images, accuracy-based retraining and the null hypothesis (randomly selected images) converge to similar levels of accuracy, but both consistently outperform the entropy-based retraining.

Generally, the impact of the size of the retraining dataset manifested primarily in the stability of the performance trajectories rather than absolute accuracy levels. While larger retraining datasets (3000 images) produced smoother degradation curves, the fundamental patterns of relative strategy effectiveness remained consistent across all dataset sizes. Statistical significance of performance differences between all strategies (indicated by asterisks in Fig. 3) is notable in Fig. 3A, B (500 and 1000 retraining images), particularly in the critical 30–60% decay range. This performance trend persists in Fig. 3C (3000 retraining images) although significant differences between the null- and accuracy-based retraining diminish.

Object manifold geometries within decaying networks

Additionally, we investigated the geometric properties of object manifolds during progressive neurodegeneration. Following the theoretical framework established by Chung et al.32, we examined three key geometric metrics—capacity, correlation, and dimensionality—across different layers of the network under varying levels of synaptic decay (Fig. 4). These properties were specifically investigated because of the possibility of extending and translating this analysis to biological neural activity to provide a unifying framework to examine biological and in silico representation and information processing abilities. Within this context, some previous work analyzing neural data and deep language models has found similar and predictive geometric properties between the two33. Therefore, geometric properties of manifolds arising from neural data (e.g., functional magnetic resonance imaging (fMRI), electroencephalogram) could serve as valuable biomarkers that can be modeled computationally. While our geometric results represent a single experimental seed due to computational constraints of weight storage across multiple retraining iterations, they provide valuable insights into the geometric transformations of neural representations under the degeneration and rehabilitation scenarios.

Fig. 4: Geometric analysis of object manifolds in all convolutional and linear layers at various levels of network decay and retraining.

The top row displays the capacity of the model, or the linear separability of the manifolds through the layers as the networks are decayed. Accuracy-based retraining shows higher levels of model capacity in deep layers for synaptic decay levels of 20–60%. The middle row displays the correlation between manifolds. As the disease progresses, correlations increase in all layers. The boundaries between manifolds become less separable and are, therefore, more highly correlated between distinct objects. The final row depicts the dimensionality of manifolds. Accuracy-based retraining does marginally better at dimensionality reduction in later layers.

Capacity, which quantifies the network’s ability to maintain separable object representations, revealed distinct patterns across retraining strategies. In all three conditions (null, accuracy-based, and entropy-based retraining), capacity showed a characteristic increase in deeper layers of the network, particularly after layer 20 (layer_20_Conv2d in Fig. 4). However, the magnitude of this increase varied with synaptic decay levels. At 20% decay, all strategies led to a maintenance of relatively similar capacity profiles, but marked divergences emerged at higher decay levels (40–80%). Notably, accuracy-based retraining demonstrated higher levels of capacity in deeper layers even under severe degeneration (60–80% decay), suggesting more robust maintenance of object discriminability.

Correlation patterns across layers provided insights into the network’s representational similarity between object manifolds. All retraining strategies resulted in a general trend of slightly increasing correlations in deeper layers, consistent with the emergence of more abstract object representations. The accuracy-based retraining strategy showed a particularly interesting behavior, maintaining lower correlation values in later layers, even under severe decay conditions under 20–60% synaptic decay, compared to null-based retraining. Typically, intact “healthy” networks tend to decorrelate manifolds along the hierarchy of layers. This has been interpreted as “improvement of neural code for objects”30 that assists network capacity. As more damage is incurred, the model fails to decorrelate and relies more on correlations between manifolds to represent objects, especially in the case of null-based retraining.

The dimensionality analysis revealed more interesting differences between retraining strategies, particularly in how they affect representational compression in deeper network layers. During moderate levels of synaptic decay (20–40%), accuracy-based retraining demonstrated a better (if only slightly) pattern of dimensionality reduction in deep layers compared to both null and entropy-based approaches. This enhanced compression aligns with theoretical principles previously established32, where optimal object recognition is associated with progressive dimensionality reduction through the processing hierarchy.

The accuracy-based retraining strategy appears to better maintain this beneficial compression characteristic, particularly in layers beyond layer 22 (layer_22_Conv2d in Fig. 4). This finding is especially notable when considered alongside the capacity metrics, where accuracy-based retraining also showed superior performance. This suggests that accuracy-based retraining better preserves the network’s ability to form compact, discriminative representations of object categories despite ongoing synaptic degradation. Taken together, the accuracy-based retraining technique shows superior performance in terms of accuracy and geometric properties of object manifolds.

Discussion

While cognitive interventions and occupational therapy approaches have been widely studied in neurodegenerative conditions, empirical and analytical evidence for their efficacy has remained limited34. Our in silico investigation provides novel insights into how targeted cognitive training could be optimized and personalized in the future, particularly through the lens of representational dynamics in neural networks. The geometric principles that we observed in our computational model have parallels in biological neural recordings—similar manifold analyses can be performed on population-level neural activity measured through techniques, such as fMRI and multi-electrode recordings35. The superior performance of accuracy-based retraining across multiple computational metrics suggests that targeted intervention strategies, focused on strengthening specifically degraded capabilities, may offer advantages over more general approaches to cognitive rehabilitation. These computational findings could guide the development of biomarkers based on neural population geometry, potentially enabling more precise monitoring of disease progression and therapeutic effectiveness through neuroimaging and electrophysiological measurements. This is especially evident in our geometric analyses, where accuracy-based retraining helped to maintain optimal manifold properties, namely, the maintenance of linear separability between object categories, while preserving beneficial dimensionality reduction in deeper network layers (Fig. 4).

Although the different retraining strategies lead to significantly different effects, the modest nature of these differences, particularly the surprisingly robust performance of the null approach (random sampling), reflects important principles with biological parallels. More precisely, the random sampling strategy’s effectiveness demonstrates that even non-targeted cognitive stimulation can provide substantial benefits during neurodegeneration. This finding also aligns with clinical observations that general cognitive engagement can help maintain function in neurodegenerative conditions36. The random sampling approach still provides the network with diverse visual experiences that help to maintain function through synaptic plasticity in undamaged regions of the visual system. The limited differences between strategies may also reflect fundamental constraints of our retraining protocol, including that all strategies sample from the same held-out pool of images, which potentially limits the possibility of substantial differences between the approaches. In particular, since this held-out pool is only composed of 5000 images, as more images are used for retraining the model, all sampled retraining sets have more overlap with each other. As synaptic damage becomes severe (>60% decay), all strategies converge toward poor performance, reflecting the biological reality that there are limits to compensation when neurodegeneration becomes extensive. The modeling results suggest an “intervention window” during moderate degeneration (20–60% decay) where targeted interventions show the clearest benefits.

Our work also explored the relationship between dimensionality reduction and computational efficiency. This relationship shows fundamental principles of neural information processing, where successive layers should transform high-dimensional sensory inputs into more compact, semantically meaningful representations. The fact that accuracy-based retraining maintains this characteristic under moderate degeneration (20–40% decay) suggests that it may better preserve the network’s ability to form efficient, discriminative representations. This preservation of representational efficiency is crucial, as it indicates that the network maintains its capacity to extract and organize relevant visual features despite ongoing neural degradation. Such maintained organizational principles could parallel the mechanisms by which successful cognitive interventions help patients retain functional visual processing capabilities even as neurodegeneration progresses. The overall increase in dimensionality in manifolds that we uncovered in the degenerating networks is corroborated by work studying the dimensionality and increase in noise of neural population responses in biologically degenerating systems36,37,38. Recent computational studies of Alzheimer’s disease have found that affected brains show altered low-dimensional manifold structure and, thus, compromised information processing39. These results suggest that as degeneration occurs, compressed object representations become noisier and, thus, less discriminable. Therefore, based on the results of our work, we posit here that future intervention strategies can be developed to target the size and separability of manifolds of neural population responses to more effectively retain cognition in the presence and progression of neurodegenerative disease. The in silico model developed in this work allows for this analysis to identify ideal intervention techniques.

While this work represents an early step toward computational frameworks that may eventually inform digital twin approaches for cognitive rehabilitation, several important limitations must be acknowledged when interpreting these results. These limitations span from the fundamental abstractions inherent in CNN-based modeling of biological vision to the specific challenges in accurately representing PCA pathology. A primary consideration is the remaining discernible differences in information processing between CNNs and the biological visual system40. More precisely, the learning mechanisms in CNNs, particularly backpropagation, batch normalization, and softmax activations, represent significant abstractions from the real biological learning processes. Despite these architectural and computational differences, CNNs have demonstrated remarkable capacities to predict neural activation patterns in the primate visual cortex, outperforming other computational models of the visual system8,9,[41](https://www.nature.com/articles/s41540-025-00596-w#ref-CR41 “Schrimpf, M. et al. Brain-Score: which artificial neural network for object recognition is most brain-like? Preprint at bioRxiv https://doi.org/10.1101/407007

(2018).“). This predictive power suggests that while CNNs may diverge from and simplify biochemical processes, they capture important computational principles of hierarchical visual processing. Our simulation of PCA progression through synaptic decay presents additional limitations. While we model neurodegeneration through uniform weight degradation, the actual pathology of PCA involves complex interactions between tau protein accumulation, synaptic dysfunction, and cellular death16. The posterior-to-anterior progression characteristic of PCA17 suggests that a more sophisticated model of layer-specific and direction-biased degeneration might better capture disease dynamics. Our current approach, while useful for understanding general principles of hierarchical visual system degradation, does also not fully capture the spatiotemporal patterns of tau pathology and subsequent neurodegeneration. Finally, our restricted held-out set of 5000 images may not provide sufficient semantic diversity for optimal implementation of active learning strategies, particularly in identifying genuinely similar exemplars for targeted retraining. This limitation could be addressed in future work through the use of larger and more diverse datasets, such as ImageNet, that better capture the complexity of real-world visual processing demands42. Additionally, while softmax entropy is widely used as a calculation of model uncertainty43, it does come with potential drawbacks. More precisely, softmax layer entropy can underestimate model uncertainty due to neural networks’ tendency to produce over-confident predictions, leading to low levels of entropy even for “uncertain” inputs44. This can occur even for incorrect classifications or out-of-distribution inputs. The limited sample pool and softmax entropy limitations likely lead to a less meaningful selection of images using the active learning pipeline that is reflected in the underperformance of this strategy. Despite these limitations, our computational framework provides valuable insights into potential mechanisms of cognitive rehabilitation in neurodegenerative conditions. The superior performance of targeted retraining strategies, even within our simplified model, suggests promising directions for developing more effective cognitive interventions in patients with PCA.

An additional consideration is our current focus on the visual processing hierarchy. PCA pathology is known to occur within a broader brain network where posterior regions are extensively connected to temporal, parietal, and subcortical areas. Incorporating these broader network effects could potentially improve model accuracy and realism by accounting for compensatory mechanisms from healthy brain regions, network-level intervention strategies, and more realistic patterns of pathology spread through connected brain areas. For example, previous research has revealed deficits in working memory in patients with PCA, compared to the typical loss of episodic memory that accompanies Alzheimer’s disease45. Thus, augmenting a CNN with a form of computational memory (e.g., Hopfield networks) could allow for a more sophisticated model of the true integration between working memory and visual processing. However, our CNN-based approach provides a validated model, specifically of the hierarchical visual processing stream affected in PCA, and expanding this setup to whole-brain modeling would require different computational architectures that might obscure the specific principles we aim to understand about visual system rehabilitation in this work. Therefore, we view our current work as establishing the feasibility of computational intervention testing, with broader network modeling representing a valuable future extension of this framework.

This parallel tracking of disease progression and intervention response provides a unique platform for future development and refinement of therapeutic approaches before clinical implementation. In principle, future iterations of such computational frameworks, when validated against patient data and initialized with individual biomarkers46 and cognitive profiles, may contribute to the development of personalized modeling approaches47. However, significant methodological advances and clinical validation would be required before such applications become feasible. Nevertheless, computational models of neurodegeneration may eventually enable the exploration of counterfactual scenarios: what will the outcome be if different interventions are attempted at different time points? What if the disease progression followed a different trajectory? These questions, difficult or impossible to address in clinical settings, become much more tractable in the computational domain. This capability is particularly valuable for understanding the timing and efficacy of various therapeutic interventions, potentially leading to more personalized and effective treatment strategies. While our current implementation has notable limitations in fully capturing the biological complexity of PCA, the framework established here demonstrates the potential of in silico approaches for systematically evaluating cognitive intervention strategies. As computational methods continue to evolve and our understanding of biological disease mechanisms deepens, these approaches hold significant promise for revolutionizing the diagnosis, treatment, and management of neurodegenerative conditions. Through the iterative refinement of such computational frameworks, we move closer to developing more effective, personalized therapeutic strategies that can address the complex challenges posed by progressive cognitive decline.

Methods

Model architecture

Our investigation employed a highly optimized variant of the VGG19 architecture15,19,20, which was selected for its demonstrably high similarity to human brain processing as quantified by Brain-Score[41](https://www.nature.com/articles/s41540-025-00596-w#ref-CR41 “Schrimpf, M. et al. Brain-Score: which artificial neural network for object recognition is most brain-like? Preprint at bioRxiv https://doi.org/10.1101/407007

(2018).“). The network architecture comprises 44 distinct operational layers organized into 5 blocks, each containing convolutional layers and ReLU activations, interspersed with 6 pooling layers (5 max pooling and 1 adaptive average pooling), and culminating in 3 final linear layers, including a softmax classifier. To better approximate biological constraints and mitigate spurious effects from overparameterization, we implemented a magnitude-based, structured pruning technique on all intermediate layers, following Fang et al.[48](https://www.nature.com/articles/s41540-025-00596-w#ref-CR48 “Li, H., Kadav, A., Durdanovic, I., Samet, H. & Graf, H. P. Pruning filters for efficient ConvNets. Preprint at https://arxiv.org/abs/1608.08710

(2016).“),[49](https://www.nature.com/articles/s41540-025-00596-w#ref-CR49 “Fang, G., Ma, X., Song, M., Mi, M. B. & Wang, X. DepGraph: towards any structural pruning. Preprint at https://arxiv.org/abs/2301.12900

(2023).“). This compression results in a model retaining only 18.4% of the original weights while maintaining a high accuracy on object classification of CIFAR100 (69.5%, compared to an accuracy of 71.4% for the unpruned model).

Initial training and dataset partitioning

We utilized the CIFAR100 dataset50, which consists of 60,000 32 × 32 color images across 100 object classes. Departing from traditional training approaches, we strategically withheld 10% of the standard training set (5000 images withheld) as a pool to sample retraining images from. This design choice enables subsequent targeted refinement while maintaining a substantial initial training corpus. The model was pre-trained on ImageNet and fine-tuned on our CIFAR100 subset using stochastic gradient descent with momentum η = 0.9, implementing a scheduled learning rate progression (0.01, 0.001, 0.0001) over 100 epochs with a batch size of 128.

Model degeneration and retraining

Model weights were decayed and frozen according to the methods described in Moore et al.28. To model the progressive nature of PCA, we implemented a systematic weight decay mechanism that simulates the spatiotemporal progression of the synaptic deg

Introduction

Introduction

Results

Differences in accuracy with different retraining schemas

Object manifold geometries within decaying networks

Discussion

Methods

Model architecture

Initial training and dataset partitioning

Model degeneration and retraining

Similar Posts