Main
The early infant microbiome assembles via intricate and partially stochastic microbial acquisitions that have the mother as the primary source and other family members as additional ones1,2,3,[4](#ref-CR4 “Dubois, L. et…
Main
The early infant microbiome assembles via intricate and partially stochastic microbial acquisitions that have the mother as the primary source and other family members as additional ones1,2,3,4,5. The infant microbiome then evolves during the following few years with complex dynamics that later result in a more stable adult-like microbiome6. Although early family-to-baby microbiome strain transmission has been quite extensively investigated1,2,3,4,7, later infant developmental stages, including those involving interaction with other peers in social contexts, have received very little attention.
As the person-to-person intra-generational microbiome transmission has been recently revealed to be extensive and impact the personal microbiome make-up8, we predicted that early social contexts such as nurseries might exert a large impact on infant microbiomes via baby-to-baby transmission. Beyond work on pathogen spreading9,10 and linked immune competence development11,12,13, microbiome investigations in nurseries are limited in observing increased microbial diversity among attendants14. This leaves a major gap in the understanding of the dynamics of human microbiome maturation during the key first 1,000 days of life15.
Here we present microTOUCH-baby, a strain-resolved longitudinally dense metagenomic study modelling interpersonal gut microbiome transmission between babies attending the nursery for the first time and their close contacts, including family members and nursery educators.
The microTOUCH-baby study
We set up the microTOUCH-baby cohort to study the dynamics of microbiome development and transmission among babies of about 1 year of age and their close social interactions network (Methods). Participants included 43 babies attending the first year of nursery (median age at nursery admission 10 months), 7 co-living siblings, 39 mothers and 30 fathers of the babies, and 5 pets from the participants’ houses, as well as 10 nursery educators (134 volunteers in total; Fig. 1a and Supplementary Table 1). Baby participants were enroled from three public nurseries in Trento (Italy). Babies spent on average 8 hours per weekday (after the ‘settling-in period’; Methods) in the nursery, with limited activities and spaces shared between the two classes in the same nursery, which are followed by different educational staff.
Fig. 1: The microTOUCH-baby study and species-level microbiome configurations before and after nursery attendance.
a, microTOUCH-baby study design and overview. b, Species-level microbiome composition overview of the microTOUCH-baby cohort during first term (principal coordinate analysis on Jaccard dissimilarity, n = 646). Samples are coloured by host categories and shapes indicate the nursery. Baby samples’ colour intensity is according to time point (from initial T01 to final T15). c,d, Average SGB richness across all timepoints and for all individuals in each family member category having versus not having a sibling (c) and having versus not having a pet (d). e,f, Change in alpha-diversity (SGB richness; e) and beta-diversity (Jaccard dissimilarity; f) across participant types between the beginning and the end of the first term, with n indicating the number of individual–individual pairs. Beta-diversity refers to the all-versus-all within-nursery dissimilarities. In the box plots (c–f), box edges show the lower and upper quartiles, the centre line indicates the median, and whiskers extend to the most extreme data point within 1.5× the interquartile range (IQR). P values are reported where statistically significant (two-sided Mann–Whitney U-test in c and d and two-sided Wilcoxon signed-rank tests for e and f); all other comparisons are non significant.
Sampling started before the beginning of the first term (T01), hence before participants from different families had any nursery-related contact among them, and ended after the Christmas nursery closure (Fig. 1a). During nursery attendance, we collected stool samples of the babies on a weekly basis, whereas educators and parents were less densely sampled (Methods). For all participants in group 1 of nursery A, sample collection continued through the second term. Two additional follow-up samples were collected for all participants at nursery year’s conclusion (TA) and at the end of the summer break (TB) (Fig. 1a).
Overall, we collected and metagenomically sequenced 1,013 microbiome samples (average sequencing depth 15.61 Gbp; Methods). Host metadata information included exact age, past and current host-health data, antibiotic exposures, maternal delivery information (Methods, Fig. 1a, and Supplementary Tables 2 and 3), and diet questionnaires (Methods). Metagenomes were processed via the MetaPhlAn 4 computational tool16 to generate taxonomic profiles at species-level genome bin (SGB)17 resolution (Supplementary Table 4 and Extended Data Fig. 1a), including yet-to-characterize species (that is, unknown SGBs accounting for 46.37% of total SGBs). We then used the StrainPhlAn 4 computational tool16,18 to generate strain-level phylogenies for 311 known SGBs and 201 unknown SGBs that were used to infer microbiome strain transmission8 (Methods).
Compositional baby microbiome landscape
We first observed expected microbiome structures1,19,20, with large compositional divergence between adults and babies (Fig. 1b, Extended Data Figs. 1b, 2 and 3, and Supplementary Table 5), age-dependent differences in babies (Extended Data Fig. 4a–e), and diet-dependent microbial stratification in adults (Extended Data Fig. 4f), but not in babies after accounting for age (Supplementary Table 6). Interestingly, at T01 (median age 10 months), the impact of maternal intrapartum antibiotic prophylaxis against group B Streptococcus and of mode of delivery on alpha-diversity was already not detected as statistically significant (Mann–Whitney U-test, n = 37, U = 137, P = 0.68 and n = 37, U = 109, P = 0.89, respectively; Extended Data Fig. 5a–d).
Some compositional patterns were suggestive of a role of microbiome transmission. Babies having a sibling had, for example, an overall higher SGB richness compared with babies without brothers and sisters (n = 40, U = 271, P = 0.012; Fig. 1c and Supplementary Table 6), further supporting previous observations21,22 and suggesting that siblings may provide important sources for infant microbiome enrichment. In contrast, babies with pets showed lower overall SGB richness (n = 40, U = 61, P = 0.012; Fig. 1d), but significance was lost after adjusting for age (Supplementary Table 6). Babies’ alpha-diversity increased during the 3 months of nursery attendance (Fig. 1e), and although the total pool of microbial species detected among babies in the nursery did not change noticeably throughout the study (Extended Data Fig. 5e), the inter-baby beta-diversity decreased significantly (7% average decrease; Wilcoxon signed-rank test n = 116 baby pairs, W = 1,026, P = 7.0 × 10−11; Fig. 1f). As overall this might be indicative of baby microbiome convergence influenced by inter-individual transmission, we performed strain-level transmission analysis to investigate this hypothesis.
Mapping strain-sharing in the nursery
Extending our StrainPhlAn-based validated pipeline8 (Methods), we defined a strain-sharing event as the identification of the same strain (that is, differing by a genetic distance lower than the pre-computed optimal species-specific threshold distinguishing between inter- and intra-individual genetic distance distributions) in different microbiome samples. Strain-sharing rates (SSRs) are computed as the number of strains shared between a pair of microbiome samples over the number of species with profiled strains present in both samples (Methods). Applied on the task of inferring mother–baby transmission, the pipeline estimated a 50% median SSR for babies at the beginning of the study, which is highly consistent with previous results irrespective of population (Extended Data Fig. 5f).
Overall, we captured over 9.47 million instances of the same SGB typed at the strain level in different samples (including those from the same participant and from different participants), with a total of 5.97% of cases in which the same strain of the SGB was present, resulting in 565,258 detected strain-sharing events (Supplementary Tables 7 and 8). Within-individual strain-sharing accounted for 27.9% of the total (157,599 events, with 99% likelihood of samples from the same individual sharing at least 1 strain, and 87% at least 5) but also strain-sharing between different individuals in the same family was very high (51,483 events, 9.1% of the total, with 86% likelihood of sharing at least 1 strain, 47% at least 5), with rarer between-family strain-sharing instances at T01 (46% likelihood of sharing at least 1 strain and 3% at least 5; Extended Data Fig. 5g). Although most strain-sharing over the first term was observed among individuals from different families (356,176 events, 63%), this reflected the >75-times greater number of between-family comparison pairs; after normalizing for the number of comparisons, one order of magnitude fewer strains were shared between families versus within family (0.7 versus 7.9 strains shared per sample pair; Supplementary Table 7). The 0.7 average strains shared by unrelated individuals represent the cohort’s microbiome sharing background, including untraced social interaction before T01, clonal strains spreading into the nursery-associated local community and possible false-positive instances, among other factors.
Tracking multi-host strain transmission
As a representative example of the combined capabilities of our study design and metagenomic pipeline to trace complex strain transmission chains, we illustrate the interpersonal transfer of a nursery-acquired strain of Akkermansia muciniphila (SGB9226) in group 1 of nursery B. A strain from this species was first introduced in the nursery group by a baby (B05) who probably obtained it from their mother, passed to another baby (B06), to then be found in their mother (M06) and father (F06), in the latter replacing another A. muciniphila strain (Fig. 2a). A. muciniphila strains contain CRISPR arrays that can be used as unique genetic tags for strains23,24 that further confirmed A. muciniphila strain identity across volunteers (Methods). Metagenomic assembly also validated such transmission patterns for the limited number of strains (8 out of 19 StrainPhlAn-positive samples) that could be reconstructed into draft genomes of sufficient quality, with high genomic similarity between assemblies from samples with the same strain according to StrainPhlAn (pairwise average nucleotide identity (ANI) 99.97%, which aligned with same-strain boundaries independently estimated elsewhere25,26). We note that the missing detection of A. muciniphila strains (grey circles in Fig. 2a) was overall consistent with the absence of the species as shown in a high-sensitivity, SGB-specific polymerase chain reaction (PCR) (Methods and Extended Data Fig. 6a). Within this example, we found only one sample in which we missed the metagenomic strain profiling to be PCR-positive at the SGB-level, concordantly with a non-zero relative abundance (0.04%) in its MetaPhlAn profile (B05_T08; Extended Data Fig. 6b), being thus the single case in Fig. 2a of SGB9226 falling below the limit of detection for strain profiling. Another transmission chain example involved Alistipes finegoldii (SGB2301) and included an educator (Extended Data Fig. 6c), further contributing to show the potential of our approach to recapitulate microbial transmission in nurseries.
Fig. 2: Inter-individual strain transmission and nursery spreading during the first term.
a, Strain-level profiling for A. muciniphila SGB9226 (left) uncovers the chain of transmission events of one strain of this species in group 1 of nursery B (right). Participant types are identified by shape (mother, diamond; baby, circle; father, square) containing participant identifiers composed of the first letter indicating participant type (M, mother; B, baby; F, father) and the family number; familiar relations are also highlighted by same-colour filling. On the right, each circle represents a sample collected from the participants depicted, with colour filling indicating the identity of the strain of A. muciniphila detected in the sample (except grey, used to indicate that the SGB was not detected/typable at the strain level) and arrows indicating the most likely transmission event. The light orange and grey circle identifies SGB9226-positive sample (B05_T08) in which a strain could not be profiled by StrainPhlAn. The identification of shared CRISPR spacers of the target strain of A. muciniphila (orange circles) across different samples is indicated by an asterisk. b, Strains present at most in one baby before nursery admission (T01) and spreading to other participants in the same nursery, reaching ≥50% prevalence in the following time points, until T15. Left and right y axes show the proportion and number of babies in which the outbreaker strain was detected, respectively. The left y axis also refers to the proportion of babies in which the SGB was detected (that is, their prevalence in the nursery). S. intestinalis, Sellimonas intestinalis.
We also explored potential gut microbiome transmission between household pets and their families. Anecdotally (given the only five pets considered), we overall identified a low total number of pet–human strain-sharing events, with intra-family pet–baby strain-sharing significantly higher than inter-family (Fisher’s exact test, n = 211, P = 0.005; Extended Data Fig. 6d,e). Strains found to be transmitted between babies and pets belonged to human-associated species that had also been previously detected in pet gut microbiomes (Faecalimonas umbilicata, Ruminococcus gnavus, Clostridium sp. AT4 and Phocaeicola vulgatus27,28,29,30), indicating they may be ecologically fit to overcome host-species boundaries.
Strain-spreading patterns in the nursery
We then examined the changes in the collective composition of the human microbiome in nurseries. First, we found the overall pool of distinct strains to decrease over time (that is, average nursery strain heterogeneity decreasing from 0.91 at T01 to 0.77 at T15, Mann–Whitney U-test, n = 454, U = 34,312, P = 1.3 × 10−11). Considering that the total reservoir of microbial species did not increase (Extended Data Fig. 5e), this indicates that some strains within the same species may have spread among babies and prevailed over other strains initially present (Extended Data Fig. 7a).
We then focused on strains that showed efficient spreading within a nursery. We found 8 cases of strains initially detected in no more than one baby before nursery start (T01) reaching ≥50% prevalence afterwards (Fig. 2b). Among these, a Streptococcus gallolyticus (nursery A) and a Bifidobacterium pseudocatenulatum (nursery B) strain were introduced in the nursery after approximately the first month of attendance and progressively spread to seven and eight babies, respectively (Fig. 2b). Although S. gallolyticus spread appeared to dwindle after reaching the maximum diffusion, B. pseudocatenulatum presence was steadily detected, consistent with the high prevalence of the Bifidobacterium genus in the infant population2. Other cases of bacterial strain diffusion involved Escherichia coli and Veillonella dispar in nursery B, and Clostridium innocuum in nursery C, which was possibly limited in its spread by other conspecific strains and niche preemption dynamics31.
Baby microbiomes built via transmission
Quantification of strains shared between babies attending the same nursery over time revealed they had, on average, more shared strains at the end of the first term than before nursery admission (the average number of strains shared with any other baby was 2.5 at T01 and 7.2 at T15, or 8.8 at T15 when disregarding strains already present at T01 and only for babies with samples available at both time points; Fig. 3a). Accordingly, whereas at T01 baby strain-sharing relations were not recapitulating nursery attendance, at T15 they clustered consistently with it (Fig. 3b, Supplementary Table 9 and Methods). We thus found strong evidence of quantitatively relevant acquisition of nursery-specific microbial profiles by babies, occurring via inter-individual strain transmission even in the relatively short time frame of the first nursery term.
Fig. 3: Strain-sharing across hosts before, during and after the first term of nursery attendance.
a, Average number of strains shared between each baby and other participants at T01 and T15. The triangles under the boxes report the average number of strains shared between the baby and any participant in ‘family’ (mother, father, sibling) or ‘nursery’ (other babies, educator). b, Average number of shared strains between baby pairs in the same versus different nursery; P values (two-sided permutation test for means; Methods) for intra- versus inter-nursery comparisons are shown in italic in the circle. The statistics for a and e are in Supplementary Tables 9 and 10. Networks are built on strain-sharing matrices among all babies at T01 and T15. c, Strain replacement rate (one minus the SSR) between initial and final time points. P values are reported where statistically significant (two-sided Mann–Whitney U-tests); all other pairwise comparisons are non significant. In the box plots c–e, box edges show the lower and upper quartiles, the centre line indicates the median, and whiskers extend to the most extreme data point within 1.5× the IQR. d, Baby–baby SSR and average number of strains shared throughout the first term. In d and e, statistical significance asterisks refer to the highest significant P value adjusted for multiple comparisons (two-sided permutation test for medians; Methods) for the set of comparisons indicated in the legend, with **P < 0.01 and ***P < 0.001. Left and right y axes indicate average strains shared and average common SGBs. e, Baby–baby SSR and average number of strains shared at T01, at the beginning and the end of the second term (T15 and TA), and after the summer break (TB), across all babies in all nurseries. At the top, P values are reported where statistically significant (two-sided Wilcoxon signed-rank test evaluating longitudinal SSR for paired baby–baby pairs attending the same nursery).
Investigating longitudinal gut microbiome changes, babies showed the lowest rate of SGB retention (defined as the Jaccard similarity between samples from initial and final time points of the same individual; Extended Data Fig. 7b) and the highest rate of strain replacement (defined as 1 − SSR) among the retained SGBs (Fig. 3c) compared with adults. A median 44.4% of the retained SGBs in babies showed baseline strain replacement during the 5 months of the study. In contrast, all other participants replaced a much lower fraction of strains in their gut (medians below 11.1%), with strain replacement rates correlated although non-significantly with age among non-baby participants (Spearman’s test, n = 68, ρ = 0.22, P = 0.071; Extended Data Fig. 7c). This reflects the expected high plasticity of the infant gut microbiome with its rapidly evolving ecosystem and limited colonization resistance6,32.
To assess the extent to which nursery attendance affects microbiome assembly in babies via microbiome transmission, we quantified and compared the SSR between pairs of babies within the same group or nursery, and across different nurseries at each time point (Fig. 3d). Strain sharing among babies in the same nursery group was significantly higher after approximately only 1 month of nursery attendance compared with babies from different nurseries (median SSR 8.3% versus 0% at T04; permutation test for medians, n = 249, P = 0.001). This is all the more noteworthy in view of the first 2 weeks of the nursery’s ‘settling-in period’ during which babies attend discontinuously and for shorter periods. In addition, at the end of the first term (T15), the SSR in the same nursery group reached an average of 20.2%, significantly higher than the SSR between babies attending different nurseries (4.6%; permutation test for medians, n = 312, P < 0.001) and higher than the SSR among babies attending the same nursery but in different groups (16.1%; permutation test for medians, n = 122, P = 0.079, significant at T08 P = 0.026, T10 P < 0.001 and T13 P = 0.001).
By extending the investigation to the second term of nursery, we found the baby–baby SSR within the same nursery (regardless of group) to reach a median 33.3% at the end of school year (TA; versus median 17.9% at T15; Wilcoxon signed-rank test, n = 58, W = 86, P = 6.2 × 10−9; Fig. 3e), with a progressive increase occurring during the whole second term, as observed for the class that was densely sampled over such a period (group 1 of nursery A; Extended Data Fig. 7d). Although the baby–baby SSR decreased during the summer break (TB), it remained significantly higher compared with post-Christmas-break levels (T15; median 23.7% at TB versus 17.9% at T15; Wilcoxon signed-rank test, n = 31, W = 68, P = 2.0 × 10−4). These results highlight that social relations outside of the household and continued spatial proximity are key determinants of infant microbiome transmission and development at levels that are substantially higher than what was recently observed for adults8.
Nursery strains match family contribution
The parent–baby SSR at T01 averaged 37.3% for mothers and 19.6% for fathers, consistent with available reports1,4,8,33,34,35. Such patterns persisted throughout the first term (Fig. 4a). The contributions of sibling strains to the baby was even higher (average SSR 56.2%; Fig. 4b). As expected, strain transmission between babies and individuals from different families remained negligible throughout the first term (Fig. 4a,b), a testament of the reliability of the strain-transmission-inference approach.
Fig. 4: Dynamics of strain transmission during the first term of nursery.
a, SSR and average number of strains shared between pairs of babies and parents (at T01) from the same versus different families at each time point. In** a** and b, statistical significance asterisks refer to two-sided permutation tests for medians (Methods) adjusted for multiple comparisons for same family versus different family across each family member type, with ***P < 0.001. Exact P values for a–e are provided in Supplementary Table 11. In all box plots, box edges indicate the lower and upper quartiles, the centre line represents the median, and whiskers extend to the most extreme data point within 1.5× the IQR. b, Strain-sharing between pairs of babies and siblings (T01) from the same versus different families at each baby time point. c, Proportion of strains acquired from group versus family, and corresponding cumulative relative abundance (bottom). For each baby time point, comparisons were performed against past or contemporaneous samples of the family and the nursery group (Methods). Statistical significance asterisks refer only to the proportion of strains acquired from the same group versus the family. In c–e, the two-sided Mann–Whitney U-test was used, with *P < 0.05, **P < 0.01 and ***P < 0.001. d, Association between having a sibling and the number of strains acquired from the nursery group. e, Breakdown between acquisition of new SGBs typed at the strain level and strain replacement for the strains acquired from the nursery (top) and association between either means of strain acquisition and having a sibling (bottom). Statistical significance asterisks refer to the comparison between SGB acquisition from nursery for babies with versus without siblings. f, Number of strains either donated (dark green) or acquired (light green) by each baby over the first term (left y axis), and ratio of donated strains to acquired strains (dashed line; right y axis).
To establish the relative contribution of strain transmission from the nursery with respect to strain transmission from the family, we computed, for each baby, the proportion of strains in the baby microbiome that were exclusively shared with, and hence putatively acquired from, either family members or other babies in the nursery group (Methods) and we refer to it as ‘proportion of strains acquired’. We found that the proportion of strains acquired from the nursery group—but not of strains acquired from the family—changed significantly over time. The proportion of strains acquired from family members fluctuated from an average of 24.0% per baby at T01 to 20.0% at the end of the first term of nursery (Wilcoxon signed-rank test, n = 25, W = 112, P = 0.18; Extended Data Fig. 7e), whereas those putatively acquired from the nursery group increased from an average of 6.5% to 28.4% at the end of the first term (Wilcoxon signed-rank test, n = 25, W = 0, P = 6.0 × 10−8; Extended Data Fig. 7e), significantly surpassing the proportion of strains acquired from the family (Mann–Whitney U-test, n = 52, U = 463, P = 0.023; Fig. 4c). This indicates that after only 3 months of nursery attendance, babies had proportionally more strains acquired from nursery peers than from their family.
A similar trend was observed when quantifying the relative abundance of strains acquired from either the family or the nursery group (Fig. 4c). Family contribution slightly diminished over time (from an average 33.2% at T01 to 20.6% at T15; Wilcoxon signed-rank test, n = 25, W = 72, P = 0.014; Extended Data Fig. 7f) whereas the contribution from the nursery group greatly expanded (reaching an average of 39.6% at T15 from a starting 10.2%; Wilcoxon signed-rank test, n = 25, W = 18, P = 1.5 × 10−5; Extended Data Fig. 7f). Strains shared with both family and group also increased significantly (from average 0.9% to 8.5%; Wilcoxon signed-rank test, n = 25, W = 0, P = 4.4 × 10−4; Extended Data Fig. 7f), probably reflecting reciprocal transmission between family and nursery (Fig. 2a). Overall, this suggests that the nursery collectively contributes to a larger extent to the strain composition of the gut microbiome of babies than to that of the family by the end of the first term (39.6% versus 20.6% at T15; Mann–Whitney U-test, n = 52, U = 479, P = 0.01; Extended Data Fig. 7f).
Long-term nursery effect on transmission
The extended longitudinal analysis of group 1 of nursery A revealed that the proportion of strains acquired from nursery peers continued to gradually increase during the second term (Extended Data Fig. 8a). Samples from all babies across nurseries at year-end (TA) confirmed comparable contributions of family and nursery to the baby (17.6% median proportion of strains acquired from nursery versus 15% from family; Mann–Whitney U-test, n = 19, U = 218, P = 0.29; Extended Data Fig. 8b) that non-significantly tended toward a greater family contribution after summer nursery closure, (8.7% median proportion of strains acquired from nursery versus 16.7% from family; Mann–Whitney U-test, n = 17, U = 122, P = 0.43; Extended Data Fig. 8b).
Babies showed lower strain retention and higher strain replacement across the summer break (that is, between TA and TB) compared with adults, despite no differences in the carriage of SGBs typed at the strain level (Extended Data Fig. 8c–f). Interestingly, family-acquired strains were significantly more retained and less replaced in babies over the summer break than nursery-acquired strains (Wilcoxon signed-rank test, n = 11, W = 5, P = 0.019 and P = 0.022 respectively; Extended Data Fig. 8g,h), suggesting that continuous seeding linked to continued contact is a factor behind long-term colonization.
Siblings affect baby strain acquisition
Predicting a potential role of siblings in the transmission patterns, we found that at T01, babies showed a higher SSR with their siblings (average 52.3%) than with their fathers (24.9%; Mann–Whitney U-test, n = 36, U = 147, P = 0.026) as well as with their mothers, although non-significantly (46.1%; Mann–Whitney U-test, n = 36, U = 120, P = 0.47; Extended Data Fig. 8i). Of note, an average of 10.4 strains were shared exclusively with siblings at T01, whereas only 2.0 and 2.4 were shared exclusively with the mother or the father (Extended Data Fig. 8j), possibly reflecting closer intestinal ecology, physical interaction and development stage, which are probably some of the same factors leading to the higher nursery strain acquisition observed in our cohort.
We further observed that having a sibling was associated with babies acquiring significantly fewer strains from their nursery group compared with babies without a sibling at T15 (Mann–Whitney U-test, n = 28, U = 117, P = 0.004; Fig. 4d). Although causality cannot be inferred, this might be linked to early acquisition from siblings ‘saturating’ the overall strain acquisition potential, which would be in line with babies with a sibling having higher alpha-diversity (Fig. 1c) and acquiring fewer new SGBs than only-children (Fig. 4e). Notably though, although all babies both spread and acquired strains in the nursery, the ratio between acquired and donated strains varied widely between babies (Fig. 4f).
The most-transmissible species
We next assessed species-level transmissibility by counting the number of strain-sharing events for each SGB in our cohort over the total potential number of strain-sharing events (Methods). Microeukaryotic taxa were not found to be abundant enough in babies to try to infer transmission, with Blastocystis, the most common human gut microeukaryote36, identified in 9.18% of the samples but never in babies (Supplementary Table 12). Focusing thus on prokaryotic taxa, out of the 64 SGBs with highest transmissibility (henceforward ‘T’) over all participant categories (Extended Data Fig. 9a and Supplementary Table 13), many known SGBs encompassed aerotolerant (S. gallolyticus, Rothia mucilaginosa and* B. pseudocatenulatum*) and spore-forming species (for example, Tyzzerella nexilis and Clostridium fessum). We also identified the spore-forming Clostridioides difficile among the most-transmissible SGBs between baby–baby pairs only (*T *= 0.38, prevalence in babies 24% and in adults 0%), in line with widespread carriage in asymptomatic babies37,38. Exceptions to this trend were prevalent non-sporulating human gut anaerobes (such as Blautia wexlerae and Faecalibacterium prausnitzii).
SGB transmissibility correlated with SGB prevalence in both adults (Spearman’s test, n = 461, ρ = 0.35, Padj = 9.8 × 10−14) and babies (Spearman’s test, n = 461, ρ = 0.40, Padj = 1.2 × 10−17; Extended Data Fig. 9b,c). The highest transmissibility scores were highlighted for SGBs shared in baby–siblings pairs, namely, A. finegoldii, Bacteroides ovatus and Bacteroides caccae, the butyrate-producing Roseburia intestinalis and Agathobaculum butyriciproducens39,[40](https://www.nature.com/articles/s41586-025-099