-
Data Descriptor
-
Published: 03 February 2026
-
…
Scientific Data , Article number: (2026) Cite this article
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editi…
-
Data Descriptor
-
Published: 03 February 2026
-
…
Scientific Data , Article number: (2026) Cite this article
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
Abstract
There is still a need for a better understanding of how abiotic/biotic factors affect the functional structure and composition of biological assemblages, given that living organisms are in constant interaction with their environment and with each other. Here, we present a comprehensive dataset of 31 functional traits of bacteria using information from BacDive, a bacterial diversity meta-database, as well as from the rrnDB and genomesizeR datasets. This updated version of the BactoTraits dataset, in addition to now offering more traits for more strains (97,721 strains with at least one trait described), makes R scripts available to the scientific community. These traits include physiological characteristics, metabolic processes, genome properties and biotope preferences. They could be inferred to the whole bacterial community thanks to taxonomic affiliation obtained from traditional high throughput 16S rRNA gene amplicon sequencing methods. This taxonomic affiliation is based on the regularly updated SILVA database and thus allows to study combinations of weighted mean trait profiles of bacterial communities at different taxonomic levels. BactoTraits can be used, for example, to improve predictions of ecological responses to natural/anthropogenic pressures and to support biomonitoring, management and conservation strategies. The R scripts, as well as the dataset encoded in BactoTraits, are available at: https://doi.org/10.24396/ORDAR-182.
Data availability
The encoded BactoTraits28 dataset is available at: https://doi.org/10.24396/ORDAR-182. The database was updated on January 28, 2026, and is available at three taxonomic levels: (1) strain level (BACTOTRAITS_database_2026-01-28; each row represents a specific BacDive ID), (2) species level (BACTOTRAITS_database_2026-01-28_SPECIESLVL; each row describes a specific species), (3) and finally at the genus level (BACTOTRAITS_database_2026-01-28_GENUSLVL; each row refers to a specific genus). Note that the strain-level version (i.e., BACTOTRAITS_database_2026-01-28) is the most detailed and includes all information relating to the sequence accession number and the corresponding NCBI tax ID (using the separator “|” in case of multiple match). In these three datasets, each column corresponds to a specific trait information. All these traits are divided into as many columns as there are modalities, where the column name specifies both the trait name and the modality name, separated by an underscore (e.g., “gram_stain_positive” or “gram_stain_negative”).
Code availability
Analyses and figures were produced using the R software34 (R-4.5.2) including the following packages: BacDive (0.8.0), tydiverse (2.0.0), rrapply (1.2.8), stringr (1.6.0), purrr (1.2.0), readr (2.1.6), progress (1.2.3), conflicted (1.2.0) and genomesizeR (1.0.0.0002). All these libraries and their respective dependencies are provided in the “LIBRARY” folder to ensure future compatibility. The R project and associated scripts, are available at: https://doi.org/10.24396/ORDAR-182.
References
Violle, C. et al. Let the concept of trait be functional! Oikos 116, 882–892, https://doi.org/10.1111/j.0030-1299.2007.15559.x (2007).
Pey, B. et al. A Thesaurus for Soil Invertebrate Trait-Based Approaches. PLoS ONE 9, e108985, https://doi.org/10.1371/journal.pone.0108985 (2014).
Krause, S. et al. Trait-based approaches for understanding microbial biodiversity and ecosystem functioning. Front. Microbiol. 5, https://doi.org/10.3389/fmicb.2014.00251 (2014). 1.
Echenique-Subiabre, I. et al. Traits determine dispersal and colonization abilities of microbes. Appl Environ Microbiol 91, e02055–24, https://doi.org/10.1128/aem.02055-24 (2025).
De Deyn, G. B., Cornelissen, J. H. C. & Bardgett, R. D. Plant functional traits and soil carbon sequestration in contrasting biomes. Ecology Letters 11, 516–531, https://doi.org/10.1111/j.1461-0248.2008.01164.x (2008).
Hedde, M., Van Oort, F. & Lamy, I. Functional traits of soil invertebrates as indicators for exposure to soil disturbance. Environmental Pollution 164, 59–65, https://doi.org/10.1016/j.envpol.2012.01.017 (2012).
Cébron, A. et al. BactoTraits – A functional trait database to evaluate how natural and man-induced changes influence the assembly of bacterial communities. Ecological Indicators 130, 108047, https://doi.org/10.1016/j.ecolind.2021.108047 (2021).
Edwards, K. F., Litchman, E. & Klausmeier, C. A. Functional traits explain phytoplankton community structure and seasonal dynamics in a marine ecosystem. Ecology Letters 16, 56–63, https://doi.org/10.1111/ele.12012 (2013).
Martini, S. et al. Functional trait‐based approaches as a common framework for aquatic ecologists. Limnology & Oceanography 66, 965–994, https://doi.org/10.1002/lno.11655 (2021).
Festjens, F. et al. Functional trait responses to different anthropogenic pressures. Ecological Indicators 146, 109854, https://doi.org/10.1016/j.ecolind.2022.109854 (2023).
Verberk, W. C. E. P., Van Noordwijk, C. G. E. & Hildrew, A. G. Delivering on a promise: integrating species traits to transform descriptive community ecology into a predictive science. Freshwater Science 32, 531–547, https://doi.org/10.1899/12-092.1 (2013).
Meyer, A. et al. Morphological vs. DNA metabarcoding approaches for the evaluation of stream ecological status with benthic invertebrates: Testing different combinations of markers and strategies of data filtering. Molecular Ecology 30, 3203–3220, https://doi.org/10.1111/mec.15723 (2021).
Monjot, A. et al. Functional diversity of microbial eukaryotes in a meromictic lake: Coupling between metatranscriptomic and a trait‐based approach. Environmental Microbiology 25, 3406–3422, https://doi.org/10.1111/1462-2920.16531 (2023).
Beck, M., Billoir, E., Floury, M., Usseglio-Polatera, P. & Danger, M. A 34-year survey under phosphorus decline and warming: Consequences on stoichiometry and functional trait composition of freshwater macroinvertebrate communities. Science of the Total Environment 858, 159786, https://doi.org/10.1016/j.scitotenv.2022.159786 (2023).
Söhngen, C., Bunk, B., Podstawka, A., Gleim, D. & Overmann, J. BacDive—the Bacterial Diversity Metadatabase. Nucl. Acids Res. 42, D592–D599, https://doi.org/10.1093/nar/gkt1058 (2014).
Söhngen, C. et al. BacDive – The Bacterial Diversity Metadatabase in 2016. Nucleic Acids Res 44, D581–D585, https://doi.org/10.1093/nar/gkv983 (2016).
Reimer, L. C. et al. BacDive in 2019: bacterial phenotypic data for High-throughput biodiversity analysis. Nucleic Acids Research 47, D631–D636, https://doi.org/10.1093/nar/gky879 (2019).
Schober, I. et al. BacDive in 2025: the core database for prokaryotic strain data. Nucleic Acids Research 53, D748–D756, https://doi.org/10.1093/nar/gkae959 (2025).
Barberán, A., Caceres Velazquez, H., Jones, S. & Fierer, N. Hiding in Plain Sight: Mining Bacterial Species Records for Phenotypic Trait Information. mSphere 2, e00237–17, https://doi.org/10.1128/mSphere.00237-17 (2017).
Madin, J. S. et al. A synthesis of bacterial and archaeal phenotypic trait data. Sci Data 7, https://doi.org/10.1038/s41597-020-0497-4 (2020). 1.
Cébron, A., Borreca, A., Beguiristain, T., Biache, C. & Faure, P. Taxonomic and functional trait-based approaches suggest that aerobic and anaerobic soil microorganisms allow the natural attenuation of oil from natural seeps. Sci Rep 12, https://doi.org/10.1038/s41598-022-10850-4 (2022). 1.
Djemiel, C. et al. Inferring microbiota functions from taxonomic genes: a review. GigaScience 11, 1–30, https://doi.org/10.1093/gigascience/giab090 (2022).
Stoddard, S. F., Smith, B. J., Hein, R., Roller, B. R. K. & Schmidt, T. M. rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development. Nucleic Acids Research 43, D593–D598, https://doi.org/10.1093/nar/gku1201 (2015).
Mercier, C., Elleouet, J., Garrett, L. & Wakelin, S. A. genomesizeR: An R package for genome size prediction. bioRxiv, https://doi.org/10.1101/2024.09.08.611926 (2024). 1.
NCBI: The National Center for Biotechnology Information. (2025). 1.
Aßhauer, K. P., Wemheuer, B., Daniel, R. & Meinicke, P. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31, 2882–2884, https://doi.org/10.1093/bioinformatics/btv287 (2015).
Douglas, G. M. et al. PICRUSt2 for prediction of metagenome functions. Nat Biotechnol 38, 685–688, https://doi.org/10.1038/s41587-020-0548-6 (2020).
Cébron, A. et al. BactoTraits. https://doi.org/10.24396/ORDAR-182 (2025). 1.
Chevenet, F., Dolédec, S. & Chessel, D. A fuzzy coding approach for the analysis of long-term ecological data. Freshwater Biology 31, 295–309, https://doi.org/10.1111/j.1365-2427.1994.tb01742.x (1994).
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research 41, D590–D596, https://doi.org/10.1093/nar/gks1219 (2012).
Yilmaz, P. et al. The SILVA and “All-species Living Tree Project (LTP)” taxonomic frameworks. Nucl. Acids Res. 42, D643–D648, https://doi.org/10.1093/nar/gkt1209 (2014).
Borcard, D., Gillet, F. & Legendre, P. Numerical Ecology with R. 2, (688. springer, New York, 2011).
Beauchard, O., Veríssimo, H., Queirós, A. M. & Herman, P. M. J. The use of multiple biological traits in marine community ecology and its potential in ecological indicator development. Ecological Indicators 76, 81–96, https://doi.org/10.1016/j.ecolind.2017.01.011 (2017).
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2025).
Acknowledgements
Financial support of the present study has come from the APR IMPACTS 2020 of the projet DiagnoTraits (2172D0218-A). This work was also supported by the French National program EC2CO (Ecosphère Continentale et Côtière) with the project DiagnoBactO (AT MICROBIOME 2024).
Author information
Authors and Affiliations
Université de Lorraine, CNRS, LIEC, F-57000, Metz, France
Vincent Laderriere, Philippe Usseglio-Polatera & Florence Maunoury-Danger 1.
Université de Lorraine, CNRS, LIEC, F-54000, Nancy, France
Aurélie Cébron
Authors
- Vincent Laderriere
- Philippe Usseglio-Polatera
- Florence Maunoury-Danger
- Aurélie Cébron
Contributions
A.C., P.U.P. and F.M.D. developed the idea and data collection framework. V.L. compiled most of the data and structured the dataset following the previous work of A.C., P.U.P. and F.M.D. All the authors contributed to the addition and verification of the information included in the dataset. V.L. wrote the R scripts, first draft of the manuscript and designed figures. All the authors have contributed to its proofreading.
Corresponding author
Correspondence to Aurélie Cébron.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Laderriere, V., Usseglio-Polatera, P., Maunoury-Danger, F. et al. BactoTraits: a trait database for exploring functional diversity of bacterial communities. Sci Data (2026). https://doi.org/10.1038/s41597-026-06652-2
Received: 15 May 2025
Accepted: 19 January 2026
Published: 03 February 2026
DOI: https://doi.org/10.1038/s41597-026-06652-2