-
Data Descriptor
-
Published: 21 January 2026
Scientific Data , Article number: (2026) Cite this article
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
Abstract
The narrow-leaf bur-reed (Sparganium angustifolium Michx.), a member of the Typβ¦
-
Data Descriptor
-
Published: 21 January 2026
Scientific Data , Article number: (2026) Cite this article
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
Abstract
The narrow-leaf bur-reed (Sparganium angustifolium Michx.), a member of the Typhaceae family, is an ecologically important species in temperate Northern Hemisphere aquatic ecosystems. As a phylogenetically basal taxon within the order Poales, S. angustifolium is critical for understanding evolutionary adaptations to aquatic environments. While molecular mechanisms of aquatic adaptation have been extensively studied in Alismatales, genomic resources for basal Poales remain scarce. Here, we present the first chromosome-level genome assembly of S. angustifolium, comprising 486.52 Mb with 99.94% coverage across 15 chromosomes. The assembly demonstrates high continuity, with both contig and scaffold N50 values reaching 33.25 Mb. Genome annotation identified 23,767 protein-coding genes and revealed that repetitive sequences represent 70.16% of the genome. Our findings provide valuable genomic resources for comparative studies of plant adaptation to aquatic environments across different evolutionary lineages.
Data availability
The raw sequence data have been deposited in the Genome Sequence Archive at the National Genomics Data Center under accession number CRA02702145. The survey, HiFi, and Hi-C reads have been deposited in the European Nucleotide Archive under accession numbers ERR1584496546 and ERR1584497147 (survey reads); ERR1583863148, ERR1583853649, and ERR1583853150 (HiFi reads); and ERR1584210551 and ERR1584497352 (Hi-C reads). The genome assembly has been deposited in the Genome Warehouse under accession number GWHGEEK00000000.153 and in the European Nucleotide Archive under accession number GCA_97706350554. The genome annotation files are available in the Figshare database(https://doi.org/10.6084/m9.figshare.29375852)55.
Code availability
All bioinformatics tools and software used for genome assembly, annotation, and data analysis in this study were operated strictly according to their official user manuals, with no custom code employed. Software versions and parameters are comprehensively documented in the Methods section.
References
The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Bot. J. Linn. Soc. 161, 105β121, https://doi.org/10.1111/j.1095-8339.2009.00996.x (2009).
Cook, C. D. K. & Nicholls, M. S. A monographic study of the genus Sparganium. Part 1: Subgenus Xanthosparganium. Bot. Helv. 96, 213β267, https://doi.org/10.5169/seals-67202 (1986).
Chen, L. Y. et al. Phylogenomic analyses of Alismatales shed light into adaptations to aquatic environments. Mol. Biol. Evol. 39, msac079, https://doi.org/10.1093/molbev/msac079 (2022).
Ma, X. et al. Seagrass genomes reveal ancient polyploidy and adaptations to the marine environment. Nat. Plants. 10, 240β255, https://doi.org/10.1038/s41477-023-01608-5 (2024).
Givnish, T. J. et al. Assembling the tree of the monocotyledons: Plastome sequence phylogeny and evolution of Poales. Ann. Mo. Bot. Gard. 97, 584β616, https://doi.org/10.3417/2010023 (2010).
Zou, Y. et al. Genomic analysis of the emergent aquatic plant Sparganium stoloniferum provides insights into its clonality, local adaptation and demographic history. Mol. Ecol. Res. 23, 1868β1879, https://doi.org/10.1111/1755-0998.13850 (2023).
Shane, D. W., Joanna, R. F., Xu, X. W. & Aaron, B. A. S. Genome assembly, annotation, and comparative analysis of the cattail Typha latifolia. G3-Genes Genom. Genet. 12, jkab401, https://doi.org/10.1093/g3journal/jkab401 (2022).
Liao, Y. et al. Chromosome-level genome and high nitrogen stress response of the widespread and ecologically important wetland plant Typha angustifolia. Front. Plant Sci. 14, 1138498, https://doi.org/10.3389/fpls.2023.1138498 (2023).
Belton, J. M. et al. Hi - C: a comprehensive technique to capture the conformation of genomes. Methods 58, 268β276, https://doi.org/10.1016/j.ymeth.2012.05.001 (2012).
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 27, 764β770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Preprint at https://arxiv.org/abs/1308.2012 (2013). 1.
Cheng, H. et al. Haplotype-resolved assembly of diploid genomes without parental data. Nat. Biotechnol. 40, 1332β1335, https://doi.org/10.1038/s41587-022-01261-x (2022).
Li, H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 37, 4572β4574, https://doi.org/10.1093/bioinformatics/btab705 (2021).
Guan, D. et al. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 36, 2896β2898, https://doi.org/10.1093/bioinformatics/btaa025 (2020).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013). 1.
Zhu, K. et al. Advancing chromosomal-scale, haplotype-resolved genome assembly: beading with Hi-C data. Adv. Biotechnol. 2, 28, https://doi.org/10.1007/s44307-024-00035-7 (2024).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259, https://doi.org/10.1186/s13059-015-0831-x (2015).
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 356, 92β95, https://doi.org/10.1126/science.aal3327 (2017).
Neva, C. D. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99β101, https://doi.org/10.1016/j.cels.2015.07.012 (2016).
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639β1645, https://doi.org/10.1101/gr.092759.109 (2009).
Saha, S., Bridges, S., Magbanua, Z. V. & Peterson, D. G. Empirical comparison of ab initio repeat finding programs. Nucleic Acids Res. 36, 2284β2294, https://doi.org/10.1093/nar/gkn064 (2008).
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. Dna. 6, 1β6, https://doi.org/10.1186/s13100-015-0041-9 (2015).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573β580, https://doi.org/10.1093/nar/27.2.573 (1999).
Kim, D. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907β915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511β515, https://doi.org/10.1038/nbt.1621 (2010).
Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31, 46β53, https://doi.org/10.1038/nbt.2450 (2013).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 24, 637β644, https://doi.org/10.1093/bioinformatics/btn013 (2008).
Johnson, A. D. et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics. 24, 2938β2939, https://doi.org/10.1093/bioinformatics/btn564 (2008).
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878β2879, https://doi.org/10.1093/bioinformatics/bth315 (2004).
Lomsadze, A., Burns, P. D. & Borodovsky, M. Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res. 42, e119, https://doi.org/10.1093/nar/gku557 (2014).
Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 44, e89, https://doi.org/10.1093/nar/gkw092 (2016).
Keilwagen, J. et al. Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi. BMC Bioinformatics. 19, 1β12, https://doi.org/10.1186/s12859-018-2203-5 (2018).
Zou, Y. The chromosome-level genome of Sparganium stoloniferum. figshare. Dataset. https://doi.org/10.6084/m9.figshare.21407208.v1 (2022).
Chen, L. Dataset: Genome assemblies of Typha angustifolia. figshare. Dataset. https://doi.org/10.6084/m9.figshare.25769703.v3 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:LODP00000000.1 (2016). 1.
NCBI GenBank https://identifiers.org/ncbi/insdc:JAQOTM000000000.1 (2024). 1.
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654β5666, https://doi.org/10.1093/nar/gkg770 (2003).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389β3402, https://doi.org/10.1093/nar/25.17.3389 (1997).
Zdobnov, E. M. & Apweiler, R. InterProScan - an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 17, 847β848, https://doi.org/10.1093/bioinformatics/17.9.847 (2001).
Cantalapiedra, C. P. et al. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825β5829, https://doi.org/10.1093/molbev/msab293 (2021).
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309βD314, https://doi.org/10.1093/nar/gky1085 (2019).
Bu, D. et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 49, W317βW325, https://doi.org/10.1093/nar/gkab447 (2021).
Chan, P. P. et al. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49, 9077β9096, https://doi.org/10.1093/nar/gkab688 (2021).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 29, 2933β2935, https://doi.org/10.1093/bioinformatics/btt509 (2013).
NGDC Genome Sequence Archive https://ngdc.cncb.ac.cn/gsa/browse/CRA027021 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15844965 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15844971 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15838631 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15838536 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15838531 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15842105 (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.sra:ERR15844973 (2025). 1.
NGDC Genome Warehouse https://bigd.big.ac.cn/gwh/Assembly/98122/show (2025). 1.
European Nucleotide Archive https://identifiers.org/insdc.gca:GCA_977063505 (2025). 1.
Shi, X., Xue, J. & Xu, X. Chromosome-level genome assembly of narrow-leaf bur-reed (Sparganium angustifolium Michx., Typhaceae). figshare. Dataset. https://doi.org/10.6084/m9.figshare.29375852 (2025).
Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics. 34, 867β868, https://doi.org/10.1093/bioinformatics/btx699 (2018).
Faust, G. G. & Hall, I. M. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics. 30, 2503β2505, https://doi.org/10.1093/bioinformatics/btu314 (2014).
Zeng, X. et al. Chromosome-level scaffolding of haplotype-resolved assemblies using Hi-C data without reference genomes. Nat. Plants. 10, 1184β1200, https://doi.org/10.1038/s41477-024-01755-3 (2024).
Acknowledgements
This study was supported by the National Water Pollution Control and Treatment Science and Technology Major Project, China (No. 2015ZX07503005). The calculations in this paper were performed using the supercomputing system at the Supercomputing Center of Wuhan University.
Author information
Authors and Affiliations
National Field Station of Freshwater Ecosystem of Liangzi Lake, College of Life Sciences, Wuhan University, Wuhan, 430072, China
Xiang Shi & Xinwei Xu 1.
Key Laboratory of Vegetation and Environmental Change, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
Jianhua Xue 1.
China National Botanical Garden, Beijing, 100093, China
Jianhua Xue
Authors
- Xiang Shi
- Jianhua Xue
- Xinwei Xu
Contributions
X.X. designed the research; J.X. carried out the field collections; X.S. carried out the experiments and performed the data analysis; X.S., J.X. and X.X. wrote and revised the manuscript. All authors read and approved the manuscript.
Corresponding author
Correspondence to Xinwei Xu.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisherβs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the articleβs Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleβs Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shi, X., Xue, J. & Xu, X. Chromosome-level genome assembly of narrow-leaf bur-reed (Sparganium angustifolium Michx., Typhaceae). Sci Data (2026). https://doi.org/10.1038/s41597-026-06640-6
Received: 04 July 2025
Accepted: 15 January 2026
Published: 21 January 2026
DOI: https://doi.org/10.1038/s41597-026-06640-6