Synthetic RNA-seq cohorts for data sharing: a discovery-aware benchmark at transcriptome scale (opens in new tab)
Background: Sharing patient-level gene expression data is essential for translational discovery but carries documented re-identification risks. Bulk RNA-seq count matrices can retain genotypic signals and paired clinical metadata compounds this through quasi-identifier matching. Synthetic RNA-seq cohorts offer a complementary path for privacy-preserving data sharing, but the field lacks a multi-axis benchmark that probes biological fidelity and empirical privacy risk at transcriptome scale. H...
Read the original article