ParaDISM: Precise mapping of short reads to genes with highly homologous regions (opens in new tab)
Background Genes with highly similar genomic copies (paralogs, tandem duplications and pseudogenes) pose a major challenge for Short-Read High Throughput Sequencing (srHTS). High sequence similarity makes it difficult to unambiguously identify the sequences of origin of short reads. This results in misalignment artifacts which can propagate through bioinformatic pipelines and increase error rates in variant calling. Results We present ParaDISM, a pipeline that refines standard alignments to i...
Read the original article