Our journey through low-complexity regions
lh3.github.io·4w
Flag this post

29 September 2025

Those who inspect SVs often have probably noticed many SVs fall in LCRs or tandem repeats and they are challenging to call. How many and how challenging? I could not find a good answer in the literature, so decided to work with Qian (Alvin) Qin to measure by ourselves.

To start with, we need to identify LCRs in the human genome. This turned out to be a non-trivial problem. At first, we attempted to use published tandem repeat catalogs. However, targeting short-read tandem repeat calling, existing catalogs contain millions of fragmented regions that do not look highly repetitive. We did not know how to filter them given the large differences between these catalogs. Another practical issue is that these catalogs are defined on GRCh38 only and are hard to be repro…

Similar Posts

Loading similar posts...