Polyploids, organisms with more than two copies of their genome, are ubiquitous in the plant kingdom, predominant in agriculture, and important drivers of evolution. Biologists have thus spent significant resources studying these organisms. I was surprised then when I found that very little work had been done to characterize and estimate linkage disequilibrium (LD) in polyploids
LD is the statistical association between two alleles at different loci. It is used all over the place in computational biology and population genetics, but the literature is pretty lacking in describing LD in polyploids. So in Gerard (2021a), I provided a series of characterizations of LD in polyploids. I discuss measures that are both "haplotypic" (alleles are on the same haplotype) and "composite" (alleles may be on separate haplotypes). The haplotypic measures are old-hat, but the composite measures here are the real novelty. They are useful because you only need genotype (dosage) information to estimate them, which is a bonus for polyploids since haplotypic phase is really hard to come by with current technologies.
But actually estimating LD in polyploids is tricky because of genotype uncertainty. In polyploids, it is much harder to be sure of your genotypes because of the many different types of heterozygosity, as well as data-quirks like allele bias and overdispersion which are more significant in polyploids. If you aren't sure of your genotypes, and you use estimated genotypes to calculate LD, it will be super biased toward 0. Statisticians call this "attenuation", and it's a (120 year old) known result of measurement error.
In Gerard (2021b), I came up with a way to get rid of this bias
that's super fast. Below are LD estimates (y-axis) for a simulated
octoploid species where true LD is the red dashed line. Blue uses
estimated genotypes. Black is MLE (from Gerard, 2021a), orange is my
new way (from Gerard 2021b).
The software that implements these approaches is on CRAN: https://cran.r-project.org/package=ldsep
Gerard, D. (2021a). Pairwise Linkage Disequilibrium Estimation for Polyploids. Molecular Ecology Resources (in press), p. 1--19. doi:10.1111/1755-0998.13349 | bioRxiv:2020.08.03.234476
Gerard, D. (2021b). Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty. bioRxiv, p. 1--22. doi:10.1101/2021.02.08.430270