R/mom.R
ldest_comp.Rd
This function will estimate the composite LD between two loci, either using genotype estimates or using genotype likelihoods. The resulting measures of LD are generalizations of Burrow's "composite" LD measure.
ldest_comp(
ga,
gb,
K,
pen = 1,
useboot = TRUE,
nboot = 50,
se = TRUE,
model = c("norm", "flex")
)
One of two possible inputs:
A vector of counts, containing the genotypes for each
individual at the first locus. When type = "comp"
,
the vector of genotypes may be continuous (e.g. the
posterior mean genotype).
A matrix of genotype log-likelihoods at the first locus.
The rows index the individuals and the columns index
the genotypes. That is ga[i, j]
is the genotype
likelihood of individual i
for genotype j-1
.
One of two possible inputs:
A vector of counts, containing the genotypes for each
individual at the second locus. When type = "comp"
,
the vector of genotypes may be continuous (e.g. the
posterior mean genotype).
A matrix of genotype log-likelihoods at the second locus.
The rows index the individuals and the columns index
the genotypes. That is gb[i, j]
is the genotype
likelihood of individual i
for genotype j-1
.
The ploidy of the species. Assumed to be the same for all individuals.
The penalty to be applied to the likelihood. You can think about
this as the prior sample size. Should be greater than 1. Does not
apply if model = "norm"
, type = "comp"
, and using
genotype likelihoods. Also does not apply when type = "comp"
and using genotypes.
Should we use bootstrap standard errors TRUE
or not
FALSE
? Only applicable if using genotype likelihoods and
model = "flex"
The number of bootstrap iterations to use is
boot = TRUE
. Only applicable if using genotype likelihoods and
model = "flex"
.
A logical. Should we calculate standard errors (TRUE
) or
not (FALSE
). Calculating standard errors can be really slow
when type = "comp"
, model = "flex"
, and when using
genotype likelihoods. Otherwise, standard error calculations
should be pretty fast.
Should we assume the class of joint genotype distributions
is from the proportional bivariate normal (model = "norm"
)
or from the general categorical distribution (model = "flex"
).
Only applicable if using genotype likelihoods.
A vector with some or all of the following elements:
D
The estimate of the LD coefficient.
D_se
The standard error of the estimate of the LD coefficient.
r2
The estimate of the squared Pearson correlation.
r2_se
The standard error of the estimate of the squared Pearson correlation.
r
The estimate of the Pearson correlation.
r_se
The standard error of the estimate of the Pearson correlation.
Dprime
The estimate of the standardized LD
coefficient. When type
= "comp", this corresponds
to the standardization where we fix allele frequencies.
Dprime_se
The standard error of Dprime
.
Dprimeg
The estimate of the standardized LD coefficient. This corresponds to the standardization where we fix genotype frequencies.
Dprimeg_se
The standard error of Dprimeg
.
z
The Fisher-z transformation of r
.
z_se
The standard error of the Fisher-z
transformation of r
.
p_ab
The estimated haplotype frequency of ab. Only returned if estimating the haplotypic LD.
p_Ab
The estimated haplotype frequency of Ab. Only returned if estimating the haplotypic LD.
p_aB
The estimated haplotype frequency of aB. Only returned if estimating the haplotypic LD.
p_AB
The estimated haplotype frequency of AB. Only returned if estimating the haplotypic LD.
q_ij
The estimated frequency of genotype i at locus 1 and genotype j at locus 2. Only returned if estimating the composite LD.
n
The number of individuals used to estimate pairwise LD.
set.seed(1)
n <- 100 # sample size
K <- 6 # ploidy
## generate some fake genotypes when LD = 0.
ga <- stats::rbinom(n = n, size = K, prob = 0.5)
gb <- stats::rbinom(n = n, size = K, prob = 0.5)
head(ga)
#> [1] 2 3 3 5 2 5
head(gb)
#> [1] 3 3 2 6 3 2
## generate some fake genotype likelihoods when LD = 0.
gamat <- t(sapply(ga, stats::dnorm, x = 0:K, sd = 1, log = TRUE))
gbmat <- t(sapply(gb, stats::dnorm, x = 0:K, sd = 1, log = TRUE))
head(gamat)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] -2.918939 -1.418939 -0.9189385 -1.4189385 -2.918939 -5.4189385 -8.918939
#> [2,] -5.418939 -2.918939 -1.4189385 -0.9189385 -1.418939 -2.9189385 -5.418939
#> [3,] -5.418939 -2.918939 -1.4189385 -0.9189385 -1.418939 -2.9189385 -5.418939
#> [4,] -13.418939 -8.918939 -5.4189385 -2.9189385 -1.418939 -0.9189385 -1.418939
#> [5,] -2.918939 -1.418939 -0.9189385 -1.4189385 -2.918939 -5.4189385 -8.918939
#> [6,] -13.418939 -8.918939 -5.4189385 -2.9189385 -1.418939 -0.9189385 -1.418939
head(gbmat)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] -5.418939 -2.918939 -1.4189385 -0.9189385 -1.418939 -2.918939 -5.4189385
#> [2,] -5.418939 -2.918939 -1.4189385 -0.9189385 -1.418939 -2.918939 -5.4189385
#> [3,] -2.918939 -1.418939 -0.9189385 -1.4189385 -2.918939 -5.418939 -8.9189385
#> [4,] -18.918939 -13.418939 -8.9189385 -5.4189385 -2.918939 -1.418939 -0.9189385
#> [5,] -5.418939 -2.918939 -1.4189385 -0.9189385 -1.418939 -2.918939 -5.4189385
#> [6,] -2.918939 -1.418939 -0.9189385 -1.4189385 -2.918939 -5.418939 -8.9189385
## Composite LD with genotypes
ldout1 <- ldest_comp(ga = ga,
gb = gb,
K = K)
head(ldout1)
#> D D_se r2 r2_se r r_se
#> 0.0044612795 0.0221319876 0.0004064944 0.0040307019 0.0201617053 0.0999593506
## Composite LD with genotype likelihoods
ldout2 <- ldest_comp(ga = gamat,
gb = gbmat,
K = K,
se = FALSE,
model = "flex")
head(ldout2)
#> D D_se r2 r2_se r r_se
#> 0.008882068 NA 0.018772196 NA 0.137011665 NA
## Composite LD with genotype likelihoods and proportional bivariate normal
ldout3 <- ldest_comp(ga = gamat,
gb = gbmat,
K = K,
model = "norm")
head(ldout3)
#> D D_se r2 r2_se r r_se
#> 0.02582342 0.01216745 0.67448602 0.26764405 0.82127098 0.16294503