Simulator for genotype likelihoods.

Uses the updog R package for simulating read counts and generating genotype log-likelihoods.

simgl(
  nvec,
  rdepth = 10,
  od = 0.01,
  bias = 1,
  seq = 0.01,
  ret = c("gl", "gp", "all"),
  est = FALSE,
  ...
)

Arguments

nvec: The genotype counts. nvec[k] contains the number of individuals with genotype k-1.
rdepth: The read depth. Lower means more uncertain.
od: The overdispersion parameter. Higher means more uncertain.
bias: The allele bias parameter. Further from 1 means more bias. Must greater than 0.
seq: The sequencing error rate. Higher means more uncertain.
ret: The return type. Should we just return the genotype likelihoods ("gl"), just the genotype posteriors ("gp"), or the entire updog output ("all")
est: A logical. Estimate the updog likelihood parameters while genotype (TRUE) or fix them at the true values (FALSE)? More realistic simulations would set this to TRUE, but it makes the method much slower.
...: Additional arguments to pass to flexdog_full().

Value

By default, a matrix. The genotype (natural) log likelihoods. The rows index the individuals and the columns index the dosage. So gl[i,j] is the genotype log-likelihood for individual i at dosage j - 1.

Author

David Gerard

Examples

set.seed(1)
simgl(c(1, 2, 1, 0, 0), model = "norm", est = TRUE)
#>             [,1]      [,2]      [,3]       [,4]      [,5]
#> [1,]  -0.0527924 -4.370039 -9.577024 -17.181178 -43.85351
#> [2,] -17.2757020 -1.821025 -1.731916  -4.231294 -19.60222
#> [3,]  -6.2869622 -1.796049 -4.883182 -10.467745 -32.96015
#> [4,] -13.4332263 -1.434198 -2.403487  -5.939004 -23.82247
simgl(c(1, 2, 1, 0, 0), model = "norm", est = FALSE)
#>              [,1]      [,2]      [,3]       [,4]      [,5]
#> [1,]  -0.09618742 -2.798880 -6.516204 -12.512266 -31.31682
#> [2,]  -4.94789634 -1.315725 -3.006407  -7.128083 -22.82756
#> [3,]  -4.94789634 -1.315725 -3.006407  -7.128083 -22.82756
#> [4,] -10.22692932 -1.899873 -1.614191  -3.834738 -16.08930