I put on github an R package for some of my ideas on removing unwanted variation.
Y = XB + ZA + E,
Not accounting for the hidden covariates, Z, can reduce power and
result in poor control of false discovery rate. The
vicar package provides a suite
of functions to adjust for hidden confounders, both when one has and
does not have access to control genes.
backwash can adjust for hidden
confounding when one does not have access to control genes. They do so
via non-parametric empirical Bayes methods that use the powerful
methodology of Adaptive SHrinkage (Stephens 2016) within the
factor-augmented regression framework described in Wang et
backwash is a slightly more Bayesian version of
When one has control genes, there are many approaches to take. Such methods include RUV2 (J. A. Gagnon-Bartsch and Speed 2012), RUV4 (J. Gagnon-Bartsch, Jacob, and Speed 2013), and CATE (Wang et al. 2015). This package adds to the field of confounder adjustment with control genes by
Many of these ideas are described in Gerard and Stephens (2017).
Gagnon-Bartsch, Johann A, and Terence P Speed. 2012. Using Control Genes to Correct for Unwanted Variation in Microarray Data. Biostatistics 13 (3). Biometrika Trust: 539–52. doi:10.1093/biostatistics/kxr034.
Gagnon-Bartsch, Johann, Laurent Jacob, and Terence Speed. 2013. Removing Unwanted Variation from High Dimensional Data with Negative Controls. Technical Report 820, Department of Statistics, University of California, Berkeley. http://statistics.berkeley.edu/tech-reports/820.
Gerard, D., & Stephens, M. (2017). Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls. arXiv preprint arXiv:1705.08393. [Link to arXiv]
Stephens, Matthew. 2016. False Discovery Rates: A New Deal. Biostatistics. doi:10.1093/biostatistics/kxw041.
Wang, Jingshu, Qingyuan Zhao, Trevor Hastie, and Art B Owen. 2015. Confounder Adjustment in Multiple Hypothesis Testing. ArXiv Preprint ArXiv:1508.04178. https://arxiv.org/abs/1508.04178.