Here is a list of the equations I want you to have memorized on the top of your head.
Sample Mean: \(\bar{Y} = \frac{1}{n}\sum_{i=1}^nY_i\).
Sample variance: \(s_y = \frac{1}{n-1}\sum_{i=1}^n(Y_i - \bar{Y})^2\).
SLR Model: \(Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\), \(E[\epsilon_i] = 0\), \(var(\epsilon_i) = \sigma^2\), \(cor(\epsilon_i, \epsilon_j) = 0\) for \(i\neq j\).
Normal SLR Model: \(Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\), \(\epsilon \overset{iid}{\sim} N(0, \sigma^2)\).
OLS Objective: \(\sum_{i=1}^n\left[Y_i - (\beta_0 + \beta_1X_i)\right]^2\)
SLR OLS estimates: \(\hat{\beta}_0 = \bar{Y} - \hat{\beta}_1\bar{X}\), \(\hat{\beta}_1 = cor(X, Y)\frac{sd(Y)}{sd(X)}\).
Fitted values: \(\hat{Y}_i = \hat{\beta}_0 + \hat{\beta}_1 X_i\)
Residuals: \(e_i = Y_i - \hat{Y}_i\).
Properties of fitted regression line: (i) mean of residuals is 0, (ii) mean of observed values equals mean of fitted values, (iii) residuals are uncorrelated with predictors, (iv) residuals are uncorrelated with fitted values, and (v) regression line always goes through mean \((\bar{X}, \bar{Y})\).
MSE: \(MSE = \frac{1}{n-p}\sum_{i=1}^n\left[Y_i - \hat{Y}_i\right]^2\), where \(p=2\) in SLR.
\(t\)-statistic: \(t^* = \frac{\hat{\beta}_1}{s(\hat{\beta}_1)}\), which follows a \(t_{n-p}\) distribution under the null that \(\beta_1 = 0\).
Get two-sided \(p\)-value manually via \(2 * pt(q = -abs(t^*), df = n - p)\), where \(p=2\) in SLR.
Confidence interval = estimate \(\pm\) multiplier \(\times\) standard error
\(SSE = \sum_{i=1}^n(Y_i - \hat{Y}_i)^2\), with \(df\) of \(n - p\)
\(SSR = \sum_{i=1}^n(\hat{Y}_i - \bar{Y}_i)^2\), with \(df\) of \(p - 1\)
\(SSTO = \sum_{i=1}^n(Y_i - \bar{Y}_i)^2\), with \(df\) of \(n-1\)
\(SSTO = SSE + SSR\)
\(F^* = \frac{[SSE(R) - SSE(F)]/[df_R - df_F]]}{SSE(F)/df_F}\) which follows a \(F(dr_R - df_F, df_F)\) distribution under the null of the reduced model.
\(R^2 = \frac{SSR}{SSTO} = 1 - \frac{SSE}{SSTO}\)
Interpretations:
Bonferroni corrected \(p\)-values: unadjusted \(p\)-value \(\times\) number of tests.
Matrix stuff
Multiple linear regression model: \[\begin{align} Y_i &= \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + \cdots + \beta_{p-1}X_{i,p-1} + \epsilon_i\\ E[\epsilon_i] &= 0\\ var(\epsilon_i) &= \sigma^2\\ cov(\epsilon_i, \epsilon_j) &= 0 \text{ for all } i \neq j \end{align}\]
Quadratic Regression.
Indicator variables, how they show up in a design matrix.
Interaction Effects.
The error sum of squares given predictors \(X_1, X_2, \ldots, X_{p-1}\). \[ SSE(X_1,X_2,\ldots,X_{p-1}) = \sum_{i=1}^n\left[Y_i - (\hat{\beta}_0 + \hat{\beta}_1X_{i1} + \hat{\beta}_2X_{i2} + \cdots + \hat{\beta}_{p-1}X_{i,p-1})\right]^2 \]
The extra sum of squares \[\begin{align} SSR(X_1|X_2) &= SSE(X_2) - SSE(X_1, X_2) = SSR(X_1, X_2) - SSR(X_2)\\ SSR(X_2|X_1) &= SSE(X_1) - SSE(X_1, X_2) = SSR(X_1, X_2) - SSR(X_1)\\ SSR(X_1, X_2|X_3) &= SSE(X_3) - SSE(X_1, X_2, X_3) = SSR(X_1, X_2, X_3) - SSR(X_3)\\ \text{ etc...} \end{align}\]
Decomposing sum of squares (with corresponding degrees of freedom), e.g.
Type I versus Type II sums of squares.
How the \(F\)-test can be used for different hypothesis tests.
Adjusted coefficient of multiple determination: \[\begin{align} R^2_a = 1 - \left(\frac{n-1}{n-p}\right)\frac{SSE}{SSTO}. \end{align}\]
Coefficients of partial determination \[\begin{align} R^2_{Y1|23} &= \frac{SSR(X_1|X_2, X_3)}{SSE(X_2, X_3)}\\ R^2_{Y2|13} &= \frac{SSR(X_2|X_1, X_3)}{SSE(X_1, X_3)}\\ R^2_{Y3|12} &= \frac{SSR(X_3|X_1, X_2)}{SSE(X_1, X_2)}\\ R^2_{Y4|123} &= \frac{SSR(X_4|X_1, X_2, X_3)}{SSE(X_1, X_2, X_3)}\\ &\text{etc...} \end{align}\]
\(Z\)-score \[ Z_i = \frac{X_i - \bar{X}}{s_x} \]
AIC: Akaike’s Information Criterion \[ AIC = n\log\left(\frac{SSE}{n}\right) + 2p \]
BIC: Bayesian information Criterion \[ BIC = n\log\left(\frac{SSE}{n}\right) + \log(n)p \]
Mallows \(C_p\): \[ C_p = p + (n-p)\frac{\hat{\sigma}^2 - \hat{\sigma}^2_{full}}{\hat{\sigma}^2_{full}}, \]
The leverage value: \(h_{ii}\) is the \(i\)th diagonal element of the hat matrix.
The studentized residual \[ r_i = \frac{e_i}{\sqrt{MSE(1 - h_{ii})}} \]
Cook’s Distance: \[ D_i = \frac{\sum_{j=1}^n\left(\hat{Y}_j - \hat{Y}_{j(i)}\right)^2}{pMSE} \]
What are good values of leverage, studentized residuals, and cook’s distance?