Measures of Central Tendency

4 measures of variation?

when data distribution is symmetrical then

mean ≠ median

mean = median

mode = midrange = midhinge

mode ≠ midrange ≠ midhinge

What is the empirical rule?

68% within $$1\sigma$$

95% within $$2\sigma$$

What is a Type I error?

error of falsely rejecting a null hypothesis

What is a Type II error?

incorrectly retaining a false null hypothesis

What is R2?

Multiplied by 100 it represents the percentage of variation in the outcome that can be explained by the model

What are the 4 basic assumptions when performing multiple regression?

• No (perfect) multicollinearity: There should be no perfect linear relationship between two or more of the predictors

• Variance inflation factor (VIF): can be used to assess and eliminate multicollinearity. VIF is a statistical value that identifies what independent variable(s) contribute to multicollinearity and should be removed. Any variable with VIF of greater than 10 should be removed.

• Normally distributed errors: it is assumed that the residuals in the model are normally distributed values with a mean of 0, i.e. they are most frequently zero, close to zero and rarely much greater than zero

• Homoscedasticity: at each level of the predictor variable(s), the variance of the residual terms should be constant

• Linearity: The inclusion of each independent variable preserves the straight-line assumptions of multiple regression analysis