What are explanatory variables
They are inputs into a model that are expected to influence the response variable.
What are response variables
They are outputs from the model that are likely to be affected by the explanatory variables
What are categorical values and non-categorical values
They are explanatory variables that are used for modelling where the values of each level are distinct, and often cannot be given any natural ordering or score. Eg. gender which takes on the value male or female.
Non-categorical values can take numerical values, eg age
What are the drawbacks of using a normal model for linear regression
CLAN
How does GLMs address the problems for the normal model for linear regression
What is the purpose of the link function
What does the deviance of a model compare
It compares the observed value Yi to the fitted value Ui
In essence, the deviance is a measure of how much the fitted values differ from the obervations
What do the chi-squared statistic measure
This measures whether the inclusion of one or more additional explanatory variables in a model improves the fit significantly
What can be used to measure the uncertainty in the parameter estimators used in GLM
The Cramer-rao lower bound
What are deviance residuals
Is a measure of the distance between the actual observation and the fitted values
What are standardised pearson residuals
It is the difference between the observed response and the predicted value,
adjusted for the standard deviation of the predicted value
and the leverage of the observed response
What is Cook’s distance used for
It is used to estimate the influence of a data point on the model results
Cook’s distance of 1 or more is considered to merit closer examination in the analysis
What is Aliasing
Aliasing occurs when there is dependency among the observed covariates. i.e one covariate may be identical to some linear combination of other covariates
What is intrinsic aliasing
This occurs because of dependencies inherent in the definition of the covariates.
These intrinsic dependencies arise most commonly whenever categorical variables are included in the model.
What is Extrinsic aliasing
It arises from a dependency among the covariates. It arises when the dependency results from the nature of the data itself, rather than as a result of the inherent properties of the covariates.
What is an interaction term
It is used where the pattern in the response variable is better modelled by including extra parameters for each combination of two or more factors
An interaction exists when the effect of one factor varies depending on the value of another factor
What are the assumptions of the classical normal model for linear regression
What 2 properties do the members of the exponential family have
What is special about the Tweedie distribution
How is the number of degrees of freedom calculated
It is the number of observations less the number of parameters
What are nested models
Two models are nested if one model contains explanatory variables that are a subset of the explanatory variables in the other model
What is the equation of the Akaike Information Criteria and what does it measure
AIC = -2xlog likelihood + 2x number of parameters
It looks at the trade-off of the likelihood of a model against the number of parameters: the lower the AIC, the better the model
What does the deviance residual measure
It measures the distance between the actual observation and fitted value
What is the difference between correlations and interactions