Variables
Types of variable
A variable is either
- continuous
- discrete or
- categorical (qualitative).
Categorical variables come in three types:
- Binary (mutually distinct: male or female)
- Practical
- Ordinal
Categorical variables (for example hair color or gender) don't have a natural order.
The number of children in a household is a discrete variable: there cannot be 2.3 children.
Continuous variables come in two types:
- Ratio (Needs to have a 0-point, such as height)
- Interval (Temperature is an interval variable because 2° is not twice as hot as 1°)
Independent variable
German: erklärende/unabhängige/prädiktor/eoxogene Variable
Goes on x-axis (abscissa)
Aka
- explanatory variable
- feature (machine learning and pattern recognition)
- input variable
- predictor variable
- regressor
- covariate
- control variable (used in econometrics for covariate)
- controlled variable
- manipulated variable
- exposure variable
- risk factor (medical statistics)
Dependent variable
German: interessierende/endogene Variable - Zielvariable
Goes on y-Axis (ordinate)
Aka
- response variable
- regressand
- predicted variable
- measured variable
- explained variable
- experimental variable
- responding variable
- outcome variable
- output variable
- label
Over- and underfitted models
An overfitted model contains more parameters than can be justified by the data.
An underfitted model cannot adequately capture the underlying structure of the data, for example when fitting a linear model to non-linear data.
Techniques to reduce overfitting:
- model comparison
- cross-validation
- regularization
- early stopping
- pruning
- Bayesian priors
- dropout