There are two types of variables in a dataset in general:
1) Real valued (ex.: 1,2,3,4)
2) Categorical (ex.: Blood Group)
Categorical variables have two sub-types:
a) Ordinal (ex.: Age group 25-55, or Grades A-D)
b) Non-ordinal (ex.: Male/Female, Plant/Animal)
1) Real valued (ex.: 1,2,3,4)
2) Categorical (ex.: Blood Group)
Categorical variables have two sub-types:
a) Ordinal (ex.: Age group 25-55, or Grades A-D)
b) Non-ordinal (ex.: Male/Female, Plant/Animal)
It is comparatively easy to build linear regression model around real-values and ordinal-categorical variables that can relate to regression. However, nonordinal-categorical variables pose a problem for a "regression" model. This is because the regression equations,
y = a0 + a1X1 + a2X2 +....
cannot rely on their arbitrary nature to predict output values. Nonetheless, these nonordinal categorical variables can always be transformed into real-valued or binary forms to make use of them in a regression problem.
No comments:
Post a Comment