Using the Equation to Make Predictions
Example: Given the same temperature, mowing, thirst example, if the temperature were 90 degrees Fahrenheit and someone mowed for one and a half hours, how much water would you estimate that they would want to drink during three hours outside?
Once you have the multiple regression equation, it is relatively straight forward to make predictions. In this example, the regression equation was given by
Water = - 122 + 1.51*Temperature + 12.5*Mowing Time
Now set temperature to 90 degrees and mowing for 1.5 hours as given in the example.
Water=-122+1.51*90+12.5*1.5=32.65 oz
We estimate that the person in the example would drink a little less than 33 ounces.
Warning: Make certain that the units in the problem match the units in the regression equation. For example, if someone mowed for 30 minutes, you would need to change that value to 0.5 hours (half an hour) and substitute 0.5 into the regression equation NOT 30.
Interpretation of the Multiple Regression Equation
The interpretation of a multiple regression equation is similar, but not identical to the interpretation of a simple regression equation. In multiple regression, there can still be an intercept, but in order to talk about the coefficient of a variable as a "slope" we have to hold each other variable in the equation constant.
Example: Recall for the temperature, mowing, water example, the multiple regression equation is
Water = - 122 + 1.51*Temperature + 12.5*Mowing Time
The intercept represents how much water a person would drink when he/she mowed for zero hours and the temperature is zero degrees (F). We should immediately notice that the input of zero degrees (F) is out of the range of the summer time temperatures used to develop the regression equation. Alternatively, this intercept value is -122 ounces which also does not make sense for this problem, indicating that we should check to see if the inputs were in the prediction region.
Interpreting the coefficient for temperature. When the mowing time is held constant, for every one degree (F) increase in the temperature the amount of water consumed increases by 1.51 oz. Or we could say for every 10 degree increase in the temperature, the amount of water consumed increases by 15 ounces.
Interpreting the coefficient for Mowing Time. When the temperature is held constant, for every one hour increase in mowing time the amount of water consumed increases by 12.5 oz.
From this example, we see that each additional hour of mowing has an effect more similar to increasing the temperature ten degrees than one degree in terms of being thirsty. It is always important to check and make sure the coefficients match our knowledge of the situation, and these results seem reasonable when thinking about real life situations.
Interpretation of the Measures of the Strength of the Association for Multiple Regression
Measuring the strength of the association for multiple regression is again similar to but slightly different from simple linear regression. One measure of association used in simple linear regression is R-squared. A capital R is used to indicate multiple input variables. This measure should not be used in isolation in multiple regression. R-squared should be used in conjunction with R-squared adjusted (or adjusted R-squared).
Example: For the temperature, mowing, thirst example, we have the following:
R-Sq = 99.4%
R-Sq(adj) = 99.0%
S = 1.245
R-Sq
= 99.4%
R-squared is called the Coefficient
of Multiple
Determination and tells
the percent of the variance in the dependent variable that can be explained
by all of the independent variables taken together. Thus, 99.4% of the
variation in a person's thirst when they have been outside for three hours
can be explained by the temperature and the length of time they are mowing
the grass. Clearly this is a strong relationship.
R-Sq(adj) = 99.0%
R-Squared
Adjusted
or
Adjusted
R-Squared is
a
version of R-Squared that has been adjusted for the number of predictors in
the model. R-Squared tends
to over estimate the strength of the association especially if the model has
more than one independent variable. Thus, it is important to look
at the adjusted R-square which compensates for the number of variables in
the model. We can use R-squared adjusted to compare all other models
with two independent variables to see which have a stronger
association. In this example, the R-square adjusted (99%)
is only slightly lower than the R-square value (99.4%). In some cases,
the difference is much more dramatic between these two measures. Then
additional diagnostic measures may be needed. (See Multiple
Regression Diagnostics)
Now let's learn to compute these measures