Data Analysis Project for Multiple Regression
Select four independent variables and one dependent variable of your choice. They should all be quantitative.
For this multiple regression project “physics” is the dependent variable while “gender,” “iq,” “ei,” “vsat,” and “msat” as independent variables.
Run a standard or simultaneous regression with this a set of variables from your previous assignment.
The results from the run standard regression on the SPSS data processor.
Conducted on excel:
Then run a hierarchical regression in which you arbitrarily specify an order of entry for the four variables.
The results from the run hierarchical regression on the SPSS data processor. I used gender as the order of entry for the four variables.
Results
How does the equation in the last step of your hierarchical analysis (with all four variables entered) compare with your standard regression?
Draw overlapping circle diagrams to illustrate how the variance is partitioned among the X predictor variables in each of these two analyses; indicate (by giving a numerical value) what percentage of variance is attributed (uniquely) to each independent variable.
In other words, how does the variance partitioning in these two analyses differ? Which is the more conservative approach?
Look at the equations for Steps 1 and 2 of your hierarchical analysis.
Calculate the difference between the R2 values for these two equations; show that this equals one of the squared part correlations (which one?) in your output.
Evaluate the statistical significance of this change in R2 between Steps 1 and 2. How does this F compare to the t statistic for the slope of the predictor variable that entered the model at Step 2?
Consider the predictor variable that you entered in the first step of your hierarchical analysis.
How does your evaluation of the variance due to this variable differ when you look at it in the hierarchical analysis compared with the way you look at it in the standard analysis?
In which analysis does this variable look more “important,” that is, appears to explain more variance? Why?
Now consider the predictor variable that you entered in the last (fourth) step of your hierarchical analysis and compare your assessment of this variable in the hierarchical analysis (in terms of proportions of explained variance) with your assessment of this variable in the standard analysis.
Look at the values of R and F for the overall model as they change from Steps 1 through 4 in the hierarchical analysis. How do R and F change in this case as additional variables are added in?
In general, does R tend to increase or decrease as additional variables are added to a model? In general, under what conditions does F tend to increase or decrease as variables are added to a model?
Suppose you had done a statistical (data-driven) regression (using the forward method of entry) with this set of four predictor variables. Which (if any) of the four predictor variables would have entered the equation and which one would have entered first? Why?
If you had done a statistical regression (using a method of entry such as forward) rather than a hierarchical (user-determined order of entry), how would you change your significance testing procedures? Why?