1. Describe the data, using summary statistics and graphs, as appropriate.
2. Calculate the pair-wise correlation coefficients between Wage and each of the other variables. Test the statistical significance of each correlation coefficient.
3. Consider the two variables Gender and and Size. Compute the pairwise correlation of the two variables and test the significance of the correlation coefficient. Now consider the two values of Gender while grouping the Size variable into 3 intervals (3 intervals of 20 observations each) and construct a contingency Table. Using the contingency Table perform a test for the presence of association between the two variables. Compare and discuss results from the contingency Table analysis with results from the correlation analysis.
4. Consider the two variables Age and Edu and test the null hypothesis that the two variables have equal variance.
5. Estimate a regression model of the form:
Wagei =α + β1Agei + β2Edui + β3Genderi + β4Sizei +ui
where the i subscript corresponds to worker i. Interpret the coefficients that you obtain, and comment on their economic and statistical significance.
6. Interpret the R2 statistic from the regression and test whether it is statistically significant.
7. Re-estimate the model adding the variable Age to the power two (Age^2) and comment on any changes to the results and goodness of fit:
Wagei =α + β1Agei + β2Agei2 + β3Edui + β4Genderi + β5Sizei +ui
8. Estimate a (partial) log-version of the regression model of the form: