Step 1:
Get to know your data. Look at the variables and see what they are measuring and what types of data you have to analyse.
Step 2:
Plan and describe how you will summarize the socio-demographic and general health of the sample (descriptive statistics). Think about the following points: What summary statistics will you use for which kinds of data? What data will you put in tables and/or graphs? How will you assess the suitability of each of these methods?
What assumptions are they based on? How will you treat each variable?
What are the socio-demographic, health and lifestyle characteristics of your sample participants? Describing your sample is the first part of your analysis and comes first in the report results. Think about why it is important to get a description of the sample before you present results from hypothesis testing.
Step 3:
Develop a series of hypotheses that can be tested using the different types of statistical tests below. What would be the hypotheses? What tests will you do and why?
Explain why the statistical techniques that you will use are appropriate. (Hint: weeks 3-10). If you decide to create new variables, describe how you will do so and why you chose each method of doing so. (Hint: see information in weeks 1-4)
Select two categorical variables that are of interest to you and perform an appropriate univariate statistical test. Explain why the statistical test that you have used is appropriate, show the results and report your conclusion. Repeat this again using two new categorical variables or one new outcome (dependent) variable for the same potential explanatory (independent) variable.
Select a variable with two or three categories and investigate how the values of another continuous (scale) variable differ between categories. You may choose to create a new variable with two, three, or more categories from an existing continuous or categorical variable. (Example: blood pressure and gender or blood pressure and BMI categorised as normal, overweight, obese.)
Repeat this again with another pair of variables that will lead to a nonparametric test if possible. (Hint: explore the data to look for skewed distributions of a continuous variable).
Perform a multiple linear regression analysis to find those independent variables (continuous and categorical) that are significantly related to systolic blood pressure at the 5% significance level.
Use Enter method to add potential risk factors.
Data Analysis
Structure this section with headings for each of the assignment instructions.
In this section you will state your hypotheses that address each of the assignment questions.
Then describe the methods of analysis. Make sure you cover the points listed below as appropriate.
What are the variables? Describe how you checked data and how you created new variables if you did so.
How were the variables analysed?
What measures of central tendency and dispersion?
What other descriptive statistics were used – frequencies, proportions, etc.?
What statistical test did you use with what variables?
What assumptions had to be met in order to use the stated tests. What did you do if assumptions were not met.