Project Goal: To explore confidence intervals and hypothesis testing for categorical data.
Directions Overview: You will be asked to research and write a report about a qualitative variable from an identified population.
In your report, be sure to use full sentences, and justify your answers using specific details or values as appropriate. The following worksheet steps are provided to you to gather the required information to write your Final Report as detailed in Report Specifications.
Identify a qualitative variable that you would like to research, and the population that you will be studying.
Note: You will eventually be determining whether items from your population possess (yes/no) the characteristic you are researching.
What is the Qualitative Variable?
Number of teenagers who test while driving a vehicle.
What is the Population?
All teenagers in the U.S. who drive frequently.
Find a published figure for the population proportion. For example, according to the Pew Research Center, 69% of US adults say they use Facebook. (https://www.pewresearch.org/fact-tank/2021/06/01/facts-about-americans-and-facebook).
According to data from the 2015 Youth Risk Behavior Survey, 38% of teenagers admitted to texting while driving a vehicle. (https://www.aafp.org/news/health-of-the-public/20180928textndrive.html)
Take a sample of n=30 from your identified population and create a table recording the results. Include the table below (enter the individual responses).
Yes Yes No Yes No No
No Yes No Yes No Yes
No No Yes Yes No Yes
No Yes No No No No
Yes No No Yes No No
Identify and explain at least one type of bias that could be present in your research.
Using technology, create a bar chart for your qualitative variable. Be sure to include all appropriate labels in your bar chart. Include a screen clip of your bar chart here.
What is the sample proportion? Use appropriate notation.
The sample proportion is p-hat = 0.4.
Using 1-2 sentences, compare the sample proportion to your identified population proportion.
The sample proportion is slightly higher than the population proportion. This difference could be explained by standard error, which is about 0.0886 for these data (square root of (0.38)(1-.0.38)/30).
Using the identified population proportion, and The Central Limit Theorem for Proportions, manually calculate the mean and the standard error (i.e. the standard deviation of the sampling distribution) for the sampling proportion for the various sample sizes given below.
Reminder: According to the Central Limit for Proportions, and
0.38 0.0686
0.38 0.0343
0.38 0.0172
What do you notice happens to the standard error as the sample size increases from 50 to 200, and from 200 to 800?
Hint: does the standard error change in a pattern? Make at least two observations.
As the sample size increases, the standard error decreases. Larger sample sizes better reflect the whole population. As the sample increases from 50 to 200, the standard error decreases from 0.0686 to 0.0343. Increasing the sample size from 200 to 800 brings the standard error down to 0.0172.
Using technology, and the identified population proportion, create the sampling distribution for each of the three sample sizes. Include a screen clip of each of these sampling distributions. They will be required for your Final Report.
In a paragraph (3-4 sentences), describe the shape of the three sampling distributions. Hint: Do they appear to agree with the Central Limit Theorem for Proportions?
As we increase the sample size, the sampling distribution tends to closely resemble the Central Limit Theorem. The first distribution we created is clearly too small—it doesn’t fit in the normal curve. The second one is better—with only minor deviations from the normal curve. Despite the same deviations appearing in the third sampling distribution, it is much more in line with the Central Limit Theorem.
Use the Empirical Rule and the population proportion, what values of the sample proportion would be considered unusual? Is your sample proportion usual or unusual? Explain your answer. Be sure to show all calculations.
Under the Empirical Rule, 68% of observations fall within the first standard deviation, 95% within the first two standard deviations, and 99.7% within the first three standard deviations. Observations within three standard deviations are considered usual, while observations beyond three standard deviations are considered unusual.
Our population proportion (p) was 0.38. Our sample proportion (p-hat) was 0.40 taken from a sample of n = 30.
One standard deviation = square root of (0.38)(1-0.38)/30 = 0.0886.
Three standard deviations = 0.0886 x 3 = 0.2659.
All values of 0.38 plus or minus 0.2659 (0.11 through 0.65) are considered usual values, while values outside of this range are considered unusual.
Under the Empirical Rule, your sample proportion of 0.40 is considered usual.
Write a Final Report for this project. Use the following guidelines to structure your report:
Introduction: Set the stage for your report by providing an overview of the qualitative variable from your identified population. Include the identified population proportion and cite the source. (1-2 paragraphs)
Describe the process: How did you go about deciding how to take a sample from your population, what kind of sample did you use, and what bias(es) do you think might be present? (1-2 paragraphs)
Results: Summarizes the results from your research and discuss how close your population proportion was to your sample proportion. Include any graphs that support your analysis. (2-4 paragraphs)
Conclusion: Provide an overview of the research you conducted. Include 2-3 questions related to your research you would like to investigate further. Include reflections about the project that would add insight. (1 paragraph)