Chapter 1. Confidence Intervals for Proportions

Introduction

Statistical Applets

Set the underlying population proportion p, the desired confidence level, and the sample size n with the sliders, then click SAMPLE to take a sample. On the right you'll see a large dot representing the sample proportion (i.e., the proportion of people in the sample who said they plan to vote for candidate A), and the lines on each side of this dot span the confidence interval. Click SAMPLE 25 take 25 samples all at once. The green line indicates the population proportion p. Intervals that contain p ("hits") will be colored gray; "misses" will be colored red. Click on any confidence interval to reveal the sample proportion and the range of the confidence interval.

Click the "Quiz Me" button to complete the activity.

This applet simulates the process of taking an election poll involving two candidates. Some proportion p of the population plans to vote for candidate A. If we sample n people from the population, the proportion of people in the sample who say they plan to vote for candidate A is our best estimate of the population proportion. We can then construct a level C confidence interval, where C is the probability that the interval actually includes the true value of p. More specifically, the confidence interval is calculated as the sample proportion ± z* times the standard deviation of the sample proportion, where z* is the critical value of z that has (1-C)/2 of the normal distribution to the right of the value, and the standard deviation is .

Question 1.1

Excluding voters for third-party candidates, approximately 52% of the voters in Virginia's 2012 presidential election voted for Barack Obama, with the other 48% voting for Mitt Romney. In modern presidential politics, the vote in Virginia this year was not particularly close: Obama fairly easily won more than 50% of the votes, claiming Virginia's 13 electoral college votes.

Use this applet to simulate what might have happened in 25 pre-election polls, with each poll including a sample size of 250 people. First click RESET, then set the Population Proportion to .52, the sample size to 250, and click SAMPLE 25.

Inspect the 95% confidence intervals of these 25 simulated polls. How many of them include the true population proportion of .52?

M1Us4pqX75xhTU6OdgvW9YXikaeDr8EPReylRKVfRLuOxtogfS8L2jcL6yTIgebQdXe9ow==

Make sure you actually run the simulation as directed: Click RESET (if necessary), then set the Population Proportion to .52, the Confidence Level to 95, the Sample Size to 250, and click SAMPLE 25. Then enter the number of "Hits" indicated on the left side of the graph.

Incorrect.

Great job.

Number of polls that include the true population proportion.

Question 1.2

LmIKaJjn/yWH9/me56iNxwy23p7/Ha2yqM6bfpS14/nz/GLXZSs0oQfdOsWQkcDuHIJmSnf1U2U0dhyHBXjmKrWjuPnLYnNoz3DtvI0/iaKXq5kFC6dUf2Zs4plewIuzWEbEu7JjKekP/amJ1sBTOl604AF9VU4b+s6h+6oXsWowbjjaa7csxPv1bSJlWn8wbl75mzeSA3GcbbwMkblPxAsBy5iOLYug5zD2wMPIObdYnmH7Lxz3ZHnDchVgTQoqx7bmbLz44hGWP8IDWDyAQp7BTScTWpNFpOL6rqHsOvGF4ULRrRi2q/y7GS5QTtjS6lLlzUKQF6KX9gan31UrHakpyKNoJNcrsQwPqLaqNCyqMhWpwnx8OWTz9kQ7pV7Ga2j7Crvt7W8dkOAUisjhGQ==

Correct.

Note that since we defined the Population Proportion in terms of the percentage of voters who voted for Obama, this question, about the number of polls where Romney was projected to get more than 50% of the votes, asks you to count how many of the polls showed a proportion less than .50. This question does not involve confidence intervals, so ignore the "Percent Hit" statistic on the left.

Incorrect.

Question 1.3

O3r9vTgPLECw8RUoqeXgZ0E00eOnCGihGnhcAkvmSD6gWMDWjXgzJcn2zVKTcUe5/Gp30lx+jqSSb9zSgh2pxiodeHCwg/5OVG4Q9dEHlFwAwPLTQVksALiCfGM8e/wfgtov6urEa2rPJE2BbYgjuN/F/mx3SDf022uzWcX5l1C8YIL7Jwg097mjnRkw5vEt35pWktVlotaeQO7kjj7BArZB9kVFPsu0pLf38e9LoAyNuclKAI/EwX6WnGgRrj/jtbZlWB/w8qNatdRBy+5TzC0zMEWFlLNjatqbpbf7G58sFZ5hPXyqwqLk0O9CzxsciRWh71FY9rqivrWCNJ/7jXJlaFo9t9UeJDr1XQ0MrSQtYB2LhMRyDE8XPuy3AKKSZnWjQA==

This experiment shows that even with a relatively large sample size, it's impossible to know whether a sample reflects the true underlying population distribution. First of all, we have to remember that, by definition, a 95% confidence interval will fail to include the true population mean 5% of the time. 5% may not seem all that likely, but when many samples are drawn—as is the case in a presidential election year, when many, many polls are taken—some samples will inevitably miss the mark. Furthermore, in cases when a certain value serves as a hard threshold—such as in an election, where getting 50.01% of the vote wins and getting 49.99% loses—this exercise clearly shows that samples will "call" the result "wrong" even more than 95% of the time.