Chapter 1. Correlation and Regression


Statistical Applets

Click on the graphing area to create a scatterplot of data points. Click again on a previously-added point to remove it, or drag the point to move it around. The correlation coefficient for the data you enter will be shown on the left. Click the checkboxes to show the least-squares regression line for your data, the mean values of X and Y, and the residual values for each data point.

Click "Draw your own line" to select starting and ending points for your own line on the plot. The "relative sum of squares" for your line, as compared to the least-squares regression line, will then be calculated and shown.

Click the "Quiz Me" button to complete the activity.

A scatterplot displays the form, direction, and strength of the relationship between two quantitative variables. Straight-line (linear) relationships are particularly important because a straight line is a simple pattern that is quite common. The correlation measures the direction and strength of the linear relationship. The least-squares regression line is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible (these vertical distances, from each data point to the least-squares regression line, are called the residual values).

This applet lets you explore how the correlation and least-squares regression line changes as points are added or subtracted from a scatterplot.

Question 1.1

To answer this question you must add points to the scatterplot above so that the correlation is between 0.5 and 0.7.
Great job.

Question 1.2

Now add another datapoint, in the upper-left corner of the scatterplot. This causes the correlation coefficient to V5+w3BtIB2S+8cT6gv5IZuDJHdOjm5yBdGLIGDIKyky086gczhM/YuWBLxtMkc7MPc8ohQ==. Then click and drag this point down to the lower-left corner of the scatterplot. As you move the point down, the correlation coefficient nuwyTvvIcF+FAiHQ9bgfDx3rxLFM8wAQKGGDQe8xVvNOYwumJ009xIjGo60olD059G2Y9GzjrLg=.

Try again.
Incorrect. See above for the correct answers.
Great job.

Question 1.3

Since the least-squares regression line in the original scatterplot has a positive slope, a new data point in the upper-left corner of the scatterplot represents an outlier that "pulls" the regression line up, reducing the correlation coefficient. However, as this point moves down toward the original regression line, it becomes less of an outlier, and the correlation coefficient subsequently becomes larger.