If the correlation coefficient is zero the slope of a linear regression line will be

The correlation coefficient r is directly related to the coefficient of determination r2 in the obvious way. If r2 is represented in decimal form, e.g. 0.39 or 0.87, then all we have to do to obtain r is to take the square root of r2:

\[r= \pm \sqrt{r^2}\]

The sign of r depends on the sign of the estimated slope coefficient b1:

  • If b1 is negative, then r takes a negative sign.
  • If b1 is positive, then r takes a positive sign.

That is, the estimated slope and the correlation coefficient r always share the same sign. Furthermore, because r2 is always a number between 0 and 1, the correlation coefficient r is always a number between -1 and 1.

One advantage of r is that it is unitless, allowing researchers to make sense of correlation coefficients calculated on different data sets with different units. The "unitless-ness" of the measure can be seen from an alternative formula for r, namely:

\[r=\frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2\sum_{i=1}^{n}(y_i-\bar{y})^2}}\]

If x is the height of an individual measured in inches and y is the weight of the individual measured in pounds, then the units for the numerator is inches × pounds. Similarly, the units for the denominator is inches × pounds. Because they are the same, the units in the numerator and denominator cancel each other out, yielding a "unitless" measure.

Another formula for r that you might see in the regression literature is one that illustrates how the correlation coefficient r is a function of the estimated slope coefficient b1:

\[r=\frac{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}{\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}}\times b_1\]

We are readily able to see from this version of the formula that:

  • The estimated slope b1 of the regression line and the correlation coefficient r always share the same sign. If you don't see why this must be true, view this screencast.
  • The correlation coefficient r is a unitless measure. If you don't see why this must be true, view this screencast.
  • If the estimated slope b1 of the regression line is 0, then the correlation coefficient r must also be 0.

That's enough with the formulas! As always, we will let statistical software such as R or Minitab do the dirty calculations for us. For the skin cancer mortality and latitude example (skincancer.txt), the correlation between skin cancer mortality and latitude is -0.825. It doesn't matter the order in which you specify the variables, so the correlation between latitude and skin cancer mortality is also -0.825. What does this correlation coefficient tells us? That is, how do we interpret the Pearson correlation coefficient r? In general, there is no nice practical operational interpretation for r as there is for r2. You can only use r to make a statement about the strength of the linear relationship between x and y. In general:

  • If r = -1, then there is a perfect negative linear relationship between x and y.
  • If r = 1, then there is a perfect positive linear relationship between x and y.
  • If r = 0, then there is no linear relationship between x and y.

All other values of r tell us that the relationship between x and y is not perfect. The closer r is to 0, the weaker the linear relationship. The closer r is to -1, the stronger the negative linear relationship. And, the closer r is to 1, the stronger the positive linear relationship. As is the case for the r2 value, what is deemed a "large" correlation coefficient r value depends greatly on the research area.

So, what does the correlation of -0.825 between skin cancer mortality and latitude tell us? It tells us:

  • The relationship is negative. As the latitude increases, the skin cancer mortality rate decreases (linearly).
  • The relationship is quite strong (since the value is pretty close to -1)

In general, there does not appear to be any relationship. As the OP notes, the correlation coefficient has the same sign as the usual regression slope when the intercept is included, but the slope in the no intercept model can be opposite. Here is an example:

x = c(72, 72, 70, 69, 72, 64, 72, 69, 69, 67, 70, 76, 71, 72) y = c(6, 11, 7, 10, 12, 19, 10, 16, 9, 8, 5, 4, 7, 10) cor(x,y) # Negative fit = lm(y ~ x) summary(fit) # Negative and significant slope fit.noint = lm(y ~ x-1) summary(fit.noint) # Positive and significant slope

The correlation and the ordinary slope are negative (and "significant"), while the no-intercept slope is positive (and "significant").

The difference is easily explained by the fact that the no intercept model actually forces the intercept to be zero, which can be seen graphically as follows:

slope.noint = fit.noint$coefficients plot(x,y, main="Regression Fits With and Without Intercept") abline(lsfit(x, y), lwd=2) abline(0, slope.noint, lwd=2, col="red") legend("topright", c("Fit with Int", "Fit w/o Int"), lwd=c(2:2), col = c("black", "red"))

To see more clearly what is going on, it is helpful to expand the axes so that the origin (0,0) is included. The no intercept line must pass through the origin.

plot(x,y, xlim = c(0,100), ylim = c(0,20), main="Regression Fits With and Without Intercept") abline(lsfit(x, y), lwd=2) abline(0, slope.noint, lwd=2, col="red") legend("topleft", c("Fit with Int", "Fit w/o Int"), lwd=c(2:2), col = c("black", "red"))

The resulting graph shows the difference more clearly:

What is the slope of a linear regression If the correlation coefficient is zero?

the intercept of the regression line is zero. there is a perfect linear relationship. the slope of the regression line is zero.

When correlation coefficient is zero What is the nature of the regression lines?

(1) If the correlation coefficient rxy=0, then the two lines of regression are parallel to each other.

What happens if the slope of the regression line is 0?

If the slope is zero, y does not change, thus is constant—a horizontal line. Vertical lines are problematic in that there is no change in x.

What does a slope coefficient of zero mean?

If the slope is negative, then there is a negative linear relationship, i.e., as one increases the other variable decreases. If the slope is 0, then as one increases, the other remains constant, i.e., no predictive relationship.

Toplist

Neuester Beitrag

Stichworte