Exactly –1. A perfect downhill (negative) linear relationship –0.70. A strong downhill (negative) linear
relationship –0.50. A moderate downhill (negative) relationship –0.30. A weak downhill (negative) linear relationship 0. No linear relationship +0.30. A weak uphill (positive) linear relationship +0.50. A moderate uphill (positive) relationship +0.70. A strong uphill (positive) linear relationship Exactly +1. A perfect uphill (positive) linear
relationship If the scatterplot doesn’t show that there’s at least somewhat of a linear relationship, the correlation doesn’t mean much. Why measure the amount of linear relationship if there isn’t much of one? However, you can think of this idea of no linear relationship in two ways: 1) If no relationship at all exists, calculating the correlation doesn’t make sense because correlation only applies to linear relationships, and 2) If a strong relationship exists but it’s
not linear, the correlation may be misleading, because in some cases a strong curved relationship exists. That’s why it’s critical to check out the scatterplot first.
Scatterplots with correlations of a) +1.00; b) –0.50; c) +0.85; and d) +0.15
The above figure shows examples of what various correlations look like, in terms of the strength and direction of the relationship. Figure (a) shows a correlation of nearly +1, Figure (b) shows a correlation of –0.50, Figure (c) shows a correlation of +0.85, and Figure (d) shows a correlation of +0.15.Comparing Figures (a) and (c), you see Figure (a) is nearly a perfect uphill straight line, and Figure (c) shows a very strong uphill linear pattern (but not as strong as Figure (a)). Figure (b) is going downhill, but the points are somewhat scattered in a wider band, showing a linear relationship is present, but not as strong as in Figures (a) and (c). Figure (d) doesn’t show much of anything happening (and it shouldn’t, since its correlation is very close to 0).
Many folks make the mistake of thinking that a correlation of –1 is a bad thing, indicating no relationship. Just the opposite is true! A correlation of –1 means the data are lined up in a perfect straight line, the strongest negative linear relationship you can get. The “–” (minus) sign just happens to indicate a negative relationship, a downhill line.
How close is close enough to –1 or +1 to indicate a strong enough linear relationship? Most statisticians like to see correlations beyond at least +0.5 or –0.5 before getting too excited about them. Don’t expect a correlation to always be 0.99 however; remember, these are real data, and real data aren’t perfect.
About This Article
This article is from the book:
- Statistics For Dummies ,
About the book author:
Deborah J. Rumsey, PhD, is an Auxiliary Professor and Statistics Education Specialist at The Ohio State University. She is the author of Statistics For Dummies, Statistics II For Dummies, Statistics Workbook For Dummies, and Probability For Dummies.
This article can be found in the category:
- Statistics ,
- X causes Y
- Y causes X
- X causes Y and Y causes X
- Some third variable Z causes X and Y
- The correlation is a coincidence; there is no causal relationship between X and Y.
- The more firemen that are fighting a fire, the bigger the fire is going to be.
The actual causation is Y → X: The bigger the fire is, the more firemen are necessary to fight it.
- For a gas, an increase in pressure causes an increase in temperature.
This is Charles' Law for an ideal gas. In fact X → Y and Y → X. The causation works in both directions: an increase in either temperature or pressure causes an increase in the other. - Children that sleep with the light on are likely to develop nearsightedness later in life.
This result was published in a study in May 13, 1999, in the Journal Nature. In fact a follow up study showed that Z → X and Z → Y. There is a strong like between parental nearsightedness and child nearsightedness. Also, nearsighted parents were more likely to leave the light on in a child's room. - Women that take
hormone replacement therapy (HRT) are less likely to have coronary heart disease.
At first glance X → Y, but after controlling for the third variable socio-economic group, the opposite effect was found: women that take HRT were more likely to develop heart disease. - As ice cream sales increase, the rate of drowning deaths increase.
This is also a case of Z → X and Z → Y. Both events depend on the season of the year. In the summer months, ice cream sales increase; drowning deaths also increase because more people to swimming. - Piracy causes global warming.
It is true that both piracy and global warming have increased over the past several decades, but this is just a coincidence. There is no causal relationship. Another explanation is that both result from a common third factor: population increase.
- X must
precede Y in time,
- the causation must be plausable,
- common causes from other variables are controlled for.
-
Example 1: There is a perfect quadratic relationship between x and y, but the correlation is -0.368. A quadratic relationship between x and y means that there is an equation y = ax2 + bx + c that allows us to compute y from x. a, b, and c must be determined from the dataset.
- Example 2: Without the outlier, the correlation is 1; with the outlier the correlation is 0.514.
Example 3: Without the outlier, the correlation is 1; with the outlier the correlation is 0.522.
- Compute the correletion between meter and kilo.
- Create a scatterplot with a linear regression line (linear trend line) of meter (x-variable) and kilo (y-variable).
- Repeat steps 1 and 2 after omitting the point that represents William Perry.