Let's take a look at some examples so we can get some practice interpreting the coefficient of determination r2 and the correlation coefficient r. Show
Example 1. How strong is the linear relationship between temperatures in Celsius and temperatures in Fahrenheit? Here's a plot of an estimated regression equation based on n = 11 data points: Statistical software reports that r2 = 100% and r = 1.000. Both measures tell us that there is a perfect linear relationship between temperature in degrees Celsius and temperature in degrees Fahrenheit. We know that the relationship is perfect, namely that Fahreheit = 32 + 1.8 × Celsius. It should be no surprise then that r2 tells us that 100% of the variation in temperatures in Fahrenheit is explained by the temperature in Celsius. Example 2. How strong is the linear relationship between the number of stories a building has and its height? One would think that as the number of stories increases, the height would increase, but not perfectly. Some statisticians compiled data on a set of n = 60 buildings reported in the 1994 World Almanac (bldgstories.txt). Statistical software reports r2 = 90.4% and r = 0.951 and produced the following plot: The positive sign of r tells us that the relationship is positive — as number of stories increases, height increases — as we expected. Because r is close to 1, it tells us that the linear relationship is very strong, but not perfect. The r2 value tells us that 90.4% of the variation in the height of the building is explained by the number of stories in the building. Example 3. How strong is the linear relationship between the age of a driver and the distance the driver can see? If we had to guess, we might think that the relationship is negative — as age increases, the distance decreases. A research firm (Last Resource, Inc., Bellefonte, PA) collected data on a sample of n = 30 drivers (signdist.txt).Statistical software reports that reports that r2 = 64.2% and r = -0.801 and produced the following output: The negative sign of r tells us that the relationship is negative — as driving age increases, seeing distance decreases — as we expected. Because r is fairly close to -1, it tells us that the linear relationship is fairly strong, but not perfect. The r2 value tells us that 64.2% of the variation in the seeing distance is reduced by taking into account the age of the driver. Example 4. How strong is the linear relationship between the height of a student and his or her grade point average? Data were collected on a random sample of n = 35 students in a statistics course at Penn State University (heightgpa.txt).Statistical software reports that r2 = 0.3% and r = -0.053 and produced the following output: Because r is quite close to 0, it suggests — not surprisingly, I hope — that there is next to no linear relationship between height and grade point average. Indeed, the r2 value tells us that only 0.3% of the variation in the grade point averages of the students in the sample can be explained by their height. In short, we would need to identify another more important variable, such as number of hours studied, if predicting a student's grade point average is important to us. Published on April 22, 2022 by Shaun Turney. Revised on September 14, 2022. The coefficient of determination is a number between 0 and 1 that measures how well a statistical
model predicts an outcome. The coefficient of determination is often written as R2, which is pronounced as “r squared.” For simple linear regressions, a lowercase r is usually used instead (r2). The coefficient of determination (R²) measures how well a
statistical model predicts an outcome. The outcome is represented by the model’s dependent variable. The lowest possible value of R² is 0 and the highest possible value is 1. Put simply, the better a model is at making
predictions, the closer its R² will be to 1. More technically, R2 is a measure of goodness of fit. It is the proportion of variance in the dependent variable that is explained by the model. Graphing your linear regression data usually gives you a good clue as to whether its R2 is high
or low. For example, the graphs below show two sets of simulated data: You can see in the first dataset that when the R2 is high, the observations are close to the model’s
predictions. In other words, most points are close to the line of best fit: In contrast, you can see in the second dataset that when the R2 is low, the observations are far from the model’s predictions. In other words, when the R2 is low, many points are far from the line of best fit: Calculating the coefficient of determinationYou can choose between two formulas to calculate the coefficient of determination (R²) of a simple linear regression. The first formula is specific to simple linear regressions, and the second formula can be used to calculate the R² of many types of statistical models. Formula 1: Using the correlation coefficientFormula 1:
Where r = Pearson correlation coefficient Example: Calculating R² using the correlation coefficientYou are studying the relationship between heart rate and age in children, and you find that the two variables have a negative Pearson correlation:
This value can be used to calculate the coefficient of determination (R²) using Formula 1:
Formula 2: Using the regression outputsFormula 2:
Where:
These values can be used to calculate the coefficient of determination (R²) using Formula 2:
What can proofreading do for your paper?Scribbr editors not only correct grammar and spelling mistakes, but also strengthen your writing by making sure your paper is free of vague language, redundant words and awkward phrasing. See editing example Interpreting the coefficient of determinationYou can interpret the coefficient of determination (R²) as the proportion of variance in the dependent variable that is predicted by the statistical model. Another way of thinking of it is that the R² is the proportion of variance that is shared between the independent and dependent variables. You can also say that the R² is the proportion of variance “explained” or “accounted for” by the model. The proportion that remains (1 − R²) is the variance that is not predicted by the model. If you prefer, you can write the R² as a percentage instead of a proportion. Simply multiply the proportion by 100. R² as an effect sizeLastly, you can also interpret the R² as an effect size: a measure of the strength of the relationship between the dependent and independent variables. Psychologist and statistician Jacob Cohen (1988) suggested the following rules of thumb for simple linear regressions: R² as an effect size
Be careful: the R² on its own can’t tell you anything about causation. Example: Interpreting R²A simple linear regression that predicts students’ exam scores (dependent variable) from their study time (independent variable) has an R² of .71. From this R² value, we know that:
Studying longer may or may not cause an improvement in the students’ scores. Although this causal relationship is very plausible, the R² alone can’t tell us why there’s a relationship between students’ study time and exam scores. For example, students might find studying less frustrating when they understand the course material well, so they study longer. Reporting the coefficient of determinationIf you decide to include a coefficient of determination (R²) in your research paper, dissertation or thesis, you should report it in your results section. You can follow these rules if you want to report statistics in APA Style:
Practice questionsFrequently asked questions about the coefficient of determinationCite this Scribbr articleIf you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Is this article helpful?You have already voted. Thanks :-) Your vote is saved :-) Processing your vote... Is correlation coefficient same as correlation of determination?The Pearson correlation coefficient (r) is used to identify patterns in things whereas the coefficient of determination (R²) is used to identify the strength of a model.
Is correlation equal to R2?The correlation, denoted by r, measures the amount of linear association between two variables. r is always between -1 and 1 inclusive. The R-squared value, denoted by R 2, is the square of the correlation.
Can both the coefficient of determination and correlation be negative?Because the coefficient of determination is the result of squaring the correlation coefficient, the coefficient of determination cannot be negative. (Even if the correlation is negative, squaring it will result in a positive number.)
What does it mean if the coefficient of determination R2 is equal to?The coefficient of determination is a number between 0 and 1 that measures how well a statistical model predicts an outcome.
...
Coefficient of Determination (R²) | Calculation & Interpretation.. |