Is the coefficient of determination is a positive value then the regression equation?

What Is the Coefficient of Determination?

The coefficient of determination is a statistical measurement that examines how differences in one variable can be explained by the difference in a second variable, when predicting the outcome of a given event. In other words, this coefficient, which is more commonly known as R-squared (or R2), assesses how strong the linear relationship is between two variables, and is heavily relied on by researchers when conducting trend analysis. To cite an example of its application, this coefficient may contemplate the following question: if a woman becomes pregnant on a certain day, what is the likelihood that she would deliver her baby on a particular date in the future? In this scenario, this metric aims to calculate the correlation between two related events: conception and birth.

R-Squared

Key Takeaways

  • The coefficient of determination is a complex idea centered on the statistical analysis of models for data.
  • The coefficient of determination is used to explain how much variability of one factor can be caused by its relationship to another factor.
  • This coefficient is commonly known as R-squared (or R2), and is sometimes referred to as the "goodness of fit."
  • This measure is represented as a value between 0.0 and 1.0, where a value of 1.0 indicates a perfect fit, and is thus a highly reliable model for future forecasts, while a value of 0.0 would indicate that the model fails to accurately model the data at all. 

Understanding the Coefficient of Determination

The coefficient of determination is a measurement used to explain how much variability of one factor can be caused by its relationship to another related factor. This correlation, known as the "goodness of fit," is represented as a value between 0.0 and 1.0. A value of 1.0 indicates a perfect fit, and is thus a highly reliable model for future forecasts, while a value of 0.0 would indicate that the calculation fails to accurately model the data at all. But a value of 0.20, for example, suggests that 20% of the dependent variable is predicted by the independent variable, while a value of 0.50 suggests that 50% of the dependent variable is predicted by the independent variable, and so forth.

Graphing the Coefficient of Determination

On a graph, the goodness of fit measures the distance between a fitted line and all of the data points that are scattered throughout the diagram. The tight set of data will have a regression line that's close to the points and have a high level of fit, meaning that the distance between the line and the data is small. Although a good fit has an R2 close to 1.0, this number alone cannot determine whether the data points or predictions are biased. It also doesn't tell analysts whether the coefficient of determination value is intrinsically good or bad. It is at the discretion of the user to evaluate the meaning of this correlation, and how it may be applied in the context of future trend analyses.

Published on April 22, 2022 by Shaun Turney. Revised on September 14, 2022.

The coefficient of determination is a number between 0 and 1 that measures how well a statistical model predicts an outcome.

Interpreting the coefficient of determination
Coefficient of determination (R2)Interpretation
0 The model does not predict the outcome.
Between 0 and 1 The model partially predicts the outcome.
1 The model perfectly predicts the outcome.

The coefficient of determination is often written as R2, which is pronounced as “r squared.” For simple linear regressions, a lowercase r is usually used instead (r2).

What is the coefficient of determination?

The coefficient of determination (R²) measures how well a statistical model predicts an outcome. The outcome is represented by the model’s dependent variable.

The lowest possible value of R² is 0 and the highest possible value is 1. Put simply, the better a model is at making predictions, the closer its R² will be to 1.

Example: Coefficient of determinationImagine that you perform a simple linear regression that predicts students’ exam scores (dependent variable) from their time spent studying (independent variable).
  • If the R2 is 0, the linear regression model doesn’t allow you to predict exam scores any better than simply estimating that everyone has an average exam score.
  • If the R2 is between 0 and 1, the model allows you to partially predict exam scores. The model’s estimates are not perfect, but they’re better than simply using the average exam score.
  • If the R2 is 1, the model allows you to perfectly predict anyone’s exam score.

More technically, R2 is a measure of goodness of fit. It is the proportion of variance in the dependent variable that is explained by the model.

Graphing your linear regression data usually gives you a good clue as to whether its R2 is high or low. For example, the graphs below show two sets of simulated data:

  • The observations are shown as dots.
  • The model’s predictions (the line of best fit) are shown as a black line.
  • The distance between the observations and their predicted values (the residuals) are shown as purple lines.

You can see in the first dataset that when the R2 is high, the observations are close to the model’s predictions. In other words, most points are close to the line of best fit:

Is the coefficient of determination is a positive value then the regression equation?

Note: The coefficient of determination is always positive, even when the correlation is negative.

In contrast, you can see in the second dataset that when the R2 is low, the observations are far from the model’s predictions. In other words, when the R2 is low, many points are far from the line of best fit:

Is the coefficient of determination is a positive value then the regression equation?

Calculating the coefficient of determination

You can choose between two formulas to calculate the coefficient of determination (R²) of a simple linear regression. The first formula is specific to simple linear regressions, and the second formula can be used to calculate the R² of many types of statistical models.

Formula 1: Using the correlation coefficient

Formula 1:

   

Is the coefficient of determination is a positive value then the regression equation?

Where r = Pearson correlation coefficient

Example: Calculating R² using the correlation coefficientYou are studying the relationship between heart rate and age in children, and you find that the two variables have a negative Pearson correlation:

   

Is the coefficient of determination is a positive value then the regression equation?

This value can be used to calculate the coefficient of determination (R²) using Formula 1:

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

Formula 2: Using the regression outputs

Formula 2:

   

Is the coefficient of determination is a positive value then the regression equation?

Where:

  • RSS = sum of squared residuals
  • TSS = total sum of squares
Example: Calculating R² using regression outputsAs part of performing a simple linear regression that predicts students’ exam scores (dependent variable) from their study time (independent variable), you calculate that:

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

These values can be used to calculate the coefficient of determination (R²) using Formula 2:

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

   

Is the coefficient of determination is a positive value then the regression equation?

Interpreting the coefficient of determination

You can interpret the coefficient of determination (R²) as the proportion of variance in the dependent variable that is predicted by the statistical model.

Another way of thinking of it is that the R² is the proportion of variance that is shared between the independent and dependent variables.

You can also say that the R² is the proportion of variance “explained” or “accounted for” by the model. The proportion that remains (1 − R²) is the variance that is not predicted by the model.

If you prefer, you can write the R² as a percentage instead of a proportion. Simply multiply the proportion by 100.

R² as an effect size

Lastly, you can also interpret the R² as an effect size: a measure of the strength of the relationship between the dependent and independent variables. Psychologist and statistician Jacob Cohen (1988) suggested the following rules of thumb for simple linear regressions:

R² as an effect size
Minimum coefficient of determination (R²) valueEffect size interpretation
.01 Small
.09 Medium
.25 Large

Be careful: the R² on its own can’t tell you anything about causation.

Example: Interpreting R²A simple linear regression that predicts students’ exam scores (dependent variable) from their study time (independent variable) has an R² of .71. From this R² value, we know that:
  • 71% of the variance in students’ exam scores is predicted by their study time
  • 29% of the variance in student’s exam scores is unexplained by the model
  • The students’ study time has a large effect on their exam scores

Studying longer may or may not cause an improvement in the students’ scores. Although this causal relationship is very plausible, the R² alone can’t tell us why there’s a relationship between students’ study time and exam scores.

For example, students might find studying less frustrating when they understand the course material well, so they study longer.

Reporting the coefficient of determination

If you decide to include a coefficient of determination (R²) in your research paper, dissertation or thesis, you should report it in your results section. You can follow these rules if you want to report statistics in APA Style:

  • You should use “r²” for statistical models with one independent variable (such as simple linear regressions). Use “R²” for statistical models with multiple independent variables.
  • You don’t need to provide a reference or formula since the coefficient of determination is a commonly used statistic.
  • You should italicize r² and R² when reporting their values (but don’t italicize the ²).
  • You shouldn’t include a leading zero (a zero before the decimal point) since the coefficient of determination can’t be greater than one.
  • You should provide two significant digits after the decimal point.
  • Very often, the coefficient of determination is provided alongside related statistical results, such as the F value, degrees of freedom, and p value.
Example: Reporting r² in APA StyleStudents’ exam scores were predicted by their study time, r² = .71, F(1,32) = 7.33, p = .003.

Practice questions

Frequently asked questions about the coefficient of determination

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

This Scribbr article

Turney, S. (September 14, 2022). Coefficient of Determination (R²) | Calculation & Interpretation. Scribbr. Retrieved October 24, 2022, from https://www.scribbr.com/statistics/coefficient-of-determination/

Is this article helpful?

You have already voted. Thanks :-) Your vote is saved :-) Processing your vote...

When the coefficient of determination is a positive value then the regression equation?

Answer and Explanation: If the coefficient of determination is a positive value, then the regression equation A. must have a positive slope.

What is the regression equation if the coefficient of determination?

The coefficient of determination can also be found with the following formula: R2 = MSS/TSS = (TSS − RSS)/TSS, where MSS is the model sum of squares (also known as ESS, or explained sum of squares), which is the sum of the squares of the prediction from the linear regression minus the mean for that variable; TSS is the ...

When coefficient of determination is a positive value?

Positive coefficients of determination indicate that there is a positive relationship- y generally increases with x. Negative coefficients of determination indicate that there is a negative relationship- y generally decreases as x increases.

Is coefficient of determination the same as regression?

The coefficient of determination (R² or r-squared) is a statistical measure in a regression model that determines the proportion of variance in the dependent variable that can be explained by the independent variable.