Which of the following is found by subtracting the smallest number in a data set from the largest?

How to Find the Range of a Data Set | Formula & Examples

Published on September 11, 2020 by Pritha Bhandari. Revised on September 25, 2020.

In statistics, the range is the spread of your data from the lowest to the highest value in the distribution. It is a commonly used measure of variability.

Along with measures of central tendency, measures of variability give you descriptive statistics for summarizing your data set.

The range is calculated by subtracting the lowest value from the highest value. While a large range means high variability, a small range means low variability in a distribution.

Calculate the range

The formula to calculate the range is:

Which of the following is found by subtracting the smallest number in a data set from the largest?

  • R = range
  • H = highest value
  • L = lowest value

The range is the easiest measure of variability to calculate. To find the range, follow these steps:

  1. Order all values in your data set from low to high.
  2. Subtract the lowest value from the highest value.

This process is the same regardless of whether your values are positive or negative, or whole numbers or fractions.

Range exampleYour data set is the ages of 8 participants.
Participant12345678
Age37 19 31 29 21 26 33 36

First, order the values from low to high to identify the lowest value (L) and the highest value (H).

Age19 21 26 29 31 33 36 37

Then subtract the lowest from the highest value.

R = HL

R = 3719= 18

The range of our data set is 18 years.

How useful is the range?

The range generally gives you a good indicator of variability when you have a distribution without extreme values. When paired with measures of central tendency, the range can tell you about the span of the distribution.

But the range can be misleading when you have outliers in your data set. One extreme value in the data will give you a completely different range.

Range example with an outlier One value in your data set is replaced with an outlier.
Age19 21 26 29 31 33 36 61

Using the same calculation, we get a very different result this time:

R = H – L

R = 61 – 19 = 42

With an outlier, our range is now 42 years.

In the example above, the range indicates much more variability in the data than there actually is. Although we have a large range, most values are actually clustered around a clear middle.

Because only two numbers are used, the range is easily influenced by outliers. It can’t tell you about the shape of the distribution of values on its own.

To get a clear idea of your data’s variability, the range is best used in combination with other measures of variability like interquartile range and standard deviation.

Frequently asked questions about the range

Is this article helpful?

You have already voted. Thanks :-) Your vote is saved :-) Processing your vote...

4.5 Measures of dispersion 4.5.1 Calculating the range and interquartile range

Text begins

To calculate the range, you need to find the largest observed value of a variable (the maximum) and subtract the smallest observed value (the minimum). The range only takes into account these two values and ignore the data points between the two extremities of the distribution. It's used as a supplement to other measures, but it is rarely used as the sole measure of dispersion because it’s sensitive to extreme values.

The interquartile range and semi-interquartile range give a better idea of the dispersion of data. To calculate these two measures, you need to know the values of the lower and upper quartiles. The lower quartile, or first quartile (Q1), is the value under which 25% of data points are found when they are arranged in increasing order. The upper quartile, or third quartile (Q3), is the value under which 75% of data points are found when arranged in increasing order. The median is considered the second quartile (Q2). The interquartile range is the difference between upper and lower quartiles. The semi-interquartile range is half the interquartile range.

When the data set is small, it is simple to identify the values of quartiles. Let’s look at an example.

Example 1 – Range and interquartile range of a data set

Find the quartiles of this data set: 6, 47, 49, 15, 43, 41, 7, 39, 43, 41, 36.

You first need to arrange the data points in increasing order. As you do so, you can give them a rank to indicate their position in the data set. Rank 1 is the data point with the smallest value, rank 2 is the data point with the second-lowest value, etc.

Table 4.5.1.1
Rank of data points
Table summary
This table displays the results of Rank of data points. The information is grouped by Rank (appearing as row headers), Value (appearing as column headers).

RankValue
1  6
2  7
3  15
4  36
5  39
6  41
7  41
8  43
9  43
10  47
11  49

Then you need to find the rank of the median to split the data set in two. As we have seen in the section on the median, if the number of data points is an uneven value, the rank of the median will be

(n + 1) ÷ 2 = (11 + 1) ÷ 2 = 6

The rank of the median is 6, which means there are five points on each side.

Then you need to split the lower half of the data in two again to find the lower quartile. The lower quartile will be the point of rank (5 + 1) ÷ 2 = 3. The result is Q1 = 15. The second half must also be split in two to find the value of the upper quartile. The rank of the upper quartile will be 6 + 3 = 9. So Q3 = 43.

Once you have the quartiles, you can easily measure the spread. The interquartile range will be Q3 - Q1, which gives 28 (43-15). The semi-interquartile range is 14 (28 ÷ 2) and the range is 43 (49-6).

For larger data sets, you can use the cumulative relative frequency distribution to help identify the quartiles or, even better, the basic statistics functions available in a spreadsheet or statistical software that give results more easily.

What happens when the data set includes a data point whose value is considered extreme compared to the rest of the distribution?

Example 2 – Range and interquartile range in presence of an extreme value

Find the range and interquartile range of the data set of example 1, to which a data point of value 75 was added.

The range would now be 69 (75-6). The median would be the mean of the values of the data point of rank 12 ÷ 2 = 6 and the data point of rank (12 ÷ 2) + 1 = 7. Because it falls between ranks 6 and 7, there are six data points on each side of the median. The lower quartile is the mean of the values of the data point of rank 6 ÷ 2 = 3 and the data points of rank (6 ÷ 2) + 1 = 4. The result is (15 + 36) ÷ 2 = 25.5. The upper quartile is the mean of the values of data point of rank 6 + 3 = 9 and the data point of rank 6 + 4 = 10, which is (43 + 47) ÷ 2 = 45. The interquartile range is 45 - 25.5 = 19.5.

In summary, the range went from 43 to 69, an increase of 26 compared to example 1, just because of a single extreme value. The more robust interquartile range went from 28 to 19.5, a decrease of only 8.5.

The second example demonstrated that the interquartile range is more robust than the range when the data set includes a value considered extreme. It’s not a perfect measure, though. In this example, we might have expected that when adding an extreme value, the measure of dispersion would increase, but the opposite happened because there was a great difference between the values of data points of ranks 3 and 4.

The five-value series formed by the minimum, the three quartiles and the maximum is often referred to as “the five-number summary.” It is a well-known manner to summarize data sets. In the following section on box and whisker plot, we will see a useful method to visualize this five-number summary.

Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

Privacy notice

Date modified: 2021-09-02

Which of the following provides a measure of spread?

The variance and the standard deviation are measures of the spread of the data around the mean.

Which of the following represents the most frequently occurring score?

Mode. The mode is defined as the most frequently occurring score. If the data are arranged in a frequency distribution similar to illustration 4, then the mode is easy to identify.

What do you call the number which appears most often in a given set of scores or data?

The mode is the value that appears most frequently in a data set.

What is the middle value in a set of data?

from least to greatest or greatest to least; the median is the data value in the middle; if there is an even number of data values in the set, the median is the mean of the two middle values.