When data is symmetric or approximately symmetric Why is the mean preferred to the median?

Recommended textbook solutions

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

Statistical Techniques in Business and Economics

15th EditionDouglas A. Lind, Samuel A. Wathen, William G. Marchal

1,236 solutions

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

Mathematical Statistics with Applications

7th EditionDennis Wackerly, Richard L. Scheaffer, William Mendenhall

3,341 solutions

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

The Basic Practice of Statistics

6th EditionDavid Moore

970 solutions

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

Introductory Statistics

7th EditionPrem S. Mann

1,450 solutions

No doubt you have been told otherwise, but mean $=$ median does not imply symmetry.

There's a measure of skewness based on mean minus median (the second Pearson skewness), but it can be 0 when the distribution is not symmetric (like any of the common skewness measures).

Similarly, the relationship between mean and median doesn't necessarily imply a similar relationship between the midhinge ($(Q_1+Q_3)/2$) and median. They can suggest opposite skewness, or one may equal the median while the other doesn't.

One way to investigate symmetry is via a symmetry plot*.

If $Y_{(1)}, Y_{(2)}, ..., Y_{(n)}$ are the ordered observations from smallest to largest (the order statistics), and $M$ is the median, then a symmetry plot plots $Y_{(n)}-M$ vs $M-Y_{(1)}$, $Y_{(n-1)}-M$ vs $M-Y_{(2)}$ , ... and so on.

* Minitab can do those. Indeed I raise this plot as a possibility because I've seen them done in Minitab.

Here are four examples:

$\hspace{6cm} \textbf{Symmetry plots}$

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

(The actual distributions were (left to right, top row first) - Laplace, Gamma(shape=0.8), beta(2,2) and beta(5,2). The code is Ross Ihaka's, from here)

With heavy-tailed symmetric examples, it's often the case that the most extreme points can be very far from the line; you would pay less attention to the distance from the line of one or two points as you near the top right of the figure.

There are of course, other plots (I mentioned the symmetry plot not from a particular sense of advocacy of that particular one, but because I knew it was already implemented in Minitab). So let's explore some others.

Here's the corresponding skewplots that Nick Cox suggested in comments:

$\hspace{6cm} \textbf{Skewness plots}$

When data is symmetric or approximately symmetric Why is the mean preferred to the median?

In these plots, a trend up would indicate a typically heavier right tail than left and a trend down would indicate a typically heavier left tail than right, while symmetry would be suggested by a relatively flat (though perhaps fairly noisy) plot.

Nick suggests that this plot is better (specifically "more direct"). I am inclined to agree; the interpretation of the plot seems consequently a little easier, though the information in the corresponding plots are often quite similar (after you subtract the unit slope in the first set, you get something very like the second set).

[Of course, none of these things will tell us that the distribution the data were drawn from is actually symmetric; we get an indication of how near-to-symmetric the sample is, and so to that extent we can judge if the data are reasonably consistent with being drawn from a near-symmetrical population.]

What Is Symmetrical Distribution?

A symmetrical distribution occurs when the values of variables appear at regular frequencies and often the mean, median, and mode all occur at the same point. If a line were drawn dissecting the middle of the graph, it would reveal two sides that mirror one other.

In graphical form, symmetrical distributions may appear as a normal distribution (i.e., bell curve). Symmetrical distribution is a core concept in technical trading as the price action of an asset is assumed to fit a symmetrical distribution curve over time.

Symmetrical distributions can be contrasted with asymmetrical distributions, which is a probability distribution that exhibits skewness or other irregularities in its shape.

Key Takeaways

  • A symmetrical distribution is one where splitting the data down the middle produces mirror images.
  • Bell curves are a commonly-cited example of symmetrical distributions.
  • Having a symmetrical distribution is useful for analyzing data and making inferences based on statistical techniques.
  • In finance, data-generating processes with symmetrical distributions can help inform trading decisions.
  • Real-world price data, however, tend to exhibit asymmetrical qualities such as right-skewness.

What Does a Symmetrical Distribution Tell You?

Symmetrical distributions are used by traders to establish the value area for a stock, currency, or commodity on a set time frame. This time frame can be intraday, such as 30-minute intervals, or it can be longer-term using sessions or even weeks and months. A bell curve can be drawn around the price points hit during that time period and it is expected that most of the price action—approximately 68% of price points—will fall within one standard deviation of the center of the curve. The curve is applied to the y-axis (price) as it is the variable whereas time throughout the period is simply linear. So the area within one standard deviation of the mean is the value area where price and the actual value of the asset are most closely matched.

If the price action takes the asset price out of the value area, then it suggests that price and value are out of alignment. If the breach is to the bottom of the curve, the asset is considered to be undervalued. If it is to the top of the curve, the asset is to be overvalued. The assumption is that the asset will revert to the mean over time. When traders speak of reversion to the mean, they are referring to the symmetrical distribution of price action over time that fluctuates above and below the average level.

The central limit theorem states that the distribution of sample approximates a normal distribution (i.e., becomes symmetric) as the sample size becomes larger, regardless of the population distribution—including asymmetric ones.

Example of How Symmetrical Distribution Is Used

Symmetrical distribution is most often used to put price action into context. The further the price action wanders from the value area one standard deviation on each side of the mean, the greater the probability that the underlying asset is being under or overvalued by the market. This observation will suggest potential trades to place based on how far the price action has wandered from the mean for the time period being used. On larger time scales, however, there is a much greater risk of missing the actual entry and exit points.

Image by Julie Bang © Investopedia 2019 

Symmetrical Distributions vs. Asymmetrical Distributions

The opposite of symmetrical distribution is asymmetrical distribution. A distribution is asymmetric if it is not symmetric with zero skewness; in other words, it does not skew. An asymmetric distribution is either left-skewed or right-skewed. A left-skewed distribution, which is known as a negative distribution, has a longer left tail. A right-skewed distribution, or a positively skewed distribution, has a longer right tail. Determining whether the mean is positive or negative is important when analyzing the skew of a data set because it affects data distribution analysis. A log-normal distribution is a commonly-cited asymmetrical distribution featuring right-skew.

Skewness is often an important component of a trader’s analysis of a potential investment return. A symmetrical distribution of returns is evenly distributed around the mean. An asymmetric distribution with a positive right skew indicates that historical returns that deviated from the mean were primarily concentrated on the bell curve’s left side.

Conversely, a negative left skew shows historical returns deviating from the mean concentrated on the right side of the curve.

Normal vs. Skewed.

Image by Sabrina Jiang © Investopedia 2020

Limitations of Using Symmetrical Distributions

A common investment refrain is that past performance does not guarantee future results; however, past performance can illustrate patterns and provide insight for traders looking to make a decision about a position. Symmetrical distribution is a general rule of thumb, but no matter the time period used, there will often be periods of asymmetrical distribution on that time scale. This means that, although the bell curve will generally return to symmetry, there can be periods of asymmetry that establish a new mean for the curve to center on. This also means that trading based solely on the value area of a symmetrical distribution can be risky if the trades are not confirmed by other technical indicators.

What Is the Relationship Between Mean, Median, and Mode in a Symmetrical Distribution?

In a symmetrical distribution, all three of these descriptive statistics tend to be the same value, for instance in a normal distribution (bell curve). This also holds in other symmetric distributions such as the uniform distribution (where all values are identical; depicted simply as a horizontal line) or the binomial distribution, which accounts for discrete data that can only take on one of two values (e.g., zero or one, yes or no, true or false, etc.).

On rare occasions, a symmetrical distribution may have two modes (neither of which are the mean or median), for instance in one that would appear like two identical hilltops equidistant from one another.

Is the Median Symmetric?

The median describes the point at which 50% of data values lie above, and 50% lie below. Thus it is the mid-point of the data. In a symmetrical distribution, the median will always be the mid-point and create a mirror image with the median in the middle. This is not the case for an asymmetric distribution.

What Is the Shape of a Frequency Distribution?

The "shape" of the frequency distribution of data is simply its graphical representation (e.g. as a bell curve, etc.). Visualizing the shape of the data can help analysts quickly understand if it is symmetrical or not.

What Is Symmetric vs. Asymmetric Data?

Symmetric data is observed when the values of variables appear at regular frequencies or intervals around the mean. Asymmetric data, on the other hand, may have skewness or noise such that the data appears at irregular or haphazard intervals.

When data is symmetric Why is the mean preferred to the median?

Of the three measures of tendency, the mean is most heavily influenced by any outliers or skewness. In a symmetrical distribution, the mean, median, and mode are all equal. In these cases, the mean is often the preferred measure of central tendency.

Why is the mean better preferred than the median?

It's best to use the mean when the distribution of the data values is symmetrical and there are no clear outliers. It's best to use the median when the the distribution of data values is skewed or when there are clear outliers.

Do you use mean or median for symmetric?

We have two measurements for the center of a distribution. The mean and the median. We use the mean as a measurement for center when a distribution is symmetric/bell-shaped/normal. We use the median when the distribution is skewed.

Why is the median sometimes preferred?

For example, the median is often used as a measure of central tendency for income distributions, which are generally highly skewed. Because the median only uses one or two values, it's unaffected by extreme outliers or non-symmetric distributions of scores.