Weighted Average

Weighted averages appear more often than one might realize. That said, there is often confusion about what they are and how, or when, to use them. This article will provide a brief overview of weighted averages in the hope of alleviating any confusion or misunderstandings that might exist.

Specifically, this article will cover:

top

Review: The Simple Average (or Mean)

The concept of an average, or mean, is familiar to most, if not all. For a given number of numeric items, the average is found by taking their sum then dividing by the number of items. That is, given \(n\) items, \(x_{1}, x_{2}, \dots , x_{n}\), the average is (commonly denoted as \(\overline{x}\) - read as \(x\) bar”):

\[ \overline{x} = \frac{\sum_{i=1}^{n} x_{i}}{n} \]

Consider the following example. There are seven (7) items, 62.5, 75, 80, 72.5, 79, 83 and 76. These are the \(x_{i}\)’s and \(n = 7\). The average, \(\bar{x}\), then, is:

\[ \begin{align} \overline{x} &= \frac{\sum_{i=1}^{7} x_{i}}{7} \\ \overline{x} &= \frac{62.5 + 75 + 80 + 72.5 + 79 + 83 + 76}{7} \\ \overline{x} &= \frac{528}{7} \\ \overline{x} &\approx 75.43 \end{align} \]

There are many ways to interpret what this number means. Some might refer to it as being akin to the center of mass. Others might interpret this number as the expected value of whatever these seven numbers represent with all things being equal. If, for example, these seven numbers represent house prices in a given neighborhood, then, with all things being equal, each house price in the neighborhood would be 75.43. In many ways, the interpretation of the average is contextually dependent.

The important idea for this discussion is that each of these seven numbers (whatever they represent) contributes the same amount to the average. Any one particular value is no more, or no less, important than any other number.

top

The Weighted Average

The above discussion on the simple average presented the idea that in the simple average each item contributes the same amount to the average. Returning to the seven item example, if each item contributes the same amount to the average then any one number would contribute one seventh to the average. Algebraically, this is readily seen:

\[ \begin{align} \overline{x} &= \frac{\sum_{i=1}^{7} x_{i}}{7} \\ \overline{x} &= \frac{62.5 + 75 + 80 + 72.5 + 79 + 83 + 76}{7} \\ \overline{x} &= \frac{1}{7}(62.5) + \frac{1}{7}(75) + \frac{1}{7}(80) + \frac{1}{7}(72.5) + \frac{1}{7}(79) + \frac{1}{7}(83) + \frac{1}{7}(76) \\ \overline{x} &\approx 8.93 + 10.71 + 11.43 + 10.36 + 11.23 + 11.86 + 10.86 \\ \overline{x} &\approx 75.43 \end{align} \]

The amount that each item contributes to average is called the weight. The interpretation of a weight can vary depending, perhaps, on the context. Weights are often thought of in terms of importance or relative value, but the best way, I believe, to think about a weight is in terms of contribution.

It would be instructive to take a closer look at the equation

\[ \begin{align} \overline{x} &= \frac{1}{7}(62.5) + \frac{1}{7}(75) + \frac{1}{7}(80) + \frac{1}{7}(72.5) + \frac{1}{7}(79) + \frac{1}{7}(83) + \frac{1}{7}(76) \\ \end{align} \]

The first thing to note is, as mentioned, the value \(\frac{1}{7}\) is the weight applied to each \(x_{i}\), which we can then refer to \(w_{i}\) (the weight applied to the \(i^{th}\) element). The second thing to note is that, in this example, the weights sum to one. This is not always the case, as will be discussed later, but for now suffice it so say that the weights will sum to value. So, putting these two things together, there is a weight associated with each item and the weights will sum to some number, then the definition of a weighted average becomes:

\[ \overline{x} = \frac{\sum_{i=1}^{n}w_{i}x_{i}}{\sum_{i=1}^{n} w_{i}} \]

One nice consequence, however, of when the weights do sum to one, then the equation

\[ \overline{x} = \frac{\sum_{i=1}^{n}w_{i}x_{i}}{\sum_{i=1}^{n} w_{i}} \]

simplifies quite nicely to

\[ \overline{x} = \sum_{i=1}^{n}w_{i}x_{i} \]

top

“Classic” Weighted Average Example

An example that most can relate to is how, when taking a course, is the final grade received computed. Assume that the seven numbers listed above (62.5, 75, 80, 72.5, 79, 83 and 76) represent grades received during the semester. Each number is a percentage. Assume further that the first four numbers, 62.5, 75, 80 and 72.5 represent the grades for the four quizzes given during the semester. The next two numbers, 79 and 83, represent the grades for the two tests given during the semester. And the last number, 76, is the grade on the final exam. At the beginning of the semester, the instructor outlined that quizzes would be worth 30% of the final grade, tests would be worth 40% and the final would be worth 30%.

What does this all this mean? These “worths” are the weights applied to each individual category (quizzes, tests and final exam). It is instructive to explore what this really means in terms of contribution.

Assume that each quiz counts the same as any other quiz and each test counts the same as any other test (which is usually the case). Since there are four quizzes, then each quiz grade contributes 0.3/4 towards the final grade (where 0.3 is the decimal representation of 30%). Since there are two tests, each test grade would contribute 0.4/2 towards the final grade. And since there is only one final, the final exam grade would contribute 0.3 towards the final grade. The computation of the final grade for the course would be the weighted average of all the grades received.

\[ \begin{align} \overline{x} &= \frac{0.3}{4}(62.5) + \frac{0.3}{4}(75) + \frac{0.3}{4}(80) + \frac{0.3}{4}(72.5) + \frac{0.4}{2}(79) + \frac{0.2}{4}(83) + \frac{0.3}{1}(76) \\ \overline{x} &= 76.95 \end{align} \]

The immediate thing to notice first is that this weighted average is different than the simple average of 75.43 as seen above. This is because of the contribution each grade made to the overall average. A second thing to note is that the weights sum to one. Hence, the computation of weighted average is simplified version of the weighted average:

\[ \overline{x} = \sum_{i=1}^{n}w_{i}x_{i} \]

where, again, \(w_{i}\) is the weight associated with the \(i^{th}\) element, \(x_{i}\).

This method of computing the final grade directly applies the weighted average concept. It is worth taking a closer inspection of what is going on here. Consider the algebraic expression for \(\overline{x}\) given above and perform some algebraic manipulation:

\[ \begin{align} \overline{x} &= \frac{0.3}{4}(62.5) + \frac{0.3}{4}(75) + \frac{0.3}{4}(80) + \frac{0.3}{4}(72.5) + \frac{0.4}{2}(79) + \frac{0.2}{4}(83) + \frac{0.3}{1}(76) \\ \overline{x} &= \frac{0.3}{4}\left(62.5 + 75 + 80 + 72.5\right) + \frac{0.4}{2}\left(79 +83\right) + 0.3(76) \\ \overline{x} &= 0.3\left(\frac{62.5 + 75 + 80 + 72.5}{4}\right) + 0.4\left(\frac{79 +83}{2}\right) + 0.3(76) \\ \end{align} \]

Something interesting results. What is within the parentheses are the averages of the quizzes, tests and final, respectively.

\[ \begin{align} \overline{x} &= 0.3\left(\frac{62.5 + 75 + 80 + 72.5}{4}\right) + 0.4\left(\frac{79 +83}{2}\right) + 0.3(76) \\ \overline{x} &= 0.3 \left(\overline{Q}\right) + 0.4 \left(\overline{T}\right) + 0.3 \left(\overline{F}\right) \end{align} \]

where the average of the quizzes is:

\[ \begin{align} \overline{Q} &= \left(\frac{62.5 + 75 + 80 + 72.5}{4}\right) \\ \overline{Q} &= 72.5 \end{align} \]

The average of the tests is:

\[ \begin{align} \overline{T} &= \left(\frac{79 +83}{2}\right) \\ \overline{T} &= 81 \end{align} \]

and (obviously) the average of the final exam is, \(\overline{F} = 76\).

The final grade would be:

\[ \begin{align} \overline{x} &= 0.3 \left(72.5\right) + 0.4 \left(81\right) + 0.3 \left(76\right) \\ \overline{x} &= 76.95 \end{align} \]

which is what was obtained above. The expression for \(\overline{x}\) in this case can be written as

\[ \overline{x} = \frac{\sum_{j=1}^{k}w_{j} x_{j}}{\sum_{j=1}^{k} w_{j}} \]

which looks exactly like the general expression for a weighted average given above with one exception. The individual items now represent averages. To reflect this the indices were purposedly changed. Instead of using \(i\) to represent all the items, \(j\) is used to reflect the average of certain groupings of the original elements. \(k\) represents the number of averages. In this example, \(k = 3\) for there are three averages, one for each grade type (quizzes, tests and final). The \(x_{j}\)’s are the three respective averages and the \(w_{j}\)’s are the three weights applied to the respective averages. Further, in this example, the weights sum to 1 so the computation of the weighted average simplifies to

\[ \overline{x} = \sum_{j=1}^{k}w_{j} x_{j} \]

This might seem confusing but it is really not. This example showed how given some number of items (in this case four quizzes, two tests and one final) the final grade could be computed by applying a unique weight to each item. However, it was shown how it is possible that individual items can be grouped together and weights then applied to their respective averages. This latter case is conceptually the same as the computation of any weighted average except that the items going into the computation represent averages instead of specific data values.

top

Normalization: When Weights Do Not Sum to 1

In the above examples, the weights summed to 1. It is not a requirement that the weights sum to 1. Consider this example. Suppose there are three items, 36.8, 26.7 and 40.7. The simple average, \(\bar{x}\), is:

\[ \begin{align} \overline{x} &= \frac{\sum_{i=1}^{3} x_{i}}{3} \\ \overline{x} &= \frac{36.8 + 26.7 + 40.7}{3} \\ \overline{x} &= \frac{104.2}{3} \\ \overline{x} &\approx 34.73 \end{align} \]

Now for whatever reasons, the weights assigned to each are 310, 112 and 34, respectively. The weighted average is:

\[ \begin{align} \overline{x} &= \frac{\sum_{i=1}^{3}w_{i}x_{i}}{\sum_{i=1}^{3} w_{i}} \\ \overline{x} &= \frac{310(36.8) + 112(26.7) + 34(40.7)}{310 + 112 + 34} \\ \overline{x} &= \frac{11408 + 2990.4 + 1383.8}{456} \\ \overline{x} &= \frac{15782.2}{456} \\ \overline{x} &\approx 34.61 \end{align} \]

First notice that as expected, the weighted average is different that the simple average (significantly more weight was given to 36.8 than the others, especially relative to the high number 40.7). But, more importantly for this discussion, also notice that the sum of the weights is 456, not 1. This is not an issue in the weighted average computation but should weights be assigned such that they sum to 1, an obvious benefit would be that the general expression

\[ \overline{x} = \frac{\sum_{i=1}^{n}w_{i}x_{i}}{\sum_{i=1}^{n} w_{i}} \]

simplifies quite nicely to

\[ \overline{x} = \sum_{i=1}^{n}w_{i}x_{i} \]

How are the the originally assigned weights which do not sum to 1 be manipulated such that the new weights do sum to 1? The process of normalization can be used. Looking more closely at the computation of the weighted average it is realized that it can be algebraically manipulated as follows:

\[ \begin{align} \overline{x} &= \frac{310(36.8) + 112(26.7) + 34(40.7)}{310 + 112 + 34} \\ \overline{x} &= \frac{310(36.8) + 112(26.7) + 34(40.7)}{456} \\ \overline{x} &= \frac{310}{456}(36.8) + \frac{112}{456}(26.7) + \frac{34}{456}(40.7) \\ \overline{x} &= 0.6798246(36.8) + 0.245614(26.7) + 0.0745614(40.7) \\ \overline{x} &= 25.01754 + 6.557895 + 3.3034649 \\ \overline{x} &\approx 34.61 \end{align} \]

This weighted average is the same as the one above. This process is termed normalization. Each original assigned weight was normalized relative to the total weight by determining the proportion of each assigned weight has with respect to the total amount of weight. Mathematically, this can be expressed as:

\[ w_{i}^{'} = \frac{w_{i}}{\sum_{j=1}^{n} w_{j}} \]

where \(w_{i}^{'}\) is the normalized weight for the \(i^{th}\) original assigned weight. For \(i = 2\), for example, the original weight was \(w_{2} = 112\) and the normalized weight is \(w_{2}^{'} = \frac{112}{456}\). These normalized weights will always sum to 1; that is \(\sum_{i=1}^{n} w_{i}^{'} = 1\). In this example :

\[ \begin{align} \sum_{i=1}^{3} w_{i}^{'} &= \frac{310}{456} + \frac{112}{456} + \frac{34}{456} \\ \sum_{i=1}^{3} w_{i}^{'} &= \frac{310 + 112 + 34}{456} \\ \sum_{i=1}^{3} w_{i}^{'} &= \frac{456}{456} \\ \sum_{i=1}^{3} w_{i}^{'} &= 1 \\ \end{align} \]

Hence, when the assigned weights do not sum to one, the general equation for the weighted average

\[ \overline{x} = \frac{\sum_{i=1}^{n}w_{i}x_{i}}{\sum_{i=1}^{n} w_{i}} \]

can be manipulated to be:

\[ \overline{x} = \frac{\sum_{i=1}^{n}w_{i}^{'} x_{i}}{\sum_{i=1}^{n} w_{i}^{'}} \]

where \(w_{i}^{'}\) are the normalized weights. And since the sum of the normalized weights is 1, this simplifies quite nicely to

\[ \overline{x} = \sum_{i=1}^{n}w_{i}^{'} x_{i} \]

Using this normalization process for this example now produces (using the decimal representation of the normalized weights):

\[ \begin{align} \overline{x} &= \frac{0.6798246(36.8) + 0.245614(26.7) + 0.0745614(40.7)}{0.6798246 + 0.245614 + 0.0745614} \\ \overline{x} &= \frac{25.01754 + 6.557895 + 3.3034649}{1} \\ \overline{x} &\approx 34.61 \end{align} \]

top

Weighted Averages Applied To A Target Population

To understand this idea, consider this example. In a given school (high school or college) there are two sections of an introductory statistics class. One section is taught by Professor Smith, the second by Professor Jones. The number of students in Professor Smith’s class is 32 while Professor Jones has only 20 students. For this introductory class a department final exam is issued. The dean of the department, Dean Wacamole is interested in what the overall average grade was for the final exam.

Here are the final exam scores for Professor Smith’s and Professor Jones’ classes (note: the NAs mean that no data exist which is understandable for there are less students in Professor Jones’ class):

Introductory Statistics Final Exam Scores
Prof Smith Prof Jones
80 85
62 93
83 77
95 52
43 56
55 72
82 78
78 88
72 90
76 80
81 75
80 71
89 67
91 63
90 69
63 44
70 91
75 85
77 80
82 75
70 NA
84 NA
89 NA
90 NA
73 NA
77 NA
86 NA
88 NA
67 NA
81 NA
64 NA
77 NA

One option to satisfy Dean’s request is for each Professor to submit their grades from which an overall average grade can be computed. In this case that would be

\[ \overline{x} = \frac{\sum_{i=1}^{52} x_{i}}{52} \]

where 52 are the total number of grades (32 from the first, 20 from the second) and \(x_{i}\) are the individual grades. This would produce the overall average of \(\overline{x} = 76.173\).

However, common procedure is that each professor submit only their respective class’ final exam average and the number of students in their class. Professor Smith reports an average of 77.1875 with 32 students while Professor Jones reports an average of 74.55 with 20 students. The Dean proceeds to compute the overall average as

\[ \begin{align} \overline{x} &= \frac{77.1875 + 74.55}{2} \\ \overline{x} &= \frac{151.7375}{2} \\ \overline{x} &= 75.86875 \end{align} \]

Notice that this average, 75.86875, is not the same as the first average computed above (76.173) using all the scores (which given the problem statement is the correct answer).

What’s going on?

The method that the Dean used to get an average final exam score was averaging the averages which were first computed relative to the size of each class. The fact that there were different class sizes was not factored in. There were 32 students in Professor Smith’s class versus 20 in Professor Jones’ class. This means that the contribution to the overall average is greater for Professor Smith’s class than from Professor Jones’. In other words, the class size are weights, hence a weighted average needed to be used as follows:

\[ \begin{align} \overline{x} &= \frac{\sum_{i=1}^{2}w_{i}x_{i}}{\sum_{i=1}^{2} w_{i}} \\ \overline{x} &= \frac{32(77.1875) + 20 (74.55)}{32 + 20} \\ \overline{x} &= \frac{2470 + 1491}{52} \\ \overline{x} &= \frac{3961}{52} \\ \overline{x} &= 76.173 \end{align} \]

Which agrees with the average found from summing all the exam scores and dividing by 52. Notice that this computation did not use the normalization process, where each original assigned weight is normalized relative to the total weight by determining the proportion of each assigned weight has with respect to the total amount of weight. This would cause the sum of the weights to be 1. Doing the normalization process would yield:

\[ \begin{align} \overline{x} &= \frac{32}{52} (77.1875) + \frac{20}{52} (74.55) \\ \overline{x} &= 0.6153846 (77.1875) + 0.3846154 (74.55) \\ \overline{x} &= 47.5 + 28.67308 \\ \overline{x} &= 76.173 \end{align} \]

As expected, the same result is obtained.

Refraining from any comments on the Dean’s lack of understanding or the relatively low final exam average, this example shows two things. First that when attempting to compute an overall average from averages of subsets of the total population (here the total population is the students in both classes), it is critical to know the contribution each makes relative to the total population, thus dictating the use of weighted averages. Second, the same result can be found by using a simple average if given the entire population data, which is often not the case for a variety of reasons (least of which might be the lack of feasibility of having the entire population data).

Notice that the word population was used. This was quite intentional. It is often the case, especially in statistics, that large population datasets are summarized by averages of different categories within the population and the proportion of the total population each category represents. An example might be that the age of all males and all females in the US (collected perhaps by census data) is reported along with the percentage of the total population that are males and females. Should one want to know the average age of all persons a weighted average can be used as described in the final exam example.

One final note being that this population idea, though not necessarily obvious, is consistent with the classic example used above for computing the overall grade for a course. The total population was all graded material which was broken down into three categories or sub-populations (quizzes, tests and final exam) and an average computed for each. Knowing the proportion of the total population each represented (quizzes 30%, tests 40% and final exam 30%) allowed for the computation of an overall grade using a weighted average.

top