Vous êtes sur la page 1sur 7

29/08/2019 descriptive statistics - How to 'sum' a standard deviation?

- Cross Validated

How to 'sum' a standard deviation?


Asked 7 years, 4 months ago Active 1 year, 1 month ago Viewed 223k times

I have a monthly average for a value and a standard deviation corresponding to that average. I
am now computing the annual average as the sum of monthly averages, how can I represent the
66 standard deviation for the summed average ?

For example considering output from a wind farm:

Month MWh StdDev


January 927 333
40 February 1234 250
March 1032 301
April 876 204
May 865 165
June 750 263
July 780 280
August 690 98
September 730 76
October 821 240
November 803 178
December 850 250

We can say that in the average year the wind farm produces 10,358 MWh, but what is the
standard deviation corresponding to this figure ?

standard-deviation descriptive-statistics

edited Apr 5 '12 at 6:34 asked Apr 4 '12 at 15:22


klonq
505 2 6 9

3 A discussion following a now-deleted reply noted a possible ambiguity in this question: do you seek the
SD of the monthly averages or do you want to recover the SD of all the original values from which those
averages were constructed? That reply also correctly pointed out that if you want the latter, you will need
the numbers of values involved in each one of the monthly averages. – whuber ♦ Apr 4 '12 at 17:37

1 A comment to another deleted reply pointed out that it is strange to compute an average as a sum: surely
you mean that you are averaging the monthly averages. But if what you want is to estimate the average of
all the original data, then such a procedure is not usually a good one: a weighted average is needed. And
of course it's not possible to give a good answer to your question about the "SD for the summed average"
until it is clear what the "summed average" is and what it is intended to represent. Please clarify that for us.
– whuber ♦ Apr 4 '12 at 21:40

@whuber I have added an example to clarify. Mathematically I believe that the sum of averages is equal to
the monthly average times 12. – klonq Apr 5 '12 at 6:37

2 Yes, klonq, that is a very reasonable request. However, these replies were deleted by their owner, not by
the community. To preserve their value, I have attempted here to relay (my take on) the key ideas arising in

By using ourthose
site, you acknowledge
replies that you have
and their comments. BTW,read
yourand understand
recent edits areour Cookie
quite Policy
helpful: , Privacy
people like to Policy, and
see example
our Terms ofdata.
Service .
– whuber ♦ Apr 5 '12 at 14:33
1 W l t th it @H d Thi i 't
https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation t th OP' ti Pl l th "Y 1/7
29/08/2019 descriptive statistics - How to 'sum' a standard deviation? - Cross Validated
1 Welcome to the site, @Hayden. This isn't an answer to the OP's question. Please only use the "Your
Answer" field to provide answers. If you have a follow-up question, click the [ASK QUESTION] at the top
& ask it there, then we can help you properly. Since you are new here, you may want to take our tour,
which contains information for new users. – gung ♦ May 30 '14 at 4:07

4 Answers

Short answer: You average the variances; then you can take square root to get the average
standard deviation.
65
Example

Month MWh StdDev Variance


========== ===== ====== ========
January 927 333 110889
February 1234 250 62500
March 1032 301 90601
April 876 204 41616
May 865 165 27225
June 750 263 69169
July 780 280 78400
August 690 98 9604
September 730 76 5776
October 821 240 57600
November 803 178 31684
December 850 250 62500
=========== ===== ======= =======
Total 10358 647564
÷12 863 232 53964

And then the average standard deviation is sqrt(53,964) = 232

From Sum of normally distributed random variables:

If and are independent random variables that are normally distributed (and therefore
also jointly so), then their sum is also normally distributed

...the sum of two independent normally distributed random variables is normal, with its mean
being the sum of the two means, and its variance being the sum of the two variances

And from Wolfram Alpha's Normal Sum Distribution:

Amazingly, the distribution of a sum of two normally distributed independent variates and
with means and variances and , respectively is another normal
distribution

which has mean


By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service.
and variance
https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation 2/7
29/08/2019 descriptive statistics - How to 'sum' a standard deviation? - Cross Validated

For your data:

sum: 10,358 MWh


variance: 647,564
standard deviation: 804.71 ( sqrt(647564) )

So to answer your question:

How to 'sum' a standard deviation?


You sum them quadratically:

s = sqrt(s1^2 + s2^2 + ... + s12^2)

Conceptually you sum the variances, then take the square root to get the standard deviation.

Because i was curious, i wanted to know the average monthly mean power, and its standard
deviation. Through induction, we need 12 normal distributions which:

sum to a mean of 10,358


sum to a variance of 647,564

That would be 12 average monthly distributions of:

mean of 10,358/12 = 863.16


variance of 647,564/12 = 53,963.6
standard deviation of sqrt(53963.6) = 232.3

We can check our monthly average distributions by adding them up 12 times, to see that they
equal the yearly distribution:

Mean: 863.16*12 = 10358 = 10,358 (correct)

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
Variance: 53963.6*12 = 647564 = 647,564 (correct)
our Terms of Service.

N t i'll l it t ith k
https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviationl d f th t i L t th t t 3/7
29/08/2019 descriptive statistics - How to 'sum' a standard deviation? - Cross Validated
Note: i'll leave it to someone with a knowledge of the esoteric Latex math to convert my
formula images, and formula code into stackexchange formatted formulas.

Edit: I moved the short, to the point, answer up top. Because i needed to do this again today, but
wanted to double-check that i average the variances.

edited May 12 '17 at 9:06 answered Apr 18 '12 at 2:48


Tim ♦ Ian Boyd
64.1k 11 142 242 949 9 13

3 This all seems to assume the months are uncorrelated - have you made that assumption explicit
anywhere? Also, why do we need to bring in the normal distribution? If we're only talking about variance
then that seems unnecessary - for example, see my answer here – Macro Jul 25 '12 at 12:26

1 @Marco Because i think better in pictures and it makes everything easier to understand. – Ian Boyd Jul
25 '12 at 19:57

2 @Marco Also, i believe this question started on the (now defunct) stats.stackexchange site. A wall of
formulas are less accessible than simpler, graphical, less rigorous treatments. – Ian Boyd Jul 26 '12 at
13:45

2 I doubt this is correct. Imagine two data sets with each only a single measurement each. Their variance of
each set is 0, but the set of both measurements has a variance greater than 0 if the data points differ. –
Njol Oct 10 '16 at 15:58

1 @Njol, I think that's why we assume all variables have normal distribution. And we can do it here,
because we talk about phisical measurement. In your example both variables are not normally distributed.
– tworec Apr 26 '17 at 12:56

This is an old question but the answer accepted is not actually correct or complete. The user
wants to calculate the standard deviation over 12 month data where the mean and standard
11 deviation is already calculated over each month. Assuming that the number of samples in each
month is the same, then it is possible to calculate the sample mean and variance over the year
from each month data. For simplicity assume that we have two sets of data:

with known values of sample mean and sample variance, , , , .

Now we want to calculate the same estimates for

Consider that , are calculated as:

To estimate mean and variance over the total set we need to calculate:

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service. which is given in the accepted answer. For variance
however the story is different:
https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation 4/7
29/08/2019 y descriptive statistics - How to 'sum' a standard deviation? - Cross Validated

So if you have the variance over each subset and you want the variance over the whole set then
you can average the variances of each subset if they all have the same mean. Otherwise you
need to add the variance of mean of each subset.

Lets say that over the first half of the year we produce exactly 1000 MWh per day and in the
seconds half we produce 2000 MWh per day. Then the mean and variance of energy production
in first and seconds half are 1000 and 2000 for mean and variance is 0 for both halves. Now
there are two different things that we may be interested in:

1-We want to calculate the variance of energy production over the whole year: then by
averaging the two variance we arrive at zero, which is not correct since the energy per day over
the the whole year is not constant. In this case we need to add the variance of all the means from
each subset. Mathematically in this case the random variable of interest is energy production per
day. We have sample statistics over subsets and we want to calculate the sample statistics over
a longer time.

2-We want to calculate the variance of energy production per year: In other words we are
interested in how much energy production changes from one year to another year. In this case
averaging the variance leads to the correct answer which is 0, since in each year we are
producing exactly 1500 MHW on average. Mathematically in this case the random variable of
interest is average of energy production per day where the averaging is done over the whole
year.

edited Apr 26 '17 at 16:37 answered Apr 26 '17 at 15:48


Hooman
535 4 10

I believe what you may be really interested in though is the standard error rather than the
standard deviation.
1 The standard error of the mean (SEM) is the standard deviation of the sample-mean's estimate
of a population mean, and that will give you a measure how how good your yearly MWh estimate
is.

It's very easy to compute: if you used samples to obtain your monthly MWh averages and
standard deviations, you would just compute the standard deviation as @IanBoyd suggested and
normalize it by the total size of your sample. That is,

edited Apr 11 '15 at 17:45 answered Apr 11 '15 at 17:33

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy
Matteo Policy, and
our Terms of Service. 111 4

https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation 5/7
29/08/2019 descriptive statistics - How to 'sum' a standard deviation? - Cross Validated

I'd like to stress again the incorrectness in part of the accepted answer. The wording of the
question lead to confusion.
0 The question have Average and StdDev of each month, but it's unclear what kind of subset is
used. Is it the average of 1 wind turbine of the whole farm or the daily average of the whole farm?
If it's the daily average for each month, you can't add up the monthly average to get the annual
average because they do not have the same denominator. If it's the unit average, the question
should state

We can say that in the average year each turbine in the wind farm produces 10,358 MWh,...

Instead of

We can say that in the average year the wind farm produces 10,358 MWh,...

Further more, The Standard deviation or variance is the comparison against the set's own
average. It does NOT contain any information regarding the average of the whole set.

The image's not necessary very correct but it conveys the general idea. Let's imagine the output
of 1 wind farm as in the image. As you can see, the "local" variance has nothing to do with the
"global" variance, no matter how you add or multiply those. You cannot predict the variance of
the year using variance of 2 half year. So, in the accepted answer, while the sum calculation is
correct, the division by 12 to get the monthly number means nothing.. Of the three section,
the first and the last section is wrong, the second is right.

Again, it's very wrong application, please do not follow it or it would get you into trouble. Just
calculated for the whole thing, using total yearly/monthly output of each unit as data points
depending whether you want yearly or monthly number,that should be the correct answer. You
probably want something like this. This is my randomly generated numbers. If you have the data,
the result in cell O2 should be your answer.

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service.

https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation 6/7
29/08/2019 descriptive statistics - How to 'sum' a standard deviation? - Cross Validated

answered Jun 13 '17 at 9:08


Tam Le
3 2

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service.

https://stats.stackexchange.com/questions/25848/how-to-sum-a-standard-deviation 7/7

Vous aimerez peut-être aussi