
I found when I use excel to calculate the standard deviation, the result is not the same as calculated in the DataGraph, but it is same with the “s” called as “Unbiased estimate of the standard deviation”. I don’t understand statistics, however, when I check some websites and my previous data, I found it should be wrong in DataGraph. Kindly hope you give me an answer, thank you.
First let me explain what we do in DataGraph …
In DataGraph, s is the sample standard deviation and σ is the population standard deviation. The population standard deviation is also simply referred to as the standard deviation and s is the unbiased estimate of σ. s is appropriate for small samples from a larger population. The one you use depends on the data/analysis.
I found two basic examples here that demonstrate the difference:
https://en.wikipedia.org/wiki/Standard_deviationThe first example is a sample of a larger population, so ‘s’ is used.
The second is a sample of the entire population and ‘σ’ is used.
Note the values from these graphs were calculated in DataGraph and match the values shown on the wiki page. We have also verified our calculations with other stats software (i.e., R).
Based on that, the equations in DataGraph are correctly implemented. If you’re interested here are the equations:
Also, we just uploaded an example file with these graphs/equations to the online examples, that you can get here …
In Excel, they used to have the function STDEV, which calculated s.
https://support.office.com/enus/article/STDEVfunction51FECAAA231E4BBB923033650A72C9B0Excel now has separate functions, STDEV.S or STDEV.P
https://support.office.com/enus/article/stdevsfunction7d69cf970c1f4acfbe27f3e83904cc23I assume this change was made in Excel to avoid confusion about which one was being calculated. If you were using STDEV then you were calculating s. This seems consistent with your comparison between DataGraph and Excel.
Note that the equations for s and σ are similar, the difference being in the denominator of s (n1). As a result, s is always greater than σ, for the same set of data. As n increases, s approaches the value of σ. Thus for large sample sizes n>30, s and σ are approx. the same values.
Sounds like you may have been choosing σ instead of s, but hopefully this helps explain each one and why you would pick one versus another.
Thank you so much. Before I don’t understand the definition of “population standard deviation” and “sample standard deviation”. In my work, I always use the STDEV in Excel and when I learn the manual of DataGraph “Global variable” Page 21, and I found a form displays sigma “standard deviation” and s “Unbiased estimate of the standard deviation”. So I thought the sigma is the STDEV I used, now I understand and I think if you correct the definition with “population standard deviation” should be better than “Standard deviation”. Of course, maybe because I am a layman.
You’re Welcome. Glad that was helpful.
We will add those descriptions in the documentation.
