Focal Point
[SOLVED] Standard Deviation Revisited

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/5097046666

January 12, 2014, 06:59 AM
Danny-SRL
[SOLVED] Standard Deviation Revisited
I had to do some STDEV calculations for a customer.
See what happens when the field is an integer:
  
TABLE FILE CAR
SUM ASQ.SALES AVE.SALES CNT.SALES
COMPUTE STD=((ASQ.SALES-AVE.SALES *AVE.SALES) **.5) /CNT.SALES ;
END

And when the field is decimal:
  
DEFINE FILE CAR
SALES/D7=SALES;
END
TABLE FILE CAR
SUM ASQ.SALES AVE.SALES CNT.SALES
COMPUTE STD=((ASQ.SALES-AVE.SALES *AVE.SALES) **.5) /CNT.SALES ;
END

This message has been edited. Last edited by: <Kathryn Henning>,


Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

January 12, 2014, 03:45 PM
j.gross
I know the AVE. operator on an integer field yields an integer field. Whether or not same is true of ASQ., that's enough to ruin your STD's day.


- Jack Gross
WF through 8.1.05
January 13, 2014, 01:21 AM
Danny-SRL
Jack,
Right as usual!
Coming back to the calculating of the STD, after looking through documentation and consulting with my colleagues, It seems that the formula I wrote down is erroneous.
The following matches with what EXCEL provides with it STDEV function.
Using LET (notice the # for a new line - apparently LET is limited in length!):
  
LET
STD = COMPUTE STD_<1>= # (((ASQ.<1> - AVE.<1> * AVE.<1>) * (CNT.<1>)/(CNT.<1> - 1))) ** 0.5;;
END

Using DEFINE FUNCTION:
  
DEFINE FUNCTION STDV/D15.2 (VAR1/D12.2,AQ/D12.2,AV/D12.2,CQ/I9)
SVAR/D12.2 =(AQ-(AV * AV))
* (CQ /(CQ -1)) ;
SSDEV/D15.2=SQRT(SVAR);
END



Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

January 16, 2014, 10:00 AM
j.gross
Dan --

Perhaps both formulae are "correct". I didn't analyze the entire expressions, but one obvious difference is (N) vs (N-1) in the denominator.


As I recall, there are two definitions [differing only by the appearance of 1/N or 1/(N-1) in the formula] for the variance of a set of real values.

The one with 1/N is the definition of population variance, for a finite population of equi-probable outcomes.

The one with 1/(N-1) is the statistic, computed from N observations from a normal distribution, that serves as an unbiased statistical estimate of the variance of the underlying distribution.

The square root of the first is then the definition of standard deviation in that context; while sqrt of the second is a (slightly biased) estimate of the population std dev.

- Jack