Focal Point
[SOLVED] Benford's Law and FOCUS

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/8301019331

October 25, 2005, 02:40 PM
saltrogge
[SOLVED] Benford's Law and FOCUS
Has anyone implemented anything in FOCUS or webFocus using Benford's Law? Benford's law states that certain digits are more likely to occur in certain positions within a random number. For example, the number 1 has the highest probability of showing up in the first position of a number, and the number 9 has the least probability of showing up in the first position.

This law is often used in financial analysis to detect fraudulent/fictitious transactions. If anyone has done anything like this could you let me know. Thanks

Stephen Altrogge

This message has been edited. Last edited by: <Kathryn Henning>,
October 31, 2005, 03:34 PM
Kerry
Hi Stephen,

Can you please be more specific on how you would like to use this in your application? An example will be great for understaning.

Cheers,

Kerry
October 31, 2005, 08:20 PM
Tony A
Stephen,

Do you mean in working out the probability that a series of numbers fall within Benford's law? Where, for the first significant digit of a given number 'n', the probability 'P', equates to Log10(1+1/n)

Shouldn't be too hard to manage given the formula.
November 01, 2005, 01:44 PM
saltrogge
Tony A,

Yes that is exactly what I mean. I'm working with large volumes of transactions in flat files and am trying to determine the best way to implement the formula you mentioned. Any ideas?

Stephen
November 15, 2005, 08:40 AM
Tony A
Hi Stephen,

Sorry it's taken a while to get back to you but try this for an example of your application. It uses the GGSALES file and shows that the figures are likely to be suspicious? Smiler

-SET &ECHO='ALL';
DEFINE FILE GGSALES
DOLLARS_D/D12 = DOLLARS;
DOLLARS_A/A1 = LJUST(12,FTOA(DOLLARS_D,'(D12)','A12'),'A1');
DOLLAR_CNT/I9 = 1;
END
TABLE FILE GGSALES
SUM DOLLAR_CNT
BY DOLLARS_A
ON TABLE SAVE AS X FORMAT ALPHA
END
-RUN
-SET &Sample = &RECORDS;

TABLE FILE GGSALES
SUM DOLLAR_CNT
COMPUTE BENFORDS/D12.5 = DECODE DOLLARS_A ('1' 0.3010 '2' 0.1761 '3' 0.1249
'4' 0.0969 '5' 0.0792 '6' 0.0669 '7' 0.0580 '8' 0.0512 '9' 0.0458 ELSE 0);
COMPUTE RATIO/D12 = BENFORDS * &Sample;
BY DOLLARS_A
ON TABLE SUBTOTAL
END



In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
November 15, 2005, 08:48 AM
Tony A
Sorry, forgot to give an explanation! Frowner

Firstly, in the defines, I convert the number in which I'm interested into Decimial notation as it is one of the formats required for FTOA.

Next I convert the number using FTOA (as opposed to EDIT which will show leading zeroes) and left justify the result. By setting the output format to A1 I effectively strip off the first significant digit.

The final define is just to show that I'm counting the rows and can be achieved many different ways.

The first table request is purely to dump the number of rows into a variable that I can use in the final calculations. I called it &Sample to depict that it represents the sample size.

In the final table request I cheated and used a decode to get the Benford's Law value for each digit as the only function close to a LOG is only for natural logs and not Log base 10 (which is what I needed).

The RATIO field shows what portion of the sample size is forecast by Benford's Law.

Hopes this helps in your quest!

T



In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
November 15, 2005, 08:52 AM
Tony A
And the result -
PAGE 1
DOLLARS_A DOLLAR_CNT BENFORDS RATIO
  18 .00000 0
1 2017 .30100 1,299
2 582 .17610 760
3 230 .12490 539
4 245 .09690 418
5 224 .07920 342
6 226 .06690 289
7 269 .05800 250
8 249 .05120 221
9 257 .04580 198
TOTAL 4317 1.00000 4,317




In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
November 21, 2005, 02:41 PM
saltrogge
Tony:

Thanks for your help. I think I may be able to put some of this into use!

Stephen A
November 21, 2005, 02:56 PM
Tony A
Stephen,

You are most welcome and I must admit I enjoyed (OK I'm a sad case) coming up with the solution but then I enjoy those problems that tax the mind the most!!

Try the table in a graph instead, it gives a better "perspective" of the results!!

T



In FOCUS
since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2  
WebFOCUS App Studio 8.2.06 standalone on Windows 10 
January 18, 2014, 05:48 PM
David Briars
Ah, another great vintage Focal Point thread!

Here is our model, similar to the one in this post, with a few modifications.
DEFINE FILE GGSALES
 DOLLARS_D/D12 = DOLLARS;
 DOLLARS_A/A1  = LJUST(15,FTOA(DOLLARS_D,'(D12)','A15'),'A1');
 DOLLARS_I/I1  = EDIT(DOLLARS_A);
 ONE/I1        = 1;
END
-*
GRAPH FILE GGSALES
-* y axis values.
SUM     CNT.DOLLARS_A/I5C                                             NOPRINT
        TOT.ONE/I5C                                                   NOPRINT
-* Series 0.
COMPUTE ACTUAL_RATIO/D7.5 = CNT.DOLLARS_A / TOT.ONE;                  AS 'Actual'
        AVE.DOLLARS_I                                                 NOPRINT
-* Series 1.
COMPUTE EXPECTED_RATIO/D7.5 = LOG( 1 + 1 / AVE.DOLLARS_I ) / 2.30259; AS 'Benford Expected'
-* x axis values.
BY      DOLLARS_A                                                     AS 'First Position of Dollar'
-*
ON GRAPH SET LOOKGRAPH VBAR
ON GRAPH SET AUTOFIT ON
ON GRAPH PCHOLD FORMAT JSCHART
-*
ON GRAPH SET GRAPHSTYLE *
setY1ScaleMax(.5);
setTitleString("Reported/Recorded Amounts");
setSubtitleString("Actual v. Benford Expected Digital Frequency");
setY1TitleString("Frequency");
setTextRotation(getO1Label(),0);
setSeriesType(0,1);
setSeriesType(1,2);
setMarkerSizeDefault(160);
setMarkerShape(getSeries(1),10);
setSeriesLineWidthDefault(20);
ENDSTYLE
ON GRAPH SET STYLE *
 INCLUDE = ENInformationBuilders_Light1, $
 TYPE=REPORT, TITLETEXT='Actual v. Benford Expected Digital Frequency', $
ENDSTYLE
END
-EXIT 


Indeed, per chart, the sales figures in GGSALES were probably human made. This probably was actually the case, unless there really was a Gotham Grinds coffee shop chain. :-)

This message has been edited. Last edited by: David Briars,