Has anyone implemented anything in FOCUS or webFocus using Benford's Law? Benford's law states that certain digits are more likely to occur in certain positions within a random number. For example, the number 1 has the highest probability of showing up in the first position of a number, and the number 9 has the least probability of showing up in the first position.
This law is often used in financial analysis to detect fraudulent/fictitious transactions. If anyone has done anything like this could you let me know. Thanks
Stephen AltroggeThis message has been edited. Last edited by: <Kathryn Henning>,
October 31, 2005, 03:34 PM
Kerry
Hi Stephen,
Can you please be more specific on how you would like to use this in your application? An example will be great for understaning.
Cheers,
Kerry
October 31, 2005, 08:20 PM
Tony A
Stephen,
Do you mean in working out the probability that a series of numbers fall within Benford's law? Where, for the first significant digit of a given number 'n', the probability 'P', equates to Log10(1+1/n)
Shouldn't be too hard to manage given the formula.
November 01, 2005, 01:44 PM
saltrogge
Tony A,
Yes that is exactly what I mean. I'm working with large volumes of transactions in flat files and am trying to determine the best way to implement the formula you mentioned. Any ideas?
Stephen
November 15, 2005, 08:40 AM
Tony A
Hi Stephen,
Sorry it's taken a while to get back to you but try this for an example of your application. It uses the GGSALES file and shows that the figures are likely to be suspicious?
-SET &ECHO='ALL'; DEFINE FILE GGSALES DOLLARS_D/D12 = DOLLARS; DOLLARS_A/A1 = LJUST(12,FTOA(DOLLARS_D,'(D12)','A12'),'A1'); DOLLAR_CNT/I9 = 1; END TABLE FILE GGSALES SUM DOLLAR_CNT BY DOLLARS_A ON TABLE SAVE AS X FORMAT ALPHA END -RUN -SET &Sample = &RECORDS;
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2
WebFOCUS App Studio 8.2.06 standalone on Windows 10
November 15, 2005, 08:48 AM
Tony A
Sorry, forgot to give an explanation!
Firstly, in the defines, I convert the number in which I'm interested into Decimial notation as it is one of the formats required for FTOA.
Next I convert the number using FTOA (as opposed to EDIT which will show leading zeroes) and left justify the result. By setting the output format to A1 I effectively strip off the first significant digit.
The final define is just to show that I'm counting the rows and can be achieved many different ways.
The first table request is purely to dump the number of rows into a variable that I can use in the final calculations. I called it &Sample to depict that it represents the sample size.
In the final table request I cheated and used a decode to get the Benford's Law value for each digit as the only function close to a LOG is only for natural logs and not Log base 10 (which is what I needed).
The RATIO field shows what portion of the sample size is forecast by Benford's Law.
Hopes this helps in your quest!
T
In FOCUS since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2
WebFOCUS App Studio 8.2.06 standalone on Windows 10
November 15, 2005, 08:52 AM
Tony A
And the result -
PAGE 1
DOLLARS_A
DOLLAR_CNT
BENFORDS
RATIO
18
.00000
0
1
2017
.30100
1,299
2
582
.17610
760
3
230
.12490
539
4
245
.09690
418
5
224
.07920
342
6
226
.06690
289
7
269
.05800
250
8
249
.05120
221
9
257
.04580
198
TOTAL
4317
1.00000
4,317
In FOCUS since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2
WebFOCUS App Studio 8.2.06 standalone on Windows 10
November 21, 2005, 02:41 PM
saltrogge
Tony:
Thanks for your help. I think I may be able to put some of this into use!
Stephen A
November 21, 2005, 02:56 PM
Tony A
Stephen,
You are most welcome and I must admit I enjoyed (OK I'm a sad case) coming up with the solution but then I enjoy those problems that tax the mind the most!!
Try the table in a graph instead, it gives a better "perspective" of the results!!
T
In FOCUS since 1986
WebFOCUS Server 8.2.01M, thru 8.2.07 on Windows Svr 2008 R2
WebFOCUS App Studio 8.2.06 standalone on Windows 10
January 18, 2014, 05:48 PM
David Briars
Ah, another great vintage Focal Point thread!
Here is our model, similar to the one in this post, with a few modifications.
DEFINE FILE GGSALES
DOLLARS_D/D12 = DOLLARS;
DOLLARS_A/A1 = LJUST(15,FTOA(DOLLARS_D,'(D12)','A15'),'A1');
DOLLARS_I/I1 = EDIT(DOLLARS_A);
ONE/I1 = 1;
END
-*
GRAPH FILE GGSALES
-* y axis values.
SUM CNT.DOLLARS_A/I5C NOPRINT
TOT.ONE/I5C NOPRINT
-* Series 0.
COMPUTE ACTUAL_RATIO/D7.5 = CNT.DOLLARS_A / TOT.ONE; AS 'Actual'
AVE.DOLLARS_I NOPRINT
-* Series 1.
COMPUTE EXPECTED_RATIO/D7.5 = LOG( 1 + 1 / AVE.DOLLARS_I ) / 2.30259; AS 'Benford Expected'
-* x axis values.
BY DOLLARS_A AS 'First Position of Dollar'
-*
ON GRAPH SET LOOKGRAPH VBAR
ON GRAPH SET AUTOFIT ON
ON GRAPH PCHOLD FORMAT JSCHART
-*
ON GRAPH SET GRAPHSTYLE *
setY1ScaleMax(.5);
setTitleString("Reported/Recorded Amounts");
setSubtitleString("Actual v. Benford Expected Digital Frequency");
setY1TitleString("Frequency");
setTextRotation(getO1Label(),0);
setSeriesType(0,1);
setSeriesType(1,2);
setMarkerSizeDefault(160);
setMarkerShape(getSeries(1),10);
setSeriesLineWidthDefault(20);
ENDSTYLE
ON GRAPH SET STYLE *
INCLUDE = ENInformationBuilders_Light1, $
TYPE=REPORT, TITLETEXT='Actual v. Benford Expected Digital Frequency', $
ENDSTYLE
END
-EXIT
Indeed, per chart, the sales figures in GGSALES were probably human made. This probably was actually the case, unless there really was a Gotham Grinds coffee shop chain. :-)This message has been edited. Last edited by: David Briars,