Focal Point
[Solved] Report Duplicate Records

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/2067066426

October 01, 2012, 02:00 PM
Greg
[Solved] Report Duplicate Records
The how-to article to find duplicate records is 404.

https://techsupport.informatio...om/sps/81071020.html

I need a report that locates duplicate records in which four fields are the same. If a record contains the same values in f3,f4,f5 and f6 Then print the ENTIRE Record (24 fields). With something incase no records are found (the desired response):

IF COUNT EQ 0 Print "No Duplicates Found"


TABLE FILE ELAPSED_TIME_CSV
BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3
BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F4
BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F5
BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F6  


I grab the four fields from the source and sort.

Not sure if I should COUNT, HOLD then do a MATCH for WHERE TOTAL GT 1?

F3 is an employee id and each emplyee ID shows up 14 times in the flat file we are reporting against (once for each day of the 2 week pay period). But if these 4 fields match, the record is a dup and can not be imported and needs to be corrected. The payroll system will not takes duplicate records and just rejects them both. Leaving the employee with no hours worked for that day.

Am I in the right direction here?

This message has been edited. Last edited by: Greg,


prod: WF 7.7.03 platform IIS on Windows 2007, databases: Oracle, , MSSQL

October 01, 2012, 02:28 PM
rfbowley
Since it appears all 4 fields are in the same record, why not:

 
WHERE ((F3 NE F4) AND (F4 NE F5) AND (F5 NE F6))
 


Granted, there will not be a 'No Duplicates Found' response without a little more coding, but, no record that has the same value in all 4 fields will get through.


Robert F. Bowley Jr.
Owner
TaRa Solutions, LLC

In WebFOCUS since 2001
October 01, 2012, 02:49 PM
Danny-SRL
Greg,
You say:
quote:
I need a report that locates duplicate records in which four fields are the same. If a record contains the same values in f3,f4,f5 and f6 Then print the ENTIRE Record (24 fields).


Do you mean that f3=f4=f5=f6?
Or do you mean that you have 2 or more records with the same values of f3, f4, f5, f6?


Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

October 01, 2012, 02:57 PM
Dan Satchell
A multi-verb request like this might work, assuming duplicate values occur across records and not within records.

TABLE FILE ELAPSED_TIME_CSV
 COUNT ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3 NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F4    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F5    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F6    NOPRINT
-*
 PRINT *
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F4    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F5    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F6    NOPRINT
 WHERE TOTAL COUNT GT 1 ;
END
-RUN
-IF (&LINES GT 0) GOTO EXIT ;
-TYPE No Duplicates Found.
-EXIT



WebFOCUS 7.7.05
October 01, 2012, 03:06 PM
Greg
Do you mean that f3=f4=f5=f6?
Or do you mean that you have 2 or more records with the same values of f3, f4, f5, f6?[/QUOTE]

Records with the same values in all four fields.

not f3=f4=f5=f6

ex:

F3 F4 F5 F6
115262 0 2012-09-28 1
115262 0 2012-09-28 1

This would be a dup and end up on the report.


prod: WF 7.7.03 platform IIS on Windows 2007, databases: Oracle, , MSSQL

October 01, 2012, 03:10 PM
Greg
quote:
Originally posted by Dan Satchell:
A multi-verb request like this might work, assuming duplicate values occur across records and not within records.

TABLE FILE ELAPSED_TIME_CSV
 COUNT ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3 NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F4    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F5    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F6    NOPRINT
-*
 PRINT *
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F3    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F4    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F5    NOPRINT
 BY ELAPSED_TIME_CSV.ELAPSED_TIME_CSV.F6    NOPRINT
 WHERE TOTAL COUNT GT 1 ;
END
-RUN
-IF (&LINES GT 0) GOTO EXIT ;
-TYPE No Duplicates Found.
-EXIT


This code returns a page number and the names of all the fields in the table:

PAGE 1

F1 F2 F3 F4 F5X F6 F7 F8 F9 F10 F11X F12 F13 F14 F15 F16 F17 F18 F19 F20 F21 F22 F23 F24 F25


prod: WF 7.7.03 platform IIS on Windows 2007, databases: Oracle, , MSSQL

October 01, 2012, 03:25 PM
Dan Satchell
If you get an empty report, you could try: SET EMPTYREPORT = OFF. Otherwise, you might HOLD the output, then test &LINES to see whether to print the results or display a No Duplicates message.


WebFOCUS 7.7.05
October 01, 2012, 03:34 PM
Greg
The falt file I am reporting against has 2 duplicates records in it for testing.

These are not being reported with the above code.


prod: WF 7.7.03 platform IIS on Windows 2007, databases: Oracle, , MSSQL

October 01, 2012, 10:51 PM
jimster06
I remember the LAST function which is shown in this example.
It would allow you to compare f3 to LAST f3, f4 to LAST f4, etc. and select where there is equality in each of the four fields between two rows.


jimster06
DevStu WF 7.6.11
W7
HTML, PDF, EXL2K
October 02, 2012, 03:36 AM
Danny-SRL
Greg,

Using the famous CAR file:
  
-* File greg3.fex
TABLE FILE CAR
SUM CNT.SALES NOPRINT
BY COUNTRY NOPRINT
BY CAR NOPRINT
PRINT COUNTRY CAR RCOST DCOST SALES
BY COUNTRY NOPRINT
BY CAR NOPRINT
IF TOTAL CNT.SALES GT 1
END

will give only the duplicates.


Daniel
In Focus since 1982
wf 8.202M/Win10/IIS/SSA - WrapApp Front End for WF

October 02, 2012, 07:53 AM
George Patton
How about concatenating the 4 separate fields into a single field? It then become easy to COUNT the number of times a particular record occurs. If it's more than one just keep a single instance in the output file.


WebFOCUS 7.7.05 Windows, Linux, DB2, IBM Lotus Notes, Firebird, Lotus Symphony/OpenOffice. Outputs PDF, Excel 2007 (for OpenOffice integration), WP
October 02, 2012, 08:33 AM
Greg
quote:
Originally posted by Danny-SRL:
Greg,

Using the famous CAR file:
  
-* File greg3.fex
TABLE FILE CAR
SUM CNT.SALES NOPRINT
BY COUNTRY NOPRINT
BY CAR NOPRINT
PRINT COUNTRY CAR RCOST DCOST SALES
BY COUNTRY NOPRINT
BY CAR NOPRINT
IF TOTAL CNT.SALES GT 1
END

will give only the duplicates.


This worked after I added sales as a by field.


prod: WF 7.7.03 platform IIS on Windows 2007, databases: Oracle, , MSSQL