Focal Point
[SOLVED] How to filter Duplicate Records

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/5357049676

June 26, 2015, 12:46 AM
CSun
[SOLVED] How to filter Duplicate Records
Howdy All



So, I'm new to WebFOCUS and programming in general, basic training is in progress.

That being said, this is the second time I have come across the following scenario, and have not been able to find a solution (or a concise way to state the problem):



I have a table that lists employee education information as follows:



Employee_ID

Employee_Name

Degree

University_Name

Date_Awarded

(About 50 other fields relating to education and training info).



The question I am asking is: What are the email addresses and degrees of employees who attended "UCLA"?



The client wants the report in the following format (I am joining to a table showing active employee data by Employee ID to get email addresses):



Employee_ID

Email Address

Degree,

Date Awarded





The problem is, my report keeps giving me duplicates for alums of multiple programs (An employee that attended undergrad, grad and certification will have three different records).



For some reason 'Distinct' and 'Last" eliminate some of the dupes, but not all.



Any direction would be appreciated.

This message has been edited. Last edited by: CSun,


WF 7.7.03 on Win7/Tomcat
June 26, 2015, 03:16 AM
Neelima
Could use SET command

SET ALL=OFF


WebFocus 8104,8204
Excel/PDF/HTML/HTMLTABLE/XML/ALPHA/GIF file/GRAPH/Active technologies
June 26, 2015, 04:33 AM
Martin vK
Put all fields as BY's and you will not have duplicates in the report.

Nevertheless you could still get multiple records for employees that attended multiple programs in the same university. In that case you do need to use MAX or similar for last Date Awarded. I am not sure how to determine which is last Degree. Depends on your data.

Martin.


WebFocus 8206M, iWay DataMigrator, Windows, DB2 Windows V10.5, MS SQL Server, Azure SQL, Hyperstage, ReportCaster
June 26, 2015, 07:34 AM
George Patton
My guess that the reason you are getting duplicates in some cases but not others is that the university names are not entered in a consistent manner. So UCLA is not the same as UCLA_ (where _ represents a space at the end).

What you should do is initially create a table that is sorted by university, then degree etc. That should immediately identify if you have incorrectly entered data.

Once you are sure you have good data, HOLD this initial pass and then re-sort using the distinct attribute.


WebFOCUS 7.7.05 Windows, Linux, DB2, IBM Lotus Notes, Firebird, Lotus Symphony/OpenOffice. Outputs PDF, Excel 2007 (for OpenOffice integration), WP
June 26, 2015, 10:55 AM
mmally
The problem is the data you are trying to combine. The data allows 2 places for duplicates Degree and Date Awarded. One way to get a clean list would be to build a HOLD file with EmpID, Email, and LAST(Date Awarded). Then join the hold file back to the employee table on ID and Date, then bring in Degree and you will get the last degree.


WebFOCUS 8
Windows, All Outputs
June 29, 2015, 06:26 PM
CSun
Thanks, for the quick responses on this.

Printing all of the fields as BY's still showed several duplicates. The only accurate way to get last degree was to separate them into levels (BS, MS, etc.).

Mmally's idea of creating a Hold file with ID and date worked perfectly.

Thanks again!


WF 7.7.03 on Win7/Tomcat