Focal Point
[CLOSED] Maintain: Duplicate records validation

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/7227010376

October 09, 2014, 02:41 AM
Nasar
[CLOSED] Maintain: Duplicate records validation
Hi,

I have an application, where we need to validate for duplicate records.
Example, there are 5 columns in a table A,B,C,D,E having values 1,2,3,4,5 for the columns then this same combination of 1,2,3,4,5 can NOT be entered again. That is the combination values for these columns should be unique.
Is there a ways to do this in maintain.

Any sample code/logic would be highly beneficial.

Thanks,
Nasar

This message has been edited. Last edited by: <Kathryn Henning>,


8.06
Windows
October 09, 2014, 03:10 AM
Alan B
From your question,I assume that it is not valid to have these columns as key fields, which would be normal.

So basically there are 3 options:

1. If the data is already in a stack, use STACK COPY with WHERE clause to check if the values already exist.
2. Use a FOR ALL NEXT with WHERE clause against the data table and check the returned stack.
3. Create and EX a focexec that contains a DB_LOOKUP with the column values passed to it and check the returned stack.

The choice of which to use depends on the data volumes, source type and indices. TO help with which one is best for you, mfd, volumes and data type are required.


Alan.
WF 7.705/8.007
October 09, 2014, 03:31 AM
Nasar
Thanks Alan for prompt reply.
Yes we can not keep the columns as keys.
The data volumes is high (more than 1 million rows and around 20+ columns to be validated).
I tried the first approach that you suggested, but that had very slow performance.
Please suggest which one to opt.
I am not sure about the 2nd and 3rd approach that you suggested. Please elaborate.
A short code/logic would be highly helpful.

The datatype is a mix of INT and VARCHAR
Thanks,
nasar


8.06
Windows
October 09, 2014, 07:43 AM
Maintain Wizard
I think in this case I would use next with where. Since the fields are NOT key fields it will be a little slower, but it should be ok. Assume a Master with:

Key0
Field1
Field2
Field3
Field4
Field5

After retrieving the values in a stack (STK1) I would do:

Reposition Key0
Stack clear STK2
Next Key0 into STK2
where Field1 = STK1.Field1
and Field2 = STK1.Field2
and Field3 = STK1.Field3
and Field4 = STK1.Field4
and Field5 = STK1.Field5

If STK2.FOCCOUNT = 1 then REJECT

You don't need a FOR ALL NEXT since if you hit even one record you want to reject. See if this is good for you.

Mark
October 12, 2014, 06:26 AM
Alan B
So I was wondering if there would be any performance benefit when using different approaches, and the answer is no.

I set up a MySQL table with 21 columns and 10 million rows and no indices.

Used 3 methods, TABLE, Maintain NEXT and DB_LOOKUP. Performance was equal for all approaches, which in real terms was bad, all running at about 66 seconds to find a single record with 20 screening conditions.

However, adding 2 medium cardinality indices brought the time down to below 1 second for each of the 3 methods used.

So for simplicity I would use the NEXT keyfield, as Mark suggested, but ensure that the DB has indices to help performance.


Alan.
WF 7.705/8.007