[SOLVED] How to create a random subset of a file using the function RDUNIF or PRDUNI
Hi All,
I'm creating a new thread to better reflect the subject that is being discussed. This discussion started from another topic named "Current Date Calculation". This discussion is going to be about creating a random subset of a file.
I'm going to copy some information from the other discussion and then add to it later.
JimThis message has been edited. Last edited by: Kerry,
WebFocus 8.201M, Windows, App Studio
October 20, 2010, 12:07 PM
jfr99
This was from Hayley:
This is related in a way to your current day topic. As I need to create a hold file of randomly retrieved records, I thought I could use the lowest microseconds position in the timestamp as part of some formula to do it.
Is there some way I can access the current timestamp, convert it to A26 and parse out that final last position value? I have tried the HYYMDm function --- only with CAPS OFF in my profile --- and I can only get the smart date back, but I can't convert it successfully.
BUT, if there is a better way to get the hold file of randomly selected recods, please let me know. I have a deadline of 2 weeks for the Excel file output.
WebFocus 8.201M, Windows, App Studio
October 20, 2010, 12:08 PM
jfr99
This was from me:
In looking at your first post.......here is an example of printing 3 random cars from the CAR file using a random number function. Don't know if this will be of any use, but here it is:
-* DEFINE FILE CAR CAR_RAND/D12.8 WITH CAR = RDUNIF(CAR_RAND); END -* TABLE FILE CAR PRINT CAR BY CAR_RAND ON TABLE HOLD END -* TABLE FILE HOLD PRINT * WHERE RECORDLIMIT EQ 3 END
I've used this method to pull a random subset from a file and it has worked for me.
WebFocus 8.201M, Windows, App Studio
October 20, 2010, 12:10 PM
jfr99
This is Hayley's last response:
Thank you, Jim. Using the RDUNIF function worked perfectly for me, too. As expected, I got different results from the 2 consecutive times I ran the test batch query. And since all I need is to do get one set of randomly selected projects one time within the program, this is fine.
However, there was one detail that needs to be shared for anyone else whose specs might not be the same. If I ran the function twice in the same program and using 2 different defined random number names, I got the SAME results where I would have expected an entirely different set with the different function names.
This is my base file. ---- ---- PROJ NAME ---- ---- 28 PROJECT 28 49 PROJECT 49 63 PROJECT 63 74 PROJECT 74 321 PROJECT 321 502 PROJECT 502
DISPLAY RANDOM FILE 2 RAND_02 PROJ NAME ------- ---- ---- .16259415 28 PROJECT 28 .21508715 49 PROJECT 49 .24673269 502 PROJECT 502 .30672254 321 PROJECT 321 .69007673 63 PROJECT 63 .98673983 74 PROJECT 74 *************************** Bottom of Data *
WebFocus 8.201M, Windows, App Studio
October 20, 2010, 12:18 PM
jfr99
Hi Hayley,
You are correct, this function will create reproducible random numbers. I also found another function named PRDUNI which has another parameter that is a seed number. If you change this seed number, then you get different sets of random numbers.
Here is another example using the CAR file that shows different uses of both RDUNIF and PRDUNI:
-***************************************** -** THIS USES FUNCTION = RDUNIF ** -***************************************** -* CREATE HOLD FILE FROM CAR TABLE FILE CAR PRINT COMPUTE CNTR/I5 = CNTR + 1; BY CAR ON TABLE HOLD END -* PRINT HOLD FILE WITH 1 RANDOM NUMBER DEFINE FILE HOLD RAND_01/D12.8 = RDUNIF(RAND_01); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_01 END -* PRINT HOLD FILE WITH 2 RANDOM NUMBERS DEFINE FILE HOLD RAND_11/D12.8 = RDUNIF(RAND_11); RAND_12/D12.8 = RDUNIF(RAND_12); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_11 RAND_12 END -* PRINT HOLD FILE WITH 2 RANDOM NUMBERS DEFINE FILE HOLD RAND_21/D12.8 = RDUNIF(RAND_21); RAND_22/D12.8 = RDUNIF(RAND_22); RAND_23/D12.8 = RDUNIF(RAND_23); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_21 RAND_22 RAND_23 END -***************************************** -** THIS USES FUNCTION = PRDUNI ** -** THE FIRST PARM IS A SEED NUMBER ** -***************************************** -* PRINT HOLD FILE WITH 1 RANDOM NUMBER DEFINE FILE HOLD RAND_01/D12.8 = PRDUNI(40, RAND_01); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_01 END -* PRINT HOLD FILE WITH 2 RANDOM NUMBERS DEFINE FILE HOLD RAND_11/D12.8 = PRDUNI(50, RAND_11); RAND_12/D12.8 = PRDUNI(50, RAND_12); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_11 RAND_12 END -* PRINT HOLD FILE WITH 2 RANDOM NUMBERS DEFINE FILE HOLD RAND_21/D12.8 = PRDUNI(60, RAND_21); RAND_22/D12.8 = PRDUNI(60, RAND_22); RAND_23/D12.8 = PRDUNI(60, RAND_23); END -* TABLE FILE HOLD PRINT CNTR CAR RAND_21 RAND_22 RAND_23 END
I hope this can maybe help others......I think I learned something today.