I have setup the following code that creates a list of weeks for each record in the file FORPROCESSING. An example of the before/after is below the code.
For the data that I'm testing at the moment, FORPROCESSING has 708 records which become 83,016 in MYAPPEND. This is taking around 30mins to process this part of my code. I don't know if it taking this long just because it has to loop 83,016 times or if appending to the file is slowing it down.
Any suggested alternatives or improvements please?
-SET &MY_APPEND_FILE = &&OTHERPARMHERE || '\' || 'MYAPPEND_FILE.FTM'; FILEDEF MYAPPEND DISK &MYAPPEND_FILE (APPEND -RUN TABLE FILE BEFOREPROCESSING PRINT STT_WEEK_DT PROCESS_WEEKS BY MY_ID AS FILE_MY_ID ON TABLE HOLD AS FORPROCESSING END -RUN -NEXTCEASEDRECORD; -READFILE FORPROCESSING -IF &IORETURN NE 0 GOTO READ_DONE; -SET &REF_DT = &STT_WEEK_DT; -SET &THE_WEEK_COUNTER = 1; -REPEAT MYLOOP &PROCESS_WEEKS TIMES DEFINE FILE CAR WEEK_DT/DMYY = &REF_DT; FIRST_WEEK_DT/DMYY = '&STT_WEEK_DT'; WEEK_COUNTER/I8 = &THE_WEEK_COUNTER; MY_ID/A11 = '&FILE_MY_ID'; END -RUN TABLE FILE CAR PRINT CAR NOPRINT FIRST_WEEK_DT WEEK_DT WEEK_COUNTER BY MY_ID WHERE CAR EQ 'AUDI' ON TABLE HOLD AS MYAPPEND END -RUN -* we can't increment the value for &REF_DT as it doesn't know it's a date, so we use a dummy file to increment and read it back. DEFINE FILE CAR OLD_REF_DT/DMYY = &REF_DT; NEW_REF_DT/DMYY = OLD_REF_DT + 7; END -RUN TABLE FILE CAR PRINT NEW_REF_DT CAR NOPRINT ON TABLE HOLD AS TEMPDATE END -RUN -* read the file back and set &REF_DT to the newly incremented reference date -READFILE TEMPDATE -SET &REF_DT = &NEW_REF_DT; -SET &THE_WEEK_COUNTER = &THE_WEEK_COUNTER + 1; -* end of the loop using the &PROCESS_WEEKS -MYLOOP -GOTO NEXTRECORD; -READ_DONE;
Contents of FORPROCESSING
MY_ID STT_WEEK_DT PROCESS_WEEKS
1235 31/01/2020 5
5678 14/02/2020 3
Contents of MYAPPEND
ID FIRST_WEEK_DT WEEK_DT WEEK_COUNTER
1235 31/01/2020 31/01/2020 1
1235 31/01/2020 07/02/2020 2
1235 31/01/2020 14/02/2020 3
1235 31/01/2020 21/02/2020 4
1235 31/01/2020 28/02/2020 5
5678 14/02/2020 14/02/2020 1
5678 14/02/2020 21/02/2020 2
5678 14/02/2020 28/02/2020 3This message has been edited. Last edited by: FP Mod Chuck,
How long does a single loop take ?
Does it get longer after each loop ?
Is REF_DT from a table or is it just calc'ed ?
What you have could be move to just some dialog manager
I would be inclined to calculate the end date from the start date and number of weeks, hold and conditionally join to a dates table containing all the weeks within the date range that is required.
It will cut out the recursive looping and make things easier to maintain in future.
Keep it simple!!
I created this for a colleague recently for a very similar requirement -
-DEFAULTH &STARTYEAR=0, &ENDYEAR=0 EX -LINES * EDAPUT MASTER,CUSTDATA,C,MEM FILENAME=CUSTDATA, SUFFIX=FOC SEGNAME=SEG01, SEGTYPE=S1 FIELDNAME=CUSTOMER_NO, ALIAS=CUSTNO, USAGE=I11L, ACTUAL=I11, TITLE='Customer No', $ FIELDNAME=STARTYEAR, ALIAS=STYEAR, USAGE=I4, ACTUAL=I4, TITLE='Start Year', $ FIELDNAME=ENDYEAR, ALIAS=ENYEAR, USAGE=I4, ACTUAL=I4, TITLE='End Year', $ FIELDNAME=SOMEDATA, ALIAS=MYDATA, USAGE=A30, ACTUAL=A30, TITLE='Some Data', $ EDAPUT* CREATE FILE CUSTDATA MODIFY FILE CUSTDATA FREEFORM CUSTOMER_NO STARTYEAR ENDYEAR SOMEDATA MATCH CUSTOMER_NO ON MATCH REJECT ON NOMATCH INCLUDE DATA 1,2011,2012,Something lies within 1,$ 2,2012,2014,Something lies within 2,$ 3,2013,2016,Something lies within 3,$ 4,2014,2018,Something lies within 4,$ 5,2015,2020,Something lies within 5,$ 6,2016,2022,Something lies within 6,$ END -RUN -* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* Dummy data complete, now to process it into multiple rows -* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FILEDEF YEARS DISK YEARS.FTM -RUN EX -LINES * EDAPUT MASTER,YEARS,C,MEM FILENAME=CUSTDATA, SUFFIX=FIX SEGNAME=SEG01, SEGTYPE=S2 FIELDNAME=YEAR, ALIAS=YEAR, USAGE=I4, ACTUAL=A4, TITLE='Year', $ EDAPUT* TABLE FILE CUSTDATA SUM MIN.STARTYEAR MAX.ENDYEAR ON TABLE HOLD AS TEMPHLD1 -RUN -READFILE TEMPHLD1 -REPEAT :Loop FOR &Year FROM &STARTYEAR TO &ENDYEAR; -WRITE YEARS &Year -:Loop JOIN CLEAR * JOIN FILE YEARS AT YEAR TO ALL CUSTDATA AT STARTYEAR AS J1 WHERE YEAR FROM STARTYEAR TO ENDYEAR; END TABLE FILE YEARS PRINT SOMEDATA BY CUSTNO BY YEAR ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * INCLUDE = enwarm, $ ENDSTYLE END
A little more to help in your understanding of why your process is inefficient.
If you run your code (with the necessary corrections ), you get the "job log" as follows -
0 NUMBER OF RECORDS IN TABLE= 2 LINES= 2 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 1 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 10 LINES= 10 0 NUMBER OF RECORDS IN TABLE= 8 LINES= 8 0 HOLDING HTML FILE ON PC DISK ...
You will see that the multiple processes result in the handling of data many times over to get just 8 resultant records. That's where it is inefficient as I would wager that your actual scenario isn't just about 2 input records over a range of 10 potential weeks? !
By using a version of my code that reflects yours and you get the following "job log" -
0 NUMBER OF RECORDS IN TABLE= 2 LINES= 1 0 NUMBER OF RECORDS IN TABLE= 8 LINES= 8 0 HOLDING HTML FILE ON PC DISK ...
It produces exactly the same results as yours but in a more efficient data handling way.
My ID Start Week Week Date Week No. Week Counter 1234 31/01/2020 31/01/2020 4 1 1234 31/01/2020 07/02/2020 5 2 1234 31/01/2020 14/02/2020 6 3 1234 31/01/2020 21/02/2020 7 4 1234 31/01/2020 28/02/2020 8 5 5678 14/02/2020 14/02/2020 6 1 5678 14/02/2020 21/02/2020 7 2 5678 14/02/2020 28/02/2020 8 3
TThis message has been edited. Last edited by: Tony A,
How about using a Cartesian product to avoid looping?
There is a good discussion of this technique here: Adding more records
-* This hold file can be create once and and kept on baseapp. -* It assumes you will never have more than 52 PROCESS_WEEKS. TABLE FILE syscolum PRINT NAME NOPRINT COMPUTE ONE/I11 = 1; COMPUTE WEEK_COUNTER/I11 = WEEK_COUNTER + 1; WHERE RECORDLIMIT IS 52 ON TABLE SET HOLDLIST PRINTONLY ON TABLE HOLD AS baseapp/counter FORMAT FOCUS INDEX ONE END -* Making a forprocessing file. Note you will need to have a ONE field with value of 1 for all rows. TABLE FILE counter PRINT COMPUTE ONE /I11 = DECODE WEEK_COUNTER ( 1 1 2 1 ); COMPUTE MY_ID /I11 = DECODE WEEK_COUNTER ( 1 1235 2 5678 ); COMPUTE STT_WEEK_DT /DMYY = DECODE WEEK_COUNTER ( 1 '31/01/2020' 2 '14/02/2020' ); COMPUTE PROCESS_WEEKS/I11 = DECODE WEEK_COUNTER ( 1 5 2 3 ); WHERE RECORDLIMIT IS 2 ON TABLE SET HOLDLIST PRINTONLY ON TABLE HOLD AS forprocessing FORMAT FOCUS END -* This join accomplishes the Cartesian product. JOIN FILE forprocessing AT ONE TO ALL FILE counter AT ONE AS J1 WHERE WEEK_COUNTER LE PROCESS_WEEKS; END -* This is output represents the MYAPPEND file. TABLE FILE forprocessing PRINT STT_WEEK_DT COMPUTE WEEK_DT/DMYY = STT_WEEK_DT + (WEEK_COUNTER - 1) * 7; WEEK_COUNTER BY MY_ID END
Win10 / IE11
AHTML EXL2K PDF
@db, also known as a conditional join
Doesn't actually require the host file to have a matching column. So you could omit the ONE column and change the join to -
JOIN FILE forprocessing AT MY_ID TO ALL FILE counter AT ONE AS J1 WHERE WEEK_COUNTER LE PROCESS_WEEKS; END
I would also prefer to change the join to a many-to-one - which would also give the same results. But that's just me
@T, ah yes, thank you. So then neither ONE is required.
Based on your comment, the new and improved code is:
-* This hold file can be created once and kept on baseapp. -* It assumes you will never have more than 52 PROCESS_WEEKS. TABLE FILE syscolum PRINT NAME NOPRINT COMPUTE WEEK_COUNTER/I11 = WEEK_COUNTER + 1; WHERE RECORDLIMIT IS 52 ON TABLE SET HOLDLIST PRINTONLY ON TABLE HOLD AS baseapp/counter END -* Making a forprocessing file. TABLE FILE counter PRINT COMPUTE MY_ID /I11 = DECODE WEEK_COUNTER ( 1 1235 2 5678 ); COMPUTE STT_WEEK_DT /DMYY = DECODE WEEK_COUNTER ( 1 '31/01/2020' 2 '14/02/2020' ); COMPUTE PROCESS_WEEKS/I11 = DECODE WEEK_COUNTER ( 1 5 2 3 ); WHERE RECORDLIMIT IS 2 ON TABLE SET HOLDLIST PRINTONLY ON TABLE HOLD AS forprocessing END -* This conditional join accomplishes the Cartesian product. JOIN FILE forprocessing AT MY_ID TO ALL FILE counter AT WEEK_COUNTER AS J1 WHERE WEEK_COUNTER LE PROCESS_WEEKS; END -* This output represents the MYAPPEND file. TABLE FILE forprocessing PRINT STT_WEEK_DT COMPUTE WEEK_DT/DMYY = STT_WEEK_DT + (WEEK_COUNTER - 1) * 7; WEEK_COUNTER BY MY_ID END
I suppose there is a many-to-many relationship between the two. Either one can be the left table, but either way you need the TO ALL keyword. At least I couldn't get it to work with TO UNIQUE.
I appreciate your input, I learned something.
Win10 / IE11
AHTML EXL2K PDF
@ DB, I try and learn something new every day - and then I would like to say my day is done and go home, but it doesn't happen
@ iDuncanW, I was giving this a little more thought last night and wanted to make a few points on other things to be aware of.
We are showing examples using a FOCUS table as the main processor and conditionally joining this to the date file is fine because they are both contained on the server. Your scenario might be a little different and you need to give thought to where your data source(s) are for the final piece of reporting(?).
Is/are you data source(s) RDBMS based? If so then you need to consider the join and the efficiency that would result. Remember to use SQL tracing, where it is relevant. Search the Forum for SQLTRACE. Adjust the location of the held data (search on FORMAT SAMEDB) as necessary.
I modified my example to show how to capture times at particular points in your code. This can be used to identify which sections of large code are consuming too much time - something that you already have done!!
-* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* Firstly create the Dummy data -* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -DEFAULTH &STWEEK=0, &ENWEEK=0, &WEEKS=0 EX -LINES * EDAPUT MASTER,CUSTDATA,C,MEM FILENAME=CUSTDATA, SUFFIX=XFOC SEGNAME=SEG01, SEGTYPE=S1 FIELDNAME=MY_ID, ALIAS=MYID, USAGE=I11, ACTUAL=I11, TITLE='My ID', $ FIELDNAME=STT_WEEK_DT, ALIAS=STWEEK, USAGE=DMYY, ACTUAL=DMYY, TITLE='Start Week', $ FIELDNAME=PROCESS_WEEKS, ALIAS=PRCWKS, USAGE=I4, ACTUAL=I4, TITLE='Proc. Weeks', $ -* I add a define into the synonym so that it retains the define in the subsequent conditional join -* and that it can be used within the WHERE clause DEFINE END_WEEK_DT/DMYY = DTADD(STT_WEEK_DT, WEEK, PROCESS_WEEKS);, TITLE='End Week', $ EDAPUT* CREATE FILE CUSTDATA MODIFY FILE CUSTDATA FREEFORM MY_ID STT_WEEK_DT PROCESS_WEEKS MATCH MY_ID ON MATCH REJECT ON NOMATCH INCLUDE DATA -* Mimic 708 rows of data to be processed -REPEAT :Data FOR &Cntr FROM 1 TO 354; 1234&Cntr.EVAL,31012020,5,$ 5678&Cntr.EVAL,14022020,3,$ -:Data END -RUN -* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* Dummy data complete, now to process it into multiple rows -* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* Get the time at start -SET &Time_001 = HHMMSS('A8'); FILEDEF WEEKDTS DISK WEEKDTS.FTM -RUN EX -LINES * EDAPUT MASTER,WEEKDTS,C,MEM FILENAME=WEEKDTS, SUFFIX=FIX SEGNAME=SEG01, SEGTYPE=S1 FIELDNAME=WEEKDT, ALIAS=WEEKDT, USAGE=DMYY, ACTUAL=A8DMYY, TITLE='Week Date', $ EDAPUT* TABLE FILE CUSTDATA SUM MIN.STT_WEEK_DT AS STWEEK MAX.END_WEEK_DT AS ENWEEK COMPUTE WEEKS/I11 = DTDIFF(MAX.END_WEEK_DT, MIN.STT_WEEK_DT, WEEK) - 1; ON TABLE SET HOLDLIST PRINTONLY ON TABLE SET ASNAMES ON ON TABLE HOLD AS TEMPHLD1 -RUN -READFILE TEMPHLD1 -REPEAT :Loop FOR &WeekNo FROM 0 TO &WEEKS; -SET &WEEKDT = DATECVT(DTADD(DATECVT(&STWEEK.QUOTEDSTRING, 'A8DMYY', 'YYMD'), WEEK, &WeekNo), 'YYMD', 'A8DMYY'); -WRITE WEEKDTS &WEEKDT -:Loop JOIN CLEAR * JOIN FILE WEEKDTS AT WEEKDT TO ALL CUSTDATA AT STT_WEEK_DT AS J1 WHERE WEEKDT FROM STT_WEEK_DT TO END_WEEK_DT; END TABLE FILE WEEKDTS PRINT COMPUTE WEEKNO/I4 = DTPART(WEEKDT, WEEK); AS 'Week No.' COMPUTE WEEKCNTR/I11 = DTDIFF(WEEKDT, STT_WEEK_DT, WEEK) + 1; AS 'Week Counter' BY MY_ID BY STT_WEEK_DT BY WEEKDT ON TABLE HOLD ON TABLE SET PAGE NOLEAD ON TABLE SET BYDISPLAY ON ON TABLE SET STYLE * INCLUDE = enwarm, $ ENDSTYLE END -RUN -* Get the time at end -SET &Time_002 = HHMMSS('A8'); -* Display times to calculate time taken -? &Time
Thanks for the responses.
Note: I just added to my original post as I just realised I hadn't put the contents of MYAPPEND file, but looks like it was understood anyway
I don't know how long a single loop takes and not sure how to work it out when running the full list as it loops over 80,000 times and blows the log file limit
&REF_DT - this is initially set after the -READFILE FORPROCESSING and takes the value from STT_WEEK_DT. It gets updated after the -READFILE TEMPDATE
Tony A & dbeagan
Thank you for the really detailed responses. I have a much lower experience level than both of you from the looks of things.
Tony, I struggled to understand how the dummy data was being created, but I'm guessing that was for demonstration and I would create a CUSTDATA file using my actual data?
I tried creating the YEARS file but I haven't constructed a file like that before. I attempted what I thought was similar but got garbage values in my YEARS file. I wasn't sure how I would do the same thing with weeks as I don't think I can increment them during the loop so I thought I'd have a try at dbeagan's
I create the counter file ok but have no idea what the syscolumn file is. There will be some records that I may have as many as 500 PROCESS_WEEKS. Can I still use this file and increase the record limit or should I just read any of our data files that have at least that many records?
I then created the forprocessing file using real data and just added the ONE field.
I was then able to do the cartesian product/conditional join. I have never used that method before so I need to read up on it some more, but I think understand it.
Thanks again both. I still have much work to do as the MYAPPEND file with the list of weeks for each ID, is essentially a time line. I then have to add multiple items. For example, each id can have a number of payments over it's life but not necessarily in each week. I will need to create a file that lists each week a payment has occurred and then combine it with the MYAPPEND file.
I was using MATCH FILE to combine, but I think I buggered something up as I had it working, but then it just gets stuck and the job does nothing. I'm guessing I can probably use some sort of join based on WEEK_DT and MY_ID.
I just realised that my profile has our old version of WebFOCUS, we are now on 8.2. Its updated now.
We don't have your data so both DB and I had to create some.
Mine is achieved using CUSTDATA. You should use your original data file (BEFOREPROCESSING).
I am guessing that you are Australia based? If your Company needed, they could contact the local branch and talk to the Professional Services team. They could supply consultancy services to help you build your experience and get you to an experienced level quicker than trying to find your way on your own.
There are also Customer events such as User Groups and Summits which might prove useful for you and your Company. They are free to attend events AFAIK and are good in many ways - just ask the general membership on the Forum!
However you choose to continue, I wish you good luck
Thanks again Tony.
It took me a while to respond to the original responses and had left it open on the posts up to Feb 26 11:33am. I have just read the other ones. I certainly have a lot to learn. My experience is with programming and extracting the data, I don't have a background in databases but yes our data is RDBMS based. I've learnt than on 8.2 we are Oracle based and we need to try to keep joins as either Oracle to Oracle or Focus hold file to Focus hold file. If we try to join Oracle to Focus on larger amounts of data, performance is not good.
This line alone should help me immensely as I hadn't explored using functions in -SET commands before.
I'm having trouble getting it to work though. In the inner DATECVT, the value is resolving to '0' so I get an error from the DTADD function. I know &MYDATE.QUOTEDSTRING is '01022018' so not sure why it ends up as '0'.
DEFINE FILE CAR MYDATE/DMYY = '01/02/2018'; END -RUN TABLE FILE CAR PRINT CAR NOPRINT MYDATE WHERE RECORDLIMIT EQ 1 ON TABLE HOLD AS DATETEST END -RUN -READFILE DATETEST -SET &MYDATE_CVT = DATECVT(&MYDATE.QUOTEDSTRING, 'A8DMYY', 'YYMD'); -TYPE my date converted = &MYDATE_CVT
Yes I am Australia based and will certainly explore my development options.
Thanks for the well wishes
DuncanThis message has been edited. Last edited by: iDuncanW,
I get "my date converted = 42766" from your example.
Are you not getting that?
No, if I use -SET I get "my date converted = 0"
If I use DATECVT in a define like below, MYDATE_CVT is 2018/02/01
DEFINE FILE CAR MYDATE/A8 = '01022018'; MYDATE_CVT/YYMD = DATECVT(MYDATE, 'A8DMYY', 'YYMD'); END -RUN TABLE FILE CAR PRINT CAR NOPRINT MYDATE MYDATE_CVT WHERE RECORDLIMIT EQ 1 END -RUN -EXIT
Not sure what is happening for you but if I use the following code I get the output below.
DEFINE FILE CAR MYDATE_A8/A8 = '01022018'; MYDATE_CVT/YYMD = DATECVT(MYDATE_A8, 'A8DMYY', 'YYMD'); MYDATE/DMYY = '01/02/2018'; END -RUN TABLE FILE CAR PRINT CAR NOPRINT MYDATE_A8 MYDATE_CVT MYDATE WHERE READLIMIT EQ 1 ON TABLE HOLD AS DATETEST FORMAT ALPHA END -RUN -READFILE DATETEST -SET &MYDATE_CVTA8 = DATECVT(&MYDATE.QUOTEDSTRING, 'A8DMYY', 'YYMD'); -TYPE my date = &MYDATE -TYPE converted in DEFINE = &MYDATE_CVT -TYPE converted IN DM = &MYDATE_CVTA8
my date = 01022018 converted in DEFINE = 20180201 converted IN DM = 42766
Nice to see another Aussie, iDuncanW
|Powered by Social Strata|