Focal Point Banner


As of December 1, 2020, Focal Point is retired and repurposed as a reference repository. We value the wealth of knowledge that's been shared here over the years. You'll continue to have access to this treasure trove of knowledge, for search purposes only.

Join the TIBCO Community
TIBCO Community is a collaborative space for users to share knowledge and support one another in making the best use of TIBCO products and services. There are several TIBCO WebFOCUS resources in the community.

  • From the Home page, select Predict: WebFOCUS to view articles, questions, and trending articles.
  • Select Products from the top navigation bar, scroll, and then select the TIBCO WebFOCUS product page to view product overview, articles, and discussions.
  • Request access to the private WebFOCUS User Group (login required) to network with fellow members.

Former myibi community members should have received an email on 8/3/22 to activate their user accounts to join the community. Check your Spam folder for the email. Please get in touch with us at community@tibco.com for further assistance. Reference the community FAQ to learn more about the community.



Read-Only Read-Only Topic
Go
Search
Notify
Tools
Using Aggregate in ETL
 Login/Join
 
Silver Member
posted
Have a source file of transactions. The two key fields are “job nbr” and “activity date”. Need to grab only the most current record. The source file is in no order. Example:

Source file:

JOB_NBR ACTIVITY_DATE FIELDA FIELDB FIELDC
12345 20060128 RTM GATE ONLINE
12345 20060124 MTO CHDR ONLINE
12345 20060201 RTM GATE ONLINE

I only want the 3rd record with activity date of 20060201 because it has the most current information. This record will be passed onto the destination file to update any existing record or be inserted if there is no match.

I’ve been trying various methods using Aggregate in ETL but cannot get the results I need.

I can write an RPC to write what I need to a Focus file and use that as my source but wanted to know if it can be done in ETL.

Thanks,
Jim
 
Posts: 43 | Location: San Jose, California. | Registered: July 18, 2005Report This Post
Platinum Member
posted Hide Post
What version are you using?.

There is no straight forward way to do it in the columns options. You would have to override the SQL using a correlated sub query.

SELECT JOB_NBR,ACTIVITY_DATE..........
FROM TABLEA
WHERE ACTIVITY_DATE = (SELECT MAX(ACTIVITY_DATE)
FROM TABLEB
WHERE TABLEB.JOB_NBR = TABLEA.JOB_NBR)

It will work for non relational sources too but you pay the price in terms of performance.


WF7.1.4 Prod/Test, MRE, self serve, DM
 
Posts: 176 | Location: Desplaines | Registered: August 05, 2004Report This Post
Silver Member
posted Hide Post
I'm using ETL 5.33.
 
Posts: 43 | Location: San Jose, California. | Registered: July 18, 2005Report This Post
Silver Member
posted Hide Post
I think I may have found a method in ETL.

I'm using the SORT option from SELECT COLUMNS of the Data Flow. I'm sorting by JOB NUMBER(ASC) and ACTIVITY DATE (ASC).

Take the 3 records in my example and sort on job number and activity date. Then the 3 records in the source file go to the destination file sorted. If the job number doesn't exist, the 1st sample record will get inserted, then the 2nd sample record will update, then the 3rd sample record will update resulting in the most current record for that job number in the destination file.

I tested it and got it to work. I spot checked several records in the destination file and only the most current (latest activity date) was there.

Thanks much,
Jim
 
Posts: 43 | Location: San Jose, California. | Registered: July 18, 2005Report This Post
Platinum Member
posted Hide Post
There is a performance price you are paying. You most probably are using key matching logic on the
target. A key matching logic issues a read for every record going to target and this slows down the load.

However if performance is not of importance than what you are doing is fine.


WF7.1.4 Prod/Test, MRE, self serve, DM
 
Posts: 176 | Location: Desplaines | Registered: August 05, 2004Report This Post
Silver Member
posted Hide Post
I am using key matching logic and yes, I agree wth the performance issue. However, the transaction file is only about 20,000 records and only takes about 5 minutes or less to load. I will have to keep the performance issue in mind when we start doing this with much larger transaction files.

Thank you for your advice.
Jim
 
Posts: 43 | Location: San Jose, California. | Registered: July 18, 2005Report This Post
  Powered by Social Strata  

Read-Only Read-Only Topic


Copyright © 1996-2020 Information Builders