As of December 1, 2020, Focal Point is retired and repurposed as a reference repository. We value the wealth of knowledge that's been shared here over the years. You'll continue to have access to this treasure trove of knowledge, for search purposes only. Moving forward, myibi is our community platform to learn, share, and collaborate. We have the same Focal Point forum categories in myibi, so you can continue to have all new conversations there. If you need access to myibi, contact us at email@example.com and provide your corporate email address, company, and name.
Presumably load type is set to "Slowly Changing Dimensions" and so for each input row a select with a where condition on the logical key values is used to see if they exist in the target. As Jessica suggests indexes on those columns would help.
Otherwise a single key column may also be helpful. But what data type is "Company" and the other key columns? If they are each 4 byte integers then you wouldn't see so much benefit. But if Company is a 64 character string, and so is "Program" then you would.
Also there are performance improvements to SCD processing in 7.6.10 gen 363 or later which should be available for download shortly.
Posts: 397 | Location: New York City | Registered: May 03, 2007
Hi all, first, thanks all for replying. I currently have no indexes relating to this flow in question. my target is set to "Slowly Changing" and my logical keys, company for example, are not very large for size/type - Int4. in my target, my first field is my surrogate key, and yes, it was previously exisitng. i have that field left as "blank" as the surrogate, followed by 4 fields that are set to "logical Key".
Since my original posting, i have modified my source/target tables and have created a unique ID, basically a CONCAT on those four logical fields and then used that new field as my one "Logical Key" field. This did not improve run time. i tried running it again yesterday, and it was still running after 5 hours before i killed it.
I also have tried switching my "Commit every rows" field to 100000 instead of the default.
I am not looking to load all these 64000 rows, but only those i have modified.
For example, i changed on field in my source where a Type II field would be affected. This would end the pre-loaded key record and create a new key with an open end date. I also changed a record for a Type I field - this would only update a few records. Lastly, i changed a source record for both a Type I field and a Type II field - i would expect to have a new record created, plus updating the Type I fields on the other similar logical records.
That said, i would only see 2 new records, and less than 5 others to be updated. The read seems to be taking all the time.
We're also using a JDBC generic connection to get the source data from IS400 box, and using the MySQL JDBS connection to load the target on a MySQL db.
thanks again for your feedback
WebFocus Developer Studio v. 7610 MySQL/Linux All different outputs including map technology and using the DMC for our ETL process as well
We ususally find it very helpful to use the dimension both as a target and as a source on the same data flow.
We join the dimension with the original source object on the natural/logical key with outer join on the original source.
In the sql filter we only allow data come through from the sources where the dimension.Active_flag is NULL or (dimension.active_flag = 1 and ((dimension_column_1 <> source_column_1) or (dimension_column_2 <> source_column_2) or (dimension_column_n <> source_column_n)))
this way the only records that are bounced against the dimension where at least one of the type1 or type2 columns are changed or it is new record.