Data Migrator - Target Transformation

June 18, 2007, 09:04 AM

Hi,

We are starting to use Data Migrator to develop ETL to populate our new Datawarehouse. I am currently using the DataMigrator included in Develop Studio until the installation of the Iway server is completed.

I want to load a Dimension (SCD1) that have a surrogate key (the primary key) and a natural key (primary key of the source system). In the target transformation I want to able, for each record, to first verify if the record already exists in the target based on the natural key and not the surrogate key (primary key). Secondly, if the record exists (as per the natural key), I want to verify that the non-key columns have changed before performing the update.

I tried to use the Key Matching Logic but it is checking with the primary key.

Thanks,

June 19, 2007, 09:54 AM

Jessica Bottone

Alain, if I'm following you correctly, what you want to do has to be done before you get to the target side of the ETL data flow. The type of data screening you are looking for cannot be done in a target transformation. On the source side of the ETL data flow, I would do a left outer join from your source to the target on the natural key fields, then in the SQL portion of the data flow, add the filtering to only pull the data where the non-key fields between the two tables are different. If you're comfortable writing your own code, another way to approach this would be to do the same thing I described here in a data migrator stored procedure, hold the results in HOLD file, then use the HOLD file as your source. If you're interested in that approach, I can give you more info on that. Good luck.

June 19, 2007, 03:23 PM

AlainD

Jessica, thanks for your feedback. We are just starting here to use Data Migrator and I am the volunter to open the way with ETLs. Our source and target database are both Oracle 9i. For now, I am just trying to load a few dimensions. Later, I will have to load the atomic Fact tables and the aggregated Fact tables. Joining the source and the target on the source side should work great for the small dimensions. For the bigger ones, I am interested about your approach using a DM stored procedure. If you can give me more info on it, it will be very appreciated.
Thanks.

PS. I will try the first technique you described (left outter join) on a small dimension and let you know how it went.

...I tried and the outer join does not work in my case. The source table and the target table are on 2 differents Oracle database. When DM build the SQL for the join, it sends it to the source database for execution and the target table is not found. So it tried instead the following. The first dataflow extract the data from the source table and save it into a flat file. The second dataflow does the left outer join between the flat file and the target table with the filter and populate the target table. It is working but is it a good solution ? Thanks.

This message has been edited. Last edited by: AlainD,

June 20, 2007, 10:07 AM

Jessica Bottone

Alain, I don't believe that having those two Oracle tables be in two different databases should have mattered. I'd like to see the synonyms for both of those (master file description and access file description) as well as the properties for the data adapters. I would have thought that Data Migrator would have seen that they were in two different databases and not generated SQL to try to actually join them together. Just like if you had a DB2 source and an Oracle source, I would have thought it would have sent a request to get the data from one, then sent a separate request to get the data from the other, and then did the join itself back on the server side. Would someone from IBI like to comment on this?

As to what you said about using flat files and was that good solution, if you don't mind two steps and having to create a permanent flat file, I think that's a perfectly fine solution. Some people perfer splitting that kind of thing out and making permanent files that stick around - others do not. So as long as it runs well for you, then just go with what you prefer. What I referred to was doing basiclly exactly like you did, but in one step and where you have a flat file that's permanent, the way I was describing would have had temporary files that went away when the job was done. If you're interested, send me your email address and I'll send you a write up on using stored procedures. Then you can decide which way you like better.

June 21, 2007, 05:23 PM

AlainD

Just to let you know that I retried the same join that was not working, and , without doing any change, it is now working. Like I mentioned previously, I am currently using the Data Migrator included in Dev Studio until our technical team have time to install the Iway server and clients. It looks like the Dev Studio DM might not reliable. I don't know. Other thing that should normally work are also not working. I will wait until I have IWAY before I continue.