Focal Point
[CLOSED] DataMigrator ETL without the 'T'

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/7137062886

November 07, 2017, 03:33 PM
Doug
[CLOSED] DataMigrator ETL without the 'T'
I have a requirement where I need to use DM to create a TABLE from a VIEW. Simple enough, however, the view does not require any transformations as they've all been done within the view.

DM is not producing any rows in my table unless I add every field in the view for a transformation, even if that transformation really doesn't do anything.

What's the missing piece here? Do I really need to do the transformation even if it's not required? Or, would I be better off just doing a TABLE FILE from the view and APP HOLDing the results?

Thanks in advance, Doug

This message has been edited. Last edited by: FP Mod Chuck,




   In FOCUS Since 1983 ~ from FOCUS to WebFOCUS.
   Current: WebFOCUS Administrator at FIS Worldpay | 8204, 8206
November 08, 2017, 10:55 AM
Clif
You do need "transformations" in the target object for an existing table for every field you want to write but it's the degenerate case where you just want a simple assignment.

If the names are the same in the source and target you can just click the automap button.


N/A
November 08, 2017, 12:43 PM
Doug
Thanks Clif, I'll check out "the automap button" and check the results...
November 09, 2017, 11:12 AM
Doug
Thanks Again Clif,

The Automapping did work. However, it's taking over 1/2 an hour to process 169000 rows, 84 columns, in DM. A select * in SQL server pulls everything in LT 3 minutes.
Is that 'normal'?
November 13, 2017, 05:22 AM
Frans
That is very slow. How long does it take to get the first record from your view?

Is it doing updates and inserts on the target or just inserts. What is the commitsize? Any heavy indexes on target?


Test: WF 8.2
Prod: WF 8.2
DB: Progress, REST, IBM UniVerse/UniData, SQLServer, MySQL, PostgreSQL, Oracle, Greenplum, Athena.
November 13, 2017, 10:27 AM
FP Mod Chuck
Doug

Which Load Type are you using on the target? If it is insert/update then that forces record at a time processing that can be very slow.


Thank you for using Focal Point!

Chuck Wolff - Focal Point Moderator
WebFOCUS 7x and 8x, Windows, Linux All output Formats
November 13, 2017, 10:33 AM
Doug
Yep, insert/update. We did notice that the time was 'quite a bit longer than a SQL Pull. However, we are doing this to preserve history. So, the insert/update is required. The fortunate side of this is that the BLOBs are not yet used (nulls).I should also mention that this is a new WebFOCUS, etc., client. The first ETLs were done last week.
November 13, 2017, 11:36 AM
Wep5622
Is there an index on the columns that you use to check whether an update can be used? Without one, the database will most likely fall back to a sequential scan, which will only get slower with more records.

Many indexes on the table (including foreign key references to and from other tables) will also slow things down as Frans mentioned.

What we often do is E(T)L to a staging table and do the data-merge within the database. That way the database engine gets better opportunity to optimize things. Another benefit is that you don't need to bother with updates in the ETL procedure, it's all inserts.
I recently saw someone call that ELT instead of ETL, but that's probably just being modern about it.


WebFOCUS 8.1.03, Windows 7-64/2008-64, IBM DB2/400, Oracle 11g & RDB, MS SQL-Server 2005, SAP, PostgreSQL 11, Output: HTML, PDF, Excel 2010
: Member of User Group Benelux :
November 13, 2017, 01:18 PM
Doug
Thanks Wep5622,

There are indexes on the primary fields with the BLOB being column 123 of 185. We can for now, deal with the slowness Frowner and will address that at a later date Smiler.
November 14, 2017, 07:23 AM
Wep5622
Does an UPDATE in SQL, with a similar statement as DM generates, take long too?

If it does, there are tools to track down where the most time is spent and on what kind of internal database operation that occurs. That usually provides the hints needed to tackle the problem. It's usually called something like EXPLAIN PLAN, but make sure you use the version that actually executes the query (in a transaction that you subsequently roll back if you want to prevent modifying the data).

If it doesn't, then you know that the problem is probably inside of DM somewhere.


WebFOCUS 8.1.03, Windows 7-64/2008-64, IBM DB2/400, Oracle 11g & RDB, MS SQL-Server 2005, SAP, PostgreSQL 11, Output: HTML, PDF, Excel 2010
: Member of User Group Benelux :
November 14, 2017, 08:10 AM
Frans
I suspect that because of the BLOB, it does inserts and updates one by one.

I would first check if you can use Change Data Capture on SQL Server. Then your process is only processing new and changed data. If that is not fast I would divide the process in 2 steps: inserts and updates. Check if you have the right indexes for the slowly changing dimensions. Insert whatever is new, update whatever is changed. But because of the BLOBS I'm not sure if the inserts will be done in batches. If not I would convert the blob to varchar and store it as varchar in the target system. That should be fast.


Test: WF 8.2
Prod: WF 8.2
DB: Progress, REST, IBM UniVerse/UniData, SQLServer, MySQL, PostgreSQL, Oracle, Greenplum, Athena.
November 14, 2017, 08:18 AM
Doug
Thanks guys, I'll check this out and pass it on to the DMs and let you know. The Explain Plan is fine. It's on the DM side that it's slow.