Focal Point
[CLOSED] performance update with DM

This topic can be found at:
https://forums.informationbuilders.com/eve/forums/a/tpc/f/7971057331/m/9037091196

March 21, 2019, 12:38 PM
JOSELUIS
[CLOSED] performance update with DM
Good morning

I'm working with DataMigrator, I need to do an update to a table only to a specific field.

I need to know some alternative for this type of updeta because currently with 1 million records it takes more than 12 hours.

I currently use a process flow with the option of if the record exists = Update the existing record.

Thank you.

This message has been edited. Last edited by: FP Mod Chuck,


WebFOCUS 8.2.03, RedHat 7 Linux
March 21, 2019, 12:51 PM
Maintain Wizard
IBI has a product called Maintain which is used for Update, Include and Delete. It can read in the data from one database and use it to update another one. We can do this record at a time, in blocks of records, or in one large block.

Where is the data coming from and where is it going?

Mark Derwin
March 21, 2019, 02:15 PM
JOSELUIS
it is used in different databases (Oracle, SqlServer, MySql, FirebirdSQL, PostgreSQL) and it is a control table in which we mark the records already consumed in a specific field.


Thank you.


quote:
Originally posted by Maintain Wizard:
IBI has a product called Maintain which is used for Update, Include and Delete. It can read in the data from one database and use it to update another one. We can do this record at a time, in blocks of records, or in one large block.

Where is the data coming from and where is it going?

Mark Derwin



WebFOCUS 8.2.03, RedHat 7 Linux
March 21, 2019, 02:34 PM
John_Edwards
Stretch out your time between commits, moving it to as many as 1,000,000 records. That dramatically reduces update times, might buy you as much as six to one.



March 21, 2019, 02:47 PM
JOSELUIS
in database like for example oracle we have no problem with increasing the commit but in mysql database for example they do not support that value that you recommend, can you help me with some other alternative of update?

quote:
Originally posted by JOSELUIS:
it is used in different databases (Oracle, SqlServer, MySql, FirebirdSQL, PostgreSQL) and it is a control table in which we mark the records already consumed in a specific field.


Thank you.


quote:
Originally posted by Maintain Wizard:
IBI has a product called Maintain which is used for Update, Include and Delete. It can read in the data from one database and use it to update another one. We can do this record at a time, in blocks of records, or in one large block.

Where is the data coming from and where is it going?

Mark Derwin



WebFOCUS 8.2.03, RedHat 7 Linux
March 21, 2019, 04:15 PM
John_Edwards
Alright, time to put the safety goggles on.

First of all, you may be able to ease the Commit number up enough to get performance that is sufficient for your needs. I've bumped into the limit as well, but leaving it at 1000 is not necessary. Play with it a bit if you haven't already, might get 2-1 and take 6 hours out of your load time. Just a thought, easy change to make.

Option 2 -- blow away and rebuild each time.

a) Use your destination as a source, read the whole doggone data table into an XFocus database as a temporary, which will preserve your data for when this option goes horribly wrong. Write this output to a specific location on your drive, don't let it go to its default location in Edatemp. I recommend a FOCUS procedure to do this, super fast, super simple, super dependable. Now you've got a backup, and a source for your next step.

b) Use that new XFocus database as a source, join it with your updates data, and use that join to write to your destination table. This allows you to truncate your destination table in your Prior To Load Option (i.e., blow away all its data) and then use the Insert Records From Memory option instead of Insert/Update which is faster.

I will draw your attention to the part in step a), where you make sure the XFocus file is saved to a specific drive location, one more time, as that is your backup in the event the power drops midway through this process.


"Safety lights are for dudes." -- Weaver



March 21, 2019, 05:46 PM
JOSELUIS
the option that you comment does not help me because of origin tange more than 20 million records and also it takes to lower this amount to a focus format and it is required only to mark the new records


quote:
Originally posted by John_Edwards:
Alright, time to put the safety goggles on.

First of all, you may be able to ease the Commit number up enough to get performance that is sufficient for your needs. I've bumped into the limit as well, but leaving it at 1000 is not necessary. Play with it a bit if you haven't already, might get 2-1 and take 6 hours out of your load time. Just a thought, easy change to make.

Option 2 -- blow away and rebuild each time.

a) Use your destination as a source, read the whole doggone data table into an XFocus database as a temporary, which will preserve your data for when this option goes horribly wrong. Write this output to a specific location on your drive, don't let it go to its default location in Edatemp. I recommend a FOCUS procedure to do this, super fast, super simple, super dependable. Now you've got a backup, and a source for your next step.

b) Use that new XFocus database as a source, join it with your updates data, and use that join to write to your destination table. This allows you to truncate your destination table in your Prior To Load Option (i.e., blow away all its data) and then use the Insert Records From Memory option instead of Insert/Update which is faster.

I will draw your attention to the part in step a), where you make sure the XFocus file is saved to a specific drive location, one more time, as that is your backup in the event the power drops midway through this process.


"Safety lights are for dudes." -- Weaver



WebFOCUS 8.2.03, RedHat 7 Linux
March 22, 2019, 04:34 AM
Martin vK
First of all, as your question is about DataMigrator, your question should be in the iWay forum.

The performance issue is because you are using key matching and as it works it reads records from the database one by one and updates them per record, so indeed it will take much time.

I can see you are on 8203, so you should have the new possibility in DM, and that is 'Optimize Load', you can find it in the Flow Properties. Nowadays I am using this very much. It transforms your flow to a single MERGE statement in the database, if the target database allows this statement, but most RDBMS do. MERGE is performed fully within the database, so no processing per record via DM server.

The main requirement would be that the source table(s) should be in the same database, otherwise the database server can not find the input data. But even if your source table(s) are not within the same database you can use this to get the requested performance. Split your flow and first use your source table(s) from other database and output to a temporary table in the target database, which should go fast using bulkload or insert from memory. Then make second flow from the temporary table to the target table using the Optimize Load, so it is performed through a MERGE statement.

success, Martin


WebFocus 8206M, iWay DataMigrator, Windows, DB2 Windows V10.5, MS SQL Server, Azure SQL, Hyperstage, ReportCaster
March 22, 2019, 09:17 AM
John_Edwards
quote:
Originally posted by Martin vK:
First of all, as your question is about DataMigrator, your question should be in the iWay forum.


There is nobody in the iWay forum. If you want to ask a Data Migrator question I very much recommend asking it here.



March 22, 2019, 10:07 AM
Marina
quote:
Oracle, SqlServer, MySql, FirebirdSQL, PostgreSQ


The response that Martin vK provided could be the most suitable for you. In case you need some more clarifications then could you please provide a bit more information? For example, when you say that you need to "update to a table", what database of a source and a target? Also, DataMigrator provides different types loads on a target (Insert/Update, bulk, SCD, IUD, etc). And, it also offers different types of flows. Perhaps DBMS SQL Flow could serve your purposes. Bottom line, the design depends on your data and requirements/rules.