February 13, 2008, 02:12 AM
VHaywardETL - Performance Question regarding SAP to FOCUS
Dear people,
This question is troubling me for ages ... and could be tested of course, but maybe someone has the experience here.
I am currently working on an ETL proces.
It's about getting 8 million SAP records to a FOCUS file the quickest way using Data Migrator.
What would be the best way also knowing that filters, defines and computes could be used regarding tranforming the data.
1) SAP -> FOCUS with defines & filters
2) SAP -> FOCUS with computes & filters
3) SAP -> FLAT with filters BUT without defines, computes, then FLAT -> FOCUS with defines and computes
4) .... you come up with a suggestion ....
February 13, 2008, 08:57 AM
Jessica BottoneI'm not all that familiar with SAP so please forgive me for may seem to be a stupid question. What type of databases is the SAP data stored in? If it's not RDBMS, you can stop reading here. :-) If it is RDBMS, then it will really depend on exactly what your defines, computes and filters are. For any that can be passed over to the RDBMS engine, I would include them on the extract from SAP. I would also do all of this in a Data Migrator stored procedure, hold the results in a hold file, then apply the remaining defines/computes/filters against the first hold file and create a second hold file. The second hold file can then be used as the 'source' in a data flow to load your Focus database target. And XFocus might actually be a better option for your target, if you get a choice on the type of database the target is. I have a document on how to do this in a stored procedure and pull the results of the stored prcoedure back into a data flow. If you'll send me your email address, I'll be happy to forward it to you.
Good Luck.
February 13, 2008, 10:07 AM
PBrightwellIt has been a while since I did any SAP work, but generally speaking Defines will add more time than computes. Now as to option #3, it depends on the speed of the server. If the FOCUS files are on the same server as SAP then you won't see significant difference between that and option #2. If they are on different servers do the computes/defines on the faster server.