As of December 1, 2020, Focal Point is retired and repurposed as a reference repository. We value the wealth of knowledge that's been shared here over the years. You'll continue to have access to this treasure trove of knowledge, for search purposes only. Moving forward, myibi is our community platform to learn, share, and collaborate. We have the same Focal Point forum categories in myibi, so you can continue to have all new conversations there. If you have any questions or need access: Contact firstname.lastname@example.org
Is there any possible way to track via log or monitoring the result of each data flow that is passing to next flow? even the result is legit for computer but may be abnormal to human reading. Assume there might be a miscalculation of one data flow within a process(certain 1 flow is giving wrong result) and cause the final result of ETL is not expected.
How do we know the result is not expected? Ans: By human's experience, logically, or this result is way off number than previous result.
scenario: Have a process with total of 5 data flows within this process. planing to run this process 4 times and each time per week. Entire process is completed successfully but those result of records for the 4 weeks as follow:
1st: 1000 (human: expected) 2nd: 1100 (human: expected) 3rd: 40 (human: abnormal, due to 2nd data flow is generating abnormal result so cause rest of flows are referenced wrong result) 4th: 50 (human: abnormal, same as 3rd week)
As above scenario we know from 3rd week's result is abnormal for us, but all processes are complete successfully. In this scenario, that we may know the issue is cause by Source data are updated or replaced. Therefore, any record or log to track which table(s) is/are updated?
I'm thinking about CDC(Capture Data Change) to track log within source database. but not sure it is only way or there is another way to backtrack flow? (haven't test CDC yet)
my objective is able to backtrack process and flows to know which result in given abnormal result. any suggestion?This message has been edited. Last edited by: nox,
creating statistics table to compare would be an idea, but does it cause performance issue once there are many flows need to run? In order to keep track each table within a flow, may need to setup a statistic for each table?